United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: JDK-4979017 java.util.BitSet.equals(Object) can be optimized
JDK-4979017 : java.util.BitSet.equals(Object) can be optimized

Details
Type:
Enhancement
Submit Date:
2004-01-15
Status:
Resolved
Updated Date:
2005-02-19
Project Name:
JDK
Resolved Date:
2005-02-19
Component:
core-libs
OS:
windows_xp
Sub-Component:
java.util
CPU:
x86
Priority:
P5
Resolution:
Fixed
Affected Versions:
1.4.2
Fixed Versions:

Related Reports
Relates:

Sub Tasks

Description
Name: jl125535			Date: 01/15/2004


A DESCRIPTION OF THE REQUEST :
The equals() method of java.util.BitSet performs unnecessary slow checks to compare all items in the common units in use.
But given the INVARIANT about unitsInUse where all bits[i] items starting at i >= unitsInUse will all be zeroes, there is no need to compare the bits[] content if unitsInUse are different between both objects.
So, by just comparing unitsInUse, we can immediately detect when two BitSet's are different in most cases. This is a notable optimization for the case of sparse bitsets


JUSTIFICATION :
See the documented invariant within the Java source code of BitSet:
    /**
     * The number of units in the logical size of this BitSet.
     * INVARIANT: unitsInUse is nonnegative.
     * INVARIANT: bits[unitsInUse - 1] is nonzero unless unitsInUse is zero.
     */
    private transient int unitsInUse = 0;


EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
This is not a bug, there's no error, just that BitSet.equals perform a full scan of the compared bitsets in all cases, despite this could be avoided in most cases where unitsInUse are different between the two compared objects, and return immediately false instead of scanning.
ACTUAL -
When BitSet.equals compare two BitSet's A and B that have distinct unitsInUse, then

if A.unitsInUse < B.unitsInUse,
then:
    for all i >= A.unitsUnitUse,
        A.bits[i] == 0;
and also:
    there exists j in [A.unitsInUse, B.unitsInUse[, where
        B.bits[j] != 0;
So
    there exists j in [A.unitsInUse, B.unitsInUse[, where
       A.bits[i] == 0 and B.bits[j] != 0;

Necessarily, the two bitsets are then different...

---------- BEGIN SOURCE ----------
Change the implementation of the BitSet.equals(Object) method:

   public boolean equals(Object obj) {
        if (!(obj instanceof BitSet))
            return false;
        if (this == obj)
            return true;
        BitSet set = (BitSet) obj;

        // This fast check works only if both sets respect the INVARIANT
        // about their private unitsInUse property
        if (unitsInUse != set.unitsInUse)
            return false;

        // Check units in use by both BitSets
        for (int i = unitsInUse - 1; i >= 0; i--)
            if (bits[i] != set.bits[i])
                return false;
        return true;
    }

---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
No need for a work-around, as this creates no run-time bug, but unnecessarily slow fullscans when comparing bitsets.

When performance is an issue with large collections of sparse bitsets, this simple optimization really allows faster code.
(Incident Review ID: 233272) 
======================================================================

                                    

Comments
EVALUATION

Looks like a good optimization. This RFE is a good candidate for the next dot release.
###@###.### 2004-01-15

I am implementing the suggestion of optimizing methods based on
the INVARIANTs in the source code.  Unfortunately, the invariants
are not always currently preserved (and this in turn gives rise to
some obscure bugs), so this will have to be fixed first.
###@###.### 2005-1-27 03:01:22 GMT
                                     
2005-01-27



Hardware and Software, Engineered to Work Together