JDK-4979017 : java.util.BitSet.equals(Object) can be optimized
  • Type: Enhancement
  • Component: core-libs
  • Sub-Component: java.util
  • Affected Version: 1.4.2
  • Priority: P5
  • Status: Resolved
  • Resolution: Fixed
  • OS: windows_xp
  • CPU: x86
  • Submitted: 2004-01-15
  • Updated: 2005-02-19
  • Resolved: 2005-02-19
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 6
6 b25Fixed
Related Reports
Relates :  
Description
Name: jl125535			Date: 01/15/2004


A DESCRIPTION OF THE REQUEST :
The equals() method of java.util.BitSet performs unnecessary slow checks to compare all items in the common units in use.
But given the INVARIANT about unitsInUse where all bits[i] items starting at i >= unitsInUse will all be zeroes, there is no need to compare the bits[] content if unitsInUse are different between both objects.
So, by just comparing unitsInUse, we can immediately detect when two BitSet's are different in most cases. This is a notable optimization for the case of sparse bitsets


JUSTIFICATION :
See the documented invariant within the Java source code of BitSet:
    /**
     * The number of units in the logical size of this BitSet.
     * INVARIANT: unitsInUse is nonnegative.
     * INVARIANT: bits[unitsInUse - 1] is nonzero unless unitsInUse is zero.
     */
    private transient int unitsInUse = 0;


EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
This is not a bug, there's no error, just that BitSet.equals perform a full scan of the compared bitsets in all cases, despite this could be avoided in most cases where unitsInUse are different between the two compared objects, and return immediately false instead of scanning.
ACTUAL -
When BitSet.equals compare two BitSet's A and B that have distinct unitsInUse, then

if A.unitsInUse < B.unitsInUse,
then:
    for all i >= A.unitsUnitUse,
        A.bits[i] == 0;
and also:
    there exists j in [A.unitsInUse, B.unitsInUse[, where
        B.bits[j] != 0;
So
    there exists j in [A.unitsInUse, B.unitsInUse[, where
       A.bits[i] == 0 and B.bits[j] != 0;

Necessarily, the two bitsets are then different...

---------- BEGIN SOURCE ----------
Change the implementation of the BitSet.equals(Object) method:

   public boolean equals(Object obj) {
        if (!(obj instanceof BitSet))
            return false;
        if (this == obj)
            return true;
        BitSet set = (BitSet) obj;

        // This fast check works only if both sets respect the INVARIANT
        // about their private unitsInUse property
        if (unitsInUse != set.unitsInUse)
            return false;

        // Check units in use by both BitSets
        for (int i = unitsInUse - 1; i >= 0; i--)
            if (bits[i] != set.bits[i])
                return false;
        return true;
    }

---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
No need for a work-around, as this creates no run-time bug, but unnecessarily slow fullscans when comparing bitsets.

When performance is an issue with large collections of sparse bitsets, this simple optimization really allows faster code.
(Incident Review ID: 233272) 
======================================================================

Comments
EVALUATION Looks like a good optimization. This RFE is a good candidate for the next dot release. ###@###.### 2004-01-15 I am implementing the suggestion of optimizing methods based on the INVARIANTs in the source code. Unfortunately, the invariants are not always currently preserved (and this in turn gives rise to some obscure bugs), so this will have to be fixed first. ###@###.### 2005-1-27 03:01:22 GMT
27-01-2005