JDK-6250140 : (coll) AbstractSet.removeAll implementation may lead to incorrect NullPointerExceptions
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.util:collections
  • Affected Version: 5.0
  • Priority: P4
  • Status: Open
  • Resolution: Unresolved
  • OS: linux
  • CPU: x86
  • Submitted: 2005-04-04
  • Updated: 2017-06-12
Related Reports
Relates :  
Description
FULL PRODUCT VERSION :
java version "1.5.0_02"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_02-b09)
Java HotSpot(TM) Client VM (build 1.5.0_02-b09, mixed mode, sharing)


ADDITIONAL OS VERSION INFORMATION :
Linux itppc27 2.4.19-4GB #1 Mon Aug 4 23:38:42 UTC 2003 i686 unknown

A DESCRIPTION OF THE PROBLEM :
Collection.removeAll may optionally throw NullPointerException:
 
* @throws NullPointerException if this collection contains one or more
*         null elements and the specified collection does not support
*         null elements (optional).

But it may *not* throw NullPointerException if the *specified* collection contains one or more null elements and *this* collection does not support null elements.


The optimized implementation of removeAll in AbstractSet breaks this: for optimization may delegate to remove(Object), and that method may (depending on the concrete implementation) throw NullPointerException if passed 'null' (because the specification allows it here).

Either the documentation for Collection is incorrect (and should also allow NullPointerException here); or the implementation in AbstractSet should be fixed (catch NullPointerException for 'null' and ignore it).

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Execute the given code.

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
No NullPointerException (as the conditions given in the specification are not given).
ACTUAL -
NullPointerException (see below)


ERROR MESSAGES/STACK TRACES THAT OCCUR :
Exception in thread "main" java.lang.NullPointerException
        at java.util.TreeMap.compare(TreeMap.java:1093)
        at java.util.TreeMap.getEntry(TreeMap.java:347)
        at java.util.TreeMap.remove(TreeMap.java:506)
        at java.util.TreeSet.remove(TreeSet.java:223)
        at java.util.AbstractSet.removeAll(AbstractSet.java:143)
        at AbstractSetBug.main(AbstractSetBug.java:18)
b

REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
import java.util.*;

public class AbstractSetBug {
    public static void main(String[] args) {
        Set s = new TreeSet();
        s.add("ABC");
        s.add("DEF");

        Set r = new HashSet();
        r.add(null);
        s.removeAll(r);
    }
}

---------- END SOURCE ----------
###@###.### 2005-04-04 17:23:43 GMT

Comments
EVALUATION The specifications and implementations of the *All methods are tricky. These methods can often be implemented in two ways, by iterating over "this" collection, or by iterating over the "specified" collection, and this will naturally lead to different exception specifications (assuming the implementation doesn't catch the exceptions). The intended implementation unfortunately leaks through to the specification. This duality is most reflected in removeAll, which logically calls remove on the intersection of two collections. Intersection is a symmetric operation, and so can typically be performed in mirrored ways, calling contains(E) on either "this" or the "specified" collection. In fact, it's more complicated, especially if you care about performance. If either collection has a "fast" contains method, it is best to iterate over the other collection. In the absence of such information, it is a little better to iterate over the larger collection (for cache effects). Sets are much more likely to have a fast contains method, so it is easy to understand why AbstractSet would want to iterate over the "specified" collection, while AbstractCollection is content to iterate over "this". retainAll seems to have the same issues. The implementation can build a data structure containing the intersection of the two collections and then replace this collection's data structure with the new one. Yet another idea, if neither collection is a Set, is to implement removeAll(c) with removeAll(new HashSet(c)). I recommend relaxing the exception spec to allow optional exceptions to be thrown by query operations on both collections, making the current implementation of AbstractSet.removeAll legal, and enabling further optimizations like the one in the previous paragraph. (This is a very big deal, replacing an order N*M algorithm with order N+M).
06-08-2005