JDK-7186403 : (coll) AbstractCollection.retainAll(coll) is slow when collection "coll" is empty
  • Type: Enhancement
  • Component: core-libs
  • Sub-Component: java.util:collections
  • Affected Version: 6u31
  • Priority: P5
  • Status: Open
  • Resolution: Unresolved
  • OS: linux_ubuntu
  • CPU: x86
  • Submitted: 2012-07-24
  • Updated: 2019-02-14
Related Reports
Relates :  
Description
FULL PRODUCT VERSION :
java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)

ADDITIONAL OS VERSION INFORMATION :
Linux 3.0.0-12-generic #20-Ubuntu

A DESCRIPTION OF THE PROBLEM :
The method "AbstractCollection.retainAll(Collection<?> c)" is very
slow when the parameter "c" is empty (and "retainAll" should just
clear "this" collection).  I attached a test that exposes this problem
and a one-line patch that fixes it.  For this test, the patch provides
822X speedup on my machine.

The patch is very simple, just call:

if (c.isEmpty()) { clear(); return true; }

at the beginning of "retainAll".

  To run the test, do:

$ java Test

The output for the un-patched version is:
Time is 822

The output for the patched version is:
Time is 1

There are other ways to speed up "retainAll", but they are more
complex, e.g., bug 5028425, whereas the above one-line patch is
trivial and provides considerable speedup for certain usage scenarios.

The method "AbstractCollection.removeAll(Collection<?> c)" can be
similarly optimized by adding "if (c.isEmpty()) return false;", but
the speed up is not as dramatic as for "retainAll".

Can you please fix this behavior?


The patch is:
======================================================================
--- AbstractCollection.java	2012-07-23 14:04:13.525513594 -0500
+++ AbstractCollection_Fix.java	2012-07-23 14:20:49.725537261 -0500
@@ -363,6 +363,7 @@
      * @see #contains(Object)
      */
     public boolean retainAll(Collection<?> c) {
+        if (c.isEmpty()) { clear(); return true; }
 	boolean modified = false;
 	Iterator<E> e = iterator();
 	while (e.hasNext()) {
======================================================================


STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
$ java Test

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Time is 1
ACTUAL -
Time is 822

REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
import java.util.ArrayList;

public class Test {
    public static void main(String[] args) {
        ArrayList<Integer> collection = new ArrayList<Integer>();
        for (int i = 0; i < 100000; i++) collection.add(i);
        ArrayList<Integer> other = new ArrayList<Integer>();
        
        long start = System.currentTimeMillis();
        collection.retainAll(other);
        System.out.println("Time is " + (System.currentTimeMillis() - start));
    }
}

---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
I realize one can solve this problem from the caller site,
replacing "collection.retainAll(other)" with something like:

if (other.isEmpty()) collection.clear();
else collection.retainAll(other);

However, this clutters the caller sites a lot.