United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
JDK-6529800 : (coll) ArrayList.removeAll should be O(n), but is O(n*n)

Details
Type:
Bug
Submit Date:
2007-03-01
Status:
Closed
Updated Date:
2012-10-08
Project Name:
JDK
Resolved Date:
2011-05-17
Component:
core-libs
OS:
windows_xp
Sub-Component:
java.util:collections
CPU:
x86
Priority:
P3
Resolution:
Fixed
Affected Versions:
6
Fixed Versions:

Related Reports

Sub Tasks

Description
A DESCRIPTION OF THE REQUEST :
ArrayList.removeAll runs in worst case O(n^2) time. The problem with ArrayList.removeAll was discovered in Sun's Java 6 SDK. ArrayList does not directly implement the removeAll method: rather, it is implemented by its superclass AbstractCollection. The listing below shows the implementation of that method.

 1  public boolean removeAll(Collection c) {
 2    boolean modified = false;
 3    Iterator e = iterator();
 4    while (e.hasNext()) {
 5      if (c.contains(e.next())) {
 6        e.remove();
 7        modified = true;
 8      }
 9    }
10    return modified;
11  }

This raises red flags because remove operations on array lists are O(n) operations, so while the code above is optimal for a linked list, it is definitely not for an array list. In particular, at every iteration in the loop, it is possible to trigger a remove, hence the worst case performance is O(n2).

JUSTIFICATION :
There is no reason why removeAll can't have O(n) worst-case performance.

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
removeAll should have worst case O(n) behavior, disregarding the behavior of the call to "contains"
ACTUAL -
removeAll has worst case O(n^2) behavior, assuming O(1) behavior for the "contains" invocation.

---------- BEGIN SOURCE ----------
http://www.ahmadsoft.org/code/arraytest.zip
---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
See http://www.ahmadsoft.org/articles/removeall/index.html for two O(n) implementations as well as benchmarks and analysis.

                                    

Comments
EVALUATION

Amin Ahmad has put up a beautiful web page analysing this problem:
http://www.ahmadsoft.org/articles/removeall/index.html

I think we should go with Amin's solution #1.
                                     
2007-03-01



Hardware and Software, Engineered to Work Together