Name: boT120536 Date: 04/02/2001
java version "1.3.0"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.0)
Java HotSpot(TM) Client VM (build 1.3.0, mixed mode)
During an algorithm engineering course, we have looked for but did not
find an efficient means to implement a workset algorithm using the
java.util.Collection interfaces.
Workset algorithms are frequent in algorithm design and have
more or less the following form:
Set s; Object x; Collection y;
while (!s.isEmpty()) {
x = chooseAnyElement(s);
s.remove(x);
y = findNewElementsToProcess(x);
s.addAll(y);
}
A naive implementation of chooseAnyElement is as follows:
Iterator it;
while (!s.isEmpty()) {
it = s.iterator();
x = it.next();
s.remove(x); // or: it.remove();
...
}
There are two problems: First, a new iterator object is created in every
run. Although annoying, this is a minor issue and a direct consequence of
the fail-fast iterator concept (see also #4131252).
Second, if the set is a HashSet, the iterator always starts at
the same location (position zero). This will empty the hash table in an
unbalanced way and finally render a linear time algorithm to a quadratic
one. This clearly is not acceptable.
Note that worklist algorithms are easy to handle as they do not require iterators.
I came up with at least two alternatives:
Solution 1: Define a method Object removeOne() in Set
(or even Collection) that will efficiently remove and report an element
of any non-empty collection. To do so in acceptable time for hash tables
(O(1) in the average, amortized) would require an additional internal cyclic
index even though this functionality is not very frequently needed.
Solution 2: Define a method void reset() in Iterator
that sets the position back to the first element, updates the modification
count and works as a new iterator except that it might report elements
in a different order (unless the collection is ordered).
Iterator it = s.iterator();
while (!s.isEmpty()) {
x = it.next();
s.remove(x); // or: it.remove();
it.reset();
...
}
For hash tables, the iterator can simply count the number of elements
to report and cycle through the table until finished. As iterators are
comparably short lived objects, the additional field does not hurt too much.
If anyone comes up with a better idea, I would greatly appreciate that.
(Review ID: 119891)
======================================================================