Bug ID: JDK-6602600 Fast removal of cancelled scheduled thread pool tasks

Type: Bug
Component: core-libs
Sub-Component: java.util.concurrent
Affected Version: 7

Priority: P3
Status: Resolved
Resolution: Fixed
OS: generic
CPU: generic

Submitted: 2007-09-10
Updated: 2017-05-16
Resolved: 2008-03-22

JDK 7
7 b25Fixed

Doug Lea wrote:
Sun Feb 18 17:50:26 EST 2007

I'm finally looking into improving STPE's space/time performance in
cases where the vast majority of scheduled tasks are cancelled. This
is very common in many network-related applications, where
timed tasks deal with network timeouts that rarely occur.
This is a high priority issue for upcoming JSR203 usages.
(Hence CC to Alan Bateman, JSR203 spec lead.)

(Alan: Your replies might sit around for a while before gating to
list.)

I tentatively put in place a new internal specialization of
delay queue that allows tasks to know their internal priority
queue index, which allows much faster removal (O(log n) rather than
O(n), without making anything else slower. This looks
sufficient to solve problems, modulo the following issues:

1. Currently ScheduledFutureTask.cancel does NOT try to remove
the cancelled task. Instead, you must call STPE.remove(task). Which
almost no one seems to know about. And actually, I think that failure
to call remove is responsible for a large part of experienced
space/time performance problems.
One reason people don't call it is because remove
is not even in the ScheduledExecutorService interface; just a
TPE/STPE specific method.  Another reason is that when people
cancel tasks, they usually don't have a handle to the STPE
to call remove with. But it's more likely that people just assume
that cancellation will remove the task from queue, because
the specs don't say anything about this one way or another, and
it is a natural assumption to make. I think we should
change this to automatically remove, and add a line to javadoc
specs saying that we do this.

This would be a slightly controversial change: Code that
currently DOES call remove when cancelling will be unaffected
and will even speed up a bit (even though there will now be
two calls, both are faster).
It is conceivable though that someone out there actually relies
on cancelled tasks not being removed, so they can iterate through
them. But any code doing this would be unreliable at best, since
even now cancelled tasks are removed when they hit head of queue.
So I don't think this is a real concern.

Can anyone think of a reason not to do this?

2. Our addition of decorators to STPE impacts new cancellation
handles: While ScheduledFutureTasks maintain heap indices, the
queued tasks could instead be decorated versions (which could even
be complete replacements for all we know). So the new delay
queue code falls back to linear scans for any task that is not a
ScheduledFutureTask. This means that cancellation will be slower
in such cases. Probably not a big deal -- the kinds of applications
requiring fast cancellation probably don't intersect much with those
that add before/after methods and the like via decorators. Or, can
anyone think of cases that do? (Even if so, there's not much we can do
about it.)

3. Automatic removal results in more contention on the single
lock protecting the delay queue. Because of this, I was expecting
the above heap-based scheme to not work so well and that I'd need
to turn to alternative concurrent PQ algorithms. But happily,
it seems to work fine. This is apparently because, even though
there are more calls to lock-protected methods, their internals are
faster, in turn mainly because the queue is kept smaller. So,
on balance, I'm unable to create tests where this lock seems to
be a bottleneck. On the other hand, I don't yet have a feeling
for how well it will work in the most demanding cases, because I
don't yet have good enough performance tests for such things (even
after getting some from Alan). Any thoughts on creating such tests
would be welcome.

EVALUATION Here's a benchmark program that shows dramatic improvements, due to minimizing context switches in new DelayedWorkQueue implementation: import java.util.concurrent.*; public class Stress { public static void main(String[] args) throws Throwable { final CountDownLatch count = new CountDownLatch(1000); final ScheduledThreadPoolExecutor pool = new ScheduledThreadPoolExecutor(100); pool.prestartAllCoreThreads(); final Runnable incTask = new Runnable() { public void run() { count.countDown(); }}; pool.scheduleAtFixedRate(incTask, 0, 10, TimeUnit.MILLISECONDS); count.await(); pool.shutdown(); pool.awaitTermination(1L, TimeUnit.DAYS); } } $ for v in dolphin concurrent; do echo $v; time jver $v jr Stress.java; done dolphin ==> javac -Xlint:all Stress.java ==> java -esa -ea Stress jver $v jr Stress.java 5.12s user 3.81s system 64% cpu 13.895 total concurrent ==> javac -Xlint:all Stress.java ==> java -esa -ea Stress jver $v jr Stress.java 1.99s user 0.63s system 18% cpu 13.893 total

26-09-2007

EVALUATION Doug Lea is providing an implementation

10-09-2007