JDK-8323659 : LinkedTransferQueue add and put methods call overridable offer
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.util.concurrent
  • Affected Version: 21.0.2,22
  • Priority: P2
  • Status: Closed
  • Resolution: Fixed
  • Submitted: 2024-01-12
  • Updated: 2024-07-03
  • Resolved: 2024-01-16
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 21 JDK 22 JDK 23
21.0.3-oracleFixed 22Fixed 23 b06Fixed
Related Reports
Relates :  
Description
The refactoring in JDK-8301341 resulted in `put` and `add` calling overridable `offer`. This change, while technically correct, does have an observable affect on existing code. For example, the below code snippet use to work with all previous JDK's, but no longer successfully executes the tasks with JDK 22 EA.

This snippet is a reduced artificial reproducer that demonstrates the issue. The crux of what the code is attempting to do is to scale the executor to max pool size.

I'm adding the regression label, as the affect of the change appears as a regression when upgrading from JDK 21 to JDK 22 EA. However, I do understand and accept that LinkedTransferQueue::offer is not really supposed to return false!

---
import java.util.concurrent.*;

public class Test {

    public static void main(String... args) throws Exception {
        ExecutorScalingQueue<Runnable> queue = new ExecutorScalingQueue<>();
        ThreadPoolExecutor executor = new ThreadPoolExecutor(
            0,
            1,
            60,
            TimeUnit.SECONDS,
            queue,
            Executors.defaultThreadFactory(),
            new ForceQueuePolicy()
        );
        queue.executor = executor;

        final CountDownLatch countDownLatch = new CountDownLatch(10);

        class TestTask implements Runnable {
            @Override
            public void run() {
                countDownLatch.countDown();
                if (countDownLatch.getCount() > 0) {
                    executor.execute(TestTask.this);
                }
            }
        }

        executor.execute(new TestTask());
        countDownLatch.await();
        executor.shutdown();
        boolean terminated = executor.awaitTermination(10, TimeUnit.SECONDS);
        assert terminated;
        System.out.println("FINISHED");
    }

    static class ExecutorScalingQueue<E> extends LinkedTransferQueue<E> {
        ThreadPoolExecutor executor;

        @Override
        public boolean offer(E e) {
            // first try to transfer to a waiting worker thread
            if (tryTransfer(e) == false) {
                // check if there might be spare capacity in the thread pool executor
                int left = executor.getMaximumPoolSize() - executor.getCorePoolSize();
                if (left > 0) {
                    // reject queuing the task to force the thread pool
                    // executor to add a worker if it can; combined
                    // with ForceQueuePolicy, this causes the thread
                    // pool to always scale up to max pool size and we
                    // only queue when there is no spare capacity
                    return false;
                } else {
                    return super.offer(e);
                }
            } else {
                return true;
            }
        }
    }

    static class ForceQueuePolicy implements RejectedExecutionHandler {

        @Override
        public void rejectedExecution(Runnable task, ThreadPoolExecutor executor) {
            if (executor.isShutdown()) {
                throw new RejectedExecutionException();
            } else {
                put(executor, task);
                // we need to check again the executor state as it might have been concurrently shut down; in this case
                // the executor's workers are shutting down and might have already picked up the task for execution.
                if (executor.isShutdown() && executor.remove(task)) {
                    throw new RejectedExecutionException();
                }
            }
        }

        void put(ThreadPoolExecutor executor, Runnable task) {
            try {
                executor.getQueue().put(task);
            } catch (final InterruptedException e) {
                throw new AssertionError(e);
            }
        }
    }
}
Comments
FYI, Corretto 21.0.2 was respinned with this and other fixes: https://github.com/corretto/corretto-21/blob/release-21.0.2.14.1/CHANGELOG.md
13-02-2024

As another data point, to help determine if a 21.0.2.1 is warranted or not, CrateDB also ran into this issue in 21.0.2, see https://github.com/crate/crate/pull/15408
24-01-2024

There is another regression with 21.0.2: JDK-8275509 with fix JDK-8290041. And I think I would find two more we could include if we do a respin... Ah, one is JDK-8310844. The other is JDK-8318562/JDK-8322985.
17-01-2024

I'm on the fence about whether a respin is warranted or not. It's an incredibly impactful issue, at least the way that it affects Elasticsearch - tasks submitted to the executor service vanish without a trace! But I do accept that the specific use case may not be all that common. For me this that the decider. If we think others are likely to run into it, then a respin is warranted for sure. Otherwise, it can wait for 03.
17-01-2024

Fix request [21u] Clean backport. New and existing tests passing in CI. The fix reverts an accidental behavioural change that was backported from JDK 22 EA. The behavioural change has been reverted in JDK 22 EA also.
17-01-2024

A pull request was submitted for review. URL: https://git.openjdk.org/jdk21u-dev/pull/180 Date: 2024-01-17 11:21:18 +0000
17-01-2024

A pull request was submitted for review. URL: https://git.openjdk.org/jdk21u/pull/433 Date: 2024-01-17 10:35:16 +0000
17-01-2024

Let's fix this for 21.0.3. I'm recognizing that this might cause problems for some, but it doesn't seem critical enough to warrant a 21.0.2.1 release.
17-01-2024

Sounds like this warrants the 21.0.2 respin, [~sgehwolf]?
17-01-2024

A pull request was submitted for review. URL: https://git.openjdk.org/jdk22/pull/80 Date: 2024-01-16 12:23:43 +0000
16-01-2024

Changeset: ee4d9aa4 Author: Chris Hegarty <chegar@openjdk.org> Date: 2024-01-16 12:13:57 +0000 URL: https://git.openjdk.org/jdk/commit/ee4d9aa4c11c47e7cf15f2742919ac20311f9ea7
16-01-2024

Setting fix version to 23, as per https://mail.openjdk.org/pipermail/jdk-dev/2023-December/008560.html
12-01-2024

A pull request was submitted for review. URL: https://git.openjdk.org/jdk/pull/17393 Date: 2024-01-12 10:38:40 +0000
12-01-2024

I raised this PR just to help with the discussion and confirm that the minimal change works as expected. https://github.com/openjdk/jdk/pull/17393
12-01-2024

Elasticsearch issue where we're encountering this: https://github.com/elastic/elasticsearch/issues/103963
12-01-2024