Bug ID: JDK-8319662 ForkJoinPool trims worker threads too slowly

Type: Bug
Component: core-libs
Sub-Component: java.util.concurrent
Affected Version: 22

Priority: P4
Status: Resolved
Resolution: Fixed

Submitted: 2023-11-07
Updated: 2024-03-26
Resolved: 2023-12-06

JDK 22
22 b27Fixed

Unlike other ExcecutorServices, ForkJoinPool releases (kills and allows GC etc) at most one worker thread per KeepAlive interval, which can be extremely slow. (This problem worsened under JDK-8288899). A better approach, that still conforms to specs and user expectations, is to trigger follow-on trims after the first with almost-immediate deadlines. This would for example shorten a full trim on a 64-thread pool with default 1-minute keepAlive settings from more than an hour to just over a minute.

Yes to be clear another issue was only discovered because this issue caused a pool thread to terminate much much sooner than expected. But if I set a keepAlive time of 30s then I would expect a worker to wait for work for at least 30s. I really don't understand how the original issue with too slow trimming arose. If I have a dozen threads and there is no work, then after 30s all would timeout.
26-03-2024
The JVMTI test GetOwnedMonitorInfoTest doesn't rely on this, the issue that David linked to is a bug related to threads terminating while holding a monitor entered with JNI MonitorEnter. If nothing else, an implNote in ForkJoinPool on trimming might be useful. The FJP instance used for virtual threads uses a keep alive of 30s and it can easily to be changed to too choose a sensible default for the current system.
23-03-2024
The change was to time out proportionally to full deployment, that is closer to what we think people expect, and improves Loom behavior. I hadn't realized that anything relied on not doing such things. Sigh. I'll contemplate changes.
22-03-2024
[~dl] Hi Doug, I don't think this fix is quite working as intended - at least it has a surprising effect. If you create a short-lived virtual thread, followed by a second, when the pool worker thread waits for its next task, it doesn't wait for anywhere near the keep-alive time, but instead keepAlive/parallelism based on this code: if (quiet) { // use timeout if trimmable int nt = (short)(qc >>> TC_SHIFT); long delay = keepAlive; // scale if not at target if (nt != (nt = Math.max(nt, parallelism)) && nt > 0) delay = Math.max(TIMEOUT_SLOP, delay / nt); if ((deadline = delay + System.currentTimeMillis()) == 0L) deadline = 1L; // avoid zero } On my system with a parallelism of 16, the work thread times out and terminates after 1875ms instead of the expected keep-alive of 30 seconds (30000/16 = 1875). This was detected via test/hotspot/jtreg/serviceability/jvmti/GetOwnedMonitorInfo/GetOwnedMonitorInfoTest.java - see JDK-8327743. I instrumented the test and FJP: TEST: Virtual-Worker-Thread-0 is terminating at 1710894777172 TEST: Virtual-Worker-Thread-1 is terminating at 1710894778215 DEBUG: ForkJoinPool-1-worker-1 is terminating at 1710894780090 <= 1875ms after vthread terminated Filed: JDK-8328769
22-03-2024
Changeset: cc25d8b1 Author: Doug Lea <dl@openjdk.org> Date: 2023-12-06 16:12:59 +0000 URL: https://git.openjdk.org/jdk/commit/cc25d8b12bbab9dde9ade7762927dcb8d27e23c5
06-12-2023
A pull request was submitted for review. URL: https://git.openjdk.org/jdk/pull/16725 Date: 2023-11-19 17:36:01 +0000
19-11-2023
See https://github.com/openjdk/jdk/pull/16725
19-11-2023

Relates :	JDK-8327743 - JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1
Relates :	JDK-8328769 - ForkJoinPool trims worker threads too quickly after JDK-8319662