JDK-8319662 : ForkJoinPool trims worker threads too slowly
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.util.concurrent
  • Affected Version: 22
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2023-11-07
  • Updated: 2024-03-26
  • Resolved: 2023-12-06
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 22
22 b27Fixed
Related Reports
Relates :  
Relates :  
Description
Unlike other ExcecutorServices, ForkJoinPool releases (kills and allows GC etc) at most one worker thread per KeepAlive interval, which can be extremely slow. (This problem worsened under JDK-8288899). A better approach, that still conforms to specs and user expectations, is to trigger follow-on trims after the first with almost-immediate deadlines. This would for example shorten a full trim on a 64-thread pool with default 1-minute keepAlive settings from more than an hour to just over a minute.
Comments
Yes to be clear another issue was only discovered because this issue caused a pool thread to terminate much much sooner than expected. But if I set a keepAlive time of 30s then I would expect a worker to wait for work for at least 30s. I really don't understand how the original issue with too slow trimming arose. If I have a dozen threads and there is no work, then after 30s all would timeout.
26-03-2024

The JVMTI test GetOwnedMonitorInfoTest doesn't rely on this, the issue that David linked to is a bug related to threads terminating while holding a monitor entered with JNI MonitorEnter. If nothing else, an implNote in ForkJoinPool on trimming might be useful. The FJP instance used for virtual threads uses a keep alive of 30s and it can easily to be changed to too choose a sensible default for the current system.
23-03-2024

The change was to time out proportionally to full deployment, that is closer to what we think people expect, and improves Loom behavior. I hadn't realized that anything relied on not doing such things. Sigh. I'll contemplate changes.
22-03-2024

[~dl] Hi Doug, I don't think this fix is quite working as intended - at least it has a surprising effect. If you create a short-lived virtual thread, followed by a second, when the pool worker thread waits for its next task, it doesn't wait for anywhere near the keep-alive time, but instead keepAlive/parallelism based on this code: if (quiet) { // use timeout if trimmable int nt = (short)(qc >>> TC_SHIFT); long delay = keepAlive; // scale if not at target if (nt != (nt = Math.max(nt, parallelism)) && nt > 0) delay = Math.max(TIMEOUT_SLOP, delay / nt); if ((deadline = delay + System.currentTimeMillis()) == 0L) deadline = 1L; // avoid zero } On my system with a parallelism of 16, the work thread times out and terminates after 1875ms instead of the expected keep-alive of 30 seconds (30000/16 = 1875). This was detected via test/hotspot/jtreg/serviceability/jvmti/GetOwnedMonitorInfo/GetOwnedMonitorInfoTest.java - see JDK-8327743. I instrumented the test and FJP: TEST: Virtual-Worker-Thread-0 is terminating at 1710894777172 TEST: Virtual-Worker-Thread-1 is terminating at 1710894778215 DEBUG: ForkJoinPool-1-worker-1 is terminating at 1710894780090 <= 1875ms after vthread terminated Filed: JDK-8328769
22-03-2024

Changeset: cc25d8b1 Author: Doug Lea <dl@openjdk.org> Date: 2023-12-06 16:12:59 +0000 URL: https://git.openjdk.org/jdk/commit/cc25d8b12bbab9dde9ade7762927dcb8d27e23c5
06-12-2023

A pull request was submitted for review. URL: https://git.openjdk.org/jdk/pull/16725 Date: 2023-11-19 17:36:01 +0000
19-11-2023

See https://github.com/openjdk/jdk/pull/16725
19-11-2023