JDK-8324983 : Race in CompileBroker::possibly_add_compiler_threads
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 11,17,21,22,23
  • Priority: P2
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2024-01-30
  • Updated: 2024-02-12
  • Resolved: 2024-02-05
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 22 JDK 23
22Fixed 23 b09Fixed
Related Reports
Relates :  
Relates :  
Description
UseDynamicNumberOfCompilerThreads adds and removes compiler threads as necessary and the logic is driven primarily by AbstractCompiler::_num_compiler_threads.  When a thread is removed this number is decremented by CompileBroker::can_remove by the running CompilerThread.  Later, possibly_add_compiler_threads notices that it can spin up a new thread and it uses JavaThread::start_internal_daemon to associate a new CompilerThread with the existing JavaThread instance.  The problem is that CompileBroker::can_remove is called from the running CompilerThread before it has actually exited.  The exit of the thread clears java.lang.Thread.eetop on the way out.  It's possible that a new CompilerThread is created before the old one has actually exited.  This will result in Thread.eetop being null for an actual running JavaThread.  For C1 and C2 this mostly doesn't cause problems but for libgraal this can lead to hangs, particularly with ReentrantLock since eetop being null makes the thread appear dead.
Comments
Please, make sure to push before RDP 3 which is coming morning Feb 8.
06-02-2024

A pull request was submitted for review. URL: https://git.openjdk.org/jdk22/pull/108 Date: 2024-02-06 16:46:23 +0000
06-02-2024

Also add 22 as affected version
06-02-2024

I suggest to run mach5 testing with JDK 22 before integration.
06-02-2024

Approved for JDK 22.
06-02-2024

ILW = Deadlock in common Truffle workloads, occurs regularly in Truffle CI pipeline, -XX:-UseDynamicNumberOfCompilerThreads = HHL = P2
06-02-2024

Fix Request: This race can cause deadlock in Truffle workloads and can only be worked around with either -XX:+UnlockDiagnosticVMOptions -XX:-ReduceNumberOfCompilerThreads to disable idling of compiler threads or -XX:-UseDynamicNumberOfCompilerThreads to disable dynamic threads completely. We'd like to have this fix in for GraalVM 24.0 which will be based on JDK 22. This fix is simple and has gone through a round of CI testing in the mainline.
06-02-2024

Changeset: 19936526 Author: Tom Rodriguez <never@openjdk.org> Date: 2024-02-05 17:43:34 +0000 URL: https://git.openjdk.org/jdk/commit/1993652653eab8dd7ce2221a97cd2e401f2dcf56
05-02-2024

A pull request was submitted for review. URL: https://git.openjdk.org/jdk/pull/17662 Date: 2024-01-31 21:35:05 +0000
31-01-2024

Graal is only affected since JDK-8319980 in JDK 22 but I think the underlying issue exists since JDK-8198756 in JDK 11. ILW = Race condition when reusing Thread instance for UseDynamicNumberOfCompilerThreads, intermittent with libgraal and Truffle, -XX:-UseDynamicNumberOfCompilerThreads = HLM = P3
31-01-2024