JDK-8346727 : JvmtiVTMSTransitionDisabler deadlock
  • Type: Bug
  • Component: hotspot
  • Sub-Component: jvmti
  • Affected Version: 24
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2024-12-20
  • Updated: 2025-01-20
  • Resolved: 2025-01-11
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 25
25 b06Fixed
Related Reports
Relates :  
Description
JGroups reports a deadlock using virtual threads with async profiler.

A mixed mode thread dump shows a thread of interest waiting in JvmtiVTMSTransitionDisabler::VTMS_transition_disable_for_all, and the Attach Listener in same as it calls JVMTI SetEventNotificationMode in its Agent_OnAttach function.

It seems that a virtual thread posting an event while holding the interrupt lock.

Note that we don't have a reproducer yet.


----------------- 10823 -----------------
"ForkJoinPool-1-worker-4" #52 daemon prio=5 tid=0x00007f64948545e0 nid=10823 waiting on condition [0x00007f64718e9000]
   java.lang.Thread.State: RUNNABLE
   JavaThread state: _thread_blocked
0x00007f6499fce169    __futex_abstimed_wait_common + 0xa9
0x00007f6499fd0e72    ___pthread_cond_timedwait + 0x262
0x00007f64996ea49c    PlatformMonitor::wait(unsigned long) + 0x5c
0x00007f64996991cc    Monitor::wait(unsigned long) + 0x6c
0x00007f649953282d    JvmtiVTMSTransitionDisabler::VTMS_transition_disable_for_all() + 0xad
0x00007f64994f9c40    JvmtiHandshake::execute(JvmtiUnitedHandshakeClosure*, _jobject*) + 0x60
0x00007f64994e9ed2    JvmtiEnv::GetStackTrace(_jobject*, int, int, jvmtiFrameInfo*, int*) + 0x52
0x00007f64994a1947    jvmti_GetStackTrace + 0x107
0x00007f6467a26a3d    Profiler::recordSample(void*, unsigned long long, EventType, Event*) + 0x14d
0x00007f6467a273ea    LockTracer::recordContendedLock(EventType, unsigned long long, unsigned long long, char const*, _jobject*, long) + 0xaa
0x00007f6467a27544    LockTracer::MonitorContendedEntered(_jvmtiEnv*, JNIEnv_*, _jobject*, _jobject*) + 0xd4
0x00007f6499508d9e    JvmtiExport::post_monitor_contended_entered(JavaThread*, ObjectMonitor*) + 0x23e
0x00007f64996bab53    ObjectMonitor::enter(JavaThread*) + 0x613
0x00007f649986da24    ObjectSynchronizer::enter(Handle, BasicLock*, JavaThread*) + 0xf4
0x00007f649979d45f    SharedRuntime::complete_monitor_locking_C(oopDesc*, BasicLock*, JavaThread*) + 0x6f


----------------- 10929 -----------------
"Attach Listener" #320934 daemon prio=9 tid=0x00007f63e8000f50 nid=10929 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
   JavaThread state: _thread_blocked
0x00007f6499fce169    __futex_abstimed_wait_common + 0xa9
0x00007f6499fd0e72    ___pthread_cond_timedwait + 0x262
0x00007f64996ea49c    PlatformMonitor::wait(unsigned long) + 0x5c
0x00007f64996991cc    Monitor::wait(unsigned long) + 0x6c
0x00007f649953282d    JvmtiVTMSTransitionDisabler::VTMS_transition_disable_for_all() + 0xad
0x00007f64994e7b70    JvmtiEnv::SetEventNotificationMode(jvmtiEventMode, jvmtiEvent, _jobject*, ...) + 0x50
0x00007f64994a334d    jvmti_SetEventNotificationMode + 0xfd
0x00007f6467a06dd5    LockTracer::stop() + 0x25
0x00007f6467a34f24    Profiler::stop(bool) + 0x194
0x00007f6467a36637    Profiler::runInternal(Arguments&, std::ostream&) + 0x8b7
0x00007f6467a3698a    Profiler::run(Arguments&) [clone .part.749] + 0x16a
0x00007f6467a369f0    Profiler::run(Arguments&) + 0x20
0x00007f6467a40545    Agent_OnAttach + 0x215
0x00007f6499491144    JvmtiAgent::load(outputStream*) + 0x1d4
0x00007f6499491c60    JvmtiAgentList::load_agent(char const*, char const*, char const*, outputStream*) + 0x70
0x00007f6498e818d7    AttachListenerThread::thread_entry(JavaThread*, JavaThread*) + 0x1c7
0x00007f64993092e8    JavaThread::thread_main_inner() [clone .part.0] + 0xb8
0x00007f64998bc56f    Thread::call_run() + 0x9f
0x00007f64996df3a5    thread_native_entry(Thread*) + 0xd5
0x00007f6499fd1897    start_thread + 0x2f7
Comments
Changeset: 31452788 Branch: master Author: Serguei Spitsyn <sspitsyn@openjdk.org> Date: 2025-01-11 07:07:27 +0000 URL: https://git.openjdk.org/jdk/commit/3145278847428ad3a855a3e2c605b77f74ebe113
11-01-2025

A pull request was submitted for review. Branch: master URL: https://git.openjdk.org/jdk/pull/22997 Date: 2025-01-09 05:03:33 +0000
09-01-2025

Initial comments from Alan and Patrico: Alan: A researcher in Red Hat working on clustering contacted me about a potential hang/deadlock when using JGroups in conjunction with async profiler. There are a bunch of mails but I think this one is interesting. Look at Attach Listener and ForkJoinPool-1-worker-4, both are in VTMS_transition_disable_for_all. Patricio: It would be good to have the Java and native outputs for the same process, because I think the ones we have are for different ones. But my guess of the issue from what I've seen is the following: The Attach Listener starts the VTMS_transition_disable_for_all() operation and waits for all vthreads that are inside a transition. There is a thread in unmount(), still in the transition, that is trying to grab the interrupt lock. The interrupt lock is held by somebody else that has already stopped due to the VTMS_transition_disable_for_all(). My guess is that thread is ForkJoinPool-1-worker-4, which is doing the JvmtiExport::post_monitor_contended_entered() upcall and blocks in VTMS_transition_disable_for_all(). So now we deadlock. In fact, if ForkJoinPool-1-worker-4 is the one holding the interrupt lock, then since he also calls VTMS_transition_disable_for_all(), then that would alone cause the deadlock too without the Attach Listener. Serguei: There are some similarities with the bug: JDK-8311218 Alan: I'm trying some changes to separate the interrupt lock from the lock we need to coordinate async access to the carrier. I'll send a note after these experiments. Patrico: Another option would be to avoid posting events if the thread holds the interrupt lock. We have already marked all such places in the VirtualThread class with notifyJvmtiDisableSuspend(true)/notifyJvmtiDisableSuspend(false), so we would have to check for thread->is_disable_suspend(). We already check for thread->is_in_any_VTMS_transition() and avoid posting events so we could add another condition. Serguei: Agreed. I was thinking about the same: to extend the thread->is_disable_suspend() to be used to skip posting the events. Alan: A good thing about that approach is that it avoids adding a field to VirtualThread.
03-01-2025