JDK-4461173 : Linux:intermittent hang due to mutex being granted to suspended thread
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 1.4.0
  • Priority: P1
  • Status: Closed
  • Resolution: Fixed
  • OS: linux,solaris_8
  • CPU: generic,x86,sparc
  • Submitted: 2001-05-21
  • Updated: 2012-10-08
  • Resolved: 2001-11-03
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
1.4.0 rc1Fixed
Related Reports
Duplicate :  
Duplicate :  
Relates :  
Relates :  
Relates :  
Relates :  
Description
Merlin JDK build hang intermittently on Linux.

See attachment for stacktrace.

Comments
CONVERTED DATA BugTraq+ Release Management Values COMMIT TO FIX: merlin-rc1 FIXED IN: merlin-rc1 INTEGRATED IN: merlin-rc1
14-06-2004

EVALUATION From the stack trace, the VM is in shutdown phase. vm thread (thread 3) got stuck waiting for ThreadCritical and never finished the VM_Exit operation. However, there doesn't seem to be a thread that is inside ThreadCritical. The compiler thread (thread 9) is waiting for ThreadCritical too, but it has been suspended, apparently by VM_Exit operation. LinuxThreads knows nothing about the suspension/resumption mechanism used in VM. If there are more than one thread waiting for the same mutex, when the mutex is unlocked, LinuxThreads grant the mutex to the longest waiting thread. It's possible LinuxThreads grant the mutex to a already suspended (by VM) thread. In this case, it looks like the ThreadCritical mutex was granted to the compiler thread which has already been suspended. Then we got a hang. hui.huang@Eng 2001-05-21 With the help of a modified VM and a testcase that can hang quickly, I have investigated further on this hang. The VM_Exit operation will grab ThreadCritical lock before it actually suspends a thread and releases the lock after the thread is suspended. This is done for each thread that needs to be suspended during VM shutdown. What happened in this hang is compiler thread tries to grab ThreadCritical right after VM thread has entered ThreadCritical, then VM thread sends out the suspension signal to suspend compiler thread. After this is done, VM thread leaves ThreadCritical, LinuxThreads then grants the mutex to the then-already-suspended compiler thread. When VM thread needs to suspend another thread, it needs to enter ThreadCritical again. But because compiler thread is now sleeping infinitely inside ThreadCritical, VM thread is put on hold infinitely. Then there is a hang. The fundamental problem for this bug is LinuxThreads knows nothing about Java suspension/resumption and may grant a mutex to a suspended thread. This problem was one of the major issues around jdb not working on Linux (4369489) and is discussed in 4413752. The simplest workaround for this problem is to make VM thread hold ThreadCritical for the entire period of suspending all other threads, that is, do not leave ThreadCritical after each thread has been suspended. hui.huang@Eng 2001-05-21 --------------------------------------------- Change synopsis to reflect the nature of this hang. It's not only in shutdown suspension, it can happen on safepoint suspension and profiler suspension as well. See duped bugs for other testcases. ###@###.### 2001-10-17 --------------------------------------------- The main issue is a thread gets "signal suspended" inside SR_handler by HotSpot and LinuxThreads then grants a mutex to the suspended thread. It is fixed by not holding the thread inside SR_handler if the thread is waiting for a mutex. ###@###.### 2001-11-02
02-11-2001