JDK-4413752 : Linux: suspended thread blocks raw ObjectMonitor entry
  • Type: Bug
  • Component: vm-legacy
  • Sub-Component: jvmdi
  • Affected Version: 1.2.0,1.3.1
  • Priority: P2
  • Status: Closed
  • Resolution: Fixed
  • OS: linux,solaris_7
  • CPU: generic
  • Submitted: 2001-02-09
  • Updated: 2021-03-02
  • Resolved: 2002-09-06
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other Other
1.3.1 rc2Fixed 1.4.0Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Description
daniel.daugherty@Eng 2001-02-09

This bug was encountered while chasing the following bug:

4369489 2/5 jdb does not work on Linux platform.

This bug is the second layer of that onion.

I have a test case that exercises JVM/DI raw monitors and thread
suspend/resume. This test case passes 6000 loops (3 worker threads in
each loop for 1 minute of execution time on an Ultra 30) with the
current service_baseline on Solaris SPARC, Solaris X86 and Win32.

However, on RedHat 6.2 Linux, the test case hangs on the *first*
iteration.

Here is a transaction diagram of the test:

//
// main               blocker           contender            resumer
// =================  ================  ===================  ================
// launch blocker
// <launch returns>   blocker running
// launch contender   enter threadLock
// <launch returns>   wait for notify   contender running
// launch resumer     :                 block on threadLock
// <launch returns>   :                 :                    resumer running
// suspend contender  :                 <suspended>          wait for notify
// <ready to test>    :                 :                    :
// :                  :                 :                    :
// notify blocker     exit threadLock   :                    :
// join blocker       :                 :                    enter threadLock*
// <join returns>     blocker exits     <resumed>            resume contender
// join resumer                         :                    exit threadLock
// <join returns>                       enter threadLock     resumer exits
// join contender                       exit threadLock
// <join returns>                       contender exits
//

On Linux, the resumer thread fails to enter threadLock even though
there is no owner. The blocker thread has exited threadLock and the
contender thread is suspended.

I have added a test case that exercises JVM/PI raw monitors and thread
suspend/resume. Initially this test case failed on all four platforms.
But after discussing this with Karen, I changed JVM/PI SuspendThread()
to immediately suspend threads in state _thread_in_native. With that
fix in place, this test case passes 1900 loops (3 worker threads in
each loop for 1 minute of execution time on an Ultra 30) with the
current service_baseline on Solaris SPARC, Solaris X86 and Win32.

However, on RedHat 6.2 Linux, the JVM/PI test case also hangs on the
first iteration.

I have added a test case that exercises Java ObjectMonitors and JVM/DI
thread suspend/resume. Like in 4333847, this test shows that it is
necessary for JVM/DI SuspendThread() calls to not allow pending
ObjectMonitor.enter() calls to complete. This test case fails on all
four platforms.

I have added a test case that exercises Java ObjectMonitors and JVM/PI
thread suspend/resume. Like in 4333847, this test shows that it is
necessary for JVM/PI SuspendThread() calls to not allow pending
ObjectMonitor.enter() calls to complete. This test case fails on all
four platforms.

daniel.daugherty@Eng 2001-03-23

Just for completeness, I have added a test case that exercises Java
ObjectMonitors and Thread.suspend() and Thread.resume(). I realize
that Thread.suspend() and Thread.resume() are deprecated, but we need
to make sure that bad things don't happen.

Comments
CONVERTED DATA BugTraq+ Release Management Values COMMIT TO FIX: ladybird-rc1 FIXED IN: ladybird merlin-beta merlin-beta2 INTEGRATED IN: ladybird-rc2 merlin-beta
14-06-2004

EVALUATION daniel.daugherty@Eng 2001-02-09 For the JVM/DI and JVM/PI raw monitor and suspend/resume test cases: I have added debug code to src/os/linux/vm/objectMonitor_linux.cpp: ObjectMonitor::raw_enter() and verified that there is no owner of the ObjectMonitor when _mutex.lock() fails to get the underlying mutex. My guess is that Linux is waiting for the suspended thread to enter the mutex. Ramki calls this a "(blind) FIFO service policy". For the Java ObjectMonitor and JVM/DI and JVM/PI suspend/resume test case, the problem is the same as described in 4333847.
11-06-2004

PUBLIC COMMENTS .
10-06-2004

SUGGESTED FIX daniel.daugherty@Eng 2001-03-23 Change the Linux version of ObjectMonitor::raw_enter() to indicate that it is trying to enter a mutex. Change the SR_handler to recognize when it receives a signal that grants it a mutex as opposed to receiving a "resume" signal. When the SR_handler is granted a mutex, it returns back to ObjectMonitor::raw_enter() which recognizes that the thread should really be suspended. raw_enter() already has logic to unlock the mutex, yield to another thread, and come around again. daniel.daugherty@Eng 2001-03-27 See webrev-0308.tar for the changes made in combination with the fix for 4333847. See webrev-0327.tar for the rest of the changes.
27-03-2001