Bug ID: JDK-6459874 Performance regression in ThreadReference.ownedMonitors()

Type: Bug
Component: hotspot
Sub-Component: runtime
Affected Version: 6

Priority: P3
Status: Closed
Resolution: Fixed
OS: generic,windows_xp
CPU: generic,x86

Submitted: 2006-08-14
Updated: 2012-10-08
Resolved: 2006-11-14

JDK 6	JDK 7	Other
6u4Fixed	7Fixed	hs10Fixed

During nightly results analysis following problem was found: since b92 execution of ThreadReference.ownedMonitors() and ThreadReference.ownedMonitorsAndFrames() take 6sec if thread for which this methods are called sleepeng (with previous builds this method take several ms). Attached test case demonstrates this, debuggee starts 2 threads, one of this threads calls sleep(Long.MAX_VALUE) another one executes infinite loop.

Test output for b92:
Time for SleepengThread
ownedMonitors: Time: 6s (6668ms)
ownedMonitorsAndFrames: Time: 6s (6668ms)
Time for LoopThread
ownedMonitors: Time: 0s (1ms)
ownedMonitorsAndFrames: Time: 0s (1ms)
debuggee out>>>Command received: QUIT
debuggee out>>>Exit

Test output for b91:
Time for SleepengThread
ownedMonitors: Time: 0s (1ms)
ownedMonitorsAndFrames: Time: 0s (1ms)
Time for LoopThread
ownedMonitors: Time: 0s (0ms)
ownedMonitorsAndFrames: Time: 0s (0ms)
debuggee out>>>Command received: QUIT
debuggee out>>>Exit

Also I couldn't reproduce this problem just calling JVMTI functions 'GetOwnedMonitorStackDepthInfo' and 'GetOwnedMonitorInfo'.

SUGGESTED FIX See attached 6459874-webrev-cr0.tgz for the proposed fix.
08-09-2006
WORK AROUND The fix for 6440070 also added two new product flags: /* 50 retries * (5 * current_retry_count) millis = ~6.375 seconds / \ / typically, at most a few retries are needed / \ product(intx, SuspendRetryCount, 50, \ "Maximum retry count for an external suspend request") \ \ product(intx, SuspendRetryDelay, 5, \ "Milliseconds to delay per retry ( current_retry_count)") \ To work around an instance of this bug, reduce the number of retries. For example, -XX:SuspendRetryCount=25 will reduce the delay to about 1 second.
18-08-2006
EVALUATION Adding suspend-equivalent support is easy on Linux and Win32 os::sleep(). Things are a little more complicated with Solaris os::sleep() due to interruptible I/O. The interruptible I/O support structure does not like to have the thread already in state _thread_blocked. However, in order for suspend-equivalence to function properly, the following steps have to be taken in the proper order: - switch thread state to _thread_blocked - set suspend-equivalent condition - do blocking operation - check for pending external suspend request - if needed, (clear suspend-equivalent condition and) self-suspend otherwise clear suspend-equivalent condition - restore thread state The suspend-equivalent flag must be set after _thread_blocked and cleared very carefully before the thread returns to its previous thread state in order to prevent a suspend requesting thread from getting confused. To reiterate, while the suspend-equivalent flag can be set without holding the SR_lock, it must be cleared while holding the SR_lock in order to prevent a suspend requesting thread from being fooled into thinking that the thread is still in a suspend-equivalent condition. Also if there is a pending external suspend request, then the thread will self-suspend while in the _thread_blocked state. This will honor the safepoint protocol. The Solaris interruptible I/O mechanism will need a minor tweak in order for the suspend-equivalent protocol to be honored.
16-08-2006
EVALUATION I've confirmed that the performance problem is due to the fact that os::sleep() is not considered a suspend-equivalent condition. This needs to be changed in the Linux, Solaris and Win32 os_*.cpp files.
15-08-2006
EVALUATION The following bug is also a likely suspect: 6440070 2/2 NotifyFramePop throws unexpected error: JVMTI_ERROR_THREAD_NOT_SUSPENDED Swamy backed out the JNI monitor changes and the performance regression is still there. Since the regression is with a thread that calls Thread.sleep(), I suspect the fact that os::sleep() is not considered a suspend-equivalent condition to be a good place to look.
14-08-2006
EVALUATION =>export TESTJDK=/java/re/jdk/1.6.0/promoted/all/b92/binaries/solaris-sparc =>export TESTJAVA=/java/re/jdk/1.6.0/promoted/all/b92/binaries/solaris-sparc =>$TESTJAVA/bin/java -classpath $TESTJDK/lib/tools.jar:. MonitorsInfoTest Time for SleepengThread ownedMonitors: Time: 6s (6739ms) ownedMonitorsAndFrames: Time: 6s (6739ms) Time for LoopThread ownedMonitors: Time: 0s (1ms) ownedMonitorsAndFrames: Time: 0s (1ms) debuggee out>>>Command received: QUIT debuggee out>>>Exit /home/swamyv/mywork/bugs/6459874 =>export TESTJAVA=/java/re/jdk/1.6.0/promoted/all/b91/binaries/solaris-sparc =>$TESTJAVA/bin/java -classpath $TESTJDK/lib/tools.jar:. MonitorsInfoTest Time for SleepengThread ownedMonitors: Time: 1s (1091ms) ownedMonitorsAndFrames: Time: 1s (1091ms) Time for LoopThread ownedMonitors: Time: 0s (1ms) ownedMonitorsAndFrames: Time: 0s (1ms) debuggee out>>>Command received: QUIT debuggee out>>>Exit Nothing changed in jdk b/w b92 and b91. The difference is changes in hotspot. In b92 we fixed the monitors to report monitors owned by jni monitors and also we fixed the memory leak in owned monitors. This addition has caused this performance regression.
14-08-2006

Duplicate :	JDK-6515686 - Some debug actions are sometimes slow for DestroyJavaVM thread.
Relates :	JDK-6471657 - JNI Direct Buffer API calls don't set the correct thread state during initialization
Relates :	JDK-6440070 - NotifyFramePop throws unexpected error: JVMTI_ERROR_THREAD_NOT_SUSPENDED