JDK-8046287 : [TESTBUG] runtime/Thread/TestThreadDumpMonitorContention.java failed error_cnt=12
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 7u80,9
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2014-06-07
  • Updated: 2016-03-11
  • Resolved: 2014-06-13
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 7 JDK 8 JDK 9
7u80Fixed 8u20Fixed 9 b22Fixed
Related Reports
Relates :  
Description
runtime/Thread/TestThreadDumpMonitorContention.java

ERROR: header line does not match runnable or waiting patterns.
ERROR: header_line='"ContendingThread-1" #11 prio=5 os_prio=0 tid=0x285d3000 nid=0x1ddc waiting on condition [0x2888f000]'
ERROR: locked_line='	- locked <0x1b15bb48> (a TestThreadDumpMonitorContention$1)'
java.lang.AssertionError: error_cnt=12
	at TestThreadDumpMonitorContention.main(TestThreadDumpMonitorContention.java:135)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:484)
	at com.sun.javatest.regtest.MainWrapper$MainThread.run(MainWrapper.java:94)
	at java.lang.Thread.run(Thread.java:745)

Comments
Actually it happens in the second thread (t@49) while unlocking a Java monitor and in the first thread (t@48) while locking a Java monitor. These are the two entry points: src/share/vm/interpreter/interpreterRuntime.cpp 590 IRT_ENTRY_NO_ASYNC(void, InterpreterRuntime::monitorenter(JavaThread* thread, BasicObjectLock* elem)) 615 IRT_ENTRY_NO_ASYNC(void, InterpreterRuntime::monitorexit(JavaThread* thread, BasicObjectLock* elem)) src/share/vm/runtime/interfaceSupport.hpp 471 #define IRT_ENTRY_NO_ASYNC(result_type, header) \ 472 result_type header { \ 473 ThreadInVMfromJavaNoAsyncException __tiv(thread); \ 474 VM_ENTRY_BASE(result_type, header, thread) \ 475 debug_only(VMEntryWrapper __vew;) 355 ~VMEntryWrapper() { <snip> 371 if (DeoptimizeALot || DeoptimizeRandom) { 372 InterfaceSupport::deoptimizeAll(); 373 }
11-06-2014

Right - Monitor::wait is why we are "waiting on condition". Which is an artifact of the deopt requiring a VM operation execution. Which happens while unlocking a Java monitor. Bleggghh!
11-06-2014

Based on code review comments from David H, I went back to grab more info. Here's the native stack traces for the two contending threads: THREAD t@48 t@48(l@48) stopped in ___lwp_cond_wait at 0xfffffd7fff2735ea 0xfffffd7fff2735ea: ___lwp_cond_wait+0x000a: jae ___lwp_cond_wait+0x18 [ 0xfffffd7fff2735f8, .+0xe ] current thread: t@48 [1] ___lwp_cond_wait(0x11c3b50, 0x11c3b38, 0x0, 0x0, 0xfffffd7fec600000, 0x1400000), at 0xfffffd7fff2735ea [2] __lwp_cond_wait(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff25873c =>[3] os::PlatformEvent::park(this = ???) (optimized), at 0xfffffd7ffe0000dd (line ~5482) in "os_solaris.cpp" [4] Monitor::ILock(this = ???, Self = ???) (optimized), at 0xfffffd7ffdea3a12 (line ~414) in "mutex.cpp" [5] Monitor::lock_without_safepoint_check(this = ???) (optimized), at 0xfffffd7ffdea7eab (line ~949) in "mutex.cpp" [6] SafepointSynchronize::block(thread = ???) (optimized), at 0xfffffd7ffe253611 (line ~679) in "safepoint.cpp" [7] Monitor::wait(this = ???, no_safepoint_check = ???, timeout = ???, as_suspend_equivalent = ???) (optimized), at 0xfffffd7ffdeaada9 (line ~184) in "interfaceSupport.hpp" [8] VMThread::execute(op = ???) (optimized), at 0xfffffd7ffe65a461 (line ~615) in "vmThread.cpp" [9] InterfaceSupport::deoptimizeAll() (optimized), at 0xfffffd7ffd361043 (line ~213) in "interfaceSupport.cpp" [10] InterpreterRuntime::monitorenter(thread = ???, elem = ???) (optimized), at 0xfffffd7ffd3a591b (line ~372) in " interfaceSupport.hpp" [11] 0xfffffd7fec648775(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fec648775 [12] 0xfffffd7fec648775(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fec648775 [13] 0xfffffd7fecc193a5(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fecc193a5 [14] 0xfffffd7e6d03eea0(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7e6d03eea0 Current function is os::PlatformEvent::park (optimized) 5482 status = os::Solaris::cond_wait(_cond, _mutex); THREAD t@49 t@49(l@49) stopped in ___lwp_cond_wait at 0xfffffd7fff2735ea 0xfffffd7fff2735ea: ___lwp_cond_wait+0x000a: jae ___lwp_cond_wait+0x18 [ 0xfffffd7fff2735f8, .+0xe ] current thread: t@49 [1] ___lwp_cond_wait(0xd80150, 0xd80138, 0x0, 0x0, 0xfffffd7fec600000, 0x1400000), at 0xfffffd7fff2735ea [2] __lwp_cond_wait(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff25873c =>[3] os::PlatformEvent::park(this = ???) (optimized), at 0xfffffd7ffe0000dd (line ~5482) in "os_solaris.cpp" [4] Monitor::IWait(this = ???, Self = ???, timo = ???) (optimized), at 0xfffffd7ffdea5426 (line ~414) in "mutex.cpp " [5] Monitor::wait(this = ???, no_safepoint_check = ???, timeout = ???, as_suspend_equivalent = ???) (optimized), at 0xfffffd7ffdeaafb8 (line ~1116) in "mutex.cpp" [6] VMThread::execute(op = ???) (optimized), at 0xfffffd7ffe65a461 (line ~615) in "vmThread.cpp" [7] InterfaceSupport::deoptimizeAll() (optimized), at 0xfffffd7ffd361043 (line ~213) in "interfaceSupport.cpp" [8] InterpreterRuntime::monitorexit(thread = ???, elem = ???) (optimized), at 0xfffffd7ffd3a6726 (line ~372) in "in terfaceSupport.hpp" [9] 0xfffffd7fec648da9(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fec648da9 [10] 0xfffffd7fec648da9(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fec648da9 [11] 0xfffffd7fecc193a5(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fecc193a5 [12] 0x500006718(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x500006718 Current function is os::PlatformEvent::park (optimized) 5482 status = os::Solaris::cond_wait(_cond, _mutex); So now I have to figure out where the OSThreadWaitState helper object is created since I mis-identified Parker::park() as the creator in my comments above. Update: Our OSThreadWaitState helper object is created by the Monitor::wait() call that's done as part of the deoptimize VM op: src/share/vm/runtime/vmThread.cpp: 562 void VMThread::execute(VM_Operation* op) { <snip> 610 if (!concurrent) { 611 // Wait for completion of request (non-concurrent) 612 // Note: only a JavaThread triggers the safepoint check when locking 613 MutexLocker mu(VMOperationRequest_lock); 614 while(t->vm_operation_completed_count() < ticket) { 615 VMOperationRequest_lock->wait(!t->is_Java_thread()); 616 } 617 } 1075 bool Monitor::wait(bool no_safepoint_check, long timeout, bool as_suspend_equivalent) { <snip> 1100 if (no_safepoint_check) { 1101 wait_status = IWait (Self, timeout) ; 1102 } else { 1103 assert (Self->is_Java_thread(), "invariant") ; 1104 JavaThread *jt = (JavaThread *)Self; 1105 1106 // Enter safepoint region - ornate and Rococo ... 1107 ThreadBlockInVM tbivm(jt); 1108 OSThreadWaitState osts(Self->osthread(), false /* not Object.wait() */); 1109 1110 if (as_suspend_equivalent) { 1111 jt->set_suspend_equivalent(); 1112 // cleared by handle_special_suspend_equivalent_condition() or 1113 // java_suspend_self() 1114 } 1115 1116 wait_status = IWait (Self, timeout) ;
10-06-2014

Testing an updated version of test right now.
09-06-2014

Here's more complete snippets of stack traces from a failing run: "ContendingThread-2" #24 prio=5 os_prio=64 tid=0x0000000000ec8800 nid=0x31 waiting on condition [0xfffffd7bbfffe000] java.lang.Thread.State: RUNNABLE JavaThread state: _thread_blocked Thread: 0x0000000000ec8800 [0x31] State: _at_safepoint _has_called_back 0 _at_poll_safepoint 0 JavaThread state: _thread_blocked at TestThreadDumpMonitorContention$1.run(TestThreadDumpMonitorContention.java:114) - locked <0xfffffd7e6d03eb28> (a TestThreadDumpMonitorContention$1) at java.lang.Thread.run(Thread.java:745) "ContendingThread-1" #23 prio=5 os_prio=64 tid=0x0000000001556000 nid=0x30 waiting on condition [0xfffffd7fbf17e000] java.lang.Thread.State: RUNNABLE JavaThread state: _thread_blocked Thread: 0x0000000001556000 [0x30] State: _at_safepoint _has_called_back 0 _at_poll_safepoint 0 JavaThread state: _thread_blocked at TestThreadDumpMonitorContention$1.run(TestThreadDumpMonitorContention.java:114) at java.lang.Thread.run(Thread.java:745) "ContendingThread-2" is locked: - locked <0xfffffd7e6d03eb28> (a TestThreadDumpMonitorContention$1) blocked on a safepoint: JavaThread state: _thread_blocked Thread: 0x0000000000ec8800 [0x31] State: _at_safepoint _has_called_back 0 _at_poll_safepoint 0 JavaThread state: _thread_blocked and waiting on a condition: "ContendingThread-2" #24 prio=5 os_prio=64 tid=0x0000000000ec8800 nid=0x31 waiting on condition [0xfffffd7bbfffe000] Now that I've entered all that data, the reason comes clear to me: The fix for JDK-8036823 included this change: --- a/src/share/vm/runtime/objectMonitor.cpp Thu May 29 08:58:51 2014 -0400 +++ b/src/share/vm/runtime/objectMonitor.cpp Fri May 30 07:20:51 2014 -0700 @@ -385,6 +385,15 @@ jt->java_suspend_self(); } Self->set_current_pending_monitor(NULL); + + // We cleared the pending monitor info since we've just gotten past + // the enter-check-for-suspend dance and we now own the monitor free + // and clear, i.e., it is no longer pending. The ThreadBlockInVM + // destructor can go to a safepoint at the end of this block. If we + // do a thread dump during that safepoint, then this thread will show + // as having "-locked" the monitor, but the OS and java.lang.Thread + // states will still report that the thread is blocked trying to + // acquire it. } Atomic::dec_ptr(&_count); Right after grabbing the ownership of the Java level monitor, the ThreadBlockInVM destructor fires and if we go to a safepoint then things start to get a bit confusing. So what happens when we go to a safepoint? All the threads that are running along will call SafepointSynchronize::block() and eventually block trying to acquire the Threads_lock. The VMThread is holding the Threads_lock so all the contenders call Parker::park() and that's where a OSThreadWaitState helper object gets created at we get this line: "ContendingThread-2" #24 prio=5 os_prio=64 tid=0x0000000000ec8800 nid=0x31 waiting on condition [0xfffffd7bbfffe000] It all depends on where the thread gets caught when we go to a safepoint to do the thread dump.
09-06-2014

Looks like the '-XX:+DeoptimizeALot' option is key to repo'ing this failure: $ cat doit.ksh JDK="/work/local/jdk/1.9.0/solaris-x64" JTREG="/java/re/jtreg/4.1/promoted/latest/binaries/jtreg/bin/jtreg" #extra_options=" -XX:+SafepointALot" #extra_options=" -Xcomp" extra_options=" -Xcomp -XX:+DeoptimizeALot" $JTREG \ -jdk "$JDK" \ -vmoptions:"-showversion -server_rt_baseline_clone_dcubed-fast" \ -vmoptions:"-XX:+PrintCommandLineFlags$extra_options" \ test/runtime/Thread/TestThreadDumpMonitorContention.java status="$?" echo "JTREG status=$status" echo "JTREG .jtr files:" find JTwork -name '*.jtr' -print Reproduced in 3 of 3 runs on mt-haku....
09-06-2014

Here's the code that prints the strings: src/share/vm/runtime/osThread.cpp: 42 // Printing 43 void OSThread::print_on(outputStream *st) const { 44 st->print("nid=0x%lx ", thread_id()); 45 switch (_state) { 46 case ALLOCATED: st->print("allocated "); break; 47 case INITIALIZED: st->print("initialized "); break; 48 case RUNNABLE: st->print("runnable "); break; 49 case MONITOR_WAIT: st->print("waiting for monitor entry " ); break; 50 case CONDVAR_WAIT: st->print("waiting on condition "); break; 51 case OBJECT_WAIT: st->print("in Object.wait() "); break; 52 case BREAKPOINTED: st->print("at breakpoint"); break; 53 case SLEEPING: st->print("sleeping"); break; 54 case ZOMBIE: st->print("zombie"); break; 55 default: st->print("unknown state %d", _state); break; 56 } 57 } And here's the helper object that sets the CONDVAR_WAIT state: src/share/vm/runtime/osThread.hpp: 131 // Utility class for use with condition variables: 132 class OSThreadWaitState : public StackObj { 133 OSThread* _osthread; 134 ThreadState _old_state; 135 public: 136 OSThreadWaitState(OSThread* osthread, bool is_object_wait) { 137 _osthread = osthread; 138 _old_state = osthread->get_state(); 139 if (is_object_wait) { 140 osthread->set_state(OBJECT_WAIT); 141 } else { 142 osthread->set_state(CONDVAR_WAIT); 143 } 144 } 145 ~OSThreadWaitState() { 146 _osthread->set_state(_old_state); 147 } 148 }; And down in the platform specific Parker::park() code we use: src/os/linux/vm/os_linux.cpp: 5772 OSThreadWaitState osts(thread->osthread(), false /* not Object.wait() */); Only using Linux as an example...
09-06-2014

Here's the failing test code: 57 final static Pattern HEADER_WAITING_PATTERN = Pattern.compile( 58 "^\"ContendingThread-.* waiting for monitor entry .*"); 59 final static Pattern HEADER_RUNNABLE_PATTERN = Pattern.compile( 60 "^\"ContendingThread-.* runnable .*"); <snip> 184 matcher = HEADER_RUNNABLE_PATTERN.matcher(header_line); 185 if (!matcher.matches()) { 186 // It's strange, but a locked line can also 187 // match the HEADER_WAITING_PATTERN. 188 matcher = HEADER_WAITING_PATTERN.matcher(header_line); 189 if (!matcher.matches()) { 190 System.err.println(); 191 System.err.println("ERROR: header line does " + 192 "not match runnable or waiting patterns."); 193 System.err.println("ERROR: header_line='" + 194 header_line + "'"); 195 System.err.println("ERROR: locked_line='" + line + " '"); 196 error_cnt++; 197 } 198 } And here's the failing line from the jstack output: ERROR: header_line='"ContendingThread-1" #11 prio=5 os_prio=0 tid=0x285d3000 nid=0x1ddc waiting on condition [0x2888f000]'
09-06-2014

This test was added for the following fix: JDK-8036823 Stack trace sometimes shows 'locked' instead of 'waiting to lock'
07-06-2014