Bug ID: JDK-8047720 Xprof hangs on Solaris

JDK-8047720 : Xprof hangs on Solaris

Type: Bug
Component: hotspot
Sub-Component: runtime
Affected Version: 8u20,9

Priority: P4
Status: Closed
Resolution: Fixed
OS: solaris
CPU: generic

Submitted: 2014-06-20
Updated: 2015-03-03
Resolved: 2014-06-27

Versions (Unresolved/Resolved/Fixed)

The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.

JDK 9
9 b23Fixed

Related Reports

Relates :	JDK-8074314 - PeriodicTask and WatcherThread APIs need to be cleaned up
Relates :	JDK-8072439 - fix for 8047720 may need more work

Description

Sometimes ( ~ 3% of runs) nsk/coverage/arguments/arguments008 hangs.
It runs JVM with -Xprof flag. 

   Steps to reproduce:

1. Take Solaris (x86) box
2. Prepare testlist with 100 items:
     for i in {1..100}; do echo "nsk/coverage/arguments/arguments008 execute_positive quick,jpda,hprof" >> testlist.run.100; done
3. execute test:
     /net/vmsqe.ru.oracle.com/export/gtee/bin/ute -bits d64 -jdk $JAVA_HOME -vmopts '-server -Xmixed' -component vm -tlists empty -testlist testlist.run.100
   note vmopts
4. observe timeouts.


with /net/vmsqe.ru.oracle.com/export/gtee/bin/ute ... -env TONGA_JAR=./tonga-utils-dtonga-hang-fix.jar and export TONGA_DUMP_CORE=true
taking from http://stt-13.ru.oracle.com/export/home0/projects/Tonga/dist/archive/4.0.1/tonga-utils-dtonga-hang-fix.jar 
i've got following stack (full is attached), only interesting threads:
----------------- t@2 -----------------
0xfffffd7fff2835ea      ___lwp_cond_wait + 0xa
0xfffffd7ffdfbed65      void os::PlatformEvent::park() + 0xcd
0xfffffd7ffde63602      void Monitor::ILock(Thread*) + 0x65e
0xfffffd7ffde67a9b      void Monitor::lock_without_safepoint_check() + 0x97
0xfffffd7ffe43a2a8      void WatcherThread::stop() + 0x30
0xfffffd7ffd3a7bb7      void before_exit(JavaThread*) + 0x163
0xfffffd7ffd6faee3      JVM_Halt + 0x2db
0xfffffd7ffb7a4a7f      Java_java_lang_Shutdown_halt0 + 0xb
0xfffffd7fec6258e3      * java.lang.Shutdown.halt0(int) bci:0 (Interpreted frame)
0xfffffd7fec607ca6      * java.lang.Shutdown.halt(int) bci:7 line:139 (Interpreted frame)
0xfffffd7fec607ca6      * java.lang.Shutdown.exit(int) bci:99 line:213 (Interpreted frame)
0xfffffd7fec607ca6      * java.lang.Runtime.exit(int) bci:14 line:109 (Interpreted frame)
0xfffffd7fec607ca6      * java.lang.System.exit(int) bci:4 line:968 (Interpreted frame)
0xfffffd7fec607ca6      * nsk.coverage.arguments.arguments008.arguments008.main(java.lang.String[]) bci:10 line:11 (Interpreted frame)
0xfffffd7fec600671      <StubRoutines>
0xfffffd7ffd3b4bb3      void JavaCalls::call_helper(JavaValue*,methodHandle*,JavaCallArguments*,Thread*) + 0x1667
0xfffffd7ffd3b350b      void JavaCalls::call(JavaValue*,methodHandle,JavaCallArguments*,Thread*) + 0x3f
0xfffffd7ffd5232bc      void jni_invoke_static(JNIEnv_*,JavaValue*,_jobject*,JNICallType,_jmethodID*,JNI_ArgumentPusher*,Thread*) + 0xb80
0xfffffd7ffd582dbd      jni_CallStaticVoidMethod + 0x765
0xfffffd7fff337fc3      JavaMain + 0x287
0xfffffd7fff27d9db      _thr_setup + 0x5b
0xfffffd7fff27dc10      _lwp_start
Locked ownable synchronizers:
    - None
----------------- t@10 -----------------
0xfffffd7fff2835ea      ___lwp_cond_wait + 0xa
0xfffffd7ffdfbed65      void os::PlatformEvent::park() + 0xcd
0xfffffd7ffde63602      void Monitor::ILock(Thread*) + 0x65e
0xfffffd7ffde67a9b      void Monitor::lock_without_safepoint_check() + 0x97
0xfffffd7ffe210786      void SafepointSynchronize::block(JavaThread*) + 0x19e
0xfffffd7ffe4447cf      void JavaThread::check_safepoint_and_suspend_for_native_trans(JavaThread*) + 0x277
0xfffffd7ffc95e57f      void ciEnv::register_method(ciMethod*,int,CodeOffsets*,int,CodeBuffer*,int,OopMapSet*,ExceptionHandlerTable*,ImplicitExceptionTable*,AbstractCompiler*,int,bool,bool,RTMState) + 0x1c3
0xfffffd7ffc685e9a      Compilation::Compilation #Nvariant 1(AbstractCompiler*,ciEnv*,ciMethod*,int,BufferBlob*) + 0x872
0xfffffd7ffc688637      void Compiler::compile_method(ciEnv*,ciMethod*,int) + 0x1cf
0xfffffd7ffcb50eda      void CompileBroker::invoke_compiler_on_method(CompileTask*) + 0x4ea
0xfffffd7ffcb5046d      void CompileBroker::compiler_thread_loop() + 0x3c5
0xfffffd7ffe43e8c5      void JavaThread::thread_main_inner() + 0x521
0xfffffd7ffe43dfd9      void JavaThread::run() + 0x8a1
0xfffffd7ffdfa6e4e      java_start + 0x1ce
0xfffffd7fff27d9db      _thr_setup + 0x5b
0xfffffd7fff27dc10      _lwp_start
Locked ownable synchronizers:
    - None
----------------- t@12 -----------------
0xfffffd7fff2835ea      ___lwp_cond_wait + 0xa
0xfffffd7ffdfbed65      void os::PlatformEvent::park() + 0xcd
0xfffffd7ffde65016      int Monitor::IWait(Thread*,long) + 0x37e
0xfffffd7ffde6ac8d      bool Monitor::wait(bool,long,bool) + 0x669
0xfffffd7ffe612536      void VMThread::execute(VM_Operation*) + 0x3e6
0xfffffd7ffe4441a5      void JavaThread::java_suspend() + 0x1c1
0xfffffd7ffcdeec85      void FlatProfiler::record_thread_ticks() + 0x305
0xfffffd7ffe3d1496      void PeriodicTask::real_time_tick(int) + 0x106
0xfffffd7ffe4395aa      void WatcherThread::run() + 0x42a
0xfffffd7ffdfa6e4e      java_start + 0x1ce
0xfffffd7fff27d9db      _thr_setup + 0x5b
0xfffffd7fff27dc10      _lwp_start

So, I hope there is no any deadlocks between FlatProfiler and Compiler and this is just another issue with PlatformEvent::park

Comments

This fix is not correct. See the following bug: JDK-8072439 fix for 8047720 may need more work for a revised version of the fix. Update: I added this entry a little too early. I haven't been able to prove that this fix is incorrect, but it certainly is confusing folks so it is going to be redone via JDK-8072439.
17-02-2015
checked for 9b23 with reproduce steps: TOTAL TESTS IN RUN: 100 TEST PASS: 100; 100% rate TEST FAIL: 0; 0% rate
22-07-2014
RULE nsk/coverage/arguments/arguments008 Timeout none
22-07-2014
I accidentally logged myself off my Solaris X86 server so here are the final testing results: $ grep -v PASS doit_loop.prod.log; echo; ls -l doit_loop.prod.log; grep -v PASS doit_loop.jvmg.log; echo; ls -l doit_loop.jvmg.log; grep -v PASS doit_loop.fast.log; echo; ls -l doit_loop.fast.log Copy 0: loop #115502... -rw-r--r-- 1 dcubed green 3469475 Jun 26 17:09 doit_loop.prod.log Copy 1: loop #25518... -rw-r--r-- 1 dcubed green 754452 Jun 26 17:09 doit_loop.jvmg.log Copy 2: loop #54447... -rw-r--r-- 1 dcubed green 1622322 Jun 26 17:09 doit_loop.fast.log so these tests ran for about 72 hours without a failure.
26-06-2014
Current test run has been going since Monday afternoon (06.23): $ grep -v PASS doit_loop.prod.log; echo; ls -l doit_loop.prod.log; grep -v PASS doit_loop.jvmg.log; echo; ls -l doit_loop.jvmg.log; grep -v PASS doit_loop.fast.log; echo; ls -l doit_loop.fast.log Copy 0: loop #106051... -rw-r--r-- 1 dcubed green 3176494 Jun 26 10:54 doit_loop.prod.log Copy 1: loop #23551... -rw-r--r-- 1 dcubed green 695442 Jun 26 10:54 doit_loop.jvmg.log Copy 2: loop #50302... -rw-r--r-- 1 dcubed green 1497972 Jun 26 10:54 doit_loop.fast.log The Aurora Adhoc test run has been stalled due to issues with the testbase itself and the infrastructure that supports Aurora.
26-06-2014
With the previous comment's code that makes the deadlock very reproducible still in place, the following patch appears to fix the deadlock: diff -r fa67e3591b91 src/share/vm/runtime/thread.cpp --- a/src/share/vm/runtime/thread.cpp Fri Jun 20 10:12:02 2014 -0700 +++ b/src/share/vm/runtime/thread.cpp Mon Jun 23 14:26:52 2014 -0600 @@ -1357,14 +1357,24 @@ void WatcherThread::make_startable() { } void WatcherThread::stop() { - { - MutexLockerEx ml(PeriodicTask_lock, Mutex::_no_safepoint_check_flag); - _should_terminate = true; - OrderAccess::fence(); // ensure WatcherThread sees update in main loop - + // Get the PeriodicTask_lock if we can. If we cannot, then the + // WatcherThread is using it and we don't want to block on that lock + // here because that might cause a safepoint deadlock depending on + // what the current WatcherThread tasks are doing. + bool have_lock = PeriodicTask_lock->try_lock(); + + _should_terminate = true; + OrderAccess::fence(); // ensure WatcherThread sees update in main loop + + if (have_lock) { WatcherThread* watcher = watcher_thread(); - if (watcher != NULL) + if (watcher != NULL) { + // If we managed to get the lock, then we should unpark the + // WatcherThread so that it can see we want it to stop. watcher->unpark(); + } + + PeriodicTask_lock->unlock(); } // it is ok to take late safepoints here, if needed My current test run has passed 100 iterations and is still going strong. I strongly suspect that this deadlock is not restricted to the Flat Profiler. Any PeriodicTask that causes a VM operation to happen at the same time that the main thread is trying to shutdown the VM could run into this deadlock. I'm also wondering if this scenario is possible: - Thread 1 does something to cause a safepoint to start (VMop or ...) - WatcherThread is executing a task and the task blocks on the safepoint while holding the PeriodicTask_lock - main thread is trying to exit, is in before_exit() and calls WatcherThread::stop() The main thread blocks on PeriodicTask_lock and does not reach a safepoint, the VM thread never reaches a safepoint and the system grinds to a halt.
23-06-2014
The following patch has changed the deadlock from being intermittent to happening in 3 of 3 runs: diff -r fa67e3591b91 src/share/vm/runtime/fprofiler.cpp --- a/src/share/vm/runtime/fprofiler.cpp Fri Jun 20 10:12:02 2014 -0700 +++ b/src/share/vm/runtime/fprofiler.cpp Mon Jun 23 11:37:14 2014 -0600 @@ -893,6 +893,11 @@ void FlatProfiler::record_thread_ticks() // Suspend each thread. This call should just return // for any threads that have already self-suspended // Net result should be one safepoint +// At this point, this thread is holding PeriodicTask_lock +// Add a slight delay to improve the odds that the main +// thread will block on PeriodicTask_lock while trying to +// terminate the VM. +os::naked_short_sleep(999); for (int j = 0; j < suspendedthreadcount; j++) { JavaThread tp = threadsList[j]; if (tp) { diff -r fa67e3591b91 src/share/vm/runtime/java.cpp --- a/src/share/vm/runtime/java.cpp Fri Jun 20 10:12:02 2014 -0700 +++ b/src/share/vm/runtime/java.cpp Mon Jun 23 11:37:14 2014 -0600 @@ -505,8 +505,13 @@ void before_exit(JavaThread thread) { Universe::heap()->stop(); // Terminate watcher thread - must before disenrolling any periodic task - if (PeriodicTask::num_tasks() > 0) + if (PeriodicTask::num_tasks() > 0) { +// At this point, the Flat Profiler thread is holding PeriodicTask_lock +// for an extra second. Add a slight delay to improve the odds that this +// thread will line up in that window and block in WatcherThread::stop(). +os::naked_short_sleep(250); WatcherThread::stop(); + } // Print statistics gathered (profiling ...) if (Arguments::has_profile()) {
23-06-2014
Can we retire the flat profiler? :)
21-06-2014
SUMMARY So the Flat Profiler (t@46) is holding the PeriodicTask_lock trying to suspend the "main" thread (t@2) which is trying to shutdown the VM. Part of shutting down the VM is stopping the WatcherThread, but the "main" thread needs the PeriodicTask_lock to do that. Classic deadlock. Here's the per-thread analysis: 'truss -p' doesn't show activity in ANY thread. The VMThread is trying to begin a safepoint: src/share/vm/runtime/safepoint.cpp: 340 // wait until all threads are stopped 341 while (_waiting_to_block > 0) { 342 if (TraceSafepoint) tty->print_cr("Waiting for %d thread(s) to block", _waiting_to_block); 343 if (!SafepointTimeout \|\| timeout_error_printed) { > 344 Safepoint_lock->wait(true); // true, means with no safepoint checks 345 } else { (dbx) print _waiting_to_block _waiting_to_block = 1 And the -XX:+SafepointTimeout output shows us: # SafepointSynchronize::begin: Timeout detected: # SafepointSynchronize::begin: Timed out while waiting for threads to stop. # SafepointSynchronize::begin: Threads which did not reach the safepoint: # "main" #1 prio=5 os_prio=64 tid=0x0000000000422000 nid=0x2 runnable [0xfffffd7ffc8ce000] java.lang.Thread.State: RUNNABLE JavaThread state: _thread_in_vm Thread: 0x0000000000422000 [0x 2] State: _call_back _has_called_back 0 _at_poll_safepoint 0 JavaThread state: _thread_in_vm # SafepointSynchronize::begin: (End of list) THREAD t@26 t@26(l@26) stopped in ___lwp_cond_wait at 0xfffffd7fff2835ea 0xfffffd7fff2835ea: ___lwp_cond_wait+0x000a: jae ___lwp_cond_wait+0x18 [ 0xfffffd7fff2835f8, .+0xe ] current thread: t@26 [1] ___lwp_cond_wait(0x71c250, 0x71c238, 0x0, 0x0, 0xfffffd7fff268720, 0x71c200), at 0xfffffd7fff2835ea [2] __lwp_cond_wait(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff26873c =>[3] os::Solaris::cond_wait(cv = 0x71c250, mx = 0x71c238), line 221 in "os_solaris.hpp" [4] os::PlatformEvent::park(this = 0x71c200), line 5482 in "os_solaris.cpp" [5] ParkCommon(ev = 0x71c200, timo = 0), line 414 in "mutex.cpp" [6] Monitor::IWait(this = 0x41fe30, Self = 0x71b000, timo = 0), line 792 in "mutex.cpp" [7] Monitor::wait(this = 0x41fe30, no_safepoint_check = true, timeout = 0, as_suspend_equivalent = false), line 1101 in "mutex.cpp" [8] SafepointSynchronize::begin(), line 344 in "safepoint.cpp" [9] VMThread::loop(this = 0x71b000), line 474 in "vmThread.cpp" [10] VMThread::run(this = 0x71b000), line 265 in "vmThread.cpp" [11] java_start(thread_addr = 0x71b000), line 861 in "os_solaris.cpp" [12] _thr_setup(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff27d9db [13] _lwp_start(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff27dc10 Current function is os::Solaris::cond_wait 221 static int cond_wait(cond_t cv, mutex_t mx) { return _cond_wait(cv, mx); } This is the thread that failed to reach the safepoint. src/share/vm/runtime/java.cpp: 462 void before_exit(JavaThread * thread) { <snip> 473 { MutexLocker ml(BeforeExit_lock); <snip> 507 // Terminate watcher thread - must before disenrolling any periodic ta sk 508 if (PeriodicTask::num_tasks() > 0) > 509 WatcherThread::stop(); 510 511 // Print statistics gathered (profiling ...) 512 if (Arguments::has_profile()) { 513 FlatProfiler::disengage(); 514 FlatProfiler::print(10); 515 } which brings us here: src/share/vm/runtime/thread.cpp: 1359 void WatcherThread::stop() { 1360 { 1361 MutexLockerEx ml(PeriodicTask_lock, Mutex::_no_safepoint_check_flag) ; and we can't grab that lock: (dbx) frame 9 (dbx) print this->_mutex this->_mutex = { _LockWord = { FullWord = 4340481 Address = 0x423b01 Bytes = "\001;B" } _owner = 0x990800 _EntryList = (nil) _OnDeck = (nil) _WaitLock = (0) _WaitSet = (nil) _snuck = false NotifyCount = -235802127 _name = "PeriodicTask_lock" _allow_vm_block = true _rank = 21 _next = (nil) _last_owner = 0x990800 } So the lock is owned by the JavaThread 0x990800 THREAD t@2 t@2(l@2) stopped in ___lwp_cond_wait at 0xfffffd7fff2835ea 0xfffffd7fff2835ea: ___lwp_cond_wait+0x000a: jae ___lwp_cond_wait+0x18 [ 0xfffffd7fff2835f8, .+0xe ] current thread: t@2 [1] ___lwp_cond_wait(0x423b50, 0x423b38, 0x0, 0x0, 0xfffffd7fff268720, 0x423b00), at 0xfffffd7fff2835ea [2] __lwp_cond_wait(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff26873c =>[3] os::Solaris::cond_wait(cv = 0x423b50, mx = 0x423b38), line 221 in "os_solaris.hpp" [4] os::PlatformEvent::park(this = 0x423b00), line 5482 in "os_solaris.cpp" [5] ParkCommon(ev = 0x423b00, timo = 0), line 414 in "mutex.cpp" [6] Monitor::ILock(this = 0x4216c0, Self = 0x422000), line 481 in "mutex.cpp" [7] Monitor::lock_without_safepoint_check(this = 0x4216c0, Self = 0x422000), line 949 in "mutex.cpp" [8] Monitor::lock_without_safepoint_check(this = 0x4216c0), line 955 in "mutex.cpp" [9] MutexLockerEx::MutexLockerEx(this = 0xfffffd7ffc8ce398, mutex = 0x4216c0, no_safepoint_check = true), line 225 in "mutexLocker.hpp" [10] WatcherThread::stop(), line 1361 in "thread.cpp" [11] before_exit(thread = 0x422000), line 509 in "java.cpp" [12] JVM_Halt(code = 95), line 422 in "jvm.cpp" [13] Java_java_lang_Shutdown_halt0(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7ffc75bc8e [14] 0xfffffd7fed62ad43(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fed62ad43 [15] 0xfffffd7fed62ad43(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fed62ad43 [16] 0xfffffd7fed607ea6(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fed607ea6 [17] 0xfffffd7fed607ea6(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fed607ea6 [18] 0xfffffd7fed607ea6(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fed607ea6 [19] 0xfffffd7fed607ea6(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fed607ea6 [20] 0xfffffd7fed607ea6(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fed607ea6 [21] 0xfffffd7fed6007b1(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fed6007b1 [22] JavaCalls::call_helper(result = 0xfffffd7ffc8cee40, m = 0xfffffd7ffc8ceb30, args = 0xfffffd7ffc8cebb8, __the_thread__ = 0x422000), line 393 in "javaCalls.cpp" [23] os::os_exception_wrapper(f = 0xfffffd7ffdef3c70 = &JavaCalls::call_helper(JavaValue,methodHandle,JavaCallArguments,Thread), value = 0xfffffd7ffc8cee40, method = 0xfffffd7ffc8ceb30, args = 0xfffffd7ffc8cebb8, thread = 0x422000), line 3884 in "os_solaris.cpp" [24] JavaCalls::call(result = 0xfffffd7ffc8cee40, method = CLASS, args = 0xfffffd7ffc8cebb8, __the_thread__ = 0x422000), line 307 in "javaCalls.cpp" [25] jni_invoke_static(env = 0x422228, result = 0xfffffd7ffc8cee40, receiver = (nil), call_type = JNI_STATIC, method_id = 0xae61e8, args = 0xfffffd7ffc8cedf0, __the_thread__ = 0x422000), line 1117 in "jni.cpp" [26] jni_CallStaticVoidMethod(env = 0x422228, cls = 0x424078, methodID = 0xae61e8, ...), line 1998 in "jni.cpp" [27] JavaMain(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff33af37 [28] _thr_setup(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff27d9db [29] _lwp_start(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff27dc10 Current function is os::Solaris::cond_wait 221 static int cond_wait(cond_t cv, mutex_t mx) { return _cond_wait(cv, mx); } This is the JavaThread that owns the PeriodicTask_lock lock and this JavaThread is also trying to suspend JavaThread 0x422000 which is the thread that's trying to shutdown the VM. (dbx) frame 13 src/share/vm/runtime/task.cpp: 75 MutexLockerEx ml(PeriodicTask_lock, Mutex::_no_safepoint_check_flag) ; 76 int orig_num_tasks = _num_tasks; 77 78 for(int index = 0; index < _num_tasks; index++) { > 79 _tasks[index]->execute_if_pending(delay_time); THREAD t@46 t@46(l@46) stopped in ___lwp_cond_wait at 0xfffffd7fff2835ea 0xfffffd7fff2835ea: ___lwp_cond_wait+0x000a: jae ___lwp_cond_wait+0x18 [ 0xfffffd7fff2835f8, .+0xe ] current thread: t@46 [1] ___lwp_cond_wait(0x991b50, 0x991b38, 0x0, 0x0, 0xfffffd7fff268720, 0x991b00), at 0xfffffd7fff2835ea [2] __lwp_cond_wait(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff26873c =>[3] os::Solaris::cond_wait(cv = 0x991b50, mx = 0x991b38), line 221 in "os_solaris.hpp" [4] os::PlatformEvent::park(this = 0x991b00), line 5482 in "os_solaris.cpp" [5] ParkCommon(ev = 0x991b00, timo = 0), line 414 in "mutex.cpp" [6] Monitor::IWait(this = 0x4200d0, Self = 0x990800, timo = 0), line 792 in "mutex.cpp" [7] Monitor::wait(this = 0x4200d0, no_safepoint_check = true, timeout = 0, as_suspend_equivalent = false), line 1101 in "mutex.cpp" [8] VMThread::execute(op = 0xfffffd7fc0482b08), line 615 in "vmThread.cpp" [9] JavaThread::java_suspend(this = 0x422000), line 2235 in "thread.cpp" [10] FlatProfiler::record_thread_ticks(), line 899 in "fprofiler.cpp" [11] FlatProfilerTask::task(this = 0x886bc0), line 954 in "fprofiler.cpp" [12] PeriodicTask::execute_if_pending(this = 0x886bc0, delay_time = 19), line 86 in "task.hpp" [13] PeriodicTask::real_time_tick(delay_time = 19), line 79 in "task.cpp" [14] WatcherThread::run(this = 0x990800), line 1330 in "thread.cpp" [15] java_start(thread_addr = 0x990800), line 861 in "os_solaris.cpp" [16] _thr_setup(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff27d9db [17] _lwp_start(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff27dc10
21-06-2014
Reran the test with the '-XX:+SafepointTimeout' option added. This time it didn't hang until loop #684... It was worth it though since I got this extra info: + /work/local/jdk/1.9.0/solaris-x64/bin/java -showversion -server_bh_hsx_rt_latest_exp_dcubed-jvmg -Xmixed -XX:+SafepointTimeout -Xprof nsk.coverage.arguments.arguments008.arguments008 java version "1.9.0-ea" Java(TM) SE Runtime Environment (build 1.9.0-ea-b19) Java HotSpot(TM) 64-Bit Server VM (build 1.9.0-bh_hsx_rt_latest_exp_dcubed-product-debug, mixed mode) # SafepointSynchronize::begin: Timeout detected: # SafepointSynchronize::begin: Timed out while waiting for threads to stop. # SafepointSynchronize::begin: Threads which did not reach the safepoint: # "main" #1 prio=5 os_prio=64 tid=0x0000000000422000 nid=0x2 runnable [0xfffffd7ffc8ce000] java.lang.Thread.State: RUNNABLE JavaThread state: _thread_in_vm Thread: 0x0000000000422000 [0x 2] State: _call_back _has_called_back 0 _at_poll_safepoint 0 JavaThread state: _thread_in_vm # SafepointSynchronize::begin: (End of list)
21-06-2014
Attached threads.2584.log from my full debug core file.
21-06-2014
Reproduced the hang on my Solaris X86 server on run #52 using these bits: + /work/local/jdk/1.9.0/solaris-x64/bin/java -showversion -server_bh_hsx_rt_latest_exp_dcubed-jvmg -Xmixed -Xprof nsk.coverage.arguments.arguments008.arguments0 08 java version "1.9.0-ea" Java(TM) SE Runtime Environment (build 1.9.0-ea-b19) Java HotSpot(TM) 64-Bit Server VM (build 1.9.0-bh_hsx_rt_latest_exp_dcubed-product-debug, mixed mode) These are full debug bits so I should be able to attach a debugger to the core file that I grabbed.
20-06-2014
INTJDK-7603055 reports this hang/timeout on Solaris SPARC and Solaris SPARC64 so this issue is not Solaris X86 specific.
20-06-2014