JDK-8032223 : nsk/regression/b4663146 gets assert(SafepointSynchronize::is_at_safepoint() || JvmtiEnv::is_thread_fully_suspended(get_thread(), false, &debug_bits))
  • Type: Bug
  • Component: hotspot
  • Sub-Component: jvmti
  • Affected Version: 9
  • Priority: P2
  • Status: Closed
  • Resolution: Fixed
  • Submitted: 2014-01-19
  • Updated: 2023-08-30
  • Resolved: 2014-02-05
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 8 JDK 9
8u401Fixed 9 b04Fixed
Related Reports
Relates :  
Description
Test run URL: http://aurora.ru.oracle.com/functional/faces/RunDetails.xhtml?names=347388.JAVASE.NIGHTLY.VM.RT_Baseline.2014-01-17-198&show-limit=0&filter=
Host:  sca00aje, Intel x86 2893 MHz, 2 cores, 11G, Solaris / Solaris 11, i86pc
Options:  -d64 -server -Xmixed -XX:MaxRAMFraction=8 -XX:+CreateMinidumpOnCrash -XX:ReservedCodeCacheSize=256M

# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/opt/jprt/T/P1/162547.cphillim/s/src/share/vm/prims/jvmtiThreadState.cpp:221), pid=23862, tid=9
#  assert(SafepointSynchronize::is_at_safepoint() || JvmtiEnv::is_thread_fully_suspended(get_thread(), false, &debug_bits)) failed: at safepoint or must be suspended
#

See attached hs_err file.

Comments
URL: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/e4062d6c5f22 User: lana Date: 2014-02-28 23:55:25 +0000
01-03-2014

URL: http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/e4062d6c5f22 User: sspitsyn Date: 2014-02-05 02:02:18 +0000
05-02-2014

The suggested fix is ready for review now. The patch is: diff -r ceddae1a16c9 src/share/vm/prims/jvmtiEnv.cpp --- a/src/share/vm/prims/jvmtiEnv.cpp Tue Jan 28 06:26:36 2014 -0800 +++ b/src/share/vm/prims/jvmtiEnv.cpp Tue Feb 04 05:10:29 2014 -0800 @@ -1360,8 +1360,10 @@ JvmtiEnv::GetFrameCount(JavaThread* java if (state == NULL) { return JVMTI_ERROR_THREAD_NOT_ALIVE; } - uint32_t debug_bits = 0; - if (is_thread_fully_suspended(java_thread, true, &debug_bits)) { + + // It is only safe to perform the direct operation on the current + // thread. All other usage needs to use a vm-op for safety. + if (java_thread == JavaThread::current()) { err = get_frame_count(state, count_ptr); } else { // get java stack frame count at safepoint. @@ -1476,9 +1478,10 @@ jvmtiError jvmtiError JvmtiEnv::GetFrameLocation(JavaThread* java_thread, jint depth, jmethodID* method_ptr, jlocation* location_ptr) { jvmtiError err = JVMTI_ERROR_NONE; - uint32_t debug_bits = 0; - if (is_thread_fully_suspended(java_thread, true, &debug_bits)) { + // It is only safe to perform the direct operation on the current + // thread. All other usage needs to use a vm-op for safety. + if (java_thread == JavaThread::current()) { err = get_frame_location(java_thread, depth, method_ptr, location_ptr); } else { // JVMTI get java stack frame location at safepoint. diff -r ceddae1a16c9 src/share/vm/prims/jvmtiEnvBase.hpp --- a/src/share/vm/prims/jvmtiEnvBase.hpp Tue Jan 28 06:26:36 2014 -0800 +++ b/src/share/vm/prims/jvmtiEnvBase.hpp Tue Feb 04 05:10:29 2014 -0800 @@ -533,7 +533,11 @@ public: VMOp_Type type() const { return VMOp_GetFrameCount; } jvmtiError result() { return _result; } void doit() { - _result = ((JvmtiEnvBase*)_env)->get_frame_count(_state, _count_ptr); + _result = JVMTI_ERROR_THREAD_NOT_ALIVE; + JavaThread* jt = _state->get_thread(); + if (Threads::includes(jt) && !jt->is_exiting() && jt->threadObj() != NULL) { + _result = ((JvmtiEnvBase*)_env)->get_frame_count(_state, _count_ptr); + } } }; @@ -559,8 +563,12 @@ public: VMOp_Type type() const { return VMOp_GetFrameLocation; } jvmtiError result() { return _result; } void doit() { - _result = ((JvmtiEnvBase*)_env)->get_frame_location(_java_thread, _depth, - _method_ptr, _location_ptr); + _result = JVMTI_ERROR_THREAD_NOT_ALIVE; + if (Threads::includes(_java_thread) && !_java_thread->is_exiting() && + _java_thread->threadObj() != NULL) { + _result = ((JvmtiEnvBase*)_env)->get_frame_location(_java_thread, _depth, + _method_ptr, _location_ptr); + } } };
04-02-2014

It was decided in the review discussion for the JDK-6471769 that this bug is not a dup of the 6471769. Moreover, the fix initially suggested for 6471769 really belongs to this one (8032223).
04-02-2014

Closing this issue a dup of the JDK-6280037. Also, I already have a suggested fix for the JDK-6280037.
31-01-2014

The older issue JDK-6471769 looks like another manifestation of this bug. As this particular instance is based on the assert it can be reproducible in the debug mode only. The old issue is more likely to be reproducible in product mode.
31-01-2014

I was not able to reproduce this issue in 1000 runs on Solaris-sparcv9.
31-01-2014

This is the interesting piece of code that I'm going to focus on: bool JvmtiEnvBase::is_thread_fully_suspended(JavaThread* thr, bool wait_for_suspend, uint32_t *bits) { // "other" threads require special handling if (thr != JavaThread::current()) { if (wait_for_suspend) { // We are allowed to wait for the external suspend to complete // so give the other thread a chance to get suspended. if (!thr->wait_for_ext_suspend_completion(SuspendRetryCount, SuspendRetryDelay, bits)) { // didn't make it so let the caller know return false; } } // We aren't allowed to wait for the external suspend to complete // so if the other thread isn't externally suspended we need to // let the caller know. else if (!thr->is_ext_suspend_completed_with_lock(bits)) { return false; } } return true; } Here is my THEORY: The issue with this function is that it sometimes returns different results depending on the passed value of the argument "wait_for_suspend". In the jvmtiEnv::GetFrameCount() the function is called with "true', but in the JvmtiThreadState::count_frames() it is called with "false". Suppose, some kind of transition eventually happens for the thread being suspended. The "true" will force the function to wait for external suspension to complete, e.g. it will wait until the thread reaches a stable state (finishes its transition). If called with "false", the function will return false right away without waiting. The important part of the theory above is that a different and pretty complex logic is involved when the is_thread_fully_suspended() is called with "false" instead of "true". I continue learning the details . . .
30-01-2014

Here is the full sequence of calls: debugLoop_run() -> getFrameCount() -> jvmti_GetFrameCount() -> jvmtiEnv::GetFrameCount() -> JvmtiEnvBase::get_frame_count() -> JvmtiThreadState::count_frames() 1. The getFrameCount() checked the state of the target thread and found it as suspended. 2. The jvmtiEnv::GetFrameCount() checked the target thread state again and only then calls the get_frame_count(): if (is_thread_fully_suspended(java_thread, true, &debug_bits)) { err = get_frame_count(state, count_ptr); Otherwise, it would launch a VM_Operation VM_GetFrameCount(). 3. The JvmtiEnvBase::get_frame_count() does nothing else but calls the JvmtiThreadState::count_frames(). 4. The JvmtiThreadState::count_frames() does the same check again in an assert: assert(SafepointSynchronize::is_at_safepoint() || JvmtiEnv::is_thread_fully_suspended(get_thread(), false, &debug_bits), "at safepoint or must be suspended"); This time the same check failed, the JvmtiEnv::is_thread_fully_suspended() returns false. A TMP conclusion: Something is going on with the suspended thread. At some point it can become non-suspended again. But I suspect that the target thread is in an internal transition that still MUST BE considered as suspended state of the thread. We see that apparently it is not the case. I'm still investigating why.
30-01-2014

The debugger back-end code validates that the target thread is suspended before call to the JVMTI GetFrameCount. Please, see the validateSuspendedThread() call below: static jboolean getFrameCount(PacketInputStream *in, PacketOutputStream *out) { . . . if (!validateSuspendedThread(out, thread)) { return JNI_TRUE; } error = JVMTI_FUNC_PTR(gdata->jvmti,GetFrameCount) (gdata->jvmti, thread, &count); . . . return JNI_TRUE; } It looks like the second case above takes place. Somehow, the jvmtiEnv::is_thread_fully_suspended() returns false when the thread is really suspended or must be considered suspended. A real puzzle here.
30-01-2014

The target thread stack trace from the pstack dump is: ----------------- lwp# 2 / thread# 2 -------------------- ffff80ffbf53ab8a ___lwp_cond_wait () + a ffff80fb2eb27f51 void os::PlatformEvent::park() () + cd ffff80fb2e9d1162 int Monitor::IWait(Thread*,long) () + 37e ffff80fb2e9d6dd9 bool Monitor::wait(bool,long,bool) () + 669 ffff80fb2efae859 int JavaThread::java_suspend_self() () + d1 ffff80fb2e4d7616 int JvmtiRawMonitor::raw_enter(Thread*) () + 6a ffff80fb2e44731b jvmtiError JvmtiEnv::RawMonitorEnter(JvmtiRawMonitor*) () + bf ffff80fd70c72695 debugMonitorEnter () + 2d ffff80fd70c624fa enqueueCommand () + f2 ffff80fd70c639e9 eventHelper_reportEvents () + e1 ffff80fd70c5e42e reportEvents () + 112 ffff80fd70c5e8f0 event_callback () + 304 ffff80fd70c5eee0 cbBreakpoint () + 128 ffff80fb2e479e8b void JvmtiExport::post_raw_breakpoint(JavaThread*,Method*,unsigned char*) () + 1e5b ffff80fb2df09bb9 void InterpreterRuntime::_breakpoint(JavaThread*,Method*,unsigned char*) () + 1cd ffff80ffaae738a0 * nsk/regression/b4663146/b4663146a.main([Ljava/lang/String;)V+0 ffff80ffaae006ad * nsk/regression/b4663146/b4663146a.main([Ljava/lang/String;)V+-6320 (line 51) ffff80fb2df3883a void JavaCalls::call_helper(JavaValue*,methodHandle*,JavaCallArguments*,Thread*) () + 1622 ffff80fb2df371d7 void JavaCalls::call(JavaValue*,methodHandle,JavaCallArguments*,Thread*) () + 3f ffff80fb2e0a51f0 void jni_invoke_static(JNIEnv_*,JavaValue*,_jobject*,JNICallType,_jmethodID*,JNI_ArgumentPusher*,Thread*) () + b80 ffff80fb2e106d21 jni_CallStaticVoidMethod () + 765 ffff80fd70c97fc3 JavaMain () + 287 ffff80ffbf531555 _thrp_setup () + a5 ffff80ffbf531800 _lwp_start () This means that the target JavaThread was suspended at the moment when the core dump was taken. This is how the java_suspend_self() ends up waiting on a Monitor: int JavaThread::java_suspend_self() { . . . while (is_external_suspend()) { ret++; this->set_ext_suspended(); // _ext_suspended flag is cleared by java_resume() while (is_ext_suspended()) { this->SR_lock()->wait(Mutex::_no_safepoint_check_flag); } } return ret; } So, there can be one of the two issues here: - there is a race between target thread (T#2) and debugLoop thread (T#9) so that the debugLoop thread was able to see the target thread as not suspended just before it was suspended - the debugLoop thread somehow did not recognize the target thread as suspended when it was suspended
30-01-2014

My conclusion above is wrong. The JvmtiEnv::is_thread_fully_suspended() checks and returns "true" if the thread is current JavaThread: if (thr != JavaThread::current()) { . . . } return true;
30-01-2014

As the assert itself is not correct and contradicts with the JVMTI spec, it worth to fix separately from the issue 6471769. But a side note of this is that these two issues are close to each other.
29-01-2014

This issue looks as another manifestation of this one: https://bugs.openjdk.java.net/browse/JDK-6471769 Error: assert(_cur_stack_depth == count_frames(),"cur_stack_depth out of sync") I'll try to prove it before closing as a dup. One prove would be to see if a fix for the 6471769 will resolve this one as well. But the issue is that these issues are normally very hard to reproduce, so that there is no way to be sure the issue was fixed or not.
29-01-2014

This is stack trace from the hs_err log: ;; Using jvm: "/export/local/aurora/sandbox/sca/vmsqe/jdk/nightly/fastdebug/rt_baseline/solaris-amd64/jre/lib/amd64/server/libjvm.so" # # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (/opt/jprt/T/P1/162547.cphillim/s/src/share/vm/prims/jvmtiThreadState.cpp:221), pid=23862, tid=9 # assert(SafepointSynchronize::is_at_safepoint() || JvmtiEnv::is_thread_fully_suspended(get_thread(), false, &debug_bits)) failed: at safepoint or must be suspended # # JRE version: Java(TM) SE Runtime Environment (8.0-b123) (build 1.8.0-ea-fastdebug-b123) # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.0-b62-internal-201401171625.cphillim.rt-missing-test-fastdebug compiled mode solaris-amd64 compressed oops) # Core dump written. Default location: /export/local/aurora/sandbox/results/ResultDir/b4663146/core or core.23862 # # If you would like to submit a bug report, please visit: # http://bugreport.sun.com/bugreport/crash.jsp # --------------- T H R E A D --------------- Current thread (0x0000000000784800): JavaThread "JDWP Transport Listener: dt_socket" daemon [_thread_in_vm, id=9, stack(0xffff80ffbcc25000,0xffff80ffbcd25000)] Stack: [0xffff80ffbcc25000,0xffff80ffbcd25000], sp=0xffff80ffbcd22a20, free space=1014k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x2905bac] void VMError::report(outputStream*)+0x92c;; __1cHVMErrorGreport6MpnMoutputStream__v_+0x92c V [libjvm.so+0x29070f5] void VMError::report_and_die()+0x579;; __1cHVMErrorOreport_and_die6M_v_+0x579 V [libjvm.so+0xfdf8db] void report_vm_error(const char*,int,const char*,const char*)+0x55f;; __1cPreport_vm_error6Fpkci11_v_+0x55f V [libjvm.so+0x1d8d5f4] int JvmtiThreadState::count_frames()+0x2c4;; __1cQJvmtiThreadStateMcount_frames6M_i_+0x2c4 V [libjvm.so+0x1c1076a] jvmtiError JvmtiEnvBase::get_frame_count(JvmtiThreadState*,int*)+0x1a;; __1cMJvmtiEnvBasePget_frame_count6MpnQJvmtiThreadState_pi_nKjvmtiError__+0x1a V [libjvm.so+0x1bf7350] jvmtiError JvmtiEnv::GetFrameCount(JavaThread*,int*)+0x338;; __1cIJvmtiEnvNGetFrameCount6MpnKJavaThread_pi_nKjvmtiError__+0x338 V [libjvm.so+0x1b2d129] jvmti_GetFrameCount+0x671;; jvmti_GetFrameCount+0x671 C [libjdwp.so+0x13131] getFrameCount+0xb5;; getFrameCount+0xb5 C [libjdwp.so+0x1a989] debugLoop_run+0x1b1;; debugLoop_run+0x1b1 C [libjdwp.so+0x2e4a5] connectionInitiated+0x8d;; connectionInitiated+0x8d C [libjdwp.so+0x2e6f5] attachThread+0x51;; attachThread+0x51 V [libjvm.so+0x1c8980d] void JvmtiAgentThread::start_function_wrapper(JavaThread*,Thread*)+0x219;; __1cQJvmtiAgentThreadWstart_function_wrapper6FpnKJavaThread_pnGThread__v_+0x219 V [libjvm.so+0x2768dbd] void JavaThread::thread_main_inner()+0x521;; __1cKJavaThreadRthread_main_inner6M_v_+0x521 V [libjvm.so+0x27684d3] void JavaThread::run()+0x84f;; __1cKJavaThreadDrun6M_v_+0x84f V [libjvm.so+0x22cf9b2] java_start+0x1ce;; java_start+0x1ce C [libc.so.1+0x121555] _thrp_setup+0xa5;; _thrp_setup+0xa5 C [libc.so.1+0x121800] _lwp_start+0x0;; _lwp_start+0x0 The assert is not exactly correct: # assert(SafepointSynchronize::is_at_safepoint() || JvmtiEnv::is_thread_fully_suspended(get_thread(), false, &debug_bits)) failed: at safepoint or must be suspended The JVMTI GetFrameCount spec tells this: "If this function is called for a thread actively executing bytecodes (for example, not the current thread and not suspended), the information returned is transient." First, it is perfectly fine to get the frame count information on the current thread. No need to be at safepoint or suspended. Second, it is Ok to get transient returned data.
29-01-2014

ILW=HLH=P2 Crash, seems rare, no known workaround Note that this occurred on a JDK9 Hotspot together with a JDK8 JDK and VM testbase 8. We should try to reproduce this with a proper build and testbase.
23-01-2014