JDK-8233549 : Thread interrupted state must only be accessed when not in a safepoint-safe state
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 14
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2019-11-04
  • Updated: 2019-11-21
  • Resolved: 2019-11-15
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 14
14 b24Fixed
Related Reports
Relates :  
Relates :  
Sub Tasks
JDK-8233784 :  
JVMTI scenario tests fail after JDK-8229516 due to unexpected interrupt:


Started recording 1. No limit specified, using maxsize=250MB as default.

Use jcmd 44608 JFR.dump name=1 filename=FILEPATH to copy recording data to file.
The following fake exception stacktrace is for failure analysis. 
nsk.share.Fake_Exception_for_RULE_Creation: (jvmti_tools.cpp:683) error
	at nsk_lvcomplain(nsk_tools.cpp:172)
# ERROR: jvmti_tools.cpp, 683: error
#   jvmti error: code=52, name=JVMTI_ERROR_INTERRUPT


URL: https://hg.openjdk.java.net/jdk/jdk/rev/76ae9aa0e794 User: dholmes Date: 2019-11-15 03:37:13 +0000

The issue is not just _thread_in_native - that was why the call to JavaThread::name() were crashing. The basic issue is that we must not access the fields of a Java object when we are in a safepoint-safe state (_thread_in_native, _thread_blocked) as the oop could be relocated whilst we are accessing the field. Essentially we must always be _thread_in_vm when we call JavaThread::is_interrupted. This is a problem for JVM TI RawMonitorWait because it deliberately transitions directly from _thread_in_native to _thread_blocked because these are equivalent safepoint-safe states, and because we can't block for a safepoint because we may deadlock with the VMThread if we own the raw monitor. It is also a problem for ObjectMonitor::wait as we do one call to is_interrupted whilst _thread_blocked. Similarly for Parker::park.

Fix: - For ObjectMonitor::wait we move the calls to is_interrupted so they occur whilst still _thread_in_vm - For Parker::park we remove the call to is_interrupted that occurred immediately before parking (as the event will be signalled if an interrupt has occurred since the interrupt state was checked earlier). This affects POSIX and Solaris platforms only. - For JVM TI RawMonitorWait we use proper thread-state transitions to go from native -> in_vm -> blocked, but we push that and the interrupt checks down to the lowest-level code so that we can perform the transitions after the raw monitor has been unlocked (thus avoiding the deadlock potential with the VMThread).

Synopsis updated to reflect actual problem.

Thanks Dan. No need for additional logs etc. Problem is well understood. Solution in the pipeline.

Here are the logs for my jdk-14+22 Linux-X64 sightings: $ unzip -l jdk-14+22_linux.8233549.zip Archive: jdk-14+22_linux.8233549.zip Length Date Time Name --------- ---------- ----- ---- 31465 2019-11-07 23:14 jdk-14+22_1/failures.linux-x86_64/TestDescription.jtr.slowdebug 19458 2019-11-07 23:17 jdk-14+22_1/failures.linux-x86_64/TestDescription.jtr.slowdebug.1 35307 2019-11-08 16:16 jdk-14+22_2/failures.linux-x86_64/TestDescription.jtr.slowdebug 30179 2019-11-09 09:21 jdk-14+22_3/failures.linux-x86_64/TestDescription.jtr.slowdebug --------- ------- 116409 4 files

I just realized what the problem is in both cases. I'm attempting to access a field of a Java object while the thread is _thread_in_native which means I'm racing with the GC moving the oop.

I've established that there is a problem with the oop of the JVMTI Agent Thread. Even before any of the thread interrupt changes are applied, if I simply try to print the name of the JavaThread when it calls RawMonitorWait then I crash: # Internal Error (open/src/hotspot/share/classfile/javaClasses.inline.hpp:64), pid=11161, tid=11184 # assert(is_instance(java_string)) failed: must be java_string --------------- T H R E A D --------------- Current thread (0x00007f7b90404800): --------------- T H R E A D --------------- Current thread (0x00007f7b90404800): [error occurred during error reporting (printing current thread), id 0xe0000000, Internal Error (open/src/hotspot/share/classfile/javaClasses.inline.hpp:64)] Stack: [0x00007f7b40cf4000,0x00007f7b40df5000], sp=0x00007f7b40df3b90, free space=1022k Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0xdd0d1a] java_lang_String::value(oop)+0x1aa V [libjvm.so+0xe05b5e] java_lang_String::as_utf8_string(oop)+0x3e V [libjvm.so+0x16dccbc] JavaThread::get_thread_name_string(char*, int) const+0x1dc V [libjvm.so+0x10f5d08] JvmtiEnv::RawMonitorWait(JvmtiRawMonitor*, long)+0x58 C [libap12t001.so+0x9a9d] rawMonitorWait+0xd C [libap12t001.so+0x9c01] nsk_jvmti_waitForSync+0x71 C [libap12t001.so+0x9d8f] agentProc+0x11f C [libap12t001.so+0x9a37] agentThreadWrapper+0x57 V [libjvm.so+0x11203c3] JvmtiAgentThread::call_start_function()+0x133 V [libjvm.so+0x16e5046] JavaThread::thread_main_inner()+0x226 V [libjvm.so+0x16ea836] Thread::call_run()+0xf6 V [libjvm.so+0x140d1fe] thread_native_entry(Thread*)+0x10e

I've booby-trapped JavaThread::interrupt and JVM_InterruptThread and java_lang_Thread::set_interrupted and none of these are getting called! Trying to investigate further at the RawMonitor::wait I'm seeing unexpected NULL values whilst trying to debug things. Something strange is happening with the JVMTI Agent Thread.

The agent thread used by the test is being interrupted whilst doing a RawMonitor::wait. This is unexpected. Trying to determine what may be issuing the interrupt - it doesn't seem to be part of the test.

Not Windows only. I have reproduced once on Linux. I have a suspicion it relates to timeouts and possible attempts to abort things if the timeout happens.

Seems to be Windows only which likely means it relates to the interrupt event - though that is only used by Process.waitFor.