JDK-8269934 : RunThese24H.java failed with EXCEPTION_ACCESS_VIOLATION in java_lang_Thread::get_thread_status
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 8,11,13,15,17,18
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: windows
  • CPU: x86_64
  • Submitted: 2021-07-06
  • Updated: 2021-08-31
  • Resolved: 2021-08-04
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 13 JDK 15 JDK 17 JDK 18
11.0.13Fixed 13.0.9Fixed 15.0.5Fixed 17.0.1Fixed 18 b09Fixed
Related Reports
Duplicate :  
Relates :  
Description
The following test failed in the JDK18 CI:

applications/runthese/RunThese24H.java

Here's a snippet from the log file:

[stress.process.out] testReplacementAfterExchange: Passed. OKAY
[stress.process.out] testTimedExchange_InterruptedException: Passed. OKAY
[stress.process.out] java.lang.NullPointerException: Cannot invoke "String.startsWith(String)" because the return value of "java.lang.management.ThreadInfo.getLockName()" is null
[stress.process.out] 	at javasoft.sqe.tests.api.java.util.concurrent.JSR166TestCase.dumpTestThreads(JSR166TestCase.java:659)
[stress.process.out] 	at javasoft.sqe.tests.api.java.util.concurrent.JSR166TestCase.threadRecordFailure(JSR166TestCase.java:320)
[stress.process.out] 	at javasoft.sqe.tests.api.java.util.concurrent.JSR166TestCase.threadFail(JSR166TestCase.java:399)
[stress.process.out] 	at javasoft.sqe.tests.api.java.util.concurrent.JSR166TestCase.awaitTermination(JSR166TestCase.java:1045)
[stress.process.out] 	at javasoft.sqe.tests.api.java.util.concurrent.JSR166TestCase.awaitTermination(JSR166TestCase.java:1056)
[stress.process.out] 	at javasoft.sqe.tests.api.java.util.concurrent.ExchangerTest.testExchange(ExchangerTest.java:49)
[stress.process.out] 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[stress.process.out] 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
[stress.process.out] 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[stress.process.out] 	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
[stress.process.out] 	at javasoft.sqe.tests.api.junit.TestCase.invokeTestCase(TestCase.java:50)
[stress.process.out] 	at javasoft.sqe.javatest.lib.MultiTest.run(MultiTest.java:193)
[stress.process.out] 	at javasoft.sqe.javatest.lib.MultiTest.run(MultiTest.java:125)
[stress.process.out] 	at javasoft.sqe.tests.api.java.util.concurrent.ExchangerTest.main(ExchangerTest.java:27)
[stress.process.out] 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[stress.process.out] 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
[stress.process.out] 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[stress.process.out] 	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
[stress.process.out] 	at applications.kitchensink.process.stress.modules.JckStressModule$TestRunner$1.run(JckStressModule.java:275)
[stress.process.out] testExchange: Failed. Test case throws exception: java.lang.NullPointerException: Cannot invoke "String.startsWith(String)" because the return value of "java.lang.management.ThreadInfo.getLockName()" is null
[stress.process.out] testSpliterator_characteristics: Passed. OKAY
[stress.process.out] testSpliterator_getComparator: Passed. OKAY
[stress.process.out] testNanoTime: Failed. tearDown failed: java.lang.NullPointerException: Cannot invoke "String.startsWith(String)" because the return value of "java.lang.management.ThreadInfo.getLockName()" is null
[stress.process.out] testCallable3: Passed. OKAY
[stress.process.out] testCallableNPE1: Passed. OKAY
[stress.process.out] testNewCachedThreadPool1: Passed. OKAY
[stress.process.out] testNewSingleThreadExecutor3: Passed. OKAY
[stress.process.out] testPrivilegedCallableWithPrivs: Passed. OKAY
[stress.process.out] testNewScheduledThreadPool: Passed. OKAY
[stress.process.out] testTimedCallable: Passed. OKAY
[stress.process.out] testCallable1: Passed. OKAY
[stress.process.out] testCallableNPE4: Passed. OKAY
[stress.process.out] testCallableNPE3: Passed. OKAY
[stress.process.out] testCallable4: Passed. OKAY
[stress.process.out] testCallable2: Passed. OKAY
[stress.process.out] testCallableNPE2: Passed. OKAY
[stress.process.out] testCreatePrivilegedCallableUsingCCLWithNoPrivs: Failed. tearDown failed: java.lang.NullPointerException: Cannot invoke "String.startsWith(String)" because the return value of "java.lang.management.ThreadInfo.getLockName()" is null
[stress.process.out] testUnconfigurableScheduledExecutorServiceNPE: Failed. tearDown failed: java.lang.NullPointerException: Cannot invoke "String.startsWith(String)" because the return value of "java.lang.management.ThreadInfo.getLockName()" is null
[stress.process.out] testNewFixedThreadPool4: Passed. OKAY
[stress.process.out] testNewSingleThreadScheduledExecutor: Passed. OKAY
[stress.process.out] testUnconfigurableExecutorServiceNPE: Passed. OKAY
[stress.process.out] testPrivilegedCallableUsingCCLWithPrivs: Passed. OKAY
[stress.process.out] java.lang.NullPointerException: Cannot invoke "String.startsWith(String)" because the return value of "java.lang.management.ThreadInfo.getLockName()" is null
[stress.process.out] 	at javasoft.sqe.tests.api.java.util.concurrent.JSR166TestCase.dumpTestThreads(JSR166TestCase.java:659)
[stress.process.out] 	at javasoft.sqe.tests.api.java.util.concurrent.JSR166TestCase.threadRecordFailure(JSR166TestCase.java:320)
[stress.process.out] 	at javasoft.sqe.tests.api.java.util.concurrent.JSR166TestCase.threadUnexpectedException(JSR166TestCase.java:510)
[stress.process.out] 	at javasoft.sqe.tests.api.java.util.concurrent.JSR166TestCase.await(JSR166TestCase.java:1217)
[stress.process.out] 	at javasoft.sqe.tests.api.java.util.concurrent.JSR166TestCase.await(JSR166TestCase.java:1222)
[stress.process.out] 	at javasoft.sqe.tests.api.java.util.concurrent.ExecutorsTest.testDefaultThreadFactory(ExecutorsTest.java:332)
[stress.process.out] 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[stress.process.out] 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
[stress.process.out] 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[stress.process.out] 	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
[stress.process.out] 	at javasoft.sqe.tests.api.junit.TestCase.invokeTestCase(TestCase.java:50)
[stress.process.out] 	at javasoft.sqe.javatest.lib.MultiTest.run(MultiTest.java:193)
[stress.process.out] 	at javasoft.sqe.javatest.lib.MultiTest.run(MultiTest.java:125)
[stress.process.out] 	at javasoft.sqe.tests.api.java.util.concurrent.ExecutorsTest.main(ExecutorsTest.java:38)
[stress.process.out] 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[stress.process.out] 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
[stress.process.out] 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[stress.process.out] 	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
[stress.process.out] 	at applications.kitchensink.process.stress.modules.JckStressModule$TestRunner$1.run(JckStressModule.java:275)
[stress.process.out] testTwoParties: Passed. OKAY
[stress.process.out] testResetAfterCommandException: Passed. OKAY
[stress.process.out] testMoreTasksThanParties: Passed. OKAY
[stress.process.out] testAwait5_Timeout_BrokenBarrier: Passed. OKAY
[stress.process.out] testReset_NoBrokenBarrier: Passed. OKAY
[stress.process.out] testConstructor2: Passed. OKAY
[stress.process.out] testConstructor1: Passed. OKAY
[stress.process.out] testAwait2_Interrupted_BrokenBarrier: Passed. OKAY
[stress.process.out] testAwait1_Interrupted_BrokenBarrier: Passed. OKAY
[stress.process.out] testReset_Leakage: Passed. OKAY
[stress.process.out] testGetParties: Passed. OKAY
[stress.process.out] testSingleParty: Passed. OKAY
[stress.process.out] testBarrierAction: Passed. OKAY
[stress.process.out] testResetAfterTimeout: Passed. OKAY
[stress.process.out] testAwait3_TimeoutException: Passed. OKAY
[stress.process.out] testAwait4_Timeout_BrokenBarrier: Passed. OKAY
[stress.process.out] testReset_BrokenBarrier: Passed. OKAY
[stress.process.out] testResetAfterInterrupt: Failed. tearDown failed: java.lang.NullPointerException: Cannot invoke "String.startsWith(String)" because the return value of "java.lang.management.ThreadInfo.getLockName()" is null
[stress.process.out] java.lang.NullPointerException: Cannot invoke "String.startsWith(String)" because the return value of "java.lang.management.ThreadInfo.getLockName()" is null
[stress.process.out] 	at javasoft.sqe.tests.api.java.util.concurrent.JSR166TestCase.dumpTestThreads(JSR166TestCase.java:659)
[stress.process.out] 	at javasoft.sqe.tests.api.java.util.concurrent.JSR166TestCase.threadRecordFailure(JSR166TestCase.java:320)
[stress.process.out] 	at javasoft.sqe.tests.api.java.util.concurrent.JSR166TestCase.threadFail(JSR166TestCase.java:399)
[stress.process.out] 	at javasoft.sqe.tests.api.java.util.concurrent.JSR166TestCase.awaitTermination(JSR166TestCase.java:1045)
[stress.process.out] 	at javasoft.sqe.tests.api.java.util.concurrent.JSR166TestCase.awaitTermination(JSR166TestCase.java:1056)
[stress.process.out] 	at javasoft.sqe.tests.api.java.util.concurrent.CyclicBarrierTest.testResetWithoutBreakage(CyclicBarrierTest.java:382)
[stress.process.out] 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[stress.process.out] 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
[stress.process.out] 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[stress.process.out] 	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
[stress.process.out] 	at javasoft.sqe.tests.api.junit.TestCase.invokeTestCase(TestCase.java:50)
[stress.process.out] 	at javasoft.sqe.javatest.lib.MultiTest.run(MultiTest.java:193)
[stress.process.out] 	at javasoft.sqe.javatest.lib.MultiTest.run(MultiTest.java:125)
[stress.process.out] 	at javasoft.sqe.tests.api.java.util.concurrent.CyclicBarrierTest.main(CyclicBarrierTest.java:33)
[stress.process.out] 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[stress.process.out] 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
[stress.process.out] 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[stress.process.out] 	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
[stress.process.out] 	at applications.kitchensink.process.stress.modules.JckStressModule$TestRunner$1.run(JckStressModule.java:275)
[stress.process.out] testResetWithoutBreakage: Failed. Test case throws exception: java.lang.NullPointerException: Cannot invoke "String.startsWith(String)" because the return value of "java.lang.management.ThreadInfo.getLockName()" is null
[stress.process.out] testGetAndSetDefaultUncaughtExceptionHandler: Failed. tearDown failed: java.lang.NullPointerException: Cannot invoke "String.startsWith(String)" because the return value of "java.lang.management.ThreadInfo.getLockName()" is null
[stress.process.out] testGetAndSetUncaughtExceptionHandler: Failed. tearDown failed: java.lang.NullPointerException: Cannot invoke "String.startsWith(String)" because the return value of "java.lang.management.ThreadInfo.getLockName()" is null
[stress.process.out] testDefaultThreadFactory: Failed. Test case throws exception: java.lang.NullPointerException: Cannot invoke "String.startsWith(String)" because the return value of "java.lang.management.ThreadInfo.getLockName()" is null
[stress.process.out] java.lang.NullPointerException: Cannot invoke "String.startsWith(String)" because the return value of "java.lang.management.ThreadInfo.getLockName()" is null
[stress.process.out] 	at javasoft.sqe.tests.api.java.util.concurrent.JSR166TestCase.dumpTestThreads(JSR166TestCase.java:659)
[stress.process.out] 	at javasoft.sqe.tests.api.java.util.concurrent.JSR166TestCase.threadRecordFailure(JSR166TestCase.java:320)
[stress.process.out] 	at javasoft.sqe.tests.api.java.util.concurrent.JSR166TestCase.threadUnexpectedException(JSR166TestCase.java:510)
[stress.process.out] 	at javasoft.sqe.tests.api.java.util.concurrent.JSR166TestCase$CheckedRunnable.run(JSR166TestCase.java:1068)
[stress.process.out] 	at javasoft.sqe.tests.api.java.util.concurrent.JSR166TestCase.runWithSecurityManagerWithPermissions(JSR166TestCase.java:816)
[stress.process.out] 	at javasoft.sqe.tests.api.java.util.concurrent.JSR166TestCase.runWithPermissions(JSR166TestCase.java:788)
[stress.process.out] 	at javasoft.sqe.tests.api.java.util.concurrent.JSR166TestCase.runWithoutPermissions(JSR166TestCase.java:828)
[stress.process.out] 	at javasoft.sqe.tests.api.java.util.concurrent.ExecutorsTest.testPrivilegedCallableWithNoPrivs(ExecutorsTest.java:455)
[stress.process.out] 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[stress.process.out] 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
[stress.process.out] 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[stress.process.out] 	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
[stress.process.out] 	at javasoft.sqe.tests.api.junit.TestCase.invokeTestCase(TestCase.java:50)
[stress.process.out] 	at javasoft.sqe.javatest.lib.MultiTest.run(MultiTest.java:193)
[stress.process.out] 	at javasoft.sqe.javatest.lib.MultiTest.run(MultiTest.java:125)
[stress.process.out] 	at javasoft.sqe.tests.api.java.util.concurrent.ExecutorsTest.main(ExecutorsTest.java:38)
[stress.process.out] 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[stress.process.out] 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
[stress.process.out] 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[stress.process.out] 	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
[stress.process.out] 	at applications.kitchensink.process.stress.modules.JckStressModule$TestRunner$1.run(JckStressModule.java:275)
[stress.process.out] #
[stress.process.out] # A fatal error has been detected by the Java Runtime Environment:
[stress.process.out] #
[stress.process.out] #  EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x00007ffbc228d1d7, pid=49332, tid=9016
[stress.process.out] #
[stress.process.out] # JRE version: Java(TM) SE Runtime Environment (18.0+5) (build 18-ea+5-146)
[stress.process.out] # Java VM: Java HotSpot(TM) 64-Bit Server VM (18-ea+5-146, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, windows-amd64)
[stress.process.out] # Problematic frame:
[stress.process.out] # V  [jvm.dll+0x38d1d7]  java_lang_Thread::get_thread_status+0x7
[stress.process.out] #
[stress.process.out] # No core dump will be written. Minidumps are not enabled by default on client versions of Windows
[stress.process.out] #
[stress.process.out] # JFR recording file will be written. Location: T:\\testoutput\\test-support\\jtreg_closed_test_hotspot_jtreg_applications_runthese_RunThese24H_java\\scratch\\0\\hs_err_pid49332.jfr
[stress.process.out] #
[stress.process.out] Unsupported internal testing APIs have been used.
[stress.process.out] 
[stress.process.out] # An error report file with more information is saved as:
[stress.process.out] # T:\\testoutput\\test-support\\jtreg_closed_test_hotspot_jtreg_applications_runthese_RunThese24H_java\\scratch\\0\\hs_err_pid49332.log


Please note that the test was reporting NULL return values from
java.lang.management.ThreadInfo.getLockName() before the
EXCEPTION_ACCESS_VIOLATION crash. It is possible that the
code that is calling ThreadInfo.getLockName() is not properly coded
to handle NULL return values. ThreadInfo.getLockName()  is racy
with respect to thread exit so it is possible for the target thread to
exit while the ThreadInfo.getLockName()  query is running. In that
case, the API is supposed to return NULL.

Here's the crashing thread's stack:

---------------  T H R E A D  ---------------

Current thread (0x000001c75b74c810):  VMThread "VM Thread" [stack: 0x0000000f91400000,0x0000000f91500000] [id=9016]

Stack: [0x0000000f91400000,0x0000000f91500000],  sp=0x0000000f914ff0e8,  free space=1020k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [jvm.dll+0x38d1d7]  java_lang_Thread::get_thread_status+0x7  (javaClasses.cpp:1836)


siginfo: EXCEPTION_ACCESS_VIOLATION (0xc0000005), reading address 0x0000000000000028

It's the VM thread that's crashing here. I suspect a NULL ptr
has been passed in via some API. The VMThread usually
protects itself against such things so I'm not sure what
happened here.
Comments
Fix Request (11u): a Rather simple sanitizing fix. Not applying clean. Testing: GHA: hs:tier1
26-08-2021

Fix Request (13u): a Rather simple sanitizing fix. No applying clean. Testing: hs:tier1
24-08-2021

Fix Request (17u) This is a clean backport in 17u. The fix is extremely simple and low risk and addresses a bug that can easily be encountered by applications. The fix has been tested in JDK 18. Thanks.
24-08-2021

Fix Request (15u): a Rather simple sanitizing fix. No applying clean. full regression testing performed
20-08-2021

Changeset: 7e518f42 Author: David Holmes <dholmes@openjdk.org> Date: 2021-08-04 02:08:30 +0000 URL: https://git.openjdk.java.net/jdk/commit/7e518f42c9346abdf0c8059b45d3dfef95ed69bb
04-08-2021

That all said I can't actually find any evidence that the crashing test attaches any native threads this way.
28-07-2021

I've trawled through the code now and as far as I can see we do not preclude threads that are in the process of attaching, and so may have a NULL threadObj when ThreadSnapshot::initialize calls get_thread_status. Code that uses the ThreadSnapshot does seem to guard again a NULL ts->threadObj(), but initialize itself does not. We can trivially add a check a report a thread status of NEW in that case. I hacked the VM to put an attaching thread in infinite sleep, then hacked a test that attaches a native thread so that we use JMM dumpAllThreads whilst the native thread is still attaching. This produced the following SEGV crash: --------------- T H R E A D --------------- Current thread (0x00007f35b81ac370): VMThread "VM Thread" [stack: 0x00007f356d630000,0x00007f356d730000] [id=31125] Stack: [0x00007f356d630000,0x00007f356d730000], sp=0x00007f356d72e5e0, free space=1017k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x5908f8] std::enable_if<HasDecorator<64ul, 64ul>::value, int>::type RawAccessBarrier<64ul>::load_internal<64ul, int>(void*)+0xc V [libjvm.so+0x590789] int RawAccessBarrier<64ul>::load<int>(void*)+0x18 V [libjvm.so+0x6f727c] EnableIf<HasDecorator<272448ul, 2048ul>::value&&AccessInternal::PreRuntimeDispatch::CanHardwireRaw<272448ul>::value, int>::type AccessInternal::PreRuntimeDispatch::load<272448ul, int>(void*)+0x18 V [libjvm.so+0x6f6fa0] EnableIf<HasDecorator<272448ul, 2048ul>::value, int>::type AccessInternal::PreRuntimeDispatch::load_at<272448ul, int>(oopDesc*, long)+0x2b V [libjvm.so+0x6f6d07] EnableIf<!HasDecorator<270400ul, 2048ul>::value, int>::type AccessInternal::PreRuntimeDispatch::load_at<270400ul, int>(oopDesc*, long)+0x34 V [libjvm.so+0x6f6ad9] int AccessInternal::load_at<262144ul, int>(oopDesc*, long)+0x30 V [libjvm.so+0x6f68f5] AccessInternal::LoadAtProxy<262144ul>::operator int<int>() const+0x27 V [libjvm.so+0x6f6609] oopDesc::int_field(int) const+0x43 V [libjvm.so+0xaddb16] java_lang_Thread::get_thread_status(oopDesc*)+0xac V [libjvm.so+0x1196025] ThreadSnapshot::initialize(ThreadsList*, JavaThread*)+0xf5 V [libjvm.so+0x1194a9f] ThreadDumpResult::add_thread_snapshot(JavaThread*)+0x79 V [libjvm.so+0x1220f09] VM_ThreadDump::snapshot_thread(JavaThread*, ThreadConcurrentLocks*)+0x2b V [libjvm.so+0x1220d0b] VM_ThreadDump::doit()+0x15b
28-07-2021

That stack really just confirms what we already suspected: during a thread dump operation we find a thread that apparently has a NULL threadObj() - which should not be possible ... except now that I'm looking closely again I don't see where we filter out JNI attaching threads in this code ...
27-07-2021

Here's the crashing thread's stack for the jdk-18+8-305-tier8 sighting: applications/runthese/RunThese24H.java --------------- T H R E A D --------------- Current thread (0x00007f34bc092e60): VMThread "VM Thread" [stack: 0x00007f346846d000,0x00007f346856d000] [id=13874] Stack: [0x00007f346846d000,0x00007f346856d000], sp=0x00007f346856b888, free space=1018k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x7c7d77] java_lang_Thread::get_thread_status(oopDesc*)+0x7 V [libjvm.so+0xd4c459] ThreadDumpResult::add_thread_snapshot(JavaThread*)+0x69 V [libjvm.so+0xdbdb91] VM_ThreadDump::doit()+0xf1 V [libjvm.so+0xdbddfa] VM_Operation::evaluate()+0xea V [libjvm.so+0xdbf6f8] VMThread::evaluate_operation(VM_Operation*)+0xb8 V [libjvm.so+0xdbfbd4] VMThread::inner_execute(VM_Operation*)+0x1d4 V [libjvm.so+0xdbfebf] VMThread::run()+0xbf V [libjvm.so+0xd412ae] Thread::call_run()+0xde V [libjvm.so+0xba8961] thread_native_entry(Thread*)+0xe1 siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x0000000000000028 That stack should be much more helpful!
26-07-2021

Thanks for clarifying Dan.
20-07-2021

Sorry, I think I've confused the analysis of this failure. This error message: [stress.process.out] java.lang.NullPointerException: Cannot invoke "String.startsWith(String)" because the return value of "java.lang.management.ThreadInfo.getLockName()" is null made me think of a different failure mode that I've chased before. Just to complete the history of what I was thinking about: src/hotspot/share/services/threadService.cpp: ThreadSnapshot::initialize() calls ObjectSynchronizer::get_lock_owner() and has to handle the fact that the lock owner may be NULL because the object may no longer be locked because the thread unlocked the object or even that the thread unlocked the object AND exited. Even if the information returned in the ThreadInfo is non-NULL at the time that it is gathered, the situation may have changed by the time the caller of the jmm_GetThreadInfo() API gets to the point of processing the info that was returned. The thread that owned the monitor may have unlocked the object or it may have unlocked the object and existed. It's just a race that's permitted by the jmm_GetThreadInfo() API. So that's what I was thinking about when I saw: the return value of "java.lang.management.ThreadInfo.getLockName()" is null snippet. Okay, what does that have to do with crashing in: V [jvm.dll+0x38d1d7] java_lang_Thread::get_thread_status+0x7 (javaClasses.cpp:1836) Probably nothing. For the java_lang_Thread::get_thread_status() crash, it's probably a case of a stale/GC'ed threadObj value that may or may not be related to an M&M API. Without a more proper stack, it's really hard to say what's doing on here or how we got to java_lang_Thread::get_thread_status(). Again, sorry for confusing the analysis.
20-07-2021

[~dcubed] This is Java code. public String getLockName() { return lockName; } It makes no difference what state the thread is in when this called, the value was captured when the thread snapshot was taken. And if it returns null and the caller doesn't check then we will get a NPE not a crash.
20-07-2021

You can have a ThreadInfo that refers to a thread that is racing to exit. At the point that you call ThreadInfo.getLockName(), it is possible that the thread is no longer alive. In that case, ThreadInfo.getLockName() is supposed to return NULL. We have specific logic in the VM that handles this case. However, if the code that is calling ThreadInfo.getLockName() is not coded properly to handle a NULL return, then we can crash.
19-07-2021

> ThreadInfo.getLockName() is racy with respect to thread exit [~dcubed] I'm not sure what you mean by this. The ThreadInfo was put together in a snapshot, and once it has been collected the thread can have terminated before any of the info is then used. But that info is all stored at the Java level by then and doesn't require examining anything in the VM AFAICS.
14-07-2021

There is almost nothing to go on here: --------------- T H R E A D --------------- Current thread (0x000001c75b74c810): VMThread "VM Thread" [stack: 0x0000000f91400000,0x0000000f91500000] [id=9016] Stack: [0x0000000f91400000,0x0000000f91500000], sp=0x0000000f914ff0e8, free space=1020k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [jvm.dll+0x38d1d7] java_lang_Thread::get_thread_status+0x7 (javaClasses.cpp:1836) VM_Operation (0x0000000fd81fe450): ThreadDump, mode: safepoint, requested by thread 0x000001c787a79880 Don't know why there is not more stack for the VMThread. There are 1033 threads in the current thread list, and 1030 of those are waiting to be deleted. The error does suggest a NULL threadObj was passed - but that should not be possible. It is somewhat disturbing to see: OutOfMemory and StackOverflow Exception counts: OutOfMemoryError java_heap_errors=561 StackOverflowErrors=76 LinkageErrors=520098
07-07-2021

There were no changes in jdk-18+5-146-tier8 that seem obviously potentially linked to this issue.
07-07-2021

ILW = HLM = P3
06-07-2021