JDK-8212207 : runtime/InternalApi/ThreadCpuTimesDeadlock.java crashes with SEGV in pthread_getcpuclockid+0x0
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 11,12
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • OS: linux
  • Submitted: 2018-10-15
  • Updated: 2020-08-21
  • Resolved: 2018-11-28
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 12
12 b22Fixed
Related Reports
Duplicate :  
Relates :  
Relates :  
Description
Test runtime/InternalApi/ThreadCpuTimesDeadlock.java crashes very intermittently (only 2 from 1000 runs) with:
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007fde8808dc20, pid=24398, tid=24438
#
# JRE version: Java(TM) SE Runtime Environment (12.0+15) (fastdebug build 12-ea+15)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 12-ea+15, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# C  [libpthread.so.0+0xcc20]  pthread_getcpuclockid+0x0
#
.....

---------------  T H R E A D  ---------------

Current thread (0x00007fde80397000):  JavaThread "MainThread" [_thread_in_vm, id=24438, stack(0x00007fde580cd000,0x00007fde581ce000)]

Stack: [0x00007fde580cd000,0x00007fde581ce000],  sp=0x00007fde581cb678,  free space=1017k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libpthread.so.0+0xcc20]  pthread_getcpuclockid+0x0
V  [libjvm.so+0x1314cc8]  ThreadTimesClosure::do_thread(Thread*)+0x248
V  [libjvm.so+0x17ac65d]  Threads::non_java_threads_do(ThreadClosure*)+0xad
V  [libjvm.so+0x131d171]  jmm_GetInternalThreadTimes+0x431
J 715  sun.management.HotspotThread.getInternalThreadTimes0([Ljava/lang/String;[J)I java.management@12-ea (0 bytes) @ 0x00007fde70751b4c [0x00007fde707519e0+0x000000000000016c]
J 720 c2 sun.management.HotspotThread.getInternalThreadCpuTimes()Ljava/util/Map; java.management@12-ea (79 bytes) @ 0x00007fde70755778 [0x00007fde70755520+0x0000000000000258]
j  jdk.internal.reflect.GeneratedMethodAccessor2.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+36 java.base@12-ea
J 668 c1 jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object; java.base@12-ea (10 bytes) @ 0x00007fde692d2454 [0x00007fde692d22a0+0x00000000000001b4]
J 667 c1 java.lang.reflect.Method.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object; java.base@12-ea (65 bytes) @ 0x00007fde692d1afc [0x00007fde692d1680+0x000000000000047c]
j  sun.reflect.misc.Trampoline.invoke(Ljava/lang/reflect/Method;Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+7
j  jdk.internal.reflect.GeneratedMethodAccessor1.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+40
J 668 c1 jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object; java.base@12-ea (10 bytes) @ 0x00007fde692d2454 [0x00007fde692d22a0+0x00000000000001b4]
J 711 c1 sun.reflect.misc.MethodUtil.invoke(Ljava/lang/reflect/Method;Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object; java.base@12-ea (111 bytes) @ 0x00007fde692de814 [0x00007fde692ddc80+0x0000000000000b94]
j  com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(Ljava/lang/reflect/Method;Ljava/lang/Object;[Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;+3 java.management@12-ea
j  com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(Ljava/lang/Object;Ljava/lang/Object;[Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;+9 java.management@12-ea
j  com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(Ljava/lang/Object;Ljava/lang/Object;[Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;+6 java.management@12-ea
j  com.sun.jmx.mbeanserver.PerInterface.getAttribute(Ljava/lang/Object;Ljava/lang/String;Ljava/lang/Object;)Ljava/lang/Object;+71 java.management@12-ea
j  com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(Ljava/lang/String;)Ljava/lang/Object;+13 java.management@12-ea
j  com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(Ljavax/management/ObjectName;Ljava/lang/String;)Ljava/lang/Object;+100 java.management@12-ea
J 690 c1 com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(Ljavax/management/ObjectName;Ljava/lang/String;)Ljava/lang/Object; java.management@12-ea (16 bytes) @ 0x00007fde692e5bcc [0x00007fde692e56a0+0x000000000000052c]
j  ThreadCpuTimesDeadlock.main([Ljava/lang/String;)V+112
v  ~StubRoutines::call_stub
V  [libjvm.so+0xea139a]  JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x85a
V  [libjvm.so+0x15c9952]  invoke(InstanceKlass*, methodHandle const&, Handle, bool, objArrayHandle, BasicType, objArrayHandle, bool, Thread*) [clone .constprop.102]+0xc02
V  [libjvm.so+0x15cd196]  Reflection::invoke_method(oop, Handle, objArrayHandle, Thread*)+0x196
V  [libjvm.so+0x101ba26]  JVM_InvokeMethod+0x1d6
j  jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Ljava/lang/reflect/Method;Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+0 java.base@12-ea
j  jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+100 java.base@12-ea
j  jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+6 java.base@12-ea
j  java.lang.reflect.Method.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+59 java.base@12-ea
j  com.sun.javatest.regtest.agent.MainWrapper$MainThread.run()V+172
j  java.lang.Thread.run()V+11 java.base@12-ea
v  ~StubRoutines::call_stub
V  [libjvm.so+0xea139a]  JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x85a
V  [libjvm.so+0xe9d34f]  JavaCalls::call_virtual(JavaValue*, Handle, Klass*, Symbol*, Symbol*, Thread*)+0x3df
V  [libjvm.so+0x1004b31]  thread_entry(JavaThread*, Thread*)+0x91
V  [libjvm.so+0x17b6367]  JavaThread::thread_main_inner()+0x2c7
V  [libjvm.so+0x17b669a]  JavaThread::run()+0x22a
V  [libjvm.so+0x14a09d0]  thread_native_entry(Thread*)+0x100

Comments
Unfortunately, we cannot backport this to 11u. The fix was effectively backed out by JDK-8214097 (see the discussion in JDK-8214428).
21-08-2020

I've also seen this bug in recent jdk11u. Added affects-version accordingly.
19-08-2020

In addition we modify the NJT list protocol so that threads are removed as they terminate, not as part of the destructor sequence. This is because most, if not all, NJTs don't actually have their destructor called at present - even if they terminate. It also seems that the only time any NJTs terminate is during VM shutdown e.g. WatcherThread, G1 Refine threads. Update: the NJT-list changes are moved to JDK-8214097.
20-11-2018

For the record I'm unable to get this test to fail locally either before or after the fix. Over 5600 runs.
19-11-2018

On further reflection it seems the best course is to fix this in the actual thread closure that will process the thread. We can check for an uninitialized NJT by using: if (!thread->is_Java_thread() && thread->stack_size() == 0) { noting that only partially constructed NJTs are possible, as JT's only get added to the ThreadsList after they are fully constructed and have executed at least to the point where call_run() is invoked.
16-11-2018

Moving the list addition until the thread actually runs causes a problem for the BarrierSet. The NJTs are created before the BarrierSet is available and so have to be added later, but the logic that does that expects to find them in the NJT list and they are not there yet. To continue on this path we need to have each NJT register itself with the BarrierSet. But that is not straight-forward and we have to account for attaching threads that won't execute call_run(). Another option would be for each NJT to have a "fully initialized" state that is only set when it runs in call_run(). The NJT::Iterator could then be set up to each include or exclude threads that are not fully initialized ... though that is also problematic as its use is buried within Threads::threads_do which calls Threads::non-java_threads_do which uses the Iterator.
15-11-2018

The handling of NonJavaThreads is done by the NonJavaThread constructor and destructor. The code notes: // Provides iteration over the list of NonJavaThreads. Because list // management occurs in the NonJavaThread constructor and destructor, // entries in the list may not be fully constructed instances of a // derived class. but it seems the users of the functionality, which is not always obvious as it is called as part of Threads::threads_do, are not aware of this limitation and do not account for threads that not only may be partially constructed, they may not have started execution yet and so do not have their dynamic runtime state initialized - like the native thread ID. This is the problem reported in JDK-8213434. Either we move the list management code into the thread_native_entry code, or perhaps the Thread::call_run method, or else we have to check every ThreadClosure that may be passed to Threads::non_java_threads_do and ensure they will work correctly. The former seems much simpler - provided it doe snot break any of the existing ThreadClosures. We also need to validate the destructor path and ensure it is always called before a thread can terminate.
15-11-2018

The code in question does: // All NonJavaThreads (i.e., every non-JavaThread in the system). void Threads::non_java_threads_do(ThreadClosure* tc) { NoSafepointVerifier nsv(!SafepointSynchronize::is_at_safepoint(), false); for (NonJavaThread::Iterator njti; !njti.end(); njti.step()) { tc->do_thread(njti.current()); } } where the NonJavaThread::iterator synchronized with the NonJavaThread::List implementation, which should ensure that a thread can't terminate until there are no active iterators. Need to check that all NonJavaThreads will terminate cleanly and destruct themselves as part of that process.
14-11-2018

There's another interesting part of the hs_err log: Other Threads: 0x00007fde801c7800 VMThread "VM Thread" [stack: 0x00007fde58bd9000,0x00007fde58cd9000] [id=24407] 0x00007fde8032b800 WatcherThread [stack: 0x00007fde581cf000,0x00007fde582cf000] [id=24436] 0x00007fde8004c800 GCTaskThread "GC Thread#0" [stack: 0x00007fde84266000,0x00007fde84366000] [id=24402] 0x00007fde50001000 GCTaskThread "GC Thread#1" [stack: 0x00007fde236cd000,0x00007fde237cd000] [id=24476] 0x00007fde50003000 GCTaskThread "GC Thread#2" [stack: 0x00007fde235cc000,0x00007fde236cc000] [id=24477] 0x00007fde50004800 GCTaskThread "GC Thread#3" [stack: 0x00007fde234cb000,0x00007fde235cb000] [id=24478] 0x00007fde80056800 ConcurrentGCThread "G1 Main Marker" [stack: 0x00007fde68135000,0x00007fde68235000] [id=24403] 0x00007fde80059000 ConcurrentGCThread "G1 Conc#0" [stack: 0x00007fde68034000,0x00007fde68134000] [id=24404] 0x00007fde800ef800 ConcurrentGCThread "G1 Refine#0" [stack: 0x00007fde596f8000,0x00007fde597f8000] [id=24405] 0x00007fde54001000 ConcurrentGCThread "G1 Refine#1" [error occurred during error reporting (printing all threads), id 0xe0000000, Internal Error (/scratch/opt/mach5/mesos/work_dir/slaves/83bb4d84-382c-4ead-b585-d8ef0018fefe-S250/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/45bc2563-3e2b-4940-b498-260011b05e81/runs/cb371fac-e60b-485b-8560-803e88f8994c/workspace/open/src/hotspot/share/runtime/thread.hpp:640)] So we hit a secondary failure: address stack_base() const { assert(_stack_base != NULL,"Sanity check"); return _stack_base; } I need to check if this code can see a newly created thread that may not yet have run and set its own stack information.
16-10-2018

Here are the threads typically sampled: {Sweeper thread=4561034, G1 Conc#0=69082, VM Periodic Task Thread=973074, C2 CompilerThread0=671711288, Service Thread=118815, G1 Young RemSet Sampling=930723, VM Thread=50974194, GC Thread#1=485273305, GC Thread#2=454793399, G1 Refine#0=648924, GC Thread#3=462066956, C1 CompilerThread0=380377675, G1 Main Marker=732363, GC Thread#0=446960551}
16-10-2018