JDK-8258027 : [linux] SIGSEGV pthread_getcpuclockid crash
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 16
  • Priority: P2
  • Status: Resolved
  • Resolution: External
  • OS: linux_suse_sles_11,linux_ubuntu
  • CPU: generic
  • Submitted: 2020-12-10
  • Updated: 2024-03-28
  • Resolved: 2020-12-18
Related Reports
Blocks :  
Duplicate :  
Relates :  
Description
Crash while using JMC JMC 8.0.0 with latest JDK 16 or with JDK 11.0.9 
b07. 

Steps to Reproduce : 
1. Use JMC 8.0.0 latest build and extract the same.
2. Launch JMC with additional arguments "~/pathtojmc/jmc -vm $JDK16_HOME/bin -consoleLog -debug" 
3. Open a running JVM instance in the "JVM browser" and Using Mouse Right click select "Start JMX Console" 
4. "JVM browser" , Right click and select "Start Flight Recording" , (Optional) reduce the "Recording time" to "10 s" instead of 1 m (default), and click on "Next" and "Finish", Wait for recording to complete. 
5. Close the JMC application. (Will have the call stack as attached in this bug) 
6. If this doesn't crash, then instead of closing the application, Repeat the step 4 many times to re-produce the crash.  (Leads to crash with details mentioned in JDK-8258031) 

Note: Crash only on Ubuntu 18.04 / 20.04, OEL 7.6 and SUSE linux , Not in Windows or Mac OS 

Some Part of the call stack is mentioned below and complete call stack is attached. 
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007ff09219d905, pid=15665, tid=15672
#
# JRE version: Java(TM) SE Runtime Environment (16.0+27) (build 16-ea+27-1884)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (16-ea+27-1884, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# C  [libpthread.so.0+0xf905]  pthread_getcpuclockid+0x5
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport %p %s %c %d %P %E" (or dumping to /home/guruhb/ade/temp/8b05/core.15665)
#
# JFR recording file will be written. Location: /home/guruhb/ade/temp/8b05/hs_err_pid15665.jfr
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
#

---------------  S U M M A R Y ------------

...

---------------  T H R E A D  ---------------

Current thread (0x00007ff08c170860):  VMThread "VM Thread" [stack: 0x00007ff057170000,0x00007ff057270000] [id=15672]

Stack: [0x00007ff057170000,0x00007ff057270000],  sp=0x00007ff05726e908,  free space=1018k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libpthread.so.0+0xf905]  pthread_getcpuclockid+0x5
V  [libjvm.so+0xd71571]  Thread::print_on(outputStream*, bool) const+0x41
V  [libjvm.so+0xd76ea0]  Threads::print_on(outputStream*, bool, bool, bool, bool)+0x190
V  [libjvm.so+0xdf9e6a]  VM_Operation::evaluate()+0xea
V  [libjvm.so+0xdfb745]  VMThread::evaluate_operation(VM_Operation*)+0xb5
V  [libjvm.so+0xdfbb68]  VMThread::inner_execute(VM_Operation*)+0x1c8
V  [libjvm.so+0xdfbe2f]  VMThread::run()+0xbf
V  [libjvm.so+0xd7801d]  Thread::call_run()+0xfd
V  [libjvm.so+0xbd0347]  thread_native_entry(Thread*)+0xe7
Comments
Based on the analysis I am closing this as an "External" issue. The SWT callback code calls jni_AttachAsDaemon, performs a java upcall and then detaches from the VM again. The suspicion is that a GTK error causes the attached thread to terminate abruptly without ever detaching from the VM.
18-12-2020

ILW = HMM = P2
15-12-2020

Thanks [~pchilanomate] for the additional investigation and analysis.
14-12-2020

I suspect this may be a long standing potential bug in GTK+/SWT whereby an error in the thread attached to the VM causes it to abort via pthread_exit, and failing to detach from the VM in the process. If that happens the VM has no way to detect, or correct for, an invalid pthread id. If we're lucky then pthread_getcpuclockid returns ESRCH; and if unlucky it crashes.
14-12-2020

This is looking to me like a JMC issue. I downloaded JMC as directed and simply started it with no arguments and after a few seconds it just crashed with the same fault, but this is with 8u262! # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x00007fe579fb8f10, pid=25939, tid=0x00007fe50511c700 # # JRE version: OpenJDK Runtime Environment (8.0_262-b10) (build 1.8.0_262-b10) # Java VM: OpenJDK 64-Bit Server VM (25.262-b10 mixed mode linux-amd64 compressed oops) # Problematic frame: # C [libpthread.so.0+0xcf10] pthread_getcpuclockid+0x0
11-12-2020

If pthread_getcpuclockid is crashing that suggests a bug in the pthreads implementation. We know the VM can sometimes encounter a terminated thread that failed to detach (which is an application bug), in which case we pass an "invalid" pthread_t to pthread_getcpuclockid, but that should result in ESRCH error at worst, not a crash. Update: the validity check is very basic and just involves assuming the value is a pointer to a pthread struct and then checking one field is >= 0. So if the value is completely bogus then attempt read the field can fault.
11-12-2020

Looking at more of the hs_err logs I see this problem in a number of them e.g. https://bugs.openjdk.java.net/secure/attachment/91488/hs_err_pid145310.log 0x00007f80f0027990 JavaThread "Thread-5" daemon [_thread_in_native, id=145396, stack(0x00007f807bbff000,0x00007f807c3fe000)] 0x00007f818802eca0 JavaThread "Thread-23" daemon [_thread_in_native, id=145525, stack(0x00007f807bbff000,0x00007f807c3fe000)] 0x00007f8144609bb0 JavaThread "Thread-31" daemon [_thread_in_native, id=146142, stack(0x00007f807bbff000,0x00007f807c3fe000)] What are these non-descript daemon threads? Are they native threads that have attached to the VM? How can they have the same stack!
11-12-2020

I took a look at the possibly related Eclipse crash, and the failure modes do all seem to be the same. In each case the faulting address is close to the start of a thread's stack (stacks grow down). What I noticed with the hs_err log from the eclipse crash was very interesting: 0x00007fcc80953000 JavaThread "Thread-92" daemon [_thread_in_native, id=26348, stack(0x00007fcb1b501000,0x00007fcb1bd00000)] 0x00007fcbc408c000 JavaThread "Thread-99" daemon [_thread_in_native, id=26381, stack(0x00007fcb1b501000,0x00007fcb1bd00000)] 0x00007fcbf8406800 JavaThread "Thread-105" daemon [_thread_in_native, id=26505, stack(0x00007fcb1b501000,0x00007fcb1bd00000)] 0x00007fcc14048000 JavaThread "Thread-109" daemon [_thread_in_native, id=26562, stack(0x00007fcb1b501000,0x00007fcb1bd00000)] 0x00007fcc5893e800 JavaThread "Thread-116" daemon [_thread_in_native, id=26592, stack(0x00007fcb1b501000,0x00007fcb1bd00000)] We have 5 threads all claiming to have the same stack! And the faulting address is near the start of that stack (0x00007fcb1bcff9d0) https://bugs.debian.org/cgi-bin/bugreport.cgi?att=1;bug=954822;filename=hs_err_pid25471.log;msg=5
11-12-2020

Unfortunately I am unable to reproduce the crash with my fastdebug build.
11-12-2020