JDK-8312617 : SIGSEGV in ConnectionGraph::verify_ram_nodes
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 22
  • Priority: P2
  • Status: Closed
  • Resolution: Fixed
  • OS: linux
  • CPU: x86
  • Submitted: 2023-07-24
  • Updated: 2024-01-25
  • Resolved: 2023-08-02
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 22
22 b09Fixed
Related Reports
Duplicate :  
Duplicate :  
Relates :  
Description
This crash happened in weekly promo build testing of 22-b7 using SPECjvm2008 scimark.monte_carlo.

It seems method ConnectionGraph::verify_ram_nodes came with JDK-8287061.


#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f50bfeef731, pid=2570475, tid=2570489
#
# JRE version: Java(TM) SE Runtime Environment (22.0+7) (build 22-ea+7-489)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (22-ea+7-489, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0x742731]  ConnectionGraph::verify_ram_nodes(Compile*, Node*)+0xc1
#
# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
#

---------------  S U M M A R Y ------------

Command Line: -XX:+UseG1GC -XX:-PrintWarnings -XX:+UseLargePages -XX:+PerfDataSaveToFile -Xlog:gc* ./SPECjvm2008.jar --showversion scimark.monte_carlo -ikv

Host: Intel(R) Xeon(R) Gold 6354 CPU @ 3.00GHz, 72 cores, 502G, Oracle Linux Server release 8.8
Time: Fri Jul 21 00:16:28 2023 UTC elapsed time: 10.288761 seconds (0d 0h 0m 10s)

---------------  T H R E A D  ---------------

Current thread (0x00007f50b81e7520):  JavaThread "C2 CompilerThread0" daemon [_thread_in_native, id=2570489, stack(0x00007f50695d6000,0x00007f50696d7000) (1028K)]


Current CompileTask:
C2:  10288  717       4       spec.benchmarks.scimark.monte_carlo.MonteCarlo::run (26 bytes)

Stack: [0x00007f50695d6000,0x00007f50696d7000],  sp=0x00007f50696d2840,  free space=1010k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x742731]  ConnectionGraph::verify_ram_nodes(Compile*, Node*)+0xc1
V  [libjvm.so+0x64256b]  Compile::Optimize()+0x13db
V  [libjvm.so+0x643aca]  Compile::Compile(ciEnv*, ciMethod*, int, Options, DirectiveSet*)+0xf2a
V  [libjvm.so+0x571a0d]  C2Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0x13d
V  [libjvm.so+0x649757]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0xa97
V  [libjvm.so+0x64c838]  CompileBroker::compiler_thread_loop()+0x698
V  [libjvm.so+0x8f5448]  JavaThread::thread_main_inner() [clone .part.0]+0xb8
V  [libjvm.so+0xe96cf8]  Thread::call_run()+0xa8
V  [libjvm.so+0xcbccfa]  thread_native_entry(Thread*)+0xda

siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x0000000000000028

Registers:
RAX=0x00007f4fe4397600, RBX=0x00000000000008a4, RCX=0x0000000000000000, RDX=0x00007f4fe4397620
RSP=0x00007f50696d2840, RBP=0x00007f50696d28e0, RSI=0x0000000000000000, RDI=0x00007f4fe439f600
R8 =0x00007f4fe4397600, R9 =0x00007f4fe4397600, R10=0x0000000000000000, R11=0x0000000000000000
R12=0x00007f50696d4b20, R13=0x00007f50696d2b30, R14=0x0000000000000000, R15=0x00007f50696d2980
RIP=0x00007f50bfeef731, EFLAGS=0x0000000000010202, CSGSFS=0x002b000000000033, ERR=0x0000000000000004
  TRAPNO=0x000000000000000e

Comments
Verified both compiler/arraycopy/stress/StressCharArrayCopy.java and compiler/arraycopy/TestCloneWithStressReflectiveCode.java passed in JDK22 ATR.
25-01-2024

Changeset: 64467923 Author: Cesar Soares Lucas <cslucas@openjdk.org> Committer: Tobias Hartmann <thartmann@openjdk.org> Date: 2023-08-02 14:27:07 +0000 URL: https://git.openjdk.org/jdk/commit/6446792327c629dbd1dfc1edfb547065f6fce651
02-08-2023

A pull request was submitted for review. URL: https://git.openjdk.org/jdk/pull/15048 Date: 2023-07-26 22:26:05 +0000
26-07-2023

Agree.
26-07-2023

Hi Cesar, I think your idea is good. And you could also add another `failing()` test just after `mexp.eliminate_macro_nodes()` call since if compilation has failed we don’t need to do `igvn.optimize()` anymore.
26-07-2023

Yeah, `verify_ram_nodes` is SEGFAULTing because `root()` is returning null since the compilation is already failing with failure reason "retry without locks coarsening". However, we are seeing the SEGFAULT now because I moved the `if (failing()) return;` that was just after the call to `mexp.eliminate_macro_nodes();` to _after_ calling `verify_ram_nodes()`. I'm going to submit a patch to make `verify_ram_nodes()` check if we are already `failing()` and if so just return. What do you all think?
25-07-2023

It seems that Cesar's patch in JDK-8287061 just exposes the bug but is **not the root cause**. I have tried to reset the source code using git to the commit just before that patch and still observed that `C->root()` returns null and `_failure_reason` is "retry without locks coarsening" after executing `mexp.eliminate_macro_nodes()` in `Compile::Optimize()`. The reason seems to be that in the fix of JDK-8268347, failure needs to be checked after calling `eliminate_macro_nodes()`. But they only check here: https://github.com/openjdk/jdk17/commit/4d8b5c70dff51470210a0ca93b932af1b27c9f27#diff-2faebd05d08f9115f8d9ef771644cf05087a6986c2f9013d7163c6aa720169c3R2577-R2578 but forget to check here (in `Compile::Optimize()`): https://github.com/openjdk/jdk/blob/9606cbcd2314506d0054ecba1804e5e0c2670cd6/src/hotspot/share/opto/compile.cpp#L2323-L2324 Since there is a `record_failure()` call in `coarsened_locks_consistent()` called by `eliminate_macro_nodes()`, which sets `_root` to null, following access to _root in `ConnectionGraph::verify_ram_nodes()` added by Cesar will crash.
25-07-2023

Hi all, I'm not the reporter but I find that the following reduced version of the test case in JDK-8312748 also triggers the same crash (V [libjvm.so+0x635d16] Unique_Node_List::push(Node*)+0x20) on my machine (Linux x64). Hope it helps :P. ``` public class TestSyncStatic implements Cloneable { public static void main(String[] args) throws CloneNotSupportedException { TestSyncStatic t = new TestSyncStatic(); for (int i = 0; i < 50_000; ++i) { synchronized (TestSyncStatic.class) { synchronized (TestSyncStatic.class) { t.clone(); } synchronized(TestSyncStatic.class) { for (int var0 = 1; var0 < 32; var0++); } } } } } ```
25-07-2023

The reproducer from (duplicate) JDK-8312744 / JDK-8312748 works: java compiler.arraycopy.stress.StressCharArrayCopy # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x00007f4120152f76, pid=1874891, tid=1874905 # # JRE version: Java(TM) SE Runtime Environment (22.0+8) (fastdebug build 22-ea+8-534) # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 22-ea+8-534, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) # Problematic frame: # V [libjvm.so+0xba7f76] ConnectionGraph::verify_ram_nodes(Compile*, Node*)+0x146 # # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
25-07-2023

Cesar, please have a look. Unfortunately, no replay file was saved and I'm not (yet) able to reproduce.
25-07-2023

ILW = Crash during Escape Analysis (regression in JDK 22 b07 from JDK-8287061), intermittent with SPECjvm2008 MonteCarlo benchmark, -XX:-ReduceAllocationMerges = HML = P2
25-07-2023