JDK-8253756 : C2 CompilerThread0 crash in Node::add_req(Node*)
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 11,12,13,14,15,16
  • Priority: P2
  • Status: Closed
  • Resolution: Fixed
  • OS: generic
  • CPU: x86_64
  • Submitted: 2020-09-27
  • Updated: 2021-01-14
  • Resolved: 2020-10-08
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 15 JDK 16
11.0.11-oracleFixed 15.0.2Fixed 16 b20Fixed
Related Reports
Relates :  
Description
ADDITIONAL SYSTEM INFORMATION :
MacBook Pro (Retina, 15-inch, Late 2013)
macOS Catalina 10.15.6
openjdk JDK 15 GA (but also 14 and 13 update versions)

A DESCRIPTION OF THE PROBLEM :
While running performance testing on the BouncyCastle implementation of Ed448 KeyPairGenerator, I noticed messages like this being written to the output (if sufficient iterations performed):
    Default case invoked for: 
       opcode = 0, "Node" 

It's somewhat deterministic in that this would appear from 0 up to ~5 times. Note that this didn't initially crash the process or apparently lead to any error (unit tests continued passing). However I was a bit concerned so started to simplify the code and drill down to what code might be the problem. At a certain point I was able to generate a crash.

At least superficially this appears very similar to, and may be a regression of, https://bugs.openjdk.java.net/browse/JDK-8208275 .

ERROR MESSAGES/STACK TRACES THAT OCCUR :
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x0000000106006714, pid=16221, tid=23299
#
# JRE version: OpenJDK Runtime Environment (15.0+36) (build 15+36-1562)
# Java VM: OpenJDK 64-Bit Server VM (15+36-1562, mixed mode, sharing, tiered, compressed oops, g1 gc, bsd-amd64)
# Problematic frame:
# V  [libjvm.dylib+0x806714]  Node::add_req(Node*)+0xb4
#
# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /Users/peter/Documents/Workspaces/BouncyCastle/bc-tests/hs_err_pid16221.log
#
# Compiler replay data is saved as:
# /Users/peter/Documents/Workspaces/BouncyCastle/bc-tests/replay_pid16221.log
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
#

(I have hs_err and replay files that I will attach).

REGRESSION : Last worked in version 12

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Simply use KeyPairGenerator "EdDSA" from BouncyCastle provider to generate thousands of keys and once the JIT kicks in you should see the "Default case invoked for" message appearing. There might be rare crashes here also, but for me I reached reliably-crashing code while stripping away layers and trying to isolate what code was causing the message.

ACTUAL -
Usually displays "Default case invoked for" message up to 5 times or so. May crash rarely but specific code tweaks can lead to (apparently) deterministic crashing.

---------- BEGIN SOURCE ----------
This is simple code that will usually cause "Default case invoked for" message:

        KeyPairGenerator kpGen = KeyPairGenerator.getInstance("EdDSA", new BouncyCastleProvider());
        kpGen.initialize(NamedParameterSpec.ED448);

        for (int i = 0; i < 10000; ++i)
        {
            kpGen.generateKeyPair();
        }

The code used to get the crash is too complicated to attach here but I can try and put a reproduce together if the logs are not enough.
---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
Bug (message or crash) appears to never happen if either -XX:-TieredCompilation or -XX:LoopStripMiningIter=0 are used.
 

FREQUENCY : often



Comments
Fix Request (11u) Should get backported for parity with 11.0.11-oracle. Applies cleanly.
22-12-2020

Fix request (JDK 15u): This patch fixes a crash during C2 compilation that is due to loop strip mining (JDK-8186027). The fix applies cleanly, is low risk and has been tested for a while in the JDK 16 CI.
11-11-2020

This also affects JDK 11. Updating corresponding fields.
11-11-2020

I think this should be backported to 15.0.2.
13-10-2020

Changeset: 76a58527 Author: Roland Westrelin <roland@openjdk.org> Date: 2020-10-08 08:39:40 +0000 URL: https://git.openjdk.java.net/jdk/commit/76a58527
08-10-2020

Seems to be related to loop strip mining. Roland, could you please have a look? Thanks.
05-10-2020

Reproduced with debug build: # Internal Error (/opt/mach5/mesos/work_dir/slaves/52628e90-e5e7-4ef3-af97-10d8776d10db-S967/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/7eae27fa-c008-4f61-be47-7beaaed81640/runs/0e7e0834-608e-4e86-859e-a5c2c98e8f71/workspace/open/src/hotspot/share/opto/loopnode.cpp:1107), pid=1350509, tid=1350531 # assert(expect_skeleton == 1 || expect_skeleton == -1) failed: unexpected skeleton node Current CompileTask: C2: 669 1 b 4 org.bouncycastle.math.ec.rfc8032.Ed448::generatePublicKey (58 bytes) Stack: [0x00007f2f0471f000,0x00007f2f04820000], sp=0x00007f2f0481b180, free space=1008k Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x11780df] LoopNode::verify_strip_mined(int) const+0xf8f V [libjvm.so+0x8fe1bb] Compile::final_graph_reshaping_main_switch(Node*, Final_Reshape_Counts&, unsigned int)+0x51b V [libjvm.so+0x900085] Compile::final_graph_reshaping_impl(Node*, Final_Reshape_Counts&) [clone .part.0]+0x85 V [libjvm.so+0x9008a0] Compile::final_graph_reshaping_walk(Node_Stack&, Node*, Final_Reshape_Counts&)+0x1c0 V [libjvm.so+0x90768c] Compile::final_graph_reshaping()+0x46c V [libjvm.so+0x90b2fa] Compile::Optimize()+0x11ba V [libjvm.so+0x90c95c] Compile::Compile(ciEnv*, ciMethod*, int, bool, bool, bool, DirectiveSet*)+0x15ac V [libjvm.so+0x759695] C2Compiler::compile_method(ciEnv*, ciMethod*, int, DirectiveSet*)+0x175 V [libjvm.so+0x91b9c0] CompileBroker::invoke_compiler_on_method(CompileTask*)+0xd60 V [libjvm.so+0x91c788] CompileBroker::compiler_thread_loop()+0x6c8 V [libjvm.so+0x16ca5cc] JavaThread::thread_main_inner()+0x21c V [libjvm.so+0x16d0310] Thread::call_run()+0x100 V [libjvm.so+0x13d49a6] thread_native_entry(Thread*)+0x116
05-10-2020

Crash is reproduced in JDK 15 and JDK16 ea. Reproducer attached (Repro-9156035.zip) OS: Windows 10 JDK 14.0.2 : Pass JDK 15+22: Pass JDK 15+23: Fail JDK 15+36:Fail JDK 16ea17: Fail ILW=HMM=P2 Impact = High, it is crash and regression found in current GA build Likelihood = medium, issue is reproducible in corner case, and reported only once Workaround = medium, no known workaround yet. Moving it to dev team for further analysis.
29-09-2020

requested submitter to provide additional information: <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Please provide the complete hs_err_pid.log and replay file to analyze this issue further. <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
28-09-2020

Please provide the below files for triaging further /Users/peter/Documents/Workspaces/BouncyCastle/bc-tests/hs_err_pid16221.log /Users/peter/Documents/Workspaces/BouncyCastle/bc-tests/replay_pid16221.log
28-09-2020