JDK-8308675 : C2: assert(no_dead_loop) failed: dead loop detected - irreducible loops are broken
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 8u291,11,17,21
  • Priority: P3
  • Status: Open
  • Resolution: Unresolved
  • Submitted: 2023-05-23
  • Updated: 2024-08-08
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbdUnresolved
Related Reports
Relates :  
Relates :  
Description
My jasm fuzzer found a dead-loop.

Attached you can find the full fuzzer test X.jasm as well as a reduced version X3.jasm.

Since the error is a bit intermittent, we require:
-XX:RepeatCompilation=1000 -XX:+StressIGVN

java -jar ~/Documents/asmtools-7.0-build/release/lib/asmtools.jar jasm X.jasm
java -XX:+UnlockExperimentalVMOptions -Xcomp -XX:CompileCommand=compileonly,X::test* -XX:-TieredCompilation -XX:RepeatCompilation=1000 -XX:+StressIGVN X


java -jar ~/Documents/asmtools-7.0-build/release/lib/asmtools.jar jasm X3.jasm
java -XX:+UnlockExperimentalVMOptions -Xcomp -XX:CompileCommand=compileonly,X3::test* -XX:-TieredCompilation -XX:RepeatCompilation=1000 -XX:+StressIGVN X3

Result is this:
dist dump
---------------------------------------------
   2     0  Root  === 0 2 3 4 33 34 1 1 20 1 32  [[ 0 1 256 338 125 222 104 388 393 395 510 138 331 689 129 111 169 109 170 442 443 115 214 726 155 208 199 352 495 157 463 576 246 110 131 132 136 851 855 1148 ]] 
   1     1  Con  === 0  [[ ]]  #top
   0   149  OrI  === _ 149 1  [[ 149 40 51 52 218 47 761 ]]  !orig=181,[2909] !jvms: X::test @ bci:2727
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/.../src/hotspot/share/opto/phaseX.cpp:956), pid=4189197, tid=4189210
#  assert(no_dead_loop) failed: dead loop detected

Current CompileTask:
C2:    107    6       4       X::test (3794 bytes)

Stack: [0x00007f3061264000,0x00007f3061365000],  sp=0x00007f306135f630,  free space=1005k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x177ac78]  PhaseGVN::dead_loop_check(Node*) [clone .part.0]+0x158  (phaseX.cpp:943)
V  [libjvm.so+0x17897e9]  PhaseIterGVN::transform_old(Node*)+0x4e9
V  [libjvm.so+0x17817ee]  PhaseIterGVN::optimize()+0x6e
V  [libjvm.so+0xaef32a]  PhaseIdealLoop::optimize(PhaseIterGVN&, LoopOptsMode)+0x6aa
V  [libjvm.so+0xae8640]  Compile::Optimize()+0x4c0
V  [libjvm.so+0xaed10e]  Compile::Compile(ciEnv*, ciMethod*, int, Options, DirectiveSet*)+0x18ce
V  [libjvm.so+0x8fec67]  C2Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0x4e7
V  [libjvm.so+0xafa22c]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0xa7c
V  [libjvm.so+0xafafe8]  CompileBroker::compiler_thread_loop()+0x5d8
V  [libjvm.so+0x1067896]  JavaThread::thread_main_inner()+0x206
V  [libjvm.so+0x1a6a8b0]  Thread::call_run()+0x100
V  [libjvm.so+0x16ff053]  thread_native_entry(Thread*)+0x103


Different failure mode with JDK 8u and 17u:

# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (.../hotspot/src/share/vm/opto/loopnode.cpp:3133), pid=3446922, tid=0x00007f058a92a700
#  assert(!in->is_CFG()) failed: CFG Node with no controlling input?

V  [libjvm.so+0x119823d]  VMError::report_and_die()+0x2fd
V  [libjvm.so+0x793e11]  report_vm_error(char const*, int, char const*, char const*)+0x61
V  [libjvm.so+0xd10227]  PhaseIdealLoop::build_loop_early(VectorSet&, Node_List&, Node_Stack&)+0x587
V  [libjvm.so+0xd16ef3]  PhaseIdealLoop::build_and_optimize(bool, bool)+0x7e3
V  [libjvm.so+0x6f7ba3]  Compile::Optimize()+0x453
V  [libjvm.so+0x6f9375]  Compile::Compile(ciEnv*, C2Compiler*, ciMethod*, int, bool, bool, bool)+0xe05
V  [libjvm.so+0x58db18]  C2Compiler::compile_method(ciEnv*, ciMethod*, int)+0xe8
V  [libjvm.so+0x708442]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0x942
V  [libjvm.so+0x709a28]  CompileBroker::compiler_thread_loop()+0x858
V  [libjvm.so+0x1109b28]  JavaThread::thread_main_inner()+0x198
V  [libjvm.so+0x1109e30]  JavaThread::run()+0x2c0
V  [libjvm.so+0xea1902]  java_start(Thread*)+0x102
C  [libpthread.so.0+0x8609]  start_thread+0xd9

Comments
I'm unassigning this now. We probably need to remove irreducible loops completely from our IR, by transforming the the ciTypeflow graph during bytecode parsing. The issue was basically this: we rely on a "constant time" check if a loop is still reachable (vs dead). This works reliably with reducible loops (single entry) - we just check if that single entry has died. But for irreducible loops (multiple entries) it is hard to detect when the last entry has died. We would always have to do a global reachability check - that is expensive.
08-08-2024

Abandoning this for now. Draft is still here, but not working: https://github.com/openjdk/jdk/pull/14374 I hope that we can instead eventually remove irreducible loops completely, that would improve things drastically.
03-10-2023

I have seen similar issues with Phi nodes corroding before the Region without irreducible loops, when working on JDK-8308149. Maybe we need to reconsider our approach with checking Region reachability in general. One solution we could try: If a Phi node has a top input, it must wait until the Region also has that slot die, and the Region removes the slot also in the Phi. I think it should not happen that only the Phi node collapses one input but the region keeps both. That would mean that we take the data from one branch, and ignore whatever happens in the other. We should add verification, so that at the end of IGVN we do not have a Phi node that has a top input.
31-05-2023

It looks like we have top sneaking through the Phi nodes first, and only later it would reach the Regions (all properly marked as irreducible). The fix in JDK-8280126 seems to be incomplete: It makes the Phi nodes wait for the Regions to collapse first, but only if the Region already has a top input. If top is not directly at the Region yet, but a few nodes further up and will only come down later, then the Phi does not notice that and can already kill the data loop, creating inconsistent states like dead-loop on the OrI node.
31-05-2023

Simplified it a bit more to a X4.jasm java -jar ~/Documents/asmtools-7.0-build/release/lib/asmtools.jar jasm X4.jasm ./java -XX:+UnlockExperimentalVMOptions -Xcomp -XX:CompileCommand=compileonly,X4::test* -XX:-TieredCompilation -XX:RepeatCompilation=1000 -XX:+StressIGVN X4
24-05-2023

Old issue, also triggers with 8u.
23-05-2023

ILW = Dead data loop in IGVN, only with fuzzer generated byte code, disable compilation of affected method = HLM = P3
23-05-2023