Bug ID: JDK-8296389 C2: PhaseCFG::convert_NeverBranch_to_Goto must handle both orders of successors

JDK-8296389 : C2: PhaseCFG::convert_NeverBranch_to_Goto must handle both orders of successors

Type: Bug
Component: hotspot
Sub-Component: compiler
Affected Version: 17,20,21

Priority: P3
Status: Resolved
Resolution: Fixed

Submitted: 2022-11-04
Updated: 2025-03-21
Resolved: 2022-12-12

Versions (Unresolved/Resolved/Fixed)

The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.

JDK 17	JDK 20	JDK 21
17.0.7-oracleFixed	20.0.2Fixed	21 b02Fixed

Related Reports

Relates :	JDK-8280126 - C2: detect and remove dead irreducible loops
Relates :	JDK-8298582 - Build failure from 8296389
Relates :	JDK-8299214 - Bytecode Fuzzer for irreducible/infinite loops and more
Relates :	JDK-8346888 - [ubsan] block.cpp:1617:30: runtime error: 9.97582e+36 is outside the range of representable values of type 'int'
Relates :	JDK-8298568 - Fastdebug build fails after JDK-8296389
Relates :	JDK-8305750 - [jdk20u] Typo in shenandoahSupport.cpp

Description

During fuzzer work of JDK-8280126, I found the same assert but with irreducible loops. I suspected it would also be possible to trigger the same bug but without irreducible loops, so I went and constructed an R3.java, using various tricks.

To reproduce, use either:
$ java -XX:+UnlockExperimentalVMOptions -XX:+UseG1GC -XX:+PrintOptoAssembly -XX:+TraceLoopOpts -XX:+PrintInlining -Xcomp -XX:CompileCommand=compileonly,R3::test -XX:-TieredCompilation -XX:PerMethodTrapLimit=0 R3.java
$ java -Xcomp -XX:CompileCommand=compileonly,R3::test -XX:PerMethodTrapLimit=0 R3.java

#  Internal Error (/home/emanuel/Documents/fork2-jdk/open/src/hotspot/share/opto/node.cpp:830), pid=111698, tid=111711
#  assert(idx < _cnt) failed: oob
#
# JRE version: Java(TM) SE Runtime Environment (20.0) (slowdebug build 20-internal-2022-10-06-1045569.emanuel...)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (slowdebug 20-internal-2022-10-06-1045569.emanuel..., compiled mode, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0x1095a1e]  Node::del_req(unsigned int)+0x26

Current CompileTask:
C2:   5472   83    b        R3::test (96 bytes)

Stack: [0x00007f3430cdc000,0x00007f3430ddd000],  sp=0x00007f3430dd7f90,  free space=1007k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x1095a1e]  Node::del_req(unsigned int)+0x26  (node.cpp:830)
V  [libjvm.so+0x656230]  PhaseCFG::convert_NeverBranch_to_Goto(Block*)+0x232  (block.cpp:653)
V  [libjvm.so+0x65669c]  PhaseCFG::remove_empty_blocks()+0x100  (block.cpp:744)
V  [libjvm.so+0x89bb42]  Compile::Code_Gen()+0x354  (compile.cpp:2978)
V  [libjvm.so+0x8920fd]  Compile::Compile(ciEnv*, ciMethod*, int, Options, DirectiveSet*)+0x159f  (compile.cpp:863)
V  [libjvm.so+0x780a29]  C2Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0x179  (c2compiler.cpp:113)
V  [libjvm.so+0x8b0c56]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0x916  (compileBroker.cpp:2240)
V  [libjvm.so+0x8af8bf]  CompileBroker::compiler_thread_loop()+0x3ed  (compileBroker.cpp:1916)
V  [libjvm.so+0x8d0008]  CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x72  (compilerThread.cpp:58)
V  [libjvm.so+0xc5df2a]  JavaThread::thread_main_inner()+0x144  (javaThread.cpp:699)
V  [libjvm.so+0xc5dde2]  JavaThread::run()+0x182  (javaThread.cpp:684)
V  [libjvm.so+0x132fe8f]  Thread::call_run()+0x195  (thread.cpp:224)
V  [libjvm.so+0x10dd81b]  thread_native_entry(Thread*)+0x19b  (os_linux.cpp:710)

TraceLoopOpts:
PHASE_PHASEIDEALLOOP1 start
Counted          Loop: N237/N231  counted [76,0),-1 (-1 iters) 
Parallel IV: 59     Loop: N237/N231  counted [76,0),-1 (-1 iters)  has_sfpt strip_mined
Counted          Loop: N252/N136  counted [0,4),+1 (-1 iters) 
Loop: N0/N0  has_sfpt
  Loop: N236/N235 
    Loop: N237/N231  counted [76,0),-1 (-1 iters)  has_sfpt strip_mined
  Loop: N251/N250 
    Loop: N252/N136  counted [0,4),+1 (-1 iters)  has_sfpt strip_mined
Empty with zero trip guard       Loop: N237/N231  counted [76,0),-1 (-1 iters)  has_sfpt strip_mined
MaxUnroll  4     Loop: N252/N136  counted [0,4),+1 (2147483648 iters)  has_sfpt strip_mined
Unroll 2( 4)     Loop: N252/N136  counted [0,4),+1 (2147483648 iters)  has_sfpt strip_mined
PHASE_PHASEIDEALLOOP1 end
PHASE_PHASEIDEALLOOP2 start
Loop: N0/N0  has_sfpt
  Loop: N251/N250  sfpts={ 253 }
    Loop: N277/N136  counted [0,4),+2 (2147483648 iters)  has_sfpt strip_mined
  Loop: N309/N170  sfpts={ 170 }
MaxUnroll  2     Loop: N277/N136  counted [0,4),+2 (2147483648 iters)  has_sfpt strip_mined
Unroll 4( 2)     Loop: N277/N136  counted [0,4),+2 (2147483648 iters)  has_sfpt strip_mined
Peel           Loop: N309/N170  sfpts={ 170 }
PHASE_PHASEIDEALLOOP2 end
PHASE_PHASEIDEALLOOP3 start
Loop: N0/N0  has_sfpt
  Loop: N309/N170  sfpts={ 170 }
PHASE_PHASEIDEALLOOP3 end
Loop: N0/N0  has_sfpt
  Loop: N309/N170  sfpts={ 170 }

Comments

A pull request was submitted for review. URL: https://git.openjdk.org/jdk20u/pull/26 Date: 2023-03-29 10:38:23 +0000
29-03-2023
Fix Request (JDK 20u) Resolves an assert during C2 compilation. The fix is medium risk, was already backported to Oracle JDK 17u and tested in the JDK 21 and 17u CIs. The backport applies cleanly.
29-03-2023
Fix request [17u] I backport this for parity with 17.0.7-oracle. Follow up for 8294672. I had to adapt the code slightly and included a follow up for a build fix. Tests pass. SAP nightly testing passed.
16-01-2023
A pull request was submitted for review. URL: https://git.openjdk.org/jdk17u-dev/pull/1059 Date: 2023-01-11 15:29:29 +0000
11-01-2023
Changeset: fabda246 Author: Emanuel Peter <epeter@openjdk.org> Date: 2022-12-12 12:11:02 +0000 URL: https://git.openjdk.org/jdk/commit/fabda246960cfdfff13c5a87de53d97169248172
12-12-2022
A pull request was submitted for review. URL: https://git.openjdk.org/jdk/pull/11481 Date: 2022-12-02 12:48:31 +0000
02-12-2022
ILW = C2 assertion failure with obscure code pattern, only with infinite loops and mainly with irreducible loops, use large enough PerMethodTrapLimit value = HLM = P3
14-11-2022
Just ran it with openjdk 17.0.4 2022-07-19 product build Reproduces with SIGSEGV, was to be expected, since we modify the blocks and nodes in unintended ways.
04-11-2022
Suggested solution: Fix PhaseCFG::convert_NeverBranch_to_Goto. On first sight it looks like the code should handle both cases: [[ "succ", "dead" ]] -> works [[ "dead", "succ" ]] -> broken The second case is broken because we do: Block succ = b->_succs[idx]; ... b->_succs.map(0,succ); ... Block dead = b->_succs[1 - idx]; This only works if idx == 0. Else, if idx == 1 we overwrite position 0, before we read "dead" from it - and instead read "succ" again. Then the code below thinks it is deleting inputs of "dead", but in fact deletes inputs of "succ". Some of the input positions are not available in "succ", and we throw an assert in Node::del_req.
04-11-2022
Context: Normal case: during matching, "live"/"succ" projection is added as output of NeverBranch before the "dead" projection leading to Halt. Thus, the outputs of NeverBranch are normally [[ "succ", "dead" ]]. Details: During DFS, usually we go from Halt to NeverBranch. Then via Region/Loop, take backedge, and find the "live"/"succ" edge. We already have its inputs (NeverBranch), thus we can now post-visit the live edge, and attach it to the NeverBranch first. Later, once we have processed the whole infinite loop, we post-visit out of NeverBranch to the "dead" projection edge, which we attach second. In R3.java: Abnormal order: "dead" projection is first attached to NeverBranch, and "live"/"succ" projection is added second. Details (see R3.java.6.png): In our pathological case, during bottom-up DFS traversal for matching, we go in through the shared 224 Halt, and first visit the 345 NeverBranch of the peeled iteration, and visit all of what is above. But there is no backedge, so we will not find the "live" edge, and we post-visit the "dead" edge first (350 CProj). Then, we take the second branch of the Halt (351 Region -> 222 CProj), and visit the peeled loop. From there, we finally find the "live" projection (346 CProj) of the peeled iterations NeverBranch, and attach it second. Why is the order of "live"/"succ" Projection vs the "dead" one relevant? It predicts in what order we later DFS traverse the nodes for scheduling the nodes into blocks, and also matters the order of successor blocks. It seems that the code in PhaseCFG::convert_NeverBranch_to_Goto expects the "succ" projection and block to have idx=0. On first appearance the code looks like it should also handle the inverted case where "succ" has idx=1, but the code has a bug, where we overwrite the _succs array where the "dead" block reference is stored before we read it off. This is how the nodes are scheduled in the block: B6: # out( B12 B7 ) <- in( N55 N57 ) Freq: 0.9 17 Region === 17 54 53 [[ 17 16 42 ]] !jvms: R3::test @ bci:57 (line 29) 42 Phi === 17 43 44 [[ 40 ]] #int:1..3 !jvms: R3::test @ bci:57 (line 29) 16 NeverBranch === 17 [[ 15 27 ]] 15 CProj === 16 [[ 52 ]] #1 27 CProj === 16 [[ 61 ]] #0 !jvms: R3::test @ bci:65 (line 30) Why did this bug not trigger before? Well, this seems to be quite an obscure code pattern, many things need to play together. We need an infinite-loop, which is peeled. This leads to the shared HaltNode, where the peeled iteration is visited first. However, we also cannot have a HaltNode further down after the peeled loop which would be traversed first. To cause the peeling, the infinite loop needs an if with a loop exit. Further, we need some of the ifs to collapse at the exactly right loop-opts phase, to trigger this sequence of events. It turns out that with irreducible loops this is a bit more common, at least my bytecode fuzzer found an example quite fast. Note: We need the flag: -XX:PerMethodTrapLimit=0 But probably we could just exhaust the traps with some extra code above the failing code. Without traps, we cannot insert predicates at parsing, which would act as fake "loop-exits", hence we would not add the NeverBranch nodes that trigger the bug.
04-11-2022