JDK-8296389 : C2: PhaseCFG::convert_NeverBranch_to_Goto must handle both orders of successors
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 17,20,21
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2022-11-04
  • Updated: 2022-12-19
  • Resolved: 2022-12-12
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 17 JDK 21
17.0.7-oracleFixed 21 b02Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Description
During fuzzer work of JDK-8280126, I found the same assert but with irreducible loops. I suspected it would also be possible to trigger the same bug but without irreducible loops, so I went and constructed an R3.java, using various tricks.

To reproduce, use either:
$ java -XX:+UnlockExperimentalVMOptions -XX:+UseG1GC -XX:+PrintOptoAssembly -XX:+TraceLoopOpts -XX:+PrintInlining -Xcomp -XX:CompileCommand=compileonly,R3::test -XX:-TieredCompilation -XX:PerMethodTrapLimit=0 R3.java
$ java -Xcomp -XX:CompileCommand=compileonly,R3::test -XX:PerMethodTrapLimit=0 R3.java

#  Internal Error (/home/emanuel/Documents/fork2-jdk/open/src/hotspot/share/opto/node.cpp:830), pid=111698, tid=111711
#  assert(idx < _cnt) failed: oob
#
# JRE version: Java(TM) SE Runtime Environment (20.0) (slowdebug build 20-internal-2022-10-06-1045569.emanuel...)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (slowdebug 20-internal-2022-10-06-1045569.emanuel..., compiled mode, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0x1095a1e]  Node::del_req(unsigned int)+0x26

Current CompileTask:
C2:   5472   83    b        R3::test (96 bytes)

Stack: [0x00007f3430cdc000,0x00007f3430ddd000],  sp=0x00007f3430dd7f90,  free space=1007k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x1095a1e]  Node::del_req(unsigned int)+0x26  (node.cpp:830)
V  [libjvm.so+0x656230]  PhaseCFG::convert_NeverBranch_to_Goto(Block*)+0x232  (block.cpp:653)
V  [libjvm.so+0x65669c]  PhaseCFG::remove_empty_blocks()+0x100  (block.cpp:744)
V  [libjvm.so+0x89bb42]  Compile::Code_Gen()+0x354  (compile.cpp:2978)
V  [libjvm.so+0x8920fd]  Compile::Compile(ciEnv*, ciMethod*, int, Options, DirectiveSet*)+0x159f  (compile.cpp:863)
V  [libjvm.so+0x780a29]  C2Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0x179  (c2compiler.cpp:113)
V  [libjvm.so+0x8b0c56]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0x916  (compileBroker.cpp:2240)
V  [libjvm.so+0x8af8bf]  CompileBroker::compiler_thread_loop()+0x3ed  (compileBroker.cpp:1916)
V  [libjvm.so+0x8d0008]  CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x72  (compilerThread.cpp:58)
V  [libjvm.so+0xc5df2a]  JavaThread::thread_main_inner()+0x144  (javaThread.cpp:699)
V  [libjvm.so+0xc5dde2]  JavaThread::run()+0x182  (javaThread.cpp:684)
V  [libjvm.so+0x132fe8f]  Thread::call_run()+0x195  (thread.cpp:224)
V  [libjvm.so+0x10dd81b]  thread_native_entry(Thread*)+0x19b  (os_linux.cpp:710)

TraceLoopOpts:
PHASE_PHASEIDEALLOOP1 start
Counted          Loop: N237/N231  counted [76,0),-1 (-1 iters) 
Parallel IV: 59     Loop: N237/N231  counted [76,0),-1 (-1 iters)  has_sfpt strip_mined
Counted          Loop: N252/N136  counted [0,4),+1 (-1 iters) 
Loop: N0/N0  has_sfpt
  Loop: N236/N235 
    Loop: N237/N231  counted [76,0),-1 (-1 iters)  has_sfpt strip_mined
  Loop: N251/N250 
    Loop: N252/N136  counted [0,4),+1 (-1 iters)  has_sfpt strip_mined
Empty with zero trip guard       Loop: N237/N231  counted [76,0),-1 (-1 iters)  has_sfpt strip_mined
MaxUnroll  4     Loop: N252/N136  counted [0,4),+1 (2147483648 iters)  has_sfpt strip_mined
Unroll 2( 4)     Loop: N252/N136  counted [0,4),+1 (2147483648 iters)  has_sfpt strip_mined
PHASE_PHASEIDEALLOOP1 end
PHASE_PHASEIDEALLOOP2 start
Loop: N0/N0  has_sfpt
  Loop: N251/N250  sfpts={ 253 }
    Loop: N277/N136  counted [0,4),+2 (2147483648 iters)  has_sfpt strip_mined
  Loop: N309/N170  sfpts={ 170 }
MaxUnroll  2     Loop: N277/N136  counted [0,4),+2 (2147483648 iters)  has_sfpt strip_mined
Unroll 4( 2)     Loop: N277/N136  counted [0,4),+2 (2147483648 iters)  has_sfpt strip_mined
Peel           Loop: N309/N170  sfpts={ 170 }
PHASE_PHASEIDEALLOOP2 end
PHASE_PHASEIDEALLOOP3 start
Loop: N0/N0  has_sfpt
  Loop: N309/N170  sfpts={ 170 }
PHASE_PHASEIDEALLOOP3 end
Loop: N0/N0  has_sfpt
  Loop: N309/N170  sfpts={ 170 }
Comments
Changeset: fabda246 Author: Emanuel Peter <epeter@openjdk.org> Date: 2022-12-12 12:11:02 +0000 URL: https://git.openjdk.org/jdk/commit/fabda246960cfdfff13c5a87de53d97169248172
12-12-2022

A pull request was submitted for review. URL: https://git.openjdk.org/jdk/pull/11481 Date: 2022-12-02 12:48:31 +0000
02-12-2022

ILW = C2 assertion failure with obscure code pattern, only with infinite loops and mainly with irreducible loops, use large enough PerMethodTrapLimit value = HLM = P3
14-11-2022

Just ran it with openjdk 17.0.4 2022-07-19 product build Reproduces with SIGSEGV, was to be expected, since we modify the blocks and nodes in unintended ways.
04-11-2022

Suggested solution: Fix PhaseCFG::convert_NeverBranch_to_Goto. On first sight it looks like the code should handle both cases: [[ "succ", "dead" ]] -> works [[ "dead", "succ" ]] -> broken The second case is broken because we do: Block *succ = b->_succs[idx]; ... b->_succs.map(0,succ); ... Block* dead = b->_succs[1 - idx]; This only works if idx == 0. Else, if idx == 1 we overwrite position 0, before we read "dead" from it - and instead read "succ" again. Then the code below thinks it is deleting inputs of "dead", but in fact deletes inputs of "succ". Some of the input positions are not available in "succ", and we throw an assert in Node::del_req.
04-11-2022

Context: Normal case: during matching, "live"/"succ" projection is added as output of NeverBranch before the "dead" projection leading to Halt. Thus, the outputs of NeverBranch are normally [[ "succ", "dead" ]]. Details: During DFS, usually we go from Halt to NeverBranch. Then via Region/Loop, take backedge, and find the "live"/"succ" edge. We already have its inputs (NeverBranch), thus we can now post-visit the live edge, and attach it to the NeverBranch first. Later, once we have processed the whole infinite loop, we post-visit out of NeverBranch to the "dead" projection edge, which we attach second. In R3.java: Abnormal order: "dead" projection is first attached to NeverBranch, and "live"/"succ" projection is added second. Details (see R3.java.6.png): In our pathological case, during bottom-up DFS traversal for matching, we go in through the shared 224 Halt, and first visit the 345 NeverBranch of the peeled iteration, and visit all of what is above. But there is no backedge, so we will not find the "live" edge, and we post-visit the "dead" edge first (350 CProj). Then, we take the second branch of the Halt (351 Region -> 222 CProj), and visit the peeled loop. From there, we finally find the "live" projection (346 CProj) of the peeled iterations NeverBranch, and attach it second. Why is the order of "live"/"succ" Projection vs the "dead" one relevant? It predicts in what order we later DFS traverse the nodes for scheduling the nodes into blocks, and also matters the order of successor blocks. It seems that the code in PhaseCFG::convert_NeverBranch_to_Goto expects the "succ" projection and block to have idx=0. On first appearance the code looks like it should also handle the inverted case where "succ" has idx=1, but the code has a bug, where we overwrite the _succs array where the "dead" block reference is stored before we read it off. This is how the nodes are scheduled in the block: B6: # out( B12 B7 ) <- in( N55 N57 ) Freq: 0.9 17 Region === 17 54 53 [[ 17 16 42 ]] !jvms: R3::test @ bci:57 (line 29) 42 Phi === 17 43 44 [[ 40 ]] #int:1..3 !jvms: R3::test @ bci:57 (line 29) 16 NeverBranch === 17 [[ 15 27 ]] 15 CProj === 16 [[ 52 ]] #1 27 CProj === 16 [[ 61 ]] #0 !jvms: R3::test @ bci:65 (line 30) Why did this bug not trigger before? Well, this seems to be quite an obscure code pattern, many things need to play together. We need an infinite-loop, which is peeled. This leads to the shared HaltNode, where the peeled iteration is visited first. However, we also cannot have a HaltNode further down after the peeled loop which would be traversed first. To cause the peeling, the infinite loop needs an if with a loop exit. Further, we need some of the ifs to collapse at the exactly right loop-opts phase, to trigger this sequence of events. It turns out that with irreducible loops this is a bit more common, at least my bytecode fuzzer found an example quite fast. Note: We need the flag: -XX:PerMethodTrapLimit=0 But probably we could just exhaust the traps with some extra code above the failing code. Without traps, we cannot insert predicates at parsing, which would act as fake "loop-exits", hence we would not add the NeverBranch nodes that trigger the bug.
04-11-2022