JDK 11 | JDK 17 | JDK 21 | JDK 22 | JDK 7 | JDK 8 | Other |
---|---|---|---|---|---|---|
11.0.23-oracleFixed | 17.0.11-oracleFixed | 21.0.2Fixed | 22 b22Fixed | 7u421Fixed | 8u411Fixed | openjdk8u422Fixed |
Blocks :
|
|
Relates :
|
|
Relates :
|
|
Relates :
|
java -Xmx1G -XX:+IgnoreUnrecognizedVMOptions -XX:CompileCommand=quiet -XX:CompileCommand=compileonly,*Test*::* -XX:-TieredCompilation -Xcomp -XX:+UnlockDiagnosticVMOptions -XX:+StressGCM -XX:UseAVX=2 Test.java # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (.../open/src/hotspot/share/opto/regalloc.hpp:85), pid=1681802, tid=1681816 # assert(idx < _node_regs_max_index) failed: Exceeded _node_regs array # # JRE version: Java(TM) SE Runtime Environment (22.0+17) (fastdebug build 22-ea+17-1342) # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 22-ea+17-1342, compiled mode, sharing, compressed oops, compressed class ptrs, g1 gc, linux-amd64) # Problematic frame: # V [libjvm.so+0x6ebf45] PhaseCFG::insert_goto_at(unsigned int, unsigned int)+0x695 Current CompileTask: C2: 3200 109 b Test::vMeth (119 bytes) Stack: [0x00007f12d5057000,0x00007f12d5158000], sp=0x00007f12d5153e90, free space=1011k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x6ebf45] PhaseCFG::insert_goto_at(unsigned int, unsigned int)+0x695 (regalloc.hpp:85) V [libjvm.so+0x6ede8c] PhaseCFG::fixup_flow()+0x1ac V [libjvm.so+0x9ee37d] Compile::Code_Gen()+0x4ad V [libjvm.so+0x9f107e] Compile::Compile(ciEnv*, ciMethod*, int, Options, DirectiveSet*)+0x1c9e ... FAILURE ANALYSIS The failure is caused by a seemingly legal but degenerate Ideal graph where around 94% of the nodes (544 out of 579 after Compile::Optimize()) are floating-point additions (AddF). On x64, these nodes, whose second operand is `inc` (see attached TestSimpler.java), are initially implemented with addF_reg_reg machine nodes. Register allocation spills `inc`, and then PhaseChaitin::fixup_spills() replaces each of the addF_reg_reg machine nodes with their memory-operand version (addF_reg_mem). PhaseRegAlloc has allocated 1118 elements for PhaseRegAlloc::_node_regs (612 + (612 >> 1) + 200 as per PhaseRegAlloc::alloc_node_regs), but each transformation from addF_reg_reg to addF_reg_mem creates a fresh node ID (Compile::next_unique()) and Compile::_unique eventually grows beyond the size of PhaseRegAlloc::_node_regs, which finally triggers the assertion failure when PhaseRegAlloc::set_pair is called for a newly created node post-register allocation (e.g. during the target-dependent peephole phase). The reason why this failure occurs only after JDK-8287087 is that this changeset makes it possible to detect a reduction chain that was undetectable before, when the innermost loop as been fully unrolled: static float test(float inc) { int i = 0, j = 0; float f = dontInline(); while (i++ < 128) { f += inc; f += inc; f += inc; f += inc; f += inc; f += inc; f += inc; f += inc; f += inc; f += inc; f += inc; f += inc; f += inc; f += inc; f += inc; f += inc; } return f; } This stronger analysis result provided by JDK-8287087 leads to the SLP early unrolling policy (SuperWord::unrolling_analysis()) requesting additional unrolling of the outermost loop, but due to limitations in the superword framework, the loop is finally not vectorized, leaving a graph with a very high density of AddF nodes (512 AddF nodes in the main loop body). Potential solutions include: - reusing the node ID of the replaced nodes in PhaseChaitin::fixup_spills() and/or adjusting Compile::_unique appropriately, - resizing _node_regs on an out-of-bounds attempt (e.g. using a growable array), - further increasing the size of _node_regs, and - adjusting the loop unrolling policy to avoid excessive unrolling for pure reduction loops. A temporary workaround is to use -XX:-UseCISCSpill.
|