Bug ID: JDK-8317507 C2 compilation fails with "Exceeded _node

JDK-8317507 : C2 compilation fails with "Exceeded _node_regs array"

Type: Bug
Component: hotspot
Sub-Component: compiler
Affected Version: 7,8,11,17,21,22

Priority: P2
Status: Closed
Resolution: Fixed

Submitted: 2023-10-04
Updated: 2025-06-26
Resolved: 2023-10-30

Versions (Unresolved/Resolved/Fixed)

The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.

JDK 11	JDK 17	JDK 21	JDK 22	JDK 7	JDK 8	Other
11.0.23-oracleFixed	17.0.11-oracleFixed	21.0.2Fixed	22 b22Fixed	7u421Fixed	8u411Fixed	openjdk8u422,shenandoah8u412Fixed

Related Reports

Blocks :	JDK-8319209 - C2: make node_regs and scheduling data structures growable
Relates :	JDK-8287087 - C2: perform SLP reduction analysis on-demand
Relates :	JDK-8318703 - C2 SuperWord: take reduction nodes into account in early unrolling analysis
Relates :	JDK-8318959 - C2: define MachNode::fill_new_machnode() statically

Description

java -Xmx1G -XX:+IgnoreUnrecognizedVMOptions -XX:CompileCommand=quiet -XX:CompileCommand=compileonly,*Test*::* -XX:-TieredCompilation -Xcomp -XX:+UnlockDiagnosticVMOptions -XX:+StressGCM -XX:UseAVX=2 Test.java

# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (.../open/src/hotspot/share/opto/regalloc.hpp:85), pid=1681802, tid=1681816
#  assert(idx < _node_regs_max_index) failed: Exceeded _node_regs array
#
# JRE version: Java(TM) SE Runtime Environment (22.0+17) (fastdebug build 22-ea+17-1342)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 22-ea+17-1342, compiled mode, sharing, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0x6ebf45]  PhaseCFG::insert_goto_at(unsigned int, unsigned int)+0x695

Current CompileTask:
C2:   3200  109    b        Test::vMeth (119 bytes)

Stack: [0x00007f12d5057000,0x00007f12d5158000],  sp=0x00007f12d5153e90,  free space=1011k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x6ebf45]  PhaseCFG::insert_goto_at(unsigned int, unsigned int)+0x695  (regalloc.hpp:85)
V  [libjvm.so+0x6ede8c]  PhaseCFG::fixup_flow()+0x1ac
V  [libjvm.so+0x9ee37d]  Compile::Code_Gen()+0x4ad
V  [libjvm.so+0x9f107e]  Compile::Compile(ciEnv*, ciMethod*, int, Options, DirectiveSet*)+0x1c9e
...

FAILURE ANALYSIS

The failure is caused by a seemingly legal but degenerate Ideal graph where around 94% of the nodes (544 out of 579 after Compile::Optimize()) are floating-point additions (AddF). On x64, these nodes, whose second operand is `inc` (see attached TestSimpler.java), are initially implemented with addF_reg_reg machine nodes. Register allocation spills `inc`, and then PhaseChaitin::fixup_spills() replaces each of the addF_reg_reg machine nodes with their memory-operand version (addF_reg_mem). PhaseRegAlloc has allocated 1118 elements for PhaseRegAlloc::_node_regs (612 + (612 >> 1) + 200 as per PhaseRegAlloc::alloc_node_regs), but each transformation from addF_reg_reg to addF_reg_mem creates a fresh node ID (Compile::next_unique()) and Compile::_unique eventually grows beyond the size of PhaseRegAlloc::_node_regs, which finally triggers the assertion failure when PhaseRegAlloc::set_pair is called for a newly created node post-register allocation (e.g. during the target-dependent peephole phase).

The reason why this failure occurs only after JDK-8287087 is that this changeset makes it possible to detect a reduction chain that was undetectable before, when the innermost loop as been fully unrolled:

    static float test(float inc) {
        int i = 0, j = 0;
        float f = dontInline();
        while (i++ < 128) {
            f += inc;
            f += inc;
            f += inc;
            f += inc;
            f += inc;
            f += inc;
            f += inc;
            f += inc;
            f += inc;
            f += inc;
            f += inc;
            f += inc;
            f += inc;
            f += inc;
            f += inc;
            f += inc;
        }
        return f;
    }

This stronger analysis result provided by JDK-8287087 leads to the SLP early unrolling policy (SuperWord::unrolling_analysis()) requesting additional unrolling of the outermost loop, but due to limitations in the superword framework, the loop is finally not vectorized, leaving a graph with a very high density of AddF nodes (512 AddF nodes in the main loop body).

Potential solutions include:
- reusing the node ID of the replaced nodes in PhaseChaitin::fixup_spills() and/or adjusting Compile::_unique appropriately,
- resizing _node_regs on an out-of-bounds attempt (e.g. using a growable array),
- further increasing the size of _node_regs, and
- adjusting the loop unrolling policy to avoid excessive unrolling for pure reduction loops.
A temporary workaround is to use -XX:-UseCISCSpill.

Comments

[jdk8u-fix-request] Approval Request from Martin Balao Alonso 8u is affected by this bug so I'd like to request approval for a backport. The 17u patch applied with minor changes, the test works fine and risk is minimal.
22-03-2024
[jdk11u-fix-request] Approval Request from Martin Balao Alonso 11u is affected by this bug so I'd like to request approval for a backport. The 17u patch applied cleanly, the test works fine and risk is minimal.
22-03-2024
A pull request was submitted for review. URL: https://git.openjdk.org/jdk8u-dev/pull/470 Date: 2024-03-22 00:54:09 +0000
22-03-2024
A pull request was submitted for review. URL: https://git.openjdk.org/jdk11u-dev/pull/2617 Date: 2024-03-22 00:50:04 +0000
22-03-2024
A pull request was submitted for review. URL: https://git.openjdk.org/jdk17u-dev/pull/2039 Date: 2023-12-11 17:10:54 +0000
12-12-2023
[jdk17u-fix-request] Approval Request from Aleksey Shipilëv Clean backport to fix the C2 corner case. Applies cleanly. Tests pass.
12-12-2023
[jdk21u-fix-request] Approval Request from Aleksey Shipilëv Clean backport to fix the C2 crash. Applies cleanly. Testing passes.
02-11-2023
A pull request was submitted for review. URL: https://git.openjdk.org/jdk21u/pull/317 Date: 2023-11-01 17:05:00 +0000
01-11-2023
Changeset: a5818972 Author: Roberto Castañeda Lozano <rcastanedalo@openjdk.org> Date: 2023-10-30 12:54:03 +0000 URL: https://git.openjdk.org/jdk/commit/a5818972c16bd883d768ff2fb23a8aa9e0142c65
30-10-2023
Ok, great. Thanks for testing! Would you mind PR-ing the manual test as new jtreg testcase, so we can backport it along with this fix?
25-10-2023
Good point [~shade], the failure can indeed be triggered in earlier JDK releases (latest update releases of JDK 8, 11, and 17) using the attached program Manual.java where the loop is manually unrolled: $ javac Manual.java $ java -Xcomp -XX:CompileOnly=Manual::test -XX:CompileCommand=dontinline,Manual::dontInline Manual
25-10-2023
Looking at the fix, it looks like that while it is triggered by more aggressive optimizations added by JDK-8287087 in JDK 21, the bug is actually more generic, and can trigger in earlier JDK releases? I.e. if user writes a heavily unrolled loop by hand?
25-10-2023
A pull request was submitted for review. URL: https://git.openjdk.org/jdk/pull/16306 Date: 2023-10-23 09:48:40 +0000
23-10-2023
Yes, raising priority.
19-10-2023
I just attached a slightly further simplified version of the reproducer (TestSimpler.java) which triggers the same assert failure without stress options: java -Xcomp -XX:CompileOnly=TestSimpler::test -XX:CompileCommand=dontinline,TestSimpler::dontInline TestSimpler.java In light of this, would it be justified to adjust the bug priority due to higher likelihood [~thartmann]?
19-10-2023
[~rcastanedalo], could you please have a look?
04-10-2023
ILW = Crash during C2 compilation, reproducible with simple test case and stress options, -XX:-UseSuperWord = HLM = P3
04-10-2023
I attached a simplified version of the test and narrowed it down to JDK-8287087 in JDK 21 b21: java -XX:+IgnoreUnrecognizedVMOptions -XX:CompileCommand=quiet -XX:CompileCommand=compileonly,Test::test -XX:-TieredCompilation -Xcomp -XX:+UnlockDiagnosticVMOptions -XX:+StressGCM -XX:CompileCommand=dontinline,TestSimple::dontInline -XX:UseAVX=2 TestSimple.java
04-10-2023