JDK-8316396 : Endless loop in C2 compilation triggered by AddNode::IdealIL
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 19,20,21,22
  • Priority: P2
  • Status: Closed
  • Resolution: Fixed
  • OS: linux
  • CPU: x86_64
  • Submitted: 2023-09-17
  • Updated: 2024-01-22
  • Resolved: 2023-10-04
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 21 JDK 22
21.0.2Fixed 22 b18Fixed
Related Reports
Duplicate :  
Relates :  
Description
The following test failed in the JDK22 CI:

applications/javafuzzer/MediumTest.java

Here's a snippet from the log file:

#section:driver
----------messages:(7/305)----------
command: driver applications.javafuzzer.JavaFuzzerRunner -NT 50
reason: User specified action: run driver/timeout=21600 applications.javafuzzer.JavaFuzzerRunner -NT 50 
started: Sun Sep 17 14:30:35 UTC 2023
Mode: agentvm
Agent id: 2
finished: Sun Sep 17 15:51:05 UTC 2023
elapsed time (seconds): 4829.941
----------configuration:(16/2449)----------

<snip>

----------System.out:(36/2669)----------
For random generator using seed: -606735860558289215
To re-run test with same seed value please add "-Djdk.test.lib.random.seed=-606735860558289215" to command line.
Using JRuby executable: /opt/mach5/mesos/work_dir/jib-master/install/org/jruby/jruby-dist/9.2.12.0/jruby-dist-9.2.12.0-bin.zip/jruby-9.2.12.0/bin/jruby
Starting JavaFuzzer: '/bin/bash /opt/mach5/mesos/work_dir/jib-master/install/com/oracle/jpg/bigapps/javafuzzer/javafuzzer/2.0/javafuzzer-2.0.zip/mrt.sh -R /opt/mach5/mesos/work_dir/slaves/cd627e65-f015-4fb1-a1d2-b6c9b8127f98-S173928/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/e66fbedc-caa6-4705-9f20-ac0723bad48b/runs/8cf948c1-2ab1-4277-8158-230b5d2f3b44/testoutput/test-support/jtreg_closed_test_hotspot_jtreg_applications_javafuzzer_MediumTest_java/scratch/0 -NT 50 -NP 12 -A -conf config.yml'
[2023-09-17T14:30:42.145497862Z] Gathering output for process 238310
[2023-09-17T15:51:05.554525337Z] Waiting for completion for process 238310
[2023-09-17T15:51:05.554789169Z] Waiting for completion finished for process 238310
Output and diagnostic info for process 238310 was saved into 'pid-238310-output.log'

Summary of the JavaFuzzer run:
------------------------------
Host:     ubuntu-22-04-x64-759186
Tests:    12 x 50
Args:     -conf config.yml

Started  at: Sun Sep 17 14:30:42 UTC 2023


r12- 50: 33 passed, 0 crashes, 0 fails, 0 hangs, 0 incorrect tests, 17 Reference Java failures
r7- 50: 32 passed, 0 crashes, 0 fails, 0 hangs, 0 incorrect tests, 18 Reference Java failures
r4- 50: 31 passed, 0 crashes, 0 fails, 0 hangs, 0 incorrect tests, 19 Reference Java failures
r5- 50: 35 passed, 0 crashes, 0 fails, 0 hangs, 0 incorrect tests, 15 Reference Java failures
r10- 50: 31 passed, 0 crashes, 0 fails, 0 hangs, 0 incorrect tests, 19 Reference Java failures
r2- 50: 27 passed, 0 crashes, 0 fails, 0 hangs, 0 incorrect tests, 23 Reference Java failures
r1- 50: 36 passed, 0 crashes, 0 fails, 0 hangs, 0 incorrect tests, 14 Reference Java failures
r3- 50: 35 passed, 0 crashes, 0 fails, 0 hangs, 0 incorrect tests, 15 Reference Java failures
r9- 50: 30 passed, 0 crashes, 0 fails, 0 hangs, 0 incorrect tests, 20 Reference Java failures
r11- 50: 27 passed, 0 crashes, 0 fails, 0 hangs, 0 incorrect tests, 23 Reference Java failures
r8- 50: 31 passed, 0 crashes, 2 fails, 0 hangs, 0 incorrect tests, 18 Reference Java failures
r6- 50: 33 passed, 0 crashes, 0 fails, 0 hangs, 0 incorrect tests, 17 Reference Java failures

Finished at: Sun Sep 17 15:51:05 UTC 2023


[2023-09-17T15:51:05.623446277Z] Waiting for completion for process 238310
[2023-09-17T15:51:05.623697265Z] Waiting for completion finished for process 238310
----------System.err:(15/908)----------
java.lang.RuntimeException: assertEquals: expected 1 to equal 2
	at jdk.test.lib.Asserts.fail(Asserts.java:594)
	at jdk.test.lib.Asserts.assertEquals(Asserts.java:205)
	at jdk.test.lib.Asserts.assertEquals(Asserts.java:189)
	at applications.javafuzzer.JavaFuzzerRunner.reportResults(JavaFuzzerRunner.java:244)
	at applications.javafuzzer.JavaFuzzerRunner.runWithConfigurations(JavaFuzzerRunner.java:265)
	at applications.javafuzzer.JavaFuzzerRunner.main(JavaFuzzerRunner.java:283)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:333)
	at java.base/java.lang.Thread.run(Thread.java:1570)

JavaTest Message: Test threw exception: java.lang.RuntimeException
JavaTest Message: shutting down test

result: Failed. Execution failed: `main' threw exception: java.lang.RuntimeException: assertEquals: expected 1 to equal 2


The summary above has this line:

> r8- 50: 31 passed, 0 crashes, 2 fails, 0 hangs, 0 incorrect tests, 18 Reference Java failures

so we have two failures, but no obvious info about what those
failures are...

It has been a long time since I've seen a javafuzzer failure without
hs_err_pid files. I did download the test execution artifacts and I
did not find any hs_err_pid files.
Comments
[jdk21u-fix-request] Approval Request from Aleksey Shipilëv Clean backport to fix potential C2 problem. Fix applies cleanly, but the regression test passes even without a fix.
17-10-2023

A pull request was submitted for review. URL: https://git.openjdk.org/jdk21u/pull/259 Date: 2023-10-16 19:32:21 +0000
16-10-2023

Changeset: b0d6c847 Author: Roland Westrelin <roland@openjdk.org> Date: 2023-10-04 12:06:31 +0000 URL: https://git.openjdk.org/jdk/commit/b0d6c8472f69fef0265b41a4caf9e564adf68f6e
04-10-2023

A pull request was submitted for review. URL: https://git.openjdk.org/jdk/pull/15923 Date: 2023-09-26 13:48:56 +0000
26-09-2023

Showed up again with another fuzzer generated test (see JDK-8316692).
22-09-2023

ILW = Endless loop during C2 compilation, reproducible with targeted test, no workaround but disable compilation of affected method = HMM = P2
18-09-2023

I attached a simplified test (Reduced.java) that reproduces the issue with: java -XX:CompileCommand=compileonly,*Reduced*::test -XX:CompileCommand=quiet -XX:-TieredCompilation -Xcomp Reduced.java The problem seems to be in this code in AddNode::IdealIL which triggers an endless cycle of node creation and deletion: https://github.com/openjdk/jdk/blob/3828dc913a3ea28d622b69bd07f26949128eb5f7/src/hotspot/share/opto/addnode.cpp#L283 #0 0x00007f7efe5f8d81 in Compile::record_dead_node (this=this@entry=0x7f7ecaff9970, idx=100419) at /oracle/jdk/open/src/hotspot/share/opto/compile.hpp:838 #1 0x00007f7efe5fe647 in Compile::record_dead_node (idx=100419, this=0x7f7ecaff9970) at /oracle/jdk/open/src/hotspot/share/gc/shared/barrierSet.hpp:163 #2 Compile::remove_useless_node (this=0x7f7ecaff9970, dead=0x7f7ed97a70a8) at /oracle/jdk/open/src/hotspot/share/opto/compile.cpp:381 #3 0x00007f7eff1428cf in PhaseIterGVN::remove_globally_dead_node (this=this@entry=0x7f7ecaff76d0, dead=0x7f7ed97a70a8, dead@entry=0x7f7ed97a6ec0) at /oracle/jdk/open/src/hotspot/share/opto/phaseX.cpp:1378 #4 0x00007f7eff1438b4 in PhaseIterGVN::remove_dead_node (dead=0x7f7ed97a6ec0, this=0x7f7ecaff76d0) at /oracle/jdk/open/src/hotspot/share/opto/phaseX.hpp:528 #5 PhaseIterGVN::subsume_node (this=this@entry=0x7f7ecaff76d0, old=old@entry=0x7f7ed97a6ec0, nn=nn@entry=0x7f7ed97a7500) at /oracle/jdk/open/src/hotspot/share/opto/phaseX.cpp:1427 #6 0x00007f7eff14be45 in PhaseIterGVN::transform_old (this=0x7f7ecaff76d0, n=<optimized out>) at /oracle/jdk/open/src/hotspot/share/opto/phaseX.cpp:1215 #7 0x00007f7efe1f409c in AddNode::IdealIL (this=0x7f7ed97a6d40, phase=0x7f7ecaff76d0, can_reshape=<optimized out>, bt=<optimized out>) at /oracle/jdk/open/src/hotspot/share/opto/node.hpp:396 #8 0x00007f7eff14bd70 in PhaseGVN::apply_ideal (can_reshape=true, k=0x7f7ed97a6d40, this=0x7f7ecaff76d0) at /oracle/jdk/open/src/hotspot/share/opto/phaseX.cpp:667 #9 PhaseIterGVN::transform_old (this=0x7f7ecaff76d0, n=0x7f7ed97a6d40) at /oracle/jdk/open/src/hotspot/share/opto/phaseX.cpp:1196 #10 0x00007f7efe1f409c in AddNode::IdealIL (this=0x7f7ed97a6c40, phase=0x7f7ecaff76d0, can_reshape=<optimized out>, bt=<optimized out>) at /oracle/jdk/open/src/hotspot/share/opto/node.hpp:396 #11 0x00007f7eff14bd70 in PhaseGVN::apply_ideal (can_reshape=true, k=0x7f7ed97a6c40, this=0x7f7ecaff76d0) at /oracle/jdk/open/src/hotspot/share/opto/phaseX.cpp:667 #12 PhaseIterGVN::transform_old (this=0x7f7ecaff76d0, n=0x7f7ed97a6c40) at /oracle/jdk/open/src/hotspot/share/opto/phaseX.cpp:1196 #13 0x00007f7efe1f40e4 in AddNode::IdealIL (this=0x7f7ed979f370, phase=0x7f7ecaff76d0, can_reshape=<optimized out>, bt=<optimized out>) at /oracle/jdk/open/src/hotspot/share/opto/node.hpp:396 #14 0x00007f7eff14bd70 in PhaseGVN::apply_ideal (can_reshape=true, k=0x7f7ed979f370, this=0x7f7ecaff76d0) at /oracle/jdk/open/src/hotspot/share/opto/phaseX.cpp:667 #15 PhaseIterGVN::transform_old (this=0x7f7ecaff76d0, n=0x7f7ed979f370) at /oracle/jdk/open/src/hotspot/share/opto/phaseX.cpp:1196 #16 0x00007f7efe1f40e4 in AddNode::IdealIL (this=0x7f7ed9787970, phase=0x7f7ecaff76d0, can_reshape=<optimized out>, bt=<optimized out>) at /oracle/jdk/open/src/hotspot/share/opto/node.hpp:396 #17 0x00007f7eff14bd70 in PhaseGVN::apply_ideal (can_reshape=true, k=0x7f7ed9787970, this=0x7f7ecaff76d0) at /oracle/jdk/open/src/hotspot/share/opto/phaseX.cpp:667 #18 PhaseIterGVN::transform_old (this=0x7f7ecaff76d0, n=0x7f7ed9787970) at /oracle/jdk/open/src/hotspot/share/opto/phaseX.cpp:1196 A simple "_dead_node_count < 100000" assert in Compile::record_dead_node triggers right away. This only reproduces after JDK-8281429 which most likely just reveals an existing issue.
18-09-2023

Similar to JDK-8308146, the problem is a timeout in the C2 stress run (-Xmx1G -Xcomp -XX:CompileOnly=*Test*::* -XX:CompileCommand=quiet -XX:-TieredCompilation -XX:+IgnoreUnrecognizedVMOptions -XX:+UnlockDiagnosticVMOptions -XX:+StressLCM -XX:+StressGCM -XX:+StressIGVN -XX:+StressCCP -XX:StressSeed=1367170464 -XX:MaxRAMPercentage=4.16667): i15 i16 d1 = 152,2565,-4615675667265282880 i17 i18 i19 = 172,-1912974912,2 i20 i21 f2 = 0,0,-1053711611 i22 i23 i24 = 151,495,172 i25 lArr fArr = -236,7115719948,-4588933268853751808 iArr2 = 215503528 Test.instanceCount Test.iFld Test.byFld = 0,344,114 vMeth2_check_sum: -2977560974 vMeth1_check_sum: -4083294497 vMeth_check_sum: 4610681715755279636 TIME_OUT! Looks like this time it's a real bug though (endless loop during compilation), I can reproduce it locally.
18-09-2023