JDK-8366940 : Test compiler/loopopts/superword/TestAliasingFuzzer.java timed out
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 26
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2025-09-05
  • Updated: 2025-09-15
  • Resolved: 2025-09-15
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 26
26 masterFixed
Related Reports
Duplicate :  
Relates :  
Relates :  
Description
TestAliasingFuzzer.javaTest cases:
 - compiler/loopopts/superword/TestAliasingFuzzer.java#vanilla
 - compiler/loopopts/superword/TestAliasingFuzzer.java#random-flags

--------------------------------------------------------------------------------------------------------------

TestAliasingFuzzer.java generates 30 subtests for every run. They are randomized. Some vectorize and execute faster, some fail to vectorize and execute slower.

Hence, some natural variance in the duration is expected.
On most machines, it seems the variance in "Running Tests" is about 30-50sec (total test time about 35-70sec). But on some machines (macosx-x64-debug), the execution time is a bit slower: 60-100 in "Running Tests", with some outliers at 110+sec. These occasionally trip the 120sec timeout, and when they trip it, they somehow cause the harness to take an excessive 9+min to shut everything down.

Solutions:
- Option 1: generate fewer tests in TestAliasingFuzzer.java. Would be sad, the test has now found 2 real bugs within 2 weeks.
- Option 2: increase test timeout. That is what I'll do.

--------------------------------------------------------------------------------------------------------------

The main logs show e.g.

Output and diagnostic info for process 85024 was saved into 'pid-85024-output.log'

Code Generation:  1.2900121
Code Compilation: 12.093837
Running Tests:    109.559235
----------System.err:(3/35)----------

JavaTest Message: Test complete.

result: Error. "driver" action timed out with a timeout of 120 seconds on agent 42

The timeout handler logs and crash dumps show nothing. The driver process appears to be executing no test code at all and the main threads is doing I/O:

"tid": "3",
            "time": "2025-09-04T18:28:03.735156Z",
            "name": "main",
            "state": "RUNNABLE",
            "stack": [
              "java.base\/sun.nio.ch.Net.poll(Native Method)",
              "java.base\/sun.nio.ch.NioSocketImpl.park(NioSocketImpl.java:190)",
              "java.base\/sun.nio.ch.NioSocketImpl.timedRead(NioSocketImpl.java:279)",
              "java.base\/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:302)",
              "java.base\/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:354)",
              "java.base\/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:798)",
              "java.base\/java.net.Socket$SocketInputStream.implRead(Socket.java:981)",
              "java.base\/java.net.Socket$SocketInputStream.read(Socket.java:971)",
              "java.base\/java.io.BufferedInputStream.fill(BufferedInputStream.java:289)",
              "java.base\/java.io.BufferedInputStream.read(BufferedInputStream.java:308)",
              "java.base\/java.io.FilterInputStream.read(FilterInputStream.java:71)",
              "com.sun.javatest.regtest.agent.AgentServer.run(AgentServer.java:238)",
              "com.sun.javatest.regtest.agent.AgentServer.main(AgentServer.java:71)"

Suspicion is that this could be a test harness and/or infra issue as we have seen similar odd timeouts.
Comments
Changeset: cf00f96f Branch: master Author: Emanuel Peter <epeter@openjdk.org> Date: 2025-09-15 06:59:56 +0000 URL: https://git.openjdk.org/jdk/commit/cf00f96fd49ac7e6e04fdde74a3015531a0b59c8
15-09-2025

[~syan] Ok, thank you. It seems your failure is a bit different than what we are seeing in our CI. I summarized what we see below. It seems that the machines we see the timeout on, the "Running Tests" takes about 60-100 sec, but outliers with 110+sec. And those eventually trip the 120sec timeout. Other platforms seem to be faster in execution, so they don't trip the timeout. The test generates random code: in some cases it can be optimized (auto-vectorization), in some not. Every run has many tests, so on average they do not timeout, but if we have many tests that are not vectorized, they just run a little slower. That's my theory for the variance. My fix for now: Increase the timeout from 120sec to 180 or 200 sec.
12-09-2025

All failures I could see so far happened on macosx-x64. What machine did it fail on for you [~syan] ? Can you find anything more in those error logs from the compilation? I'm seeing that all of our tests seem to basically finish execution, but then somehow hang later on, maybe in JTREG shutdown/cleanup? ----------------------------------------- snippet start ------------------------------------------------------ Code Generation: 1.1313885 Code Compilation: 10.754964 Running Tests: 111.78365 ----------System.err:(3/35)---------- JavaTest Message: Test complete. result: Error. "driver" action timed out with a timeout of 120 seconds on agent 94 test result: Error. "driver" action timed out with a timeout of 120 seconds on agent 94 ----------------------------------------- snippet end ------------------------------------------------------ Aaaaah, but luckily I print how long the tests are running. Any they all take over 100sec. So if the timeout is at about 120: 10 sec compilation + 100 sec execution gets us very close to that timeout. Looking at some passing cases now. Examples: One in the upper range: ----------------------------------------- snippet start ------------------------------------------------------ Code Generation: 1.4541821 Code Compilation: 12.60625 Running Tests: 96.294525 ----------System.err:(3/35)---------- ----------------------------------------- snippet end ------------------------------------------------------ And in the lower range: ----------------------------------------- snippet start ------------------------------------------------------ Code Generation: 0.1162127 Code Compilation: 1.8632681 Running Tests: 60.39385 ----------System.err:(3/35)---------- ----------------------------------------- snippet end ------------------------------------------------------ My conclusion: The distribution is around 60-100 sec in most cases, with outlayers in 110+ range. These outliers trip over the timeout.
12-09-2025

> What machine did it fail on for you [~syan] The fails log copy from a GHA job. It seems like windows system. I forgot which job observed this fails.
12-09-2025

The output that [~syan] shared () seems to have it fail during the compile framework compilation. But even that could be a timeout due to something else. Though compilation is one of the first things that happen, and should not take more than like 10sec. It is possible that somehow we generate malformed code, and then javac in the compile framework generates too many errors, and chokes on those? compiler.lib.compile_framework.CompileFrameworkException: Exception in Compile Framework: Invocation target: at compiler.lib.compile_framework.CompileFramework.invoke(CompileFramework.java:135) at compiler.loopopts.superword.TestAliasingFuzzer.main(TestAliasingFuzzer.java:191) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) at java.base/java.lang.reflect.Method.invoke(Method.java:565) at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:335) at java.base/java.lang.Thread.run(Thread.java:1474) Caused by: java.lang.reflect.InvocationTargetException at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:119) at java.base/java.lang.reflect.Method.invoke(Method.java:565) at compiler.lib.compile_framework.CompileFramework.invoke(CompileFramework.java:131) ... 5 more Caused by: compiler.lib.ir_framework.driver.TestVMException: There were one or multiple errors. Please check stderr for more information. at compiler.lib.ir_framework.driver.TestVMProcess.throwTestVMException(TestVMProcess.java:251) at compiler.lib.ir_framework.driver.TestVMProcess.checkTestVMExitCode(TestVMProcess.java:232) at compiler.lib.ir_framework.driver.TestVMProcess.<init>(TestVMProcess.java:77) at compiler.lib.ir_framework.TestFramework.runTestVM(TestFramework.java:874) at compiler.lib.ir_framework.TestFramework.start(TestFramework.java:834) at compiler.lib.ir_framework.TestFramework.start(TestFramework.java:426) at compiler.loopopts.superword.templated.AliasingFuzzer.main(AliasingFuzzer.java:17) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) ... 7 more
12-09-2025

A pull request was submitted for review. Branch: master URL: https://git.openjdk.org/jdk/pull/27257 Date: 2025-09-12 12:01:25 +0000
12-09-2025

Happened again (see JDK-8366940). Emanuel, could you please have a look?
08-09-2025

Test compiler/loopopts/superword/TestAliasingFuzzer.java#vanilla fails "TestFramework test VM exited with code 1"
05-09-2025

I don't think this is related to JDK-8366845 because the pid-XXX-output.log files of the test VM don't show any crashes. Also from the test log, it looks like the test VM finished just fine: [2025-09-04T21:28:10.120371Z] Gathering output for process 39891 [2025-09-04T21:28:10.120815Z] Waiting for completion for process 39891 [2025-09-04T21:29:57.100999Z] Waiting for completion finished for process 39891 Output and diagnostic info for process 39891 was saved into 'pid-39891-output.log' Code Generation: 1.1445268 Code Compilation: 10.8918705 Running Tests: 111.74652 ILW = Test times out, intermittent (3x so far), no workaround = MLH = P4
05-09-2025

[~dholmes] > Same test showed some failures in JDK-8366845, but unclear if timeout could be related. Yes, it could be related. So let's wait for integration of JDK-8366845 and see if it keeps happening. It is a randomized test, so the issue would be intermittent.
05-09-2025

Same test showed some failures in JDK-8366845, but unclear if timeout could be related.
05-09-2025