JDK-8352587 : C2 SuperWord: we must avoid Multiversioning for PeelMainPost loops
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 25
  • Priority: P3
  • Status: Open
  • Resolution: Unresolved
  • Submitted: 2025-03-21
  • Updated: 2025-03-24
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 25
25Unresolved
Related Reports
Causes :  
Description
The attached Java Fuzzer failure starts to fail after JDK-8350756:

$ java -XX:CompileCommand=compileonly,*Test*::* -XX:-TieredCompilation -Xcomp -XX:PerMethodTrapLimit=0 Test.java
$ java -XX:CompileCommand=compileonly,*Reduced*::* -XX:-TieredCompilation -Xcomp -XX:PerMethodTrapLimit=0 Reduced.java

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/System/Volumes/Data/mesos/work_dir/slaves/d228d36c-581b-4156-829e-5c5a441dd0ce-S520/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/3d55c41a-1fd9-42b1-b51b-7e4e4e12782a/runs/590cfc57-d55d-4441-9694-fb7f86b887ba/workspace/open/src/hotspot/share/opto/vectorization.cpp:142), pid=12893, tid=25347
#  assert(_cl->is_multiversion_fast_loop() == (_multiversioning_fast_proj != nullptr)) failed: must find the multiversion selector IFF loop is a multiversion fast loop
#
# JRE version: Java(TM) SE Runtime Environment (25.0) (fastdebug build 25-internal-LTS-2025-03-13-1510029.leonid.mesnik.jdk-td-comp)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 25-internal-LTS-2025-03-13-1510029.leonid.mesnik.jdk-td-comp, compiled mode, sharing, compressed oops, compressed class ptrs, g1 gc, bsd-amd64)
............
Command Line: -XX:+UnlockDiagnosticVMOptions -XX:RepeatCompilation=68 -Xmx1G -XX:+IgnoreUnrecognizedVMOptions -XX:CompileCommand=quiet -XX:CompileCommand=compileonly,*Test*::* -XX:-TieredCompilation -Xcomp -XX:+UnlockDiagnosticVMOptions -XX:+StressLCM -XX:+StressGCM -XX:+StressIGVN -XX:+StressCCP -XX:+StressMacroExpansion -XX:+UnlockExperimentalVMOptions -XX:PerMethodSpecTrapLimit=0 -XX:PerMethodTrapLimit=0 -XX:+VerifyLoopOptimizations -XX:VerifyIterativeGVN=10 -XX:MaxRAMPercentage=4.16667 -Dtest.boot.jdk=/System/Volumes/Data/mesos/work_dir/jib-master/install/jdk/23/37/bundles/macos-x64/jdk-23_macos-x64_bin.tar.gz/jdk-23.jdk/Contents/Home -Djava.io.tmpdir=/System/Volumes/Data/mesos/work_dir/slaves/d228d36c-581b-4156-829e-5c5a441dd0ce-S496/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/60f6eafb-740c-4147-85e1-45e1183f6a4c/runs/32ca0ef2-d96d-43d1-8c76-c4135eb316a5/testoutput/test-support/jtreg_closed_test_hotspot_jtreg_applications_javafuzzer_LongRunningTests_java/tmp Test
............
C2:1362    5    b        Test::mainTest (673 bytes)

Stack: [0x0000700003146000,0x0000700003246000],  sp=0x0000700003241830,  free space=1006k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.dylib+0x135338e]  VMError::report(outputStream*, bool)+0x1eee  (vectorization.cpp:142)
V  [libjvm.dylib+0x1357102]  VMError::report_and_die(int, char const*, char const*, __va_list_tag*, Thread*, unsigned char*, void const*, void const*, char const*, int, unsigned long)+0x612
V  [libjvm.dylib+0x6d9ba8]  report_vm_error(char const*, int, char const*, char const*, ...)+0xd8
V  [libjvm.dylib+0x13156e5]  VLoop::check_preconditions_helper()+0x435
V  [libjvm.dylib+0x13151ed]  VLoop::check_preconditions()+0x9d
V  [libjvm.dylib+0xe39bd1]  PhaseIdealLoop::auto_vectorize(IdealLoopTree*, VSharedData&)+0xe1
V  [libjvm.dylib+0xe19391]  PhaseIdealLoop::build_and_optimize()+0xe91
V  [libjvm.dylib+0x62cfd9]  PhaseIdealLoop::optimize(PhaseIterGVN&, LoopOptsMode)+0x89
V  [libjvm.dylib+0x62d5c0]  Compile::optimize_loops(PhaseIterGVN&, LoopOptsMode)+0x80
V  [libjvm.dylib+0x62525f]  Compile::Optimize()+0xf0f
V  [libjvm.dylib+0x622a9d]  Compile::Compile(ciEnv*, ciMethod*, int, Options, DirectiveSet*)+0x186d
V  [libjvm.dylib+0x4e50c0]  C2Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0x200
V  [libjvm.dylib+0x645372]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0xc32
V  [libjvm.dylib+0x644317]  CompileBroker::compiler_thread_loop()+0x3d7
V  [libjvm.dylib+0xa49b80]  JavaThread::thread_main_inner()+0x1b0
V  [libjvm.dylib+0x12a089c]  Thread::call_run()+0xbc
V  [libjvm.dylib+0xfe1f83]  thread_native_entry(Thread*)+0x123
C  [libsystem_pthread.dylib+0x64e1]  _pthread_start+0x7d
C  [libsystem_pthread.dylib+0x1f6b]  thread_start+0xf

Comments
A pull request was submitted for review. Branch: master URL: https://git.openjdk.org/jdk/pull/24183 Date: 2025-03-24 08:22:57 +0000
24-03-2025

I investigated. There seem to be at least 2 issues, I think both need to be addressed: 1) We call PhaseIdealLoop::maybe_multiversion_for_auto_vectorization_runtime_checks when we expect PreMainPost. But we also do it for PeelMainPost, i.e. when peel_only == true. But the later case is useless, because the pattern matching later expects a pre-loop, which is not created, we only peel a single iteration. So later, we never find a pre-loop, and hence cannot find the multiversion_if. Then the OpaqueMultiversioningNode is marked useless, and the multiversion_if is removed again. Solution: check for peel_only == false, otherwise do not multiversion. Alternative: somehow try to pattern-match even if we have no pre-loop, but that could be much more difficult. 2) When we mark OpaqueMultiversioningNode as useless, and remove the multiversion_if, as well as the stalled slow_loop path, we currently still keep the main fast_loop marked as multiversioned. But that's not great, we could remove the marking. Because if it stays marked, we later find it again in SuperWord, and think that we should find the multiversion_if, but of course we cannot find it because it was removed. Solution: unmark a multiversioned main loop in PhaseIdealLoop::eliminate_useless_multiversion_if if we cannot find the multiversion_if / OpaqueMultiversioningNode.
21-03-2025

[~chagedorn] Thanks for triaging! It looks like the multiversion_if disappears, and the multiversion_delayed_slow with it... how can that be? ...Loop: N0/N0 has_sfpt Loop: N200/N198 counted [4,0),-1 (-1 iters) has_sfpt Loop: N249/N250 sfpts={ 252 } Loop: N239/N247 counted [5,56),+1 (2147483648 iters) rc multiversion_delayed_slow has_sfpt strip_mined Loop: N324/N332 counted [5,int),+1 (4 iters) pre rc multiversion_fast has_sfpt Loop: N221/N220 sfpts={ 223 } Loop: N545/N216 counted [254,33),-32 (2147483648 iters) main multiversion_fast has_sfpt strip_mined Loop: N390/N392 counted [int,3),-2 (4 iters) post multiversion_fast has_sfpt Unroll 32( 7) Loop: N545/N216 counted [254,33),-32 (2147483648 iters) main multiversion_fast has_sfpt strip_mined Loop: N0/N0 has_sfpt Loop: N200/N198 counted [4,0),-1 (-1 iters) has_sfpt Loop: N324/N332 counted [5,int),+1 (4 iters) pre rc multiversion_fast has_sfpt Loop: N221/N220 sfpts={ 223 } Loop: N568/N216 counted [254,65),-64 (2147483648 iters) main multiversion_fast has_sfpt strip_mined Loop: N390/N392 counted [int,3),-2 (4 iters) post multiversion_fast has_sfpt Investigating...
21-03-2025

[~epeter] can you have a look?
21-03-2025

ILW = Assertion failure in Superword, single fuzzer test, -XX:-UseSuperWord or disable compilation of affected method = HLM = P3
21-03-2025