JDK-8328938 : C2 SuperWord: disable vectorization for large stride and scale
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 20,21,22,23
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2024-03-25
  • Updated: 2024-06-11
  • Resolved: 2024-04-04
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 21 JDK 23
21.0.4Fixed 23 b17Fixed
Related Reports
Relates :  
Relates :  
Description
Probably a regression of JDK-8310190.

Only hits the assert in fastdebug, not slowdebug, and product seems unaffected (no failure, results correct).

Reproduce:

java --add-modules java.base --add-exports java.base/jdk.internal.misc=ALL-UNNAMED --add-exports java.base/jdk.internal.util=ALL-UNNAMED -XX:CompileCommand=compileonly,TestMinIntScale::test* -XX:CompileCommand=printcompilation,TestMinIntScale::test* -XX:CompileCommand=TraceAutoVectorization,TestMinIntScale::test*,ALIGN_VECTOR -XX:+TraceNewVectors -Xbatch -XX:LoopUnrollLimit=1000 -XX:+TraceLoopOpts TestMinIntScale.java

# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/oracle-work/jdk-fork2/open/src/hotspot/share/utilities/powerOfTwo.hpp:76), pid=550535, tid=550549
#  assert(is_power_of_2(value)) failed: value must be a power of 2: 0xffffffff80000000
#
# JRE version: Java(TM) SE Runtime Environment (23.0) (fastdebug build 23-internal-2024-03-12-0734066.emanuel...)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 23-internal-2024-03-12-0734066.emanuel..., mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0x172ebba]  SuperWord::adjust_pre_loop_limit_to_align_main_loop_vectors()+0xaba
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to /oracle-work/jdk-fork2/build/linux-x64-slowdebug/jdk/bin/core.550535)
#
# An error report file with more information is saved as:
# /oracle-work/jdk-fork2/build/linux-x64-slowdebug/jdk/bin/hs_err_pid550535.log
#
# Compiler replay data is saved as:
# /oracle-work/jdk-fork2/build/linux-x64-slowdebug/jdk/bin/replay_pid550535.log
Comments
[jdk21u-fix-request] Approval Request from Aleksey Shipilëv Unclean backport to prevent accidents in C2 loop optimizations. The patch is unclean, because JDK 21u misses major SuperWord refactorings; 21u PR acked by Emanuel and Volker. Passes all aggressive compiler testing. Risk is usual for C2 changes, but on the lower side, as it bails out of optimizations cleanly; minor risk of performance regressions.
18-04-2024

A pull request was submitted for review. URL: https://git.openjdk.org/jdk21u-dev/pull/495 Date: 2024-04-12 08:04:17 +0000
15-04-2024

Changeset: 29314587 Author: Emanuel Peter <epeter@openjdk.org> Date: 2024-04-04 05:01:30 +0000 URL: https://git.openjdk.org/jdk/commit/2931458711244e20eb7845a1aefcf6ed4206bce1
04-04-2024

A pull request was submitted for review. URL: https://git.openjdk.org/jdk/pull/18485 Date: 2024-03-26 10:03:29 +0000
27-03-2024

Seems to be a regression in JDK 20 b2 (commit-search is currently broken, so I can't narrow it down): https://bugs.openjdk.org/issues/?jql=project%20%3D%20JDK%20AND%20resolution%20%3D%20Fixed%20AND%20fixVersion%20%3D%20%2220%22%20AND%20component%20%3D%20hotspot%20AND%20%22Resolved%20In%20Build%22%20%3D%20b02%20AND%20Subcomponent%20%3D%20compiler%20ORDER%20BY%20resolved%20DESC
26-03-2024

[~thartmann] Turns out that this reproducer fails further back, with -XX:+AlignVector It fails with JDK20, but not with JDK19 for me. I get: /oracle-work/jdk-20.0.2/fastdebug/bin/java --add-modules java.base --add-exports java.base/jdk.internal.misc=ALL-UNNAMED --add-exports java.base/jdk.internal.util=ALL-UNNAMED -XX:CompileCommand=printcompilation,TestMinIntScale::test* -Xbatch -XX:LoopUnrollLimit=1000 -XX:+AlignVector TestMinIntScale.java 4296 1590 % b 3 TestMinIntScale::test3 @ 16 (197 bytes) 4297 1591 b 3 TestMinIntScale::test3 (197 bytes) 4298 1592 % b 4 TestMinIntScale::test3 @ 16 (197 bytes) 4522 1593 % b 3 TestMinIntScale::test1 @ 5 (173 bytes) 4522 1594 b 3 TestMinIntScale::test1 (173 bytes) 4523 1595 % b 4 TestMinIntScale::test1 @ 5 (173 bytes) 4525 1596 b 4 TestMinIntScale::test1 (173 bytes) # # A fatal error has been detected by the Java Runtime Environment: # # SIGFPE (0x8) at pc=0x00007f783bc6e21a, pid=843863, tid=843876 # # JRE version: Java(TM) SE Runtime Environment (20.0.2+6) (fastdebug build 20.0.2+6-59) # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 20.0.2+6-59, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) # Problematic frame: # V [libjvm.so+0x19d221a] SuperWord::ref_is_alignable(SWPointer&) [clone .part.0]+0x47a # # Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to /oracle-work/jdk-fork2/build/linux-x64-slowdebug/jdk/bin/core.843863) # # An error report file with more information is saved as: # /oracle-work/jdk-fork2/build/linux-x64-slowdebug/jdk/bin/hs_err_pid843863.log # # Compiler replay data is saved as: # /oracle-work/jdk-fork2/build/linux-x64-slowdebug/jdk/bin/replay_pid843863.log # # If you would like to submit a bug report, please visit: # https://bugreport.java.com/bugreport/crash.jsp # Aborted (core dumped) Stack: [0x00007f77b6537000,0x00007f77b6638000], sp=0x00007f77b6631c70, free space=1003k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x19d221a] SuperWord::ref_is_alignable(SWPointer&) [clone .part.0]+0x47a V [libjvm.so+0x19dafb4] SuperWord::find_align_to_ref(Node_List&, int&)+0x194 V [libjvm.so+0x19e436e] SuperWord::find_adjacent_refs()+0x12e V [libjvm.so+0x19eba58] SuperWord::SLP_extract()+0x498 V [libjvm.so+0x19ebd92] SuperWord::transform_loop(IdealLoopTree*, bool)+0x262 V [libjvm.so+0x14afb11] PhaseIdealLoop::build_and_optimize()+0x1361 V [libjvm.so+0xaef581] PhaseIdealLoop::optimize(PhaseIterGVN&, LoopOptsMode)+0x261 V [libjvm.so+0xae964f] Compile::Optimize()+0xe2f V [libjvm.so+0xaed7ae] Compile::Compile(ciEnv*, ciMethod*, int, Options, DirectiveSet*)+0x18ce V [libjvm.so+0x8ff7c7] C2Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0x4e7 V [libjvm.so+0xafa84c] CompileBroker::invoke_compiler_on_method(CompileTask*)+0xa7c V [libjvm.so+0xafb608] CompileBroker::compiler_thread_loop()+0x5d8 V [libjvm.so+0x106a046] JavaThread::thread_main_inner()+0x206 V [libjvm.so+0x1a702c0] Thread::call_run()+0x100 V [libjvm.so+0x17038b3] thread_native_entry(Thread*)+0x103 The failing instruction is the idiv, with a zero input. I suspect it must be these lines: int span = preloop_stride * p.scale_in_bytes(); ... if (vw % span == 0) { if "span == 0", then the idiv from the modulo gets a division by zero -> SIGFPE Draft PR: https://github.com/openjdk/jdk/pull/18485
26-03-2024

I strongly suspect JDK-8286197, which is part of JDK20 b2. It makes certain Unsafe patterns vectorize, and I need this pattern for my reproducer.
26-03-2024

ILW = Assert during C2 compilation (regression), edge case but easy to reproduce, disable superword or compilation of affected method = HLM = P3
25-03-2024

"scale = min_int" really can only be acheived with UNSAFE, otherwise RangeChecks would not allow that to get to SuperWord. This is certainly not a very common usecase. Hence, I could either just forbid it from vectorizing, or just handle the calculations more carefully by using "jlong" instead of "int".
25-03-2024