JDK-8334228 : C2 SuperWord: fix JDK-24 regression in VPointer::cmp_for_sort after JDK-8325155
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 24
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2024-06-13
  • Updated: 2024-06-17
  • Resolved: 2024-06-17
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 24
24 masterFixed
Related Reports
Relates :  
Description
There is a regression in JDK-8325155, specifically the VPointer::cmp_for_sort code.

The impact is that the sorting does not work properly.
This only has a performance impact in product (we may not find all possible pairs, and vectorize less often - but these are extreme special cases).
In debug build, we hit the assert that verifies the sorting.

/oracle-work/jdk-fork2/build/linux-x64-debug/jdk/bin/java -XX:CompileCommand=compileonly,TestA::* -XX:CompileCommand=printcompilation,TestA::* -XX:CompileCommand=TraceAutoVectorization,*::test*,PRECONDITIONS,BODY,SW_REJECTIONS,POINTERS,SW_INFO,SW_ADJACENT_MEMOPS -XX:+TraceNewVectors -XX:+TraceLoopOpts -Xcomp -XX:-TieredCompilation TestA.java

VLoopVPointers::print:
  VPointer[mem: 1013     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936112) + invar[1164] + scale(   4) * iv]
  VPointer[mem: 1006     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936144) + invar[1164] + scale(   4) * iv]
  VPointer[mem: 1005     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936108) + invar[1164] + scale(   4) * iv]
  VPointer[mem: 1004     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936148) + invar[1164] + scale(   4) * iv]
  VPointer[mem: 1003     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936104) + invar[1164] + scale(   4) * iv]
  VPointer[mem: 1002     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936152) + invar[1164] + scale(   4) * iv]
  VPointer[mem: 1001     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936100) + invar[1164] + scale(   4) * iv]
  VPointer[mem: 1000     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936156) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  999     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936096) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  998     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936160) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  997     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936092) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  996     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936164) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  995     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936088) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  994     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936168) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  993     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936084) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  992     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936172) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  860     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936080) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  847     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936176) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  846     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936076) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  845     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936180) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  844     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936072) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  837     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936184) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  836     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936068) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  835     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936188) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  727     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936064) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  720     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936192) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  719     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936060) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  718     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936196) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  590     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936056) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  580     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936200) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  121     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936052) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  144     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936204) + invar[1164] + scale(   4) * iv]

SuperWord::transform_loop:
    Loop: N1109/N155  counted [int,985),+16 (2147483648 iters)  main has_sfpt strip_mined
 1109  CountedLoop  === 1109 174 155  [[ 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1013 1075 1109 835 1113 836 837 844 845 846 847 860 121 718 719 720 727 170 580 590 144 ]] inner stride: 16 main of N1109 strip mined !orig=[898],[749],[597],[175],[166],[71] !jvms: TestA::test206 @ bci:12 (line 18)

SuperWord::create_adjacent_memop_pairs:
 group:
  VPointer[mem: 1006     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936144) + invar[1164] + scale(   4) * iv]
  VPointer[mem: 1004     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936148) + invar[1164] + scale(   4) * iv]
  VPointer[mem: 1002     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936152) + invar[1164] + scale(   4) * iv]
  VPointer[mem: 1000     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936156) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  998     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936160) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  996     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936164) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  994     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936168) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  992     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936172) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  847     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936176) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  845     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936180) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  837     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936184) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  835     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936188) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  720     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936192) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  718     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936196) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  580     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936200) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  144     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936204) + invar[1164] + scale(   4) * iv]
  VPointer[mem: 1013     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936112) + invar[1164] + scale(   4) * iv]
  VPointer[mem: 1005     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936108) + invar[1164] + scale(   4) * iv]
  VPointer[mem: 1003     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936104) + invar[1164] + scale(   4) * iv]
  VPointer[mem: 1001     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936100) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  999     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936096) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  997     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936092) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  995     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936088) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  993     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936084) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  860     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936080) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  846     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936076) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  844     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936072) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  836     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936068) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  727     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936064) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  719     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936060) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  590     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936056) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  121     StoreI, base:   96, adr:   96,  base[  96] + offset(-1077936052) + invar[1164] + scale(   4) * iv]
 pair:
  VPointer[mem: 1006     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936144) + invar[1164] + scale(   4) * iv]
  VPointer[mem: 1004     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936148) + invar[1164] + scale(   4) * iv]
 pair:
  VPointer[mem: 1004     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936148) + invar[1164] + scale(   4) * iv]
  VPointer[mem: 1002     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936152) + invar[1164] + scale(   4) * iv]
 pair:
  VPointer[mem: 1002     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936152) + invar[1164] + scale(   4) * iv]
  VPointer[mem: 1000     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936156) + invar[1164] + scale(   4) * iv]
 pair:
  VPointer[mem: 1000     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936156) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  998     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936160) + invar[1164] + scale(   4) * iv]
 pair:
  VPointer[mem:  998     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936160) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  996     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936164) + invar[1164] + scale(   4) * iv]
 pair:
  VPointer[mem:  996     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936164) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  994     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936168) + invar[1164] + scale(   4) * iv]
 pair:
  VPointer[mem:  994     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936168) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  992     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936172) + invar[1164] + scale(   4) * iv]
 pair:
  VPointer[mem:  992     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936172) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  847     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936176) + invar[1164] + scale(   4) * iv]
 pair:
  VPointer[mem:  847     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936176) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  845     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936180) + invar[1164] + scale(   4) * iv]
 pair:
  VPointer[mem:  845     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936180) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  837     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936184) + invar[1164] + scale(   4) * iv]
 pair:
  VPointer[mem:  837     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936184) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  835     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936188) + invar[1164] + scale(   4) * iv]
 pair:
  VPointer[mem:  835     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936188) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  720     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936192) + invar[1164] + scale(   4) * iv]
 pair:
  VPointer[mem:  720     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936192) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  718     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936196) + invar[1164] + scale(   4) * iv]
 pair:
  VPointer[mem:  718     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936196) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  580     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936200) + invar[1164] + scale(   4) * iv]
 pair:
  VPointer[mem:  580     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936200) + invar[1164] + scale(   4) * iv]
  VPointer[mem:  144     StoreI, base:   96, adr:   96,  base[  96] + offset(1077936204) + invar[1164] + scale(   4) * iv]
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/oracle-work/jdk-fork2/open/src/hotspot/share/opto/superword.cpp:588), pid=3446714, tid=3446728
#  assert(p1->offset_in_bytes() <= p2->offset_in_bytes()) failed: must be sorted by offset
#
# JRE version: Java(TM) SE Runtime Environment (24.0) (fastdebug build 24-internal-2024-06-13-0840142.emanuel...)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 24-internal-2024-06-13-0840142.emanuel..., compiled mode, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0x17250de]  SuperWord::create_adjacent_memop_pairs_in_one_group(GrowableArray<VPointer const*> const&, int, int)+0x56e
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to /oracle-work/jdk-fork2/build/linux-x64-debug/jdk/bin/core.3446714)
#
# An error report file with more information is saved as:
# /oracle-work/jdk-fork2/build/linux-x64-debug/jdk/bin/hs_err_pid3446714.log
#
# Compiler replay data is saved as:
# /oracle-work/jdk-fork2/build/linux-x64-debug/jdk/bin/replay_pid3446714.log
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
#
Comments
Changeset: 29b63928 Author: Emanuel Peter <epeter@openjdk.org> Date: 2024-06-17 06:58:55 +0000 URL: https://git.openjdk.org/jdk/commit/29b63928387a8b6ab387057cb3eac4771b1bfff1
17-06-2024

ILW = Performance regression while debug hits a sanity assert, edge case?, disable compilation of affected method or disable superword with -XX:-UseSuperWord = MLM = P4
13-06-2024

A pull request was submitted for review. URL: https://git.openjdk.org/jdk/pull/19696 Date: 2024-06-13 13:05:40 +0000
13-06-2024