JDK-8323582 : C2 SuperWord AlignVector: misaligned vector memory access with unaligned native memory
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 20,21,22,23
  • Priority: P4
  • Status: Open
  • Resolution: Unresolved
  • Submitted: 2024-01-11
  • Updated: 2024-11-21
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbdUnresolved
Related Reports
Relates :  
Description
With SuperWord AlignVector, we analyze the mem_ref's pointer (VPointer).
Since arrays are objects, they are 8-byte aligned (ObjectAlignmentInBytes), and so we can just ignore the base of the address (the address of the header of the array).
But if we do not take an array, but instead directly get an address to some allocated memory with Unsafe.allocateMemory, we can do any pointer arithmetic with this pointer, and pass it as an unaligned base. This breaks the assumption, and we can get unaligned vector memory accesses, even if not allowed by AlignVector.

Reproduce with JDK23, after JDK-8310190 (added runtime assert to verify that the address is aligned):
./java --add-modules java.base --add-exports java.base/jdk.internal.misc=ALL-UNNAMED -XX:CompileCommand=compileonly,Test::test* -XX:+TraceLoopOpts -XX:+TraceSuperWord -XX:+AlignVector -XX:+VerifyAlignVector Test.java

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/oracle-work/jdk-fork0/open/src/hotspot/cpu/x86/macroAssembler_x86.cpp:831), pid=3090218, tid=3090219
#  fatal error: DEBUG MESSAGE: verify_vector_alignment found a misaligned vector memory access
#
# JRE version: Java(TM) SE Runtime Environment (23.0) (fastdebug build 23-internal-2024-01-10-0740221.emanuel...)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 23-internal-2024-01-10-0740221.emanuel..., mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0x12eca65]  MacroAssembler::debug64(char*, long, long*)+0x45
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to /oracle-work/jdk-fork0/build/linux-x64-debug/jdk/bin/core.3090218)
#
# An error report file with more information is saved as:
# /oracle-work/jdk-fork0/build/linux-x64-debug/jdk/bin/hs_err_pid3090218.log
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
#
Aborted (core dumped)

Older versions JDK20-JDK22:
/oracle-work/jdk-22/fastdebug/bin/java --add-modules java.base --add-exports java.base/jdk.internal.misc=ALL-UNNAMED -XX:CompileCommand=compileonly,Test::test* -XX:+TraceLoopOpts -XX:+TraceSuperWord -XX:+AlignVector Test.java

This seems to run ok, but produces misaligned alignment and performs misaligned vector memory access. We just don't fail on x64 since no strict alignment is actually required. In the log we find that the alignment/offset looks aligned, but that is because we assume the base to be aligned.

After find_adjacent_refs
packset
Pack: 0
 align: 0 	 417  StoreI  === 445 448 420 418  [[ 414 416 ]]  @rawptr:BotPTR, idx=Raw; unaligned unsafe  Memory: @rawptr:BotPTR, idx=Raw; !orig=343,291,248,123,262 !jvms: Test::test @ bci:38 (line 16)
 align: 4 	 414  StoreI  === 445 417 422 415  [[ 411 413 ]]  @rawptr:BotPTR, idx=Raw; unaligned unsafe  Memory: @rawptr:BotPTR, idx=Raw; !orig=340,288,123,262 !jvms: Test::test @ bci:38 (line 16)

I wonder if this has an impact on machines that do require AlignVector, such as ARM32? A first investigation with [~fgao] suggested that it does not reproduce, since it did not vectorize for some other reason. Maybe the reason is that Unaligned memory accesses are not intrinsified into simple loads / stores?

Versions with JDK19 or older do not vectorize this code anyway, so they are not affected.
Comments
This machinery can then also be used for Aliasing Analysis: fast path and slow path. Or at first just speculatively fast-path only, trap if the condition does not hold and recompile with fast and slow-path.
11-11-2024

I have a plan for this, discussed with [~chagedorn]: I will add a new predicate, and check if the native memory address is alignable. If not -> trap. Later, we can also have an unswitching, so that the failing path can go into a slow-path with a loop that is not vectorized.
11-11-2024

And yes, we have the same issue with MemorySegment. See Test2.java ./java -XX:CompileCommand=compileonly,Test2::test1 -XX:CompileCommand=printcompilation,Test2::* -XX:CompileCommand=TraceAutoVectorization,*::*,PRECONDITIONS,BODY,SW_REJECTIONS -XX:+TraceNewVectors -Xbatch -XX:+TraceLoopOpts -XX:+AlignVector -XX:+VerifyAlignVector Test2.java We vectorize: TraceNewVectors [AutoVectorization]: 2673 Replicate === _ 37 [[ ]] #vectorz<I,16> TraceNewVectors [AutoVectorization]: 2674 LoadVector === 2559 2571 2550 [[ ]] @rawptr:BotPTR, idx=Raw; mismatched #vectorz<I,16> (does not depend only on test, unknown control) TraceNewVectors [AutoVectorization]: 2675 AddVI === _ 2674 2673 [[ ]] #vectorz<I,16> TraceNewVectors [AutoVectorization]: 2676 StoreVector === 2559 2571 2550 2675 [[ ]] @rawptr:BotPTR, idx=Raw; mismatched Memory: @rawptr:BotPTR, idx=Raw; And then hit the Halt node from VerifyAlignVector: # fatal error: DEBUG MESSAGE: verify_vector_alignment found a misaligned vector memory access
11-11-2024

ILW = Crash with VerifyAlignVector in debug VM and possibly with prodcut on platforms requiring strict alignment such as ARM32 on, only with VerifyAlignVector or on ARM32 (not observed, yet), disable VerifyAlignVector or/and AlignVector if possible or disable compilation of affected method = MLM = P4
12-01-2024

I worked from this test: https://github.com/openjdk/jdk/blame/master/test/hotspot/jtreg/compiler/c2/irTests/TestVectorizationMismatchedAccess.java Which was added with JDK-8300258: https://github.com/openjdk/jdk/commit/dc523a58a6ece87e5865bea0342415a969172c77 Hence there is probably some interest in loops over memory allocated with Unsafe.allocateMemory can be vectorized. The question is how much we care about vectorizing with "-XX:+AlignVector"? Do these Unsafe memory accesses even get vectorized on the platforms where AlignVector matters? I see these possible responses: 1) Ignore this failure (Won't fix). This only works if there is no platform where it matters. 2) Disable vectorization for Unsafe memory addresses where we do not have a object base address (e.g. Unsafe.allocateMemory) if "-XX:+AlignVector". 3) Add a runtime check instead, and de-opt and disable vectorization for that method if the check ever fails. Further question: Could similar bugs happen with MemorySegment?
11-01-2024

Digging into the example, we see that the Loads/Stores look as follows: (rr) p mem_ref->dump_bfs(10,0,"#do") dist dump --------------------------------------------- 8 212 AddI === _ 213 29 [[ 205 213 226 231 ]] !orig=164,[124] !jvms: Test::test @ bci:41 (line 14) 8 22 ConI === 0 [[ 213 45 169 ]] #int:0 7 213 Phi === 214 22 212 [[ 211 212 276 ]] #int:0..1073741822:www #tripcount !orig=94 !jvms: Test::test @ bci:11 (line 15) 6 388 ConI === 0 [[ 164 438 ]] #int:16 6 276 CastII === 234 213 [[ 278 ]] #int:0..1073741822:www unconditional dependency 5 164 AddI === _ 454 388 [[ 454 165 173 226 ]] !orig=[124] !jvms: Test::test @ bci:41 (line 14) 5 278 AddI === _ 276 29 [[ 454 ]] !orig=[235] 4 27 ConI === 0 [[ 448 182 211 ]] #int:2 4 454 Phi === 450 278 164 [[ 448 164 ]] #int:1..1073741822:www #tripcount !orig=[366],[307],257,94 !jvms: Test::test @ bci:11 (line 15) 4 3 Start === 3 0 [[ 3 5 6 7 8 9 10 ]] #{0:control, 1:abIO, 2:memory, 3:rawptr:BotPTR, 4:return_address, 5:long, 6:half} 3 448 LShiftI === _ 454 27 [[ 426 429 432 435 438 441 444 447 476 329 301 275 389 360 357 354 ]] !orig=[361],[302],[255],[101] !jvms: Test::test @ bci:18 (line 15) 3 10 Parm === 3 [[ 173 107 ]] Parm0: long !jvms: Test::test @ bci:-1 (line 14) 3 0 Root === 0 158 [[ 0 1 3 22 23 24 27 29 37 58 69 80 128 127 392 391 390 274 388 385 330 328 394 396 467 469 472 475 477 478 479 480 481 482 483 484 ]] 2 29 ConI === 0 [[ 278 116 42 423 417 420 241 178 183 251 207 212 292 295 339 342 345 348 402 405 408 411 414 ]] #int:1 2 424 LoadI === 450 453 425 [[ 423 ]] @rawptr:BotPTR, idx=Raw; unaligned unsafe #int (does not depend only on test, unknown control) !orig=349,296,252,111 !jvms: Test::test @ bci:21 (line 15) 2 426 ConvI2L === _ 448 [[ 425 ]] #long:minint..maxint:www !orig=351,298,254,102 !jvms: Test::test @ bci:19 (line 15) 2 107 CastX2P === _ 10 [[ 108 180 209 253 297 299 350 352 355 358 425 427 430 433 436 439 442 445 ]] !jvms: Test::test @ bci:21 (line 15) 2 1 Con === 0 [[ ]] #top 1 423 AddI === _ 424 29 [[ 422 ]] !orig=348,295,251,116 !jvms: Test::test @ bci:37 (line 16) 1 425 AddP === _ 1 107 426 [[ 422 424 ]] !orig=350,297,253,108 !jvms: Test::test @ bci:21 (line 15) 0 422 StoreI === 450 453 425 423 [[ 419 421 ]] @rawptr:BotPTR, idx=Raw; unaligned unsafe Memory: @rawptr:BotPTR, idx=Raw; !orig=347,294,250,123,265 !jvms: Test::test @ bci:38 (line 16) What is noticable, is that the "425 AddP === _ 1 107 426" has TOP as its base address. And directly takes the unaligned base-address: "107 CastX2P === _ 10". (rr) p mem_ref_p.base()->dump() 1 Con === 0 [[ ]] #top (rr) p mem_ref_p.adr()->dump() 107 CastX2P === _ 10 [[ 108 180 209 253 297 299 350 352 355 358 425 427 430 433 436 439 442 445 ]] !jvms: Test::test @ bci:21 (line 15) In the definition of VPointer: Node* _base; // null if unsafe nonheap reference Node* _adr; // address pointer
11-01-2024