JDK-8358951 : C2: IntVector.fromMemorySegment does not vectorize with byte[] (performance regression)
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 23,24,25,26
  • Priority: P4
  • Status: Open
  • Resolution: Unresolved
  • Submitted: 2025-06-09
  • Updated: 2025-06-10
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbdUnresolved
Description
Please see the attached Test.java

It seems that MemorySegment backed by int[] or native memory do indeed produce vector code. But not with byte[] backed memory. I have not investigated other primitive array types yet.

It could be good to investigate the combination of all Vector types with all backing memory types. Write an IR test and a benchmark for all. To go even further: what happens when you combine different backing memory?

This kind of investigation would be expecially important if we want replace some intrinsics that are currently written with platform specifc assembly, and replace them with Java code that uses MemorySegment and the Vector API.

I ran the experiment like this:

[empeter@emanuel bin]$ ./java -XX:CompileCommand=compileonly,Test::testI -XX:CompileCommand=printcompilation,Test::* -Xbatch -XX:+PrintIdeal Test.java | grep LoadVector
 1388  LoadVector  === 1353 7 1387  |543  [[ 1088 1413 1153 765 1215 1233 887 1136 1122 902 ]]  @int[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=16; mismatched #vectord<I,2> !jvms: ScopedMemoryAccess::loadFromMemorySegmentScopedInternal @ bci:29 (line 358) ScopedMemoryAccess::loadFromMemorySegment @ bci:14 (line 335) IntVector::fromMemorySegment0Template @ bci:33 (line 3564) Int64Vector::fromMemorySegment0 @ bci:3 (line 957) IntVector::fromMemorySegment @ bci:31 (line 3196) Test::testI @ bci:10 (line 28)
[empeter@emanuel bin]$ ./java -XX:CompileCommand=compileonly,Test::testB -XX:CompileCommand=printcompilation,Test::* -Xbatch -XX:+PrintIdeal Test.java | grep LoadVector
[empeter@emanuel bin]$ ./java -XX:CompileCommand=compileonly,Test::testN -XX:CompileCommand=printcompilation,Test::* -Xbatch -XX:+PrintIdeal Test.java | grep LoadVector
 1237  LoadVector  === 1205 7 1235  [[ 814 1263 1049 692 1063 1080 1015 829 ]]  @rawptr:BotPTR, idx=Raw; mismatched #vectord<I,2> (does not depend only on test, raw access) !jvms: ScopedMemoryAccess::loadFromMemorySegmentScopedInternal @ bci:29 (line 358) ScopedMemoryAccess::loadFromMemorySegment @ bci:14 (line 335) IntVector::fromMemorySegment0Template @ bci:33 (line 3564) Int64Vector::fromMemorySegment0 @ bci:3 (line 957) IntVector::fromMemorySegment @ bci:31 (line 3196) Test::testN @ bci:10 (line 38)
[empeter@emanuel bin]$ ./java -XX:CompileCommand=compileonly,Test::test* -XX:CompileCommand=printcompilation,Test::* -Xbatch -XX:+PrintIdeal Test.java | grep LoadVector
 1196  LoadVector  === 1193 1194 1191  [[ 809 1230 1042 1068 1048 672 1028 994 794 ]]  @rawptr:BotPTR, idx=Raw; mismatched #vectord<I,2> (does not depend only on test, raw access) !jvms: ScopedMemoryAccess::loadFromMemorySegmentScopedInternal @ bci:29 (line 358) ScopedMemoryAccess::loadFromMemorySegment @ bci:14 (line 335) IntVector::fromMemorySegment0Template @ bci:33 (line 3564) Int64Vector::fromMemorySegment0 @ bci:3 (line 957) IntVector::fromMemorySegment @ bci:31 (line 3196) Test::testI @ bci:10 (line 28)
 1196  LoadVector  === 1193 1194 1191  [[ 809 1230 1042 1068 1048 672 1028 994 794 ]]  @rawptr:BotPTR, idx=Raw; mismatched #vectord<I,2> (does not depend only on test, raw access) !jvms: ScopedMemoryAccess::loadFromMemorySegmentScopedInternal @ bci:29 (line 358) ScopedMemoryAccess::loadFromMemorySegment @ bci:14 (line 335) IntVector::fromMemorySegment0Template @ bci:33 (line 3564) Int64Vector::fromMemorySegment0 @ bci:3 (line 957) IntVector::fromMemorySegment @ bci:31 (line 3196) Test::testB @ bci:10 (line 33)
 1196  LoadVector  === 1193 1194 1191  [[ 809 1230 1042 1068 1048 672 1028 994 794 ]]  @rawptr:BotPTR, idx=Raw; mismatched #vectord<I,2> (does not depend only on test, raw access) !jvms: ScopedMemoryAccess::loadFromMemorySegmentScopedInternal @ bci:29 (line 358) ScopedMemoryAccess::loadFromMemorySegment @ bci:14 (line 335) IntVector::fromMemorySegment0Template @ bci:33 (line 3564) Int64Vector::fromMemorySegment0 @ bci:3 (line 957) IntVector::fromMemorySegment @ bci:31 (line 3196) Test::testN @ bci:10 (line 38)
[empeter@emanuel bin]$ ./java -XX:CompileCommand=compileonly,Test::testI -XX:CompileCommand=printcompilation,Test::* -Xbatch -XX:+PrintIdeal Test.java | grep StoreVector
 1039  ConP  === 0  [[ 1088 1233 1215 1153 1136 1122 ]]  #jdk/incubator/vector/IntVector$$Lambda+0x000000000a0fa4c8 (jdk/internal/vm/vector/VectorSupport$StoreVectorOperation):exact *  Oop:jdk/incubator/vector/IntVector$$Lambda+0x000000000a0fa4c8 (jdk/internal/vm/vector/VectorSupport$StoreVectorOperation):exact *
 1413  StoreVector  === 1313 7 1410 1388  |1212  [[ 1412 ]]  @int[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=16; mismatched  Memory: @int[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=16; !jvms: ScopedMemoryAccess::storeIntoMemorySegmentScopedInternal @ bci:29 (line 441) ScopedMemoryAccess::storeIntoMemorySegment @ bci:15 (line 418) IntVector::intoMemorySegment0 @ bci:32 (line 3663) IntVector::intoMemorySegment @ bci:44 (line 3443) Test::testI @ bci:22 (line 29)
[empeter@emanuel bin]$ ./java -XX:CompileCommand=compileonly,Test::testB -XX:CompileCommand=printcompilation,Test::* -Xbatch -XX:+PrintIdeal Test.java | grep StoreVector
 1039  ConP  === 0  [[ 1268 1233 1215 1153 1136 1122 1088 ]]  #jdk/incubator/vector/IntVector$$Lambda+0x000000006d0fa4c8 (jdk/internal/vm/vector/VectorSupport$StoreVectorOperation):exact *  Oop:jdk/incubator/vector/IntVector$$Lambda+0x000000006d0fa4c8 (jdk/internal/vm/vector/VectorSupport$StoreVectorOperation):exact *
[empeter@emanuel bin]$ ./java -XX:CompileCommand=compileonly,Test::testN -XX:CompileCommand=printcompilation,Test::* -Xbatch -XX:+PrintIdeal Test.java | grep StoreVector
  966  ConP  === 0  [[ 1015 1080 1063 1049 ]]  #jdk/incubator/vector/IntVector$$Lambda+0x000000001c0fa4c8 (jdk/internal/vm/vector/VectorSupport$StoreVectorOperation):exact *  Oop:jdk/incubator/vector/IntVector$$Lambda+0x000000001c0fa4c8 (jdk/internal/vm/vector/VectorSupport$StoreVectorOperation):exact *
 1263  StoreVector  === 1169 7 1259 1237  [[ 1262 ]]  @rawptr:BotPTR, idx=Raw; mismatched  Memory: @rawptr:BotPTR, idx=Raw; !jvms: ScopedMemoryAccess::storeIntoMemorySegmentScopedInternal @ bci:29 (line 441) ScopedMemoryAccess::storeIntoMemorySegment @ bci:15 (line 418) IntVector::intoMemorySegment0 @ bci:32 (line 3663) IntVector::intoMemorySegment @ bci:44 (line 3443) Test::testN @ bci:22 (line 39)
[empeter@emanuel bin]$ ./java -XX:CompileCommand=compileonly,Test::test* -XX:CompileCommand=printcompilation,Test::* -Xbatch -XX:+PrintIdeal Test.java | grep StoreVector
  946  ConP  === 0  [[ 1068 994 1048 1042 1028 ]]  #jdk/incubator/vector/IntVector$$Lambda+0x00000000220fa4c8 (jdk/internal/vm/vector/VectorSupport$StoreVectorOperation):exact *  Oop:jdk/incubator/vector/IntVector$$Lambda+0x00000000220fa4c8 (jdk/internal/vm/vector/VectorSupport$StoreVectorOperation):exact *
 1230  StoreVector  === 1225 1226 1223 1196  [[ 1229 ]]  @rawptr:BotPTR, idx=Raw; mismatched  Memory: @rawptr:BotPTR, idx=Raw; !jvms: ScopedMemoryAccess::storeIntoMemorySegmentScopedInternal @ bci:29 (line 441) ScopedMemoryAccess::storeIntoMemorySegment @ bci:15 (line 418) IntVector::intoMemorySegment0 @ bci:32 (line 3663) IntVector::intoMemorySegment @ bci:44 (line 3443) Test::testI @ bci:22 (line 29)
  946  ConP  === 0  [[ 1068 994 1048 1042 1028 ]]  #jdk/incubator/vector/IntVector$$Lambda+0x00000000220fa4c8 (jdk/internal/vm/vector/VectorSupport$StoreVectorOperation):exact *  Oop:jdk/incubator/vector/IntVector$$Lambda+0x00000000220fa4c8 (jdk/internal/vm/vector/VectorSupport$StoreVectorOperation):exact *
 1230  StoreVector  === 1225 1226 1223 1196  [[ 1229 ]]  @rawptr:BotPTR, idx=Raw; mismatched  Memory: @rawptr:BotPTR, idx=Raw; !jvms: ScopedMemoryAccess::storeIntoMemorySegmentScopedInternal @ bci:29 (line 441) ScopedMemoryAccess::storeIntoMemorySegment @ bci:15 (line 418) IntVector::intoMemorySegment0 @ bci:32 (line 3663) IntVector::intoMemorySegment @ bci:44 (line 3443) Test::testB @ bci:22 (line 34)
  946  ConP  === 0  [[ 1068 994 1048 1042 1028 ]]  #jdk/incubator/vector/IntVector$$Lambda+0x00000000220fa4c8 (jdk/internal/vm/vector/VectorSupport$StoreVectorOperation):exact *  Oop:jdk/incubator/vector/IntVector$$Lambda+0x00000000220fa4c8 (jdk/internal/vm/vector/VectorSupport$StoreVectorOperation):exact *
 1230  StoreVector  === 1225 1226 1223 1196  [[ 1229 ]]  @rawptr:BotPTR, idx=Raw; mismatched  Memory: @rawptr:BotPTR, idx=Raw; !jvms: ScopedMemoryAccess::storeIntoMemorySegmentScopedInternal @ bci:29 (line 441) ScopedMemoryAccess::storeIntoMemorySegment @ bci:15 (line 418) IntVector::intoMemorySegment0 @ bci:32 (line 3663) IntVector::intoMemorySegment @ bci:44 (line 3443) Test::testN @ bci:22 (line 39)
Comments
From this, I would suspect that JDK-8329555 caused a JDK23 regression (see Test2.java), but there is also a regression during JDK24 additionally (see Test.java and Test3.java).
10-06-2025

Things are getting even more complicated. I now wrote a "Test2.java". And this one only runs the Byte code, and not also the others. And this has a different behavior! And it seems to point back to something between JDK23-b18 and JDK23-b21. I think it comes from here: JDK-8329555 integrated in JDK23-b20. https://github.com/openjdk/jdk/commit/80b381e91bb649e440321a440ce641a54f89dfb4 How I run it: /home/empeter/Documents/oracle/jdk-23-ea+21/fastdebug/bin/java --add-modules=jdk.incubator.vector Test2.java ... Exception in thread "main" java.lang.RuntimeException: Test failed: not contains LoadVector. at Test2.runInSeparateVM(Test2.java:40) at Test2.main(Test2.java:63) vs /home/empeter/Documents/oracle/jdk-23-ea+18/fastdebug/bin/java --add-modules=jdk.incubator.vector Test2.java ... Passed: found LoadVector in PrintIdeal output. Note: Test2.java is now ONLY running an IntVector over a byte[]. In Test.java we also run the IntVector over byte[], int[] and native memory. That seems to have a different effect, maybe because of profiling. We should investigate both! For this, I also wrote a Test3.java, which works like Test.java. Here, the regression seems to be between JDK23 and JDK24: /home/empeter/Documents/oracle/jdk-23.0.2/fastdebug/bin/java --add-modules=jdk.incubator.vector Test3.java ... Passed: found LoadVector in PrintIdeal output. /home/empeter/Documents/oracle/jdk-24.0.2/fastdebug/bin/java --add-modules=jdk.incubator.vector Test3.java ... Exception in thread "main" java.lang.RuntimeException: Test failed: not contains LoadVector. at Test3.runInSeparateVM(Test3.java:40) at Test3.main(Test3.java:81)
10-06-2025

[~dlong] I just found that this used to vectorize in JDK22, so it seems it is indeed a performance regression! [empeter@emanuel bin]$ /home/empeter/Documents/oracle/jdk-22.0.2/fastdebug/bin/java --add-modules=jdk.incubator.vector -XX:CompileCommand=compileonly,Test::testB -XX:CompileCommand=printcompilation,Test::* -Xbatch -XX:+PrintIdeal Test.java | grep LoadVector WARNING: Using incubator modules: jdk.incubator.vector 1197 LoadVector === 1194 1195 1192 [[ 810 1231 1043 1069 1049 673 1029 995 795 ]] @rawptr:BotPTR, idx=Raw; mismatched #vectord[2]:{int} (does not depend only on test, raw access) !jvms: ScopedMemoryAccess::loadFromMemorySegmentScopedInternal @ bci:29 (line 357) ScopedMemoryAccess::loadFromMemorySegment @ bci:14 (line 334) IntVector::fromMemorySegment0Template @ bci:33 (line 3505) Int64Vector::fromMemorySegment0 @ bci:3 (line 881) IntVector::fromMemorySegment @ bci:31 (line 3140) Test::testB @ bci:10 (line 33) It also still vectorizes with latest JDK23. [empeter@emanuel bin]$ /home/empeter/Documents/oracle/jdk-23.0.2/fastdebug/bin/java --add-modules=jdk.incubator.vector -XX:CompileCommand=compileonly,Test::testB -XX:CompileCommand=printcompilation,Test::* -Xbatch -XX:+PrintIdeal Test.java | grep LoadVector WARNING: Using incubator modules: jdk.incubator.vector 1197 LoadVector === 1194 1195 1192 [[ 810 1231 1043 1069 1049 673 1029 995 795 ]] @rawptr:BotPTR, idx=Raw; mismatched #vectord[2]:{int} (does not depend only on test, raw access) !jvms: ScopedMemoryAccess::loadFromMemorySegmentScopedInternal @ bci:29 (line 357) ScopedMemoryAccess::loadFromMemorySegment @ bci:14 (line 334) IntVector::fromMemorySegment0Template @ bci:33 (line 3505) Int64Vector::fromMemorySegment0 @ bci:3 (line 881) IntVector::fromMemorySegment @ bci:31 (line 3140) Test::testB @ bci:10 (line 33) But it does NOT vectorize with latest JDK24: [empeter@emanuel bin]$ /home/empeter/Documents/oracle/jdk-24.0.2/fastdebug/bin/java --add-modules=jdk.incubator.vector -XX:CompileCommand=compileonly,Test::testB -XX:CompileCommand=printcompilation,Test::* -Xbatch -XX:+PrintIdeal Test.java | grep LoadVector WARNING: Using incubator modules: jdk.incubator.vector We should try to bisect this, to find the cause!
10-06-2025

Should we change this to a performance bug, or leave as an enhancement?
09-06-2025

I assigned this to myself so I won't forget it. But I'd be more than happy if someone else volunteers for this!
09-06-2025