JDK-8371065 : C2 SuperWord: VTransformLoopPhiNode::apply set wrong type, led to wrong constant folding of phi
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 26
  • Priority: P3
  • Status: In Progress
  • Resolution: Unresolved
  • Submitted: 2025-10-31
  • Updated: 2025-11-10
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 26
26Unresolved
Related Reports
Causes :  
Duplicate :  
Description
If the attached test case is run with StressIGVN for a large enough number of iterations, execution fails due to an incorrect result.

What I observe after superword is that, in the vectorized loop, one node is a StoreI  with address of the shape:

(AddP base (AddP base (Phi#1 ...) (Phi#2  ... ) 16)

Phi#2 has constant 4 as input 1, and some expression that depends on some other Phi on the backedge. However, Phi#2's type is set to 4 by superword which causes it to constant fold and the store to always write at some fixed offset.

I reproduce this with:

$ for i in `seq 100`; do echo $i;  ~/jdk-jdk/build/linux-x86_64-server-fastdebug/images/jdk/bin/java -XX:-TieredCompilation -XX:-UseOnStackReplacement -XX:-BackgroundCompilation -XX:CompileOnly=TestBrokenSuperWord::test1 -XX:CompileCommand=dontinline,TestBrokenSuperWord::notInlined -XX:PartialPeelNewPhiDelta=1 -XX:+StressIGVN TestBrokenSuperWord || break; done

Usually needs fewer than 100 iterations to reproduce but not always.
Comments
Attached reduced version Reduced.java of JDK-8371472 (requires UseAVX=2): $ java -XX:CompileCommand=quiet -XX:CompileCommand=compileonly,Reduced::* -XX:-TieredCompilation -Xbatch -XX:+StressIGVN -XX:RepeatCompilation=100 -XX:UseAVX=2 Reduced.java
10-11-2025

A pull request was submitted for review. Branch: master URL: https://git.openjdk.org/jdk/pull/28113 Date: 2025-11-03 15:20:37 +0000
06-11-2025

Yes, it looks like this is the cause, and it is quite clearly wrong: https://github.com/openjdk/jdk/pull/27704/files#diff-6c6ddfc4afe811f5d1eae1e4db638673ff3db0cb58d37bc569a75084a6a484c6R859-R863 We have: 911 AddL === _ 1075 992 [[ 1045 882 ]] !orig=[510] !jvms: TestBrokenSuperWord::test1 @ bci:106 (line 42) 900 ConL === 0 [[ 739 905 907 803 1045 ]] #long:4 1047 CountedLoop === 1047 720 547 [[ 1019 1020 1022 1023 1025 1027 1029 1041 1046 1050 1047 955 1051 716 513 957 959 963 643 842 848 1054 1045 ]] inner stride: 8 main of N1047 strip mined multiversion_fast !orig=[969],[859],[721],[678],[641] !jvms: TestBrokenSuperWord::test1 @ bci:162 (line 49) 1045 Phi === 1047 900 911 [[ 1044 ]] #long:4..3996, widen: 3 !orig=[967],915 And since we get the type of in1, and that can be a constant.... e.g: (rr) p t->dump() long:4 I think I had made the assumption that bottom_type would always return the whole range of the type, so "long" in this case. But that does not hold for constant nodes, it seems. That means that the phi can probably constant fold away. But we don't really want to use the type of in1, just constrain the type by its basic type. Also: we don't really need to change the type if we keep it scalar. But we do need to change the type if we go from scalar to vector, e.g. long -> long vector. --------------------------- Here the "healthy" case: VTransformLoopPhiNode::apply t: long dist dump --------------------------------------------- 3 1069 ConvI2L === _ 1054 [[ 1073 ]] #long:1..991, widen: 3 3 21 ConI === 0 [[ 1028 904 915 910 1071 1038 1108 1073 ]] #int:2 3 897 ConvI2L === _ 823 [[ 910 ]] #long:0..997, widen: 3 2 993 ConL === 0 [[ 1026 919 ]] #long:32 2 1073 LShiftL === _ 1069 21 [[ 919 1075 ]] 2 905 ConL === 0 [[ 806 911 742 916 ]] #long:4 2 910 LShiftL === _ 897 21 [[ 911 920 ]] 1 919 AddL === _ 1073 993 [[ 1049 887 ]] !orig=[510] !jvms: TestBrokenSuperWord::test1 @ bci:106 (line 42) 1 911 AddL === _ 910 905 [[ 887 1049 ]] !orig=[807],[510] !jvms: TestBrokenSuperWord::test1 @ bci:106 (line 42) 0 1049 Phi === 1046 911 919 [[ 1045 ]] #long !orig=[976],[891] And the "broken" case: VTransformLoopPhiNode::apply t: long:4 dist dump --------------------------------------------- 3 21 ConI === 0 [[ 1073 921 896 899 1032 1037 1108 1071 ]] #int:2 3 1065 ConvI2L === _ 1054 [[ 1071 ]] #long:1..991, widen: 3 2 992 ConL === 0 [[ 1034 920 ]] #long:32 2 1071 LShiftL === _ 1065 21 [[ 1075 920 ]] 1 920 AddL === _ 1071 992 [[ 1045 885 ]] !orig=[510] !jvms: TestBrokenSuperWord::test1 @ bci:106 (line 42) 1 897 ConL === 0 [[ 898 900 740 804 1045 ]] #long:4 0 1045 Phi === 1047 897 920 [[ 1044 ]] #long:4 !orig=[967],911 It seems that the graph looks just a little different. Probably something due to StressIGVN. That should probably also not be the case, but I don't know yet.
03-11-2025

[~roland] Thanks for reporting. And thanks [~thartmann] for narrowing it down! I'll have a look now.
03-11-2025

Emanuel, could you please have a look?
03-11-2025

ILW = Incorrect result of C2 compiled code (regression in JDK 26), intermittent but reproducible, disable superword or compilation of affected method = HLM = P3
03-11-2025

Looks like a regression in JDK 26, maybe JDK-8324751, still reproduces with -XX:-UseAutoVectorizationSpeculativeAliasingChecks though. Trying to narrow it down. Narrowed down to JDK-8369448.
03-11-2025