JDK-8370405 : C2: mismatched store from MergeStores wrongly scalarized in allocation elimination
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 23
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2025-10-22
  • Updated: 2025-12-17
  • Resolved: 2025-11-03
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 25 JDK 26
25.0.3Fixed 26 b23Fixed
Related Reports
Blocks :  
Causes :  
Causes :  
Relates :  
Relates :  
Relates :  
Relates :  
Description
Found by Olivier Mattmann <olivier.mattmann@bluewin.ch> during work on this Master thesis where he is working on a fuzzer for C2.

The attached test case (Test.java) fails due to the compiled method Test.micro2 returning a wrong result, starting with commit 3ccb64c0 (JDK-8318446) with MergeStores enabled, but not with MergeStores disabled. The failing test case is derived from compiler/escapeAnalysis/Test8331033.java (https://github.com/openjdk/jdk/blob/27c83c730d8b0f87bb51230c35e4fe261c9d2723/test/hotspot/jtreg/compiler/escapeAnalysis/Test8331033.java)
Comments
[jdk25u-fix-request] Approval Request from Aleksey Shipilëv Fixes JDK 23 regression that will show up on JDK 25 as correctness bug. The fix is fairly new, but it is simple and conservative: blocks scalar replacement when mismatches stores are detected. New regression test fails without the fix, passes with it. All other tests pass. Risk is medium: touches C2, but in fairly conservative manner.
28-11-2025

A pull request was submitted for review. Branch: master URL: https://git.openjdk.org/jdk25u-dev/pull/10 Date: 2025-11-26 10:15:17 +0000
27-11-2025

Changeset: 09a047f0 Branch: master Author: Emanuel Peter <epeter@openjdk.org> Date: 2025-11-03 06:55:32 +0000 URL: https://git.openjdk.org/jdk/commit/09a047f00c88d14505c42a966dedbc87b9be5bdf
03-11-2025

A pull request was submitted for review. Branch: master URL: https://git.openjdk.org/jdk/pull/27997 Date: 2025-10-27 10:40:18 +0000
29-10-2025

Looked a bit closer at what happens during deopt / rematerialization: #0 Deoptimization::reassign_type_array_elements (fr=0x7f3dbbb20a10, reg_map=0x7f3dbbb20ad0, sv=0x7f3db44579c8, obj=0x6230a7ac8, type=T_INT) at /home/empeter/Documents/oracle/jdk-fork2/open/src/hotspot/share/runtime/deoptimization.cpp:1360 #1 0x00007f3dbc75ddf1 in Deoptimization::reassign_fields (fr=0x7f3dbbb20a10, reg_map=0x7f3dbbb20ad0, objects=0x7f3db4457b88, realloc_failures=false, is_jvmci=false) at /home/empeter/Documents/oracle/jdk-fork2/open/src/hotspot/share/runtime/deoptimization.cpp:1618 #2 0x00007f3dbc7584fd in rematerialize_objects (thread=0x7f3db40308c0, exec_mode=2, compiled_method=0x7f3db887b188, deoptee=..., map=..., chunk=0x7f3db402f2d8, deoptimized_objects=@0x7f3dbbb20a90: false) at /home/empeter/Documents/oracle/jdk-fork2/open/src/hotspot/share/runtime/deoptimization.cpp:381 #3 0x00007f3dbc759529 in Deoptimization::fetch_unroll_info_helper (current=0x7f3db40308c0, exec_mode=2) at /home/empeter/Documents/oracle/jdk-fork2/open/src/hotspot/share/runtime/deoptimization.cpp:534 #4 0x00007f3dbc7621ee in Deoptimization::uncommon_trap (current=0x7f3db40308c0, trap_request=-187, exec_mode=2) at /home/empeter/Documents/oracle/jdk-fork2/open/src/hotspot/share/runtime/deoptimization.cpp:2638 (rr) p sv->print_fields_on(tty) Fields: 0, 68719476737, 1048576, 256, 16777216 2:08 seems to merge the (padding?) zero with the long. 2:09 (rr) p obj->print_on(tty) [I {0x00000006230a7ac8} - klass: {type array int} - flags: is_cloneable_fast - length: 5 - 0: 0x1 1 - 1: 0x10 16 - 2: 0x100000 1048576 - 3: 0x100 256 - 4: 0x1000000 16777216 It seems that there is code in Deoptimization::reassign_type_array_elements that looks ahead, and if the next element is a long, it "merges" that into the current (int) and next (long) slow. That is how we loose the zero element, and replace it with a 1. Of course the input values (sv with the long entry) are alread not correct, but that is a separate question. We should probably have some assert for that. For now, I'm wondering why we do the merging in Deoptimization::reassign_type_array_elements.
27-10-2025

[~qamai] Delaying could be a good idea. That would give Allocation elimination a chance to happen, before MergeStores prevents it. And yes: we are definitively missing some detection during scalar replacement, even just an assert would have helped. We could probably also do some type mismatch detection at the deopt point, where we materialize the array.
24-10-2025

In Valhalla, I added a piece of code to detect mismatch accesses during scalar replacement (instead of during analysis) which may be able to prevent this case: https://github.com/openjdk/valhalla/blob/60af17ff5995cfa5de075332355f7f475c163865/src/hotspot/share/opto/macro.cpp#L709
23-10-2025

Having a quick look what might be wrong. These reproduce: - java Test.java - java -XX:CompileCommand=compileonly,Test::* -XX:CompileCommand=printcompilation,Test::* -XX:CompileCommand=TraceMergeStores,Test::*,BASIC -XX:+MergeStores Test.java These do not fail: - java -Xint Test.java - java -XX:-MergeStores Test.java Here an example run: java -XX:CompileCommand=compileonly,Test::* -XX:CompileCommand=printcompilation,Test::* -XX:CompileCommand=TraceMergeStores,Test::*,SUCCESS -XX:+MergeStores Test.java CompileCommand: compileonly Test.* bool compileonly = true CompileCommand: PrintCompilation Test.* bool PrintCompilation = true CompileCommand: TraceMergeStores Test.* const char* TraceMergeStores = 'SUCCESS' 4182 97 ! 3 Test::micro2 (99 bytes) 4183 98 % ! 4 Test::micro2 @ 37 (99 bytes) 4190 99 ! 4 Test::micro2 (99 bytes) [TraceMergeStores]: Replace 90 StoreI === 85 60 89 88 [[ 93 ]] @int[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=6; Memory: @int[int:4] (java/lang/Cloneable,java/io/Serializable):NotNull:exact[0] *,iid=45, idx=9; !jvms: Test::micro2 @ bci:22 (line 5) 93 StoreI === 85 90 92 88 [[ 576 ]] @int[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=6; Memory: @int[int:4] (java/lang/Cloneable,java/io/Serializable):NotNull:exact[1] *,iid=45, idx=9; !orig=[160] !jvms: Test::micro2 @ bci:26 (line 6) [TraceMergeStores]: with 731 ConL === 0 [[ 732 ]] #long:21474836485 732 StoreL === 85 60 89 731 [[ ]] @int[int:4] (java/lang/Cloneable,java/io/Serializable):NotNull:exact[1] *,iid=45, idx=9; mismatched Memory: @int[int:4] (java/lang/Cloneable,java/io/Serializable):NotNull:exact[0] *,iid=45, idx=9; 4197 100 ! 4 Test::micro2 (99 bytes) [TraceMergeStores]: Replace 90 StoreI === 85 60 89 88 [[ 93 ]] @int[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=6; Memory: @int[int:4] (java/lang/Cloneable,java/io/Serializable):NotNull:exact[0] *,iid=45, idx=9; !jvms: Test::micro2 @ bci:22 (line 5) 93 StoreI === 85 90 92 88 [[ 559 ]] @int[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=6; Memory: @int[int:4] (java/lang/Cloneable,java/io/Serializable):NotNull:exact[1] *,iid=45, idx=9; !orig=[160] !jvms: Test::micro2 @ bci:26 (line 6) [TraceMergeStores]: with 748 ConL === 0 [[ 749 ]] #long:21474836485 749 StoreL === 85 60 89 748 [[ ]] @int[int:4] (java/lang/Cloneable,java/io/Serializable):NotNull:exact[1] *,iid=45, idx=9; mismatched Memory: @int[int:4] (java/lang/Cloneable,java/io/Serializable):NotNull:exact[0] *,iid=45, idx=9; 4210 101 % 3 Test::main @ 4 (43 bytes) 4212 102 3 Test::main (43 bytes) Exception in thread "main" java.lang.RuntimeException: Unexpected result: 999980 at Test.main(Test.java:27) But these logs look ok. We just combine two int-stores to a long-store... Issue: - We merge the StoreI to a StoreL. - later we eliminate the allocation of the array, and scalar replace with the stored constants, see PhaseMacroExpand::scalar_replacement. - But we don't handle the merged StoreL right, and don't get the right values that way. There was a question why we need this line: synchronized(Test.class) {} I suspect one reason is that the StoreI don't "drift" into the allocation/initialization and become raw stores - otherwise we don't MergeStore them. Note: we can also just replace it with a call that is not inlined. It also keeps the stores separate from the allocate/initialize. Having a longer look, and it seems that the long-constant just pushes out all other values, and the last entry drops out from the fields/array elements. I also tried to use an unsafe StoreL. It is marked as mismatched, just like the StoreL produced from MergeStores. But: it seems that we already detect the mismatched StoreL during analysis, and so we mark the Allocation as NSR (non scalar replacable): +++++ Initial worklist for static jint Test.test(jboolean) (ea_inv=0) ... JavaObject(4) NoEscape(NoEscape) [ [ 58 ]] 46 AllocateArray === 5 6 7 8 1 (35 23 28 22 42 10 1 1 ) [[ 47 48 49 56 57 58 ]] rawptr:NotNull ( int:>=0, java/lang/Object:NotNull *, bool, int, bool ) allocationKlass:[I Test::test @ bci:1 (line 9) !jvms: Test::test @ bci:1 (line 9) ... +++++ Calculating escape states and scalar replaceability ... JavaObject(4) NoEscape(NoEscape) is NSR. is used in LoadStore or mismatched access This prevents the scalarization. But with the array stores, we do the analysis and only see array stores that are NOT mismatched. After the analysis, we do MergeStores, and insert the mismatched StoreL. But the analysis is not run again. And so when we finally try allocation elimination, we still think it is scalar replacable: +++++ Initial worklist for static jint Test.test(jboolean) (ea_inv=0) ... JavaObject(4) NoEscape(NoEscape) [ [ 58 ]] 46 AllocateArray === 5 6 7 8 1 (35 23 28 22 42 10 1 1 ) [[ 47 48 49 56 57 58 ]] rawptr:NotNull ( int:>=0, java/lang/Object:NotNull *, bool, int, bool ) allocationKlass:[I Test::test @ bci:1 (line 9) !jvms: Test::test @ bci:1 (line 9) ... NotScalar (Field load) 63 CheckCastPP === 60 58 [[ 126 67 67 74 74 81 81 258 89 89 94 188 177 166 155 144 ]] #int[int:4] (java/lang/Cloneable,java/io/Serializable):NotNull:exact *,iid=46 !jvms: Test::test @ bci:1 (line 9) >>>> 224 LoadI === 220 208 67 [[ 230 ]] @int[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=6; #int !jvms: Test::test @ bci:73 (line 28) Loop: N0/N0 has_call has_sfpt Loop: N256/N255 limit_check short_running profile_predicated predicated auto_vectorization_check_predicate sfpts={ 258 } Loop: N257/N219 limit_check short_running profile_predicated predicated auto_vectorization_check_predicate counted [0,10000),+1 (-1 iters) has_sfpt strip_mined Empty with zero trip guard Loop: N257/N219 limit_check short_running profile_predicated predicated auto_vectorization_check_predicate counted [0,10000),+1 (9992 iters) has_sfpt strip_mined [TraceMergeStores]: Replace 114 StoreI === 101 61 67 42 [[ 116 ]] @int[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=6; Memory: @int[int:4] (java/lang/Cloneable,java/io/Serializable):NotNull:exact[0] *,iid=46, idx=9; !jvms: Test::test @ bci:31 (line 17) 116 StoreI === 101 114 74 29 [[ 98 208 ]] @int[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=6; Memory: @int[int:4] (java/lang/Cloneable,java/io/Serializable):NotNull:exact[1] *,iid=46, idx=9; !orig=[197] !jvms: Test::test @ bci:36 (line 18) [TraceMergeStores]: with 277 ConL === 0 [[ 278 ]] #long:68719476737 278 StoreL === 101 61 67 277 [[ ]] @int[int:4] (java/lang/Cloneable,java/io/Serializable):NotNull:exact[1] *,iid=46, idx=9; mismatched Memory: @int[int:4] (java/lang/Cloneable,java/io/Serializable):NotNull:exact[0] *,iid=46, idx=9; Scalar 63 CheckCastPP === 60 58 [[ 126 67 67 89 89 94 ]] #int[int:4] (java/lang/Cloneable,java/io/Serializable):NotNull:exact *,iid=46 !jvms: Test::test @ bci:1 (line 9) ++++ Eliminated: 46 AllocateArray It seems that we do not check if a mismatched store has reappeared. We probably just don't expect that.
23-10-2025

It's interesting that MergeStore interferes with allocation elimination. Given that MergeStore produces mismatched accesses which are likely to impede other optimizations. A possible approach is to delay merge stores to after macro expansion.
22-10-2025

ILW = Wrong execution of C2 compiled code, reproducible with single generated test, -XX:-MergeStores = HLM = P3
22-10-2025