JDK-8288180 : C2: VectorPhase must ensure that SafePointNode memory input is a MergeMemNode
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 16,17,18,19,20
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2022-06-10
  • Updated: 2022-09-19
  • Resolved: 2022-09-13
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 20
20 b15Fixed
Description
Running:
test/hotspot/jtreg/compiler/vectorapi/TestLoopStoreVector.java

With extra Flag:
-XX:+StressReflectiveCode

#  Internal Error (/home/emanuel/Documents/fork3-jdk/open/src/hotspot/share/opto/graphKit.hpp:516), pid=2509883, tid=2509899
#  assert(mem->is_MergeMem()) failed: parse memory is always pre-split
#
# JRE version: Java(TM) SE Runtime Environment (19.0) (fastdebug build 19-internal-2022-06-02-1355549.emanuel...)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 19-internal-2022-06-02-1355549.emanuel..., mixed mode, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0xec1f94]  GraphKit::merged_memory()+0x64

Reproduce with JTreg:
~/Documents/jtreg/build/images/jtreg/bin/jtreg -va -s -jdk:/home/emanuel/Documents/fork3-jdk/build/linux-x64-debug/jdk/ -javaoptions:"-XX:+StressReflectiveCode" /home/emanuel/Documents/fork3-jdk/open/test/hotspot/jtreg/compiler/vectorapi/TestLoopStoreVector.java

Replay file should work:
./java -XX:+ReplayCompiles -XX:+ReplayIgnoreInitErrors -XX:ReplayDataFile=/home/emanuel/Documents/fork3-jdk/build/linux-x64-debug/jdk/bin/JTwork/scratch/replay_pid2513889.log -XX:+StressReflectiveCode


Probably the same bug, with the same extra flag (did reproduce locally on my ubuntu):
compiler/vectorapi/reshape/TestVectorReinterpret.java
compiler/vectorapi/reshape/TestVectorCastAVX1.java
compiler/vectorapi/VectorReinterpretTest.java
compiler/vectorapi/VectorMemoryAlias.java
compiler/vectorapi/VectorMaskLoadStoreTest.java
compiler/vectorapi/VectorMaskCastTest.java
compiler/vectorapi/VectorCastShape64Test.java
compiler/vectorapi/VectorCastShape128Test.java
compiler/vectorapi/TestVectorShuffleIotaByte.java
compiler/vectorapi/TestVectorShuffleIota.java
compiler/vectorapi/TestVectorShiftImm.java
compiler/vectorapi/TestNoInline.java
compiler/vectorapi/TestMaskedMacroLogicVector.java
compiler/vectorapi/TestLongVectorNeg.java
compiler/vectorapi/Test8259353.java

This may only reproduce on linux-aarch64-debug and macosx-aarch64-debug, did not even run on my ubuntu (no tests selected):
compiler/vectorapi/reshape/TestVectorCastNeon.java
compiler/vectorapi/VectorReplicateLongSpecialImmTest.java   (requires aarch64 in test)
Comments
Changeset: 6f2223fa Author: Emanuel Peter <epeter@openjdk.org> Date: 2022-09-13 13:14:05 +0000 URL: https://git.openjdk.org/jdk/commit/6f2223faa170a800f76a54a6637c160eadab6232
13-09-2022

A pull request was submitted for review. URL: https://git.openjdk.org/jdk/pull/10215 Date: 2022-09-08 09:08:50 +0000
12-09-2022

diff --git a/src/hotspot/share/opto/vector.cpp b/src/hotspot/share/opto/vector.cpp index f96076eb37e..9ab0d9fe910 100644 --- a/src/hotspot/share/opto/vector.cpp +++ b/src/hotspot/share/opto/vector.cpp @@ -163,6 +163,17 @@ static JVMState* clone_jvms(Compile* C, SafePointNode* sfpt) { for (uint i = 0; i < size; i++) { map->init_req(i, sfpt->in(i)); } + Node* mem = map->memory(); + if (!mem->is_MergeMem()) { + // Since we are not in parsing, the SafeNode does not guarantee that the memory + // input is necessarily a MergeMemNode. But we need to ensure that there is that + // MereMemNode, since the GraphKit assumes the memory input of the map to be a + // MergeMemNode, so that it can directly access the memory slices. + PhaseGVN& gvn = *C->initial_gvn(); + Node* mergemem = MergeMemNode::make(mem); + gvn.set_type_bottom(mergemem); + map->set_memory(mergemem); + } new_jvms->set_map(map); return new_jvms; }
08-09-2022

The GraphKit seems to assume that the memory input of the map (SafePointNode) is always a MergeMemNode. It requires this so that it can easily access the memory slices. However, the VectorPhase also generates some GraphKit instances, for example in PhaseVector::expand_vbox_alloc_node. However, at that point we are not in parsing, and the SafePointNode might have a folded memory state (not MergeMemNode). The assert in GrahpKit::merged_memory can thus be triggered. In this particular failure case, the SafePointNode was constructed with memory-phi, which was the result of a previous GraphKit::reset_memory call, which folded the memory (the MergeMemNode had only one input, the memory-phi). This on its own does not necessarily trigger our assert. In many cases, the new GraphKit first transforms the memory input and calls GraphKit::set_all_memory, which makes sure there is a MergeMemNode. But in our failure case, this never happens. Suggested Solution: VectorPhase must ensure that the map's memory input is MergeMemNode. We can do this in clone_jvms, which is called before we instanciate the GraphKit.
08-09-2022

I am now running the failing run with -XX:+StressReflectiveCode, and a comparison without that flag. The assert is triggered in GraphKit::merged_memory GraphKit::memory GraphKit::make_load GraphKit::make_load GraphKit::get_layout_helper GraphKit::new_array At the end of GraphKit::get_layout_helper we create a load (to get the layout helper), which calls all the way down, and asks for the merged_memory - this is not yet ready as not set_all_memory has been called yet. On the other hand, without StressReflectiveCode (there is a check for it in GraphKit::get_layout_helper), we actually do the logic, find out that we have a T_LONG array, and can fetch the constant layout helper. We return NULL, and do not generate a load, and so do not call the merged_memory before we reset_memory and set_all_memory later down in GraphKit::new_array.
07-09-2022

It is just difficult to see what the implicit assumptions are of the code. Should there always be a MergeMemNode as memory input to SafePoint - then we can always call merged_memory. Or is that not always supposed to be true, and merged_memory should not always be called?
06-09-2022

4) backtrace is now in Parse::Parse, whereas 1,2,3 were in Compile::Optimize, Compile::inline_incrementally 251 CallStaticJava === 239 6 7 8 1 (68 131 162 11 179 1 11 176 1 69 249 1 1 1 1 1 1 1 ) [[ ... ]] # Static 265 Proj === 251 [[ ... ]] #2 Memory mem: 18 MergeMem === _ 1 265 1 [[ 14 ]] { - } Memory _map: 14 SafePoint === 15 264 18 8 9 19 1 1 [[ ]] SafePoint 5) same state. 6) even a different GraphKit ! 251 CallStaticJava === 239 6 7 8 1 (68 ... ) [[ ... ]] # Static 265 Proj === 251 mem: 29 MergeMem === _ 1 265 1 [[ 25 ]] { - } Memory _map: 25 SafePoint === 26 264 29 8 9 1 1 1 304 1 1 1 [[ ]] SafePoint Yeah, also in this case, we have the map set, and then the memory is set via set_all_memory. I think that this is how it should be done.
06-09-2022

Here some examples. tracing merged_memory. 1) the broken case mem: 347 Proj === 345 [[ 350 349 381 ]] #2 Memory _map is set in constructor to: 381 SafePoint === 346 6 347 8 1 1 1 1 1 1 1 1 68 131 162 11 179 1 11 176 1 69 249 [[ ]] SafePoint And the memory input to SafePoint is set in constructor, cloned from: 350 VectorBoxAllocate === 346 6 347 8 1 (1 1 1 1 1 1 1 68 131 162 11 179 1 11 176 1 69 249 ) [[ 351 352 353 365 362 363 ]] # Static And here, the memory input was set at construction, in GraphKit::set_edges_for_java_call via reset_memory(). And in there, we originally had 348 MergeMem === _ 1 347 1 [[ ]] { - } Memory which then is put through the PhaseGVN::transform_no_reclaim, and unpacked because it has only one memory input. 2) mem: 348 MergeMem === _ 1 347 1 [[ 338 ]] { - } Memory _map: 338 SafePoint === 346 6 348 8 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 68 131 162 11 179 1 11 176 1 69 249 1 [[ ]] SafePoint 7 Parm === 3 [[ 251 244 213 145 51 84 339 340 345 ]] Memory 345 MemBarCPUOrder === 332 1 7 1 1 [[ 346 347 ]] 347 Proj === 345 [[ 348 ]] #2 Memory 348 MergeMem === _ 1 347 1 [[ ]] { - } Memory set_all_memory_call, called from insert_mem_bar, from LibraryCallKit::inline_vector_mem_operation. So this inserts the MergeMem, the Proj, and MemBarCPUOrder. insert_mem_bar calls reset_memory(), which unpacks the previous MergeMem, to just be 7 Param. And then 7 Param is attached as memory input to MemBarCPUOrder, which this far has nullptr memory input (expected, as it is to be replaced with reset_memory()) 3) mem: 339 MergeMem === _ 1 7 1 [[ 338 ]] { - } Memory _map: 338 SafePoint === 332 6 339 8 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 68 131 162 11 179 1 11 176 1 69 249 1 [[ ]] SafePoint We copy the _map from the previous jvms - we create LibraryCallKit, perform the inlining using the synthesized JVMState so we start out with 7 Parm === 3 [[ 251 244 213 145 51 84 339 ]] Memory 339 MergeMem === _ 1 7 1 [[ 338 ]] { - } Memory 338 SafePoint === 332 6 339 8 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 68 131 162 11 179 1 11 176 1 69 249 1 [[ ]] SafePoint 4)
06-09-2022

Maybe, but rather not. Memory is sometimes nullptr. The GraphKit is created with a SafePointNode, which here is a VectorBoxAllocateNode. That makes sense given the back-trace. This is generated before in GraphKit::box_vector and has values set in GraphKit::set_edges_for_java_call. The value for memory is set to reset_memory(). There, we ask for the previous memory, and send it through _gvn.transform(mem). This calls to PhaseGVN::transform_no_reclaim, where we transform 348 MergeMem === _ 1 347 1 [[ ]] into 347 Proj === 345 [[ 348 349 ]] #2 Memory via MergeMemNode::Identity We only have one memory input, so no split (No memory splits; ID on the one true input) What is reset_memory() really there for? It seems odd, as it does not really re-set anything directly, only in debug mode. It just transforms the input, and then is probably used as input elsewhere? Maybe we can trace the return value of reset_memory() Hmm, often it is done via reset_memory and set_all_memory, but not in all cases. I need to investigate some examples.
06-09-2022

Hmm, seems that reset_memory() can return anything really, it does not need to be a MergeMemNode. Somehow the memory() is fixed up in other ways. For example in LibraryCallKit::inline_vector_mem_operation, there is a call to insert_mem_bar(Op_MemBarCPUOrder) which in turn calls set_all_memory_call(membar). This makes sure it is wrapped in a MergeMemNode. Ok, it could be that really we need a set_all_memory call at some point, because that performs the wrapping in MergeMemNode.
06-09-2022

Working on it again. Now the mem node is not just "7 Param", but this chain: 7 Parm === 3 [[ ... ]] Memory 345 MemBarCPUOrder === 332 1 7 1 1 347 Proj === 345 Now, let's see where this comes from. The GraphKit is initialized with a SafePointNode, which has the same chain in the memory input. Q: can we add the MergeMem assert already to the constructor of GraphKit?
05-09-2022

Note, that in GraphKit::get_layout_helper we do not enter the if clause because we have StressReflectiveCode enabled. I am however not sure if or what is the relation to the assert.
14-07-2022

A question: the assert says "parse memory is always pre-split" But we are not really parsing, if I understand correctly. Compile::Optimize is calling some vector stuff to be optimized, and that calls the GraphKit, which then maybe wrongly assumes we are in parsing? FYI GraphKit::merged_memory can also be called through Compile::inline_incrementally Hmm, interestingly, in some of these cases, I see a MergeMem simply wrapping "7 Param". That is probably what is supposed to happen in our case too. ---------------- Stack Trace ------ #0 0x00007f4c55c2fd8b in GraphKit::merged_memory (this=0x7f4c3a332040) at /home/emanuel/Documents/fork4-jdk/open/src/hotspot/share/opto/graphKit.hpp:516 #1 0x00007f4c561a15fb in GraphKit::memory (this=0x7f4c3a332040, alias_idx=19) at /home/emanuel/Documents/fork4-jdk/open/src/hotspot/share/opto/graphKit.cpp:1488 #2 0x00007f4c561a18a0 in GraphKit::make_load (this=0x7f4c3a332040, ctl=0x0, adr=0x562887db5b38, t=0x7f4c2002b790, bt=T_INT, adr_idx=19, mo=MemNode::unordered, control_dependency=LoadNode::DependsOnlyOnTest, require_atomic_access=false, unaligned=false, mismatched=false, unsafe=false, barrier_data=0 '\000') at /home/emanuel/Documents/fork4-jdk/open/src/hotspot/share/opto/graphKit.cpp:1537 #3 0x00007f4c55c2fec6 in GraphKit::make_load (this=0x7f4c3a332040, ctl=0x0, adr=0x562887db5b38, t=0x7f4c2002b790, bt=T_INT, adr_type=0x7f4c2037c4f8, mo=MemNode::unordered, control_dependency=LoadNode::DependsOnlyOnTest, require_atomic_access=false, unaligned=false, mismatched=false, unsafe=false, barrier_data=0 '\000') at /home/emanuel/Documents/fork4-jdk/open/src/hotspot/share/opto/graphKit.hpp:547 #4 0x00007f4c561ae26a in GraphKit::make_load (this=0x7f4c3a332040, ctl=0x0, adr=0x562887db5b38, t=0x7f4c2002b790, bt=T_INT, mo=MemNode::unordered, control_dependency=LoadNode::DependsOnlyOnTest, require_atomic_access=false, unaligned=false, mismatched=false, unsafe=false, barrier_data=0 '\000') at /home/emanuel/Documents/fork4-jdk/open/src/hotspot/share/opto/graphKit.hpp:537 #5 0x00007f4c561a9354 in GraphKit::get_layout_helper (this=0x7f4c3a332040, klass_node=0x562887db5ac8, constant_value=@0x7f4c3a331e0c: 0) at /home/emanuel/Documents/fork4-jdk/open/src/hotspot/share/opto/graphKit.cpp:3519 #6 0x00007f4c561aa327 in GraphKit::new_array (this=0x7f4c3a332040, klass_node=0x562887db5ac8, length=0x56288805d588, nargs=1, return_size_val=0x0, deoptimize_on_exception=false) at /home/emanuel/Documents/fork4-jdk/open/src/hotspot/share/opto/graphKit.cpp:3712 #7 0x00007f4c5698e8fe in PhaseVector::expand_vbox_alloc_node (this=0x7f4c3a3323e0, vbox_alloc=0x7f4c20299890, value=0x7f4c202997e0, box_type=0x7f4c201f9640, vect_type=0x7f4c200c2198) at /home/emanuel/Documents/fork4-jdk/open/src/hotspot/share/opto/vector.cpp:369 #8 0x00007f4c5698e642 in PhaseVector::expand_vbox_node_helper (this=0x7f4c3a3323e0, vbox=0x7f4c2029a600, vect=0x7f4c202997e0, box_type=0x7f4c201f9640, vect_type=0x7f4c200c2198) at /home/emanuel/Documents/fork4-jdk/open/src/hotspot/share/opto/vector.cpp:335 #9 0x00007f4c5698e259 in PhaseVector::expand_vbox_node (this=0x7f4c3a3323e0, vec_box=0x7f4c2029a670) at /home/emanuel/Documents/fork4-jdk/open/src/hotspot/share/opto/vector.cpp:298 #10 0x00007f4c5698d300 in PhaseVector::expand_vbox_nodes (this=0x7f4c3a3323e0) at /home/emanuel/Documents/fork4-jdk/open/src/hotspot/share/opto/vector.cpp:115 #11 0x00007f4c5698ce4d in PhaseVector::optimize_vector_boxes (this=0x7f4c3a3323e0) at /home/emanuel/Documents/fork4-jdk/open/src/hotspot/share/opto/vector.cpp:59 #12 0x00007f4c55e959bc in Compile::Optimize (this=0x7f4c3a3349c0) at /home/emanuel/Documents/fork4-jdk/open/src/hotspot/share/opto/compile.cpp:2282 #13 0x00007f4c55e8ef4d in Compile::Compile (this=0x7f4c3a3349c0, ci_env=0x7f4c3a335740, target=0x562887ab0e70, osr_bci=-1, options=..., directive=0x7f4c5011a7f0) at /home/emanuel/Documents/fork4-jdk/open/src/hotspot/share/opto/compile.cpp:823 #14 0x00007f4c55d7bc11 in C2Compiler::compile_method (this=0x7f4c501f7b80, env=0x7f4c3a335740, target=0x562887ab0e70, entry_bci=-1, install_code=true, directive=0x7f4c5011a7f0) at /home/emanuel/Documents/fork4-jdk/open/src/hotspot/share/opto/c2compiler.cpp:112 #15 0x00007f4c55eaddd9 in CompileBroker::invoke_compiler_on_method (task=0x7f4c5025b640) at /home/emanuel/Documents/fork4-jdk/open/src/hotspot/share/compiler/compileBroker.cpp:2311 #16 0x00007f4c55eac991 in CompileBroker::compiler_thread_loop () at /home/emanuel/Documents/fork4-jdk/open/src/hotspot/share/compiler/compileBroker.cpp:1981 #17 0x00007f4c55ecec1e in CompilerThread::thread_entry (thread=0x7f4c501f8180, __the_thread__=0x7f4c501f8180) at /home/emanuel/Documents/fork4-jdk/open/src/hotspot/share/compiler/compilerThread.cpp:59 #18 0x00007f4c5626aff7 in JavaThread::thread_main_inner (this=0x7f4c501f8180) at /home/emanuel/Documents/fork4-jdk/open/src/hotspot/share/runtime/javaThread.cpp:698 #19 0x00007f4c5626ae84 in JavaThread::run (this=0x7f4c501f8180) at /home/emanuel/Documents/fork4-jdk/open/src/hotspot/share/runtime/javaThread.cpp:681 #20 0x00007f4c56946609 in Thread::call_run (this=0x7f4c501f8180) at /home/emanuel/Documents/fork4-jdk/open/src/hotspot/share/runtime/thread.cpp:224 #21 0x00007f4c56725a4f in thread_native_entry (thread=0x7f4c501f8180) at /home/emanuel/Documents/fork4-jdk/open/src/hotspot/os/linux/os_linux.cpp:708 #22 0x00007f4c57b44609 in start_thread (arg=<optimized out>) at pthread_create.c:477 #23 0x00007f4c57a61133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
14-07-2022

So we could do something about it in src/hotspot/share/opto/graphKit.cpp GraphKit::set_edges_for_java_call see that reset_memory() is MergeMem Because doing that directly in GraphKit::reset_memory is probably not the way to go. In fact I see that happen elsewhere and that is ok. Or maybe the problem comes from much earlier. going back earlier. That JVMState we generate, is based of a call, from call_node(). That has memory input "7 Param" Actually, this kinda shows that it is normal that set_edges_for_java_call can have the memory input be "7 param" and not a MergeMem.
14-07-2022

I'll start debugging this a bit before my vacation, maybe this helps. This is the failing assert: assert(mem->is_MergeMem(), "parse memory is always pre-split"); mem is not a MergeMemNode, but: 7 Parm === 3 [[...]] Memory Memory: @BotPTR *+bot, idx=Bot; !jvms: TestLoopStoreVector::testVectorCastL2I @ bci:-1 (line 60) We ask for memory of map_not_null(). This is a SafePoint, the memory is at index 2 2451 SafePoint === 2334 6 7 8 1 1 11 1 13 1 1 1 1 1 1 1 1 1 1 1 1 1 372 433 463 120 476 1 120 478 1 373 549 [[ ]] So I step backwards, watching both the map, and input 2 of SafePoint. Ok, so this is the map that jvms holds when the GraphKit is generated, in PhaseVector::expand_vbox_alloc_node But also the input 2 of the map is just copied right before the GraphKit, from a previous SafePoint. This is the previous SafePoint: 2348 VectorBoxAllocate === 2334 6 7 8 1 (1 11 1 13 1 1 1 1 1 1 1 1 1 1 1 1 1 372 433 463 120 476 1 120 478 1 373 549 ) [[ ... ]] # Static Ok, so right after the node creation of VectorBoxAllocate (is a CallJavaNode), we run GraphKit::set_edges_for_java_call. There, the memory is set from reset_memory() Before this, we have map() 2342 SafePoint === 2334 6 2343 8 1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 13 1 1 1 1 1 1 1 1 1 1 1 1 1 372 433 463 120 476 1 120 478 1 373 549 1 [[ ]] SafePoint and its memory 2343 MergeMem === _ 1 7 1 [[ 2342 ]] { - } Memory: @BotPTR *+bot, idx=Bot; Then, because we are in debug mode, memory of map is set to NULL. we then return transform(mem) transform figures out that 2343 MergeMem === _ 1 7 1 [[ 2342 ]] is Identity with 7 Parm === 3 [[ ...]] Memory Because "No memory splits; ID on the one true input" Now I'd like to see how "2343 MergeMem" came about Ok, well it is specially created in callGenerator.cpp - CallGenerator::do_late_inline_helper A new JVMState is generated, and because the map's memory is not a MergeMem, we insert one, wrapping the previous memory into a MergeMem. I now see two possible reasons for the problem. Either that transform should not have happened. Or we need to make sure after jvms clone, that we have a MergeMem as input?
14-07-2022

ILW = assertion failure in incubator test code; seen only using StressReflectiveCode; run test with default configuration = MLM = P4
10-06-2022