JDK-8291025 : Jtreg compiler/loopopts/TestUnreachableInnerLoop.java fails with MaxVectorSize=8
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 11,17,18,19,20
  • Priority: P3
  • Status: Closed
  • Resolution: Duplicate
  • OS: generic
  • CPU: generic
  • Submitted: 2022-07-26
  • Updated: 2022-11-30
  • Resolved: 2022-11-21
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 20
20Resolved
Related Reports
Duplicate :  
Duplicate :  
Relates :  
Relates :  
Description
JDK-8289954 fixed an assertion failure issue found in a fuzzer test and added a new jtreg case "compiler/loopopts/TestUnreachableInnerLoop.java" as a regression test. But after that fix, the new test case still fails on RISC-V port.

# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/home/fyang/openjdk-jdk/src/hotspot/share/opto/block.cpp:1249), pid=2130903, tid=2130918
#  assert(n->is_Root() || n->is_Region() || n->is_Phi() || n->is_MachMerge() || def_block->dominates(block)) failed: uses must be dominated by definitions

Current CompileTask:
C2:   1896   22 % !b  4       compiler.loopopts.TestUnreachableInnerLoop::fun @ 61 (233 bytes)

Stack: [0x0000003fc041a000,0x0000003fc061a000],  sp=0x0000003fc0614850,  free space=2026k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x416ade]  PhaseCFG::verify() const+0x1d4
V  [libjvm.so+0x7150d2]  Compile::Code_Gen()+0x284
V  [libjvm.so+0x7196d4]  Compile::Compile(ciEnv*, ciMethod*, int, Options, DirectiveSet*)+0x11d4
V  [libjvm.so+0x55c4ec]  C2Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0x258
V  [libjvm.so+0x725da8]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0x8c4
V  [libjvm.so+0x726972]  CompileBroker::compiler_thread_loop()+0x61c
V  [libjvm.so+0xb8e7ee]  JavaThread::thread_main_inner()+0x258
V  [libjvm.so+0x14080b2]  Thread::call_run()+0xd4
V  [libjvm.so+0x106189c]  thread_native_entry(Thread*)+0xee
C  [libc.so.6+0x66e94]
C  [libc.so.6+0xb2962]

Comments
I've had a closer look and I agree with Emanuel's suspicion - it is indeed related to skeleton predicates (JDK-8288981). In this particular case, we peel a main loop for which we can no longer update skeleton predicates. After some unrolling, a Cast node of a range check (whose type is >= 0) for which we have created a predicate in loop predication becomes top because the input type is negative. Some data and memory nodes are dying but since we are missing skeleton predicates, we are not able to remove the main loop (remove control nodes). We are left with a broken graph and we later crash when verifying the correctness of the graph. The redesign done in JDK-8288981 should fix this problem. I'll therefore close it as a dup. We should make sure to run the attached tests of this bug once a fix is ready for JDK-8288981.
21-11-2022

I leave this bug to [~chagedorn] for now. Feel free to assign it back to me if you disagree with my assessment, or if the bug persists after the skeleton/parse predicate work is done.
19-09-2022

I strongly suspect that this is Skeleton/Parse Predicate related JDK-8288981. JDK-8230382 seems to cause a ConvI2L to discover that its range is impossible (it is constrained to be positive bc of memory access below it, and the input is negative). Before, these discoveries were sometimes not made, and the node plus its outputs would not be removed. I cannot guarantee that this is the trigger for the bug, but I have seen this happen before. You can find that ConvI2LNode with this change diff --git a/src/hotspot/share/opto/convertnode.cpp b/src/hotspot/share/opto/convertnode.cpp index 9f471c0958d..94270067271 100644 --- a/src/hotspot/share/opto/convertnode.cpp +++ b/src/hotspot/share/opto/convertnode.cpp @@ -260,6 +260,8 @@ const Type* ConvI2LNode::Value(PhaseGVN* phase) const { // Join my declared type against my incoming type. tl = tl->filter(_type); if (!tl->isa_long()) { + tty->print("ConvI2L::Value not long: "); tl->dump(); + this->dump(); return tl; } const TypeLong* this_type = tl->is_long(); When you see this ConvI2L node, it at some point had a CastII node as input, with range check dependency. The dependency goes up way too far. We are in a main loop, but the range check happens before the pre-loop. Since we have a combination of pre-main-post, and then peel, this means we do not properly copy and instantiate the predicates before all copied loops. So here the dependency is wrong, which can lead to all sorts of issues below. I have seen that this rips out a memory-phi node, and that it does not remove the control of that phi. Another evidence that we are messing up the control/data flow.
19-09-2022

[~pli] My original change already uncovered a list of bugs that are not directy related with my change. In all other cases the bug was caused by something else, so far. In my change JDK-8230382 I must have triggered some optimizations that did not happen before, and then run into bugs. There is a possibility that this is related with JDK-8288981 (C2: Fix issues with skeleton predicates). Let me look into this.
19-09-2022

[~epeter] Would you like to fix this now?
19-09-2022

I can reproduce the assert / bug with either ./java -Xcomp -XX:CompileOnly=Test.fun -XX:-SuperWordLoopUnrollAnalysis Test.java or ./java -Xcomp -XX:CompileOnly=Test.fun -XX:MaxVectorSize=8 Test.java
19-09-2022

I would say since this bug is targeted for JDK 20, it's not urgent to fix and we can leave Emanuel's changes in for now. Feel free to un- or re-assign though if you don't have time to work on it or get stuck.
05-08-2022

Hi [~thartmann], Thanks for letting me know. I verified that issue in JDK-8291791 disappears after reverting Emanuel's code cleanup in convertnode.cpp so these 2 issues should be the same. So far I still don't understand why moving the type computing code into Value() can produce that assert failure. If we cannot fix this in a short time, shall we propose a patch to revert that part? Emanuel's code cleanup cannot be fully backed out because of conflicts.
04-08-2022

[~pli], Emanuel is out until September.
03-08-2022

[~epeter] The test will pass if I revert part of your code cleanup in convernode.cpp - applying attached revert_cleanup_in_convertnode.patch
03-08-2022

[~epeter] Git bisect finds the assertion failure appears after JDK-8230382: Clean up ConvI2L, CastII and CastLL::Ideal methods. Would like to look at this issue? To reproduce, use "java -Xcomp -XX:CompileOnly=Test.fun -XX:MaxVectorSize=8 Test" to run my attached Test.java
03-08-2022

This can also be reproduced with attached Test.java on some other platforms with -XX:MaxVectorSize=8
02-08-2022

[~pli] Thanks for creating this issue. I did another run with JVM option: -XX:+TraceSuperWordLoopUnrollAnalysis, but I didn't see any extra information in file TestUnreachableInnerLoop.jtr. I have also attached file TestUnreachableInnerLoop.jtr for reference.
26-07-2022

ILW = Same as JDK-8289954 = P3
26-07-2022