JDK-8189067 : SuperWord optimization crashes with "assert(out == prev || prev == __null) failed: no branches off of store slice"
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 9,10
  • Priority: P1
  • Status: Closed
  • Resolution: Fixed
  • Submitted: 2017-10-09
  • Updated: 2020-02-28
  • Resolved: 2017-10-13
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 10
10 b31Fixed
Related Reports
Duplicate :  
Relates :  
Relates :  
Relates :  
Description
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/scratch/mesos/slaves/5af44a71-976a-41b7-81de-5773b84ec572-S34473/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/eae0aa69-79b6-4d74-b0cf-3a1d39027b40/runs/cc5b1f70-06f3-49d8-ba9c-5379bd6bc920/workspace/open/src/hotspot/share/opto/superword.cpp:1075), pid=3075, tid=25091
#  assert(out == prev || prev == __null) failed: no branches off of store slice
#
# JRE version: Java(TM) SE Runtime Environment (10.0) (fastdebug build 10-internal+0-2017-10-07-0300098.jesper.wilhelmsson.hs)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 10-internal+0-2017-10-07-0300098.jesper.wilhelmsson.hs, mixed mode, compressed oops, g1 gc, bsd-amd64)
# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#

---------------  S U M M A R Y ------------

Command Line: -XX:MaxRAMPercentage=6 -XX:MaxRAMPercentage=12.5 -XX:+CreateCoredumpOnCrash -ea -esa -XX:CompileThreshold=100 -XX:+UnlockExperimentalVMOptions -XX:-TieredCompilation -Xbatch -XX:-UseCounterDecay -XX:-ShowMessageBoxOnError -XX:+UnlockDiagnosticVMOptions -DCompileTheWorldStartAt=0 -XX:+WhiteBoxAPI -Xbootclasspath/a:. --add-exports=java.base/jdk.internal.jimage=ALL-UNNAMED --add-exports=java.base/jdk.internal.misc=ALL-UNNAMED --add-exports=java.base/jdk.internal.reflect=ALL-UNNAMED -XX:+LogCompilation -XX:LogFile=hotspot_modules_0_%p.log -XX:ErrorFile=hs_err_modules_0_%p.log -XX:ReplayDataFile=replay_modules_0_%p.log -XX:CompileCommand=exclude,java/lang/invoke/MethodHandle.* sun.hotspot.tools.ctw.CompileTheWorld /scratch/mesos/jib-master/install/2017-10-07-0300098.jesper.wilhelmsson.hs/macosx-x64-debug.jdk/jdk-10/fastdebug/lib/modules

Host: scaaa985.us.oracle.com, MacPro6,1 x86_64 3700 MHz, 8 cores, 16G, Darwin 14.5.0
Time: Sat Oct  7 03:25:45 2017 GMT elapsed time: 398 seconds (0d 0h 6m 38s)

---------------  T H R E A D  ---------------

Current thread (0x00007fcbf9034800):  JavaThread "C2 CompilerThread0" daemon [_thread_in_native, id=25091, stack(0x0000000123fa0000,0x00000001240a0000)]


Current CompileTask:
C2: 398069 74084    b        com.sun.imageio.plugins.png.CRC::<clinit> (67 bytes)

Stack: [0x0000000123fa0000,0x00000001240a0000],  sp=0x000000012409a540,  free space=1001k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.dylib+0xceafee]  VMError::report_and_die(int, char const*, char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long)+0x4e0
V  [libjvm.dylib+0xceb7bc]  VMError::report_and_die(Thread*, char const*, int, char const*, char const*, __va_list_tag*)+0x4a
V  [libjvm.dylib+0x4bfc47]  report_vm_error(char const*, int, char const*, char const*, ...)+0xcd
V  [libjvm.dylib+0xc22d21]  SuperWord::mem_slice_preds(Node*, Node*, GrowableArray<Node*>&)+0x351
V  [libjvm.dylib+0xc1d25b]  SuperWord::dependence_graph()+0x17b
V  [libjvm.dylib+0xc1b532]  SuperWord::SLP_extract()+0xda
V  [libjvm.dylib+0xc1b29b]  SuperWord::transform_loop(IdealLoopTree*, bool)+0x573
V  [libjvm.dylib+0x96ca58]  PhaseIdealLoop::build_and_optimize(bool, bool)+0xc90
V  [libjvm.dylib+0x45b280]  Compile::Optimize()+0xf4e
V  [libjvm.dylib+0x45920b]  Compile::Compile(ciEnv*, C2Compiler*, ciMethod*, int, bool, bool, bool, DirectiveSet*)+0xbeb
V  [libjvm.dylib+0x45bef5]  Compile::Compile(ciEnv*, C2Compiler*, ciMethod*, int, bool, bool, bool, DirectiveSet*)+0x31
V  [libjvm.dylib+0x3552d0]  C2Compiler::compile_method(ciEnv*, ciMethod*, int, DirectiveSet*)+0x130
V  [libjvm.dylib+0x46cc83]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0x735
V  [libjvm.dylib+0x46c20b]  CompileBroker::compiler_thread_loop()+0x2dd
V  [libjvm.dylib+0xc797bf]  JavaThread::thread_main_inner()+0x1ed
V  [libjvm.dylib+0xc792c9]  JavaThread::run()+0x3c1
V  [libjvm.dylib+0xad3229]  thread_native_entry(Thread*)+0x12b
C  [libsystem_pthread.dylib+0x405a]  _pthread_body+0x83
C  [libsystem_pthread.dylib+0x3fd7]  _pthread_body+0x0
C  [libsystem_pthread.dylib+0x13ed]  thread_start+0xd

[error occurred during error reporting (printing native stack), id 0xe0000000]


Comments
Summary: C2 moves stores out of a loop by creating clones at all loop exit paths that observe the stored value. When walking up the dominator chain from observers of a store and placing clones of the store at the top (right after the loop), we may end up placing multiple stores at the same location. This confuses the SuperWord optimization and also affects performance of the generated code. Prototype fix: Check if there is a cloned store with the same control input and re-use it. http://cr.openjdk.java.net/~thartmann/8189067/webrev.00/ I've verified that JDK-8184995 is a duplicate. Since UseSubwordForMaxVector is not the cause but only hides the problem, I'll re-enable the optimization with this fix.
11-10-2017

This looks like JDK-8184995 and suggests that the failure is not due to UseSubwordForMaxVector which was disabled by JDK-8185013.
10-10-2017

ILW = Crash in superword analysis, with CTW but very easy to reproduce, -XX:-UseSuperWord = HHM = P1 (JDK 9 is affected as well).
10-10-2017

This is a problem with the PhaseIdealLoop::try_move_store_after_loop() optimization introduced by JDK-8080289. I was able to create a simple test that reproduces the crash: http://cr.openjdk.java.net/~thartmann/8189067/webrev.00/test/hotspot/jtreg/compiler/loopopts/TestMoveStoresOutOfLoops.java.sdiff.html When optimizing the loop, C2 moves the store to array4 out of the inner loop (290/201) by creating a clone for each use in the outer loop (194/193): 154 StoreI === 290 292 152 262 [[ 292 15 186 ]] @int[int:>=0]:exact+any *, idx=6; Memory: @int[int:>=0]:NotNull:exact+any *, idx=6; !orig=[270] !jvms: Test::test @ bci:22 292 Phi === 290 271 154 [[ 154 ]] #memory Memory: @int[int:>=0]:exact+any *, idx=6; !orig=[263],182,[180],[106] !jvms: Test::test @ bci:17 15 MergeMem === _ 1 7 1 1 1 154 [[ 178 ]] { - - - N154:int[int:>=0]:exact+any * } Memory: @BotPTR *+bot, idx=Bot; 186 Phi === 194 7 154 [[ 185 271 ]] #memory Memory: @int[int:>=0]:exact+any *, idx=6; !orig=[184],[63],[78],[180],[106] !jvms: Test::test @ bci:9 The control for the clones is determined by walking up the dominator chain for each use. Control for clone for 186 Phi: 162 IfFalse === 201 [[ 193 ]] #0 !orig=167 !jvms: Test::test @ bci:14 193 CountedLoopEnd === 162 192 [[ 173 172 ]] [lt] P=0,998831, C=1708,000000 !orig=[171] !jvms: Test::test @ bci:6 172 IfTrue === 193 [[ 194 ]] #1 !jvms: Test::test @ bci:6 194 CountedLoop === 194 60 172 [[ 194 82 64 186 ]] stride: 1 !orig=[187],[61] !jvms: Test::test @ bci:9 Control for clone for 15 MergeMem: 162 IfFalse === 201 [[ 193 300 ]] #0 !orig=167 !jvms: Test::test @ bci:14 193 CountedLoopEnd === 162 192 [[ 173 172 ]] [lt] P=0,998831, C=1708,000000 !orig=[171] !jvms: Test::test @ bci:6 173 IfFalse === 193 [[ 178 ]] #0 !jvms: Test::test @ bci:6 The problem is by walking up the dominator chain for the 173 IfFalse (which is the loop exit of the outer loop), we end up moving the store into the outer loop. As a result, both clones end up in the outer loop with the same control input: 162 IfFalse === 201 [[ 193 300 301 ]] #0 !orig=167 !jvms: Test::test @ bci:14 300 StoreI === 162 292 152 262 [[ 186 ]] @int[int:>=0]:exact+any *, idx=6; Memory: @int[int:>=0]:NotNull:exact+any *, idx=6; !orig=154,[270] !jvms: Test::test @ bci:22 301 StoreI === 162 292 152 262 [[ 15 ]] @int[int:>=0]:exact+any *, idx=6; Memory: @int[int:>=0]:NotNull:exact+any *, idx=6; !orig=154,[270] !jvms: Test::test @ bci:22 This confuses the memory slice computation for the superword optimization, triggering the assert.
10-10-2017

attached hs_err, replay and .jtr files
09-10-2017