JDK-8172850 : Anti-dependency on membar causes crash in register allocator due to invalid instruction scheduling
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 9
  • Priority: P2
  • Status: Closed
  • Resolution: Fixed
  • Submitted: 2017-01-16
  • Updated: 2019-03-19
  • Resolved: 2017-01-25
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 10 JDK 8 JDK 9
10Fixed 8u202Fixed 9 b156Fixed
Related Reports
Blocks :  
Duplicate :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Description
jdk9 PIT, 
closed/java/lang/Enum/CloneEnumConstant.java crashed two times on different hosts

;; Using jvm: "/scratch/home/aurora/CommonData/TEST_JAVA_HOME/lib/server/libjvm.so"
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f2b77d52cf0, pid=29533, tid=29602
#
# JRE version: Java(TM) SE Runtime Environment (9.0) (fastdebug build 9-internal+0-2017-01-13-173800.jesper.dev1780-hs)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 9-internal+0-2017-01-13-173800.jesper.dev1780-hs, compiled mode, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0x86acf0]  RegMask::AND(RegMask const&)+0x0
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e" (or dumping to /scratch/home/aurora/sandbox/results/workDir/closed/java/lang/Enum/CloneEnumConstant/core.29533)
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#

---------------  S U M M A R Y ------------

Command Line: -Dtest.src=/scratch/home/aurora/CommonData/j2se_jdk/jdk/test/closed/java/lang/Enum -Dtest.src.path=/scratch/home/aurora/CommonData/j2se_jdk/jdk/test/closed/java/lang/Enum -Dtest.classes=/scratch/home/aurora/sandbox/results/workDir/classes/15/closed/java/lang/Enum -Dtest.class.path=/scratch/home/aurora/sandbox/results/workDir/classes/15/closed/java/lang/Enum -Dtest.vm.opts= -Dtest.tool.vm.opts= -Dtest.compiler.opts= -Dtest.java.opts=-Xcomp -Xcomp -XX:MaxRAMFraction=8 -XX:+CreateCoredumpOnCrash -ea -esa -XX:CompileThreshold=100 -XX:+UnlockExperimentalVMOptions -server -XX:-TieredCompilation -XX:+IgnoreUnrecognizedVMOptions -XX:+AggressiveOpts -XX:-UseBiasedLocking -Dtest.jdk=/export/home/aurora/CommonData/TEST_JAVA_HOME -Dcompile.jdk=/export/home/aurora/CommonData/TEST_JAVA_HOME -Dtest.timeout.factor=16.0 -Dtest.modules=java.corba/com.sun.corba.se.impl.copyobject java.corba/com.sun.corba.se.spi.copyobject -Dtest.nativepath=/export/home/aurora/sandbox/JTREG_NATIVEPATH_LIBRARY_PREPARED --add-modules=java.corba --add-exports=java.corba/com.sun.corba.se.impl.copyobject=ALL-UNNAMED --add-exports=java.corba/com.sun.corba.se.spi.copyobject=ALL-UNNAMED -Xcomp -Xcomp -XX:MaxRAMFraction=8 -XX:+CreateCoredumpOnCrash -ea -esa -XX:CompileThreshold=100 -XX:+UnlockExperimentalVMOptions -XX:-TieredCompilation -XX:+IgnoreUnrecognizedVMOptions -XX:+AggressiveOpts -XX:-UseBiasedLocking -Djava.library.path=/export/home/aurora/sandbox/JTREG_NATIVEPATH_LIBRARY_PREPARED com.sun.javatest.regtest.agent.MainWrapper /scratch/home/aurora/sandbox/results/workDir/closed/java/lang/Enum/CloneEnumConstant.d/main.1.jta

Host: scaaa603.us.oracle.com, Intel(R) Xeon(R) CPU E5-2690 0 @ 2.90GHz, 32 cores, 251G, Oracle Linux Server release 7.0
Time: Sat Jan 14 12:53:36 2017 PST elapsed time: 44 seconds (0d 0h 0m 44s)

---------------  T H R E A D  ---------------

Current thread (0x00007f2b7053c6b0):  JavaThread "C2 CompilerThread0" daemon [_thread_in_native, id=29602, stack(0x00007f2a60690000,0x00007f2a60791000)]


Current CompileTask:
C2:  44913 5843   !b        com.sun.corba.se.impl.io.IIOPOutputStream::simpleWriteObject (176 bytes)

Stack: [0x00007f2a60690000,0x00007f2a60791000],  sp=0x00007f2a6078c0d8,  free space=1008k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x86acf0]  RegMask::AND(RegMask const&)+0x0;;  RegMask::AND(RegMask const&)+0x0
V  [libjvm.so+0x869b66]  PhaseChaitin::Register_Allocate()+0x4c6;;  PhaseChaitin::Register_Allocate()+0x4c6
V  [libjvm.so+0x9e5f79]  Compile::Code_Gen()+0x3a9;;  Compile::Code_Gen()+0x3a9
V  [libjvm.so+0x9e9b4a]  Compile::Compile(ciEnv*, C2Compiler*, ciMethod*, int, bool, bool, bool, DirectiveSet*)+0x130a;;  Compile::Compile(ciEnv*, C2Compiler*, ciMethod*, int, bool, bool, bool, DirectiveSet*)+0x130a
V  [libjvm.so+0x819062]  C2Compiler::compile_method(ciEnv*, ciMethod*, int, DirectiveSet*)+0x2e2;;  C2Compiler::compile_method(ciEnv*, ciMethod*, int, DirectiveSet*)+0x2e2
V  [libjvm.so+0x9f5156]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0x3d6;;  CompileBroker::invoke_compiler_on_method(CompileTask*)+0x3d6
V  [libjvm.so+0x9f5df1]  CompileBroker::compiler_thread_loop()+0x2b1;;  CompileBroker::compiler_thread_loop()+0x2b1
V  [libjvm.so+0x15f8a9e]  JavaThread::thread_main_inner()+0x22e;;  JavaThread::thread_main_inner()+0x22e
V  [libjvm.so+0x15f8d2e]  JavaThread::run()+0x1ce;;  JavaThread::run()+0x1ce
V  [libjvm.so+0x1347032]  thread_native_entry(Thread*)+0x112;;  thread_native_entry(Thread*)+0x112
C  [libpthread.so.0+0x7dc5]  start_thread+0xc5


Comments
verified by PIT
26-07-2017

I'll fix this by backing out JDK-8087341 (see JDK-8173195) which was pushed long ago but the problem only shows up with the recent fix for JDK-8172145 (which is correct and should not be backed out). I'll use this bug to push the regression test and additional verification code. Removing the integration_blocker label.
24-01-2017

[~dlong], yes, could be the same issue. I'll have a look.
23-01-2017

[~tobias] Does this explain 8173143?
20-01-2017

Yes, exactly. I don't think we can have the same problem with a "normal" volatile memory access because they set all memory after the membar (via "set_all_memory_call(..)"), i.e. the membar has a wide memory effect in the graph. I'm discussing this with Roland. It seems we have two options: 1) Backing out JDK-8087341 and re-do the fix in JDK 10 2) Preserve the adr_type() of the membar in the MachNode and set it to AliasIdxRaw when emitting the G1 barrier I'll investigate how invasive 2) would be.
19-01-2017

Agree with analysis. So the problem is with G1 membar which should not be anti-dependant on this Load as I understand - it should be only targeting "write to a field and the load from the card table." It should not have wide memory effect - that was goal of JDK-8087341. Can we use an other special StoreLoad barrier for G1 barrier? Can you discuss it with Roland? Can we have the same problem not with G1 barrier but normal volatile memory accesses which use MemBarVolatile?
18-01-2017

I found a very simple test (attached): private static Test f1; private static Test f2; public void m1() { } public void m2() { } public void test1(Test obj) { try { m1(); } catch (Exception e) { } finally { f1 = obj; } f2.m2(); } The NULL check of f2 is represented as CmpN(LoadN(MEM), ConN)) and matched to testN_mem_reg0(MEM) where MEM is the memory immediately after the m1() call. The problem now is that with JDK-8172145, the MemBarVolatile emitted for the G1 barrier at "f1 = obj" has an anti-dependency with the LoadN(MEM). As a result, we cannot move the testN_mem_reg0 to the block where the result is needed but have to schedule it right after m1(). We then fail because the RA tries to spill the flag register to make it available in the later block. The problem really is this inconsistency of the C2 IR connecting the LoadN to the memory output of the m1() call and GCM finding an anti-dependency with the MemBarVolatile. This changed with Roland's fix for JDK-8087341. Before, the LoadN would have been connected to the memory output of the MemBarVolatile. Using the slightly different test2 causes a bailout: ********************************************************* ** Bailout: Recompile without subsuming loads ** *********************************************************
18-01-2017

I think there a code in RA which try to keep nodes which produce and consumes flags at the same block. So It could be problem in that code.
17-01-2017

I compared the output of -XX:+TraceOptoPipelining before and after JDK-8172145. Before, testN_mem_reg0 is scheduled in the block where the result is used (B84): # --- schedule_local B84, before: --- # 801: Region 801 13 # 11: testN_mem_reg0 NULL 219 12 NULL # 12: loadConP 1 NULL NULL NULL # 10: jmpCon 801 11 NULL NULL # 9: IfTrue 10 # 777: IfFalse 10 With JDK-8172145, testN_mem_reg0 is moved to an earlier block (B2) although hoisting is disabled due to must_clone[CmpN] = 1. The LCA is updated from B84 to B2 in PhaseCFG::insert_anti_dependences(): # --- schedule_local B2, before: --- # 201: Start 201 1 # 200: MachProj 201 # 202: MachProj 201 # 204: MachProj 201 # 206: MachProj 201 # 207: MachProj 201 # 210: MachProj 201 # 211: MachProj 201 # 769: MachProj 201 # 11: testN_mem_reg0 NULL 219 12 NULL # 12: loadConP 1 NULL NULL NULL # 240: loadConP 201 NULL NULL NULL # 220: tlsLoadP 201 NULL NULL NULL [...] # --- schedule_local B84, before: --- # 801: Region 801 13 # 10: jmpCon 801 11 NULL NULL # 9: IfTrue 10 # 777: IfFalse 10 This makes the live range of testN_mem_reg0 extremely long, causing spilling later.
17-01-2017

We fail in PhaseChaitin::gather_lrg_masks() because n->out_RegMask() is an invalid address ('n' is a MachSpillCopyNode). The out_RegMask() is set earlier in PhaseAggressiveCoalesce::insert_copies(): const RegMask *rm = C->matcher()->idealreg2spillmask[m->ideal_reg()]; copy = new MachSpillCopyNode(MachSpillCopyNode::PhiInput, m, *rm, *rm); // Find a good place to insert. Kinda tricky, use a subroutine insert_copy_with_overlap(pred,copy,phi_name,src_name); Where m is a testN_mem_reg0 and m->ideal_reg() is Op_RegFlags: (gdb) print n->dump(2) 12 loadConP === 1 [[ 1079 1074 ]] java/lang/Class:exact * Oop:java/lang/Class:exact * 219 MachProj === 198 [[ 1080 184 221 222 173 237 238 231 230 244 245 256 267 288 289 290 291 1078 295 315 336 337 443 445 446 432 460 461 454 453 466 467 468 469 489 510 511 544 1079 554 672 674 675 661 689 690 683 682 695 696 697 698 718 739 740 751 778 786 1073 1074 1075 ]] #2/unmatched Memory: @BotPTR *+bot, idx=Bot; !jvms: IIOPOutputStream::simpleWriteObject @ bci:34 814 jmpDir === 346 [[ 36 ]] !orig=796 815 jmpDir === 37 [[ 36 ]] !orig=796 1079 testN_mem_reg0 === _ 219 12 [[ 1081 ]] #112/0x0000000000000070narrowoop: NULL !orig=[11] 1074 testN_mem_reg0 === _ 219 12 [[ 1081 ]] #112/0x0000000000000070narrowoop: NULL !orig=[11] 36 Region === 36 815 814 [[ 36 34 520 544 552 565 566 567 568 1081 1082 ]] !jvms: IIOPOutputStream::simpleWriteObject @ bci:144 1081 Phi === 36 1074 1079 [[ 10 ]] #int:-1..1 idealreg2spillmask does not contain a spillmask for Op_RegFlags because the flags register is non-spillable. I did binary search on the builds and this problem was introduced/triggered by JDK-8172145.
16-01-2017

Steps to reproduce: bin/java -XX:+ReplayCompiles -XX:+ReplayIgnoreInitErrors -XX:ReplayDataFile=replay_pid29533.log -XX:MaxRAMFraction=8 --add-modules java.corba
16-01-2017

ILW = Crash in the register allocator, able to reproduce, exclude method from compilation = HLM = P2
16-01-2017