JDK-6776584 : Escape analysis on sparcv9: Error: before block local scheduling
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: hs14,hs24,hs25,7,8,9
  • Priority: P3
  • Status: Resolved
  • Resolution: Duplicate
  • OS: generic,solaris_10
  • CPU: generic,sparc
  • Submitted: 2008-11-26
  • Updated: 2016-07-04
  • Resolved: 2015-01-22
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 9
9Resolved
Related Reports
Duplicate :  
Duplicate :  
Duplicate :  
Duplicate :  
Relates :  
Relates :  
Description
Nightly test fails on sparcV9 with -XX:+DoEscapeAnalysis

nsk/stress/jck60/jck60017

# 
# A fatal error has been detected by the Java Runtime Environment: 
# 
#  Internal Error (/tmp/jprt/P1/B/182532.kvn/source/src/share/vm/opto/output.cpp:2407), pid=25811, tid=15 
#  Error: before block local scheduling 
# 
# Java VM: Java HotSpot(TM) 64-Bit Server VM (14.0-b07-2008-11-21-182532.kvn.hs-merge-fastdebug compiled mode solaris-sparc ) 
# If you would like to submit a bug report, please visit: 
#   http://java.sun.com/webapps/bugreport/crash.jsp 
# 
 
---------------  T H R E A D  --------------- 
 
Current thread (0x0000000100262800):  JavaThread "CompilerThread0" daemon [_thread_in_native, id=15, stack(0xffffffff2f600000,0xffffffff2f700000)] 
 
Stack: [0xffffffff2f600000,0xffffffff2f700000],  sp=0xffffffff2f6fb540,  free space=1005k 
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) 
V  [libjvm.so+0x11c1bc0] void VMError::report_and_die() + 0x6a8
V  [libjvm.so+0x57eca4] void report_assertion_failure(const char*,int,const char*) + 0x74
V  [libjvm.so+0xe56e90] void Scheduling::verify_good_schedule(Block*,const char*) + 0x7e0
V  [libjvm.so+0xe579e4] void Scheduling::ComputeRegisterAntidependencies(Block*) + 0x34
V  [libjvm.so+0xe55d98] void Scheduling::DoScheduling() + 0x5e8
V  [libjvm.so+0xe4b658] void Compile::Output() + 0xc50
V  [libjvm.so+0x4b0404] void Compile::Code_Gen() + 0x9ac
V  [libjvm.so+0x4a8dd0] Compile::Compile(ciEnv*,C2Compiler*,ciMethod*,int,bool,bool) + 0x1bc8
V  [libjvm.so+0x30b3d0] void C2Compiler::compile_method(ciEnv*,ciMethod*,int) + 0x90
V  [libjvm.so+0x4c42d4] void CompileBroker::invoke_compiler_on_method(CompileTask*) + 0x16ac
V  [libjvm.so+0x4c20cc] void CompileBroker::compiler_thread_loop() + 0x148c
V  [libjvm.so+0x10a965c] void JavaThread::thread_main_inner() + 0x214
V  [libjvm.so+0x10a9428] void JavaThread::run() + 0x518
V  [libjvm.so+0xe344c8] java_start + 0x180
 
 
Current CompileTask: 
C2: 10%  b  javasoft.sqe.tests.api.java.awt.geom.Line2DFloat.ConstructorTest.testCase4()Ljavasoft/sqe/javatest/Status; @ 131 (286 bytes) 
 
jvm_args: -Xcomp -XX:-PrintVMOptions -XX:CompileThreshold=100 -XX:-UseCompressedOops -XX:DefaultMaxRAMFraction=8 -XX:+DoEscapeAnalysis -Xverify:all  
java_command: nsk.stress.share.StressTestRunner -testList /export/local/common/testbase/6/vm/vm/src/nsk/stress/jck60//jck60017/tests -stress:indulgent

Comments
Should be gone after the fix for JDK-8068881. The values would be hooked to MachMerge and the asserts should pass.
22-01-2015

Happens very seldom and in different tests.
11-11-2014

ILW=HLM=P3 Impact: Crash Likelihood: happens very rarely, like once in a year Workaround: Disable EA, evaluation suggests that disabling EA might not always resolve this problem
27-06-2014

Interesting.. I'm seeing the register assigned to the constant base being trashed in JDK-8044729. Haven't found the exact path where it happens but it might be related.
27-06-2014

Release team: Approved for deferral.
14-01-2014

ILW=HML=P2 Impact: Crash Likelihood: infrequent reproducible case Workaround: Disable EA Defer justification: This bug has been in the system for quite some time and is not a regression. We've been aware of this issue but have not found a good fix for this. Risk: Low since this is a known issue that has been in the system for some time. Target release: 9
13-01-2014

Problem reproduction with current sources: java -d64 -Xverify:all -XX:+CompileTheWorld -XX:CompileTheWorldStartAt=8100 -XX:CompileTheWorldStopAt=8200 -XX:CICompilerCount=1 -Xmx512M -XX:ParallelGCThreads=2 -XX:+UseCompressedOops -Xbootclasspath/p:${JAVA_HOME}/jre/lib/rt.jar CompileTheWorld : Compiling all classes in /tmp/kvn/6776584/jdk8b92/jre/lib/rt.jar Preloading failed for (2373) com/sun/management/OSMBeanFactory May 31, 2013 12:05:27 PM com.sun.org.apache.xml.internal.security.utils.CachedXPathFuncHereAPI fixupFunctionTable INFO: Registering Here function Preloading failed for (4558) com/sun/org/apache/xml/internal/serialize/HTMLSerializer Preloading failed for (4880) com/sun/org/apache/xpath/internal/objects/XNodeSet Preloading failed for (6697) com/sun/xml/internal/ws/binding/BindingImpl Preloading failed for (6734) com/sun/xml/internal/ws/client/WSServiceDelegate Preloading failed for (7048) com/sun/xml/internal/ws/model/RuntimeModeler Preloading failed for (7064) com/sun/xml/internal/ws/model/wsdl/WSDLBoundPortTypeImpl Preloading failed for (7558) com/sun/xml/internal/ws/wsdl/parser/RuntimeWSDLParser CompileTheWorld (8100) : java/beans/VetoableChangeSupport$VetoableChangeListenerMap CompileTheWorld (8101) : java/beans/VetoableChangeSupport CompileTheWorld (8102) : java/beans/Visibility CompileTheWorld (8103) : java/beans/WeakIdentityMap$Entry CompileTheWorld (8104) : java/beans/WeakIdentityMap CompileTheWorld (8105) : java/beans/XMLDecoder$1 CompileTheWorld (8106) : java/beans/XMLDecoder CompileTheWorld (8107) : java/beans/XMLEncoder$1 CompileTheWorld (8108) : java/beans/XMLEncoder$ValueData CompileTheWorld (8109) : java/beans/XMLEncoder # To suppress the following error report, specify this argument # after -XX: or in .hotspotrc: SuppressErrorAt=/output.cpp:2613 # # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (6776584/src/share/vm/opto/output.cpp:2613), pid=109, tid=9 # assert(!_reg_node[reg_lo] || edge_from_to(_reg_node[reg_lo],def)) failed: before block local scheduling # # JRE version: Java(TM) SE Runtime Environment (8.0-b92) (build 1.8.0-ea-fastdebug-b92) # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.0-b35-internal-debug mixed mode solaris-sparc compressed oops) # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again # # An error report file with more information is saved as: # /tmp/kvn/6776584/solaris_sparcv9_compiler2/debug/hs_err_pid109.log # # Compiler replay data is saved as: # /tmp/kvn/6776584/solaris_sparcv9_compiler2/debug/replay_pid109.log #
03-06-2013

Failure with symptoms of 8005531 seen with bigapps/Weblogic+medrec/stability adding there since 8005531 was closed as a duplicate on this CR.
17-01-2013

EVALUATION On Dec 8, 2008, at 10:49 AM, Vladimir Kozlov wrote: > Tom, > > in cases I see it is the same value at the same register. > It starts from next code before RA, for example: > > [t@8 l@8]: print C->root()->find(806)->dump(2) > 515 Phi === 91 64 516 [[ ... ]] #float !jvms: GetTest::testCase10 @ bci:103 > 673 loadConI === 18 [[ 669 806 ]] #2143289344 > 809 MoveF2I_reg_reg_sse === _ 515 [[ 806 ]] > 807 cmpX_cc === _ 515 515 [[ 808 806 ]] > 806 cmovI_regU === _ 807 809 673 [[ 302 815 816 ]] ne !jvms: FloatingDecimal::<init> @ bci:20 AbstractStringBuilder::append @ bci:5 StringBuilder::append @ bci:2 GetTest::testCase10 @ bci:201 > > And then RA inserts a lot of spill phis and copies. It is benign as you said. > So do you agree that we only need to fix the check in verify_good_schedule()? I'm not sure for this case. I definitely want to relax that assert for the int/oop problem I'm seeing but it disturbs me that this case has recently shown up. I think you're right that it's connected to the changes I made in reg_split.cpp but I have to believe that there's nothing besides chance keeping it from happening more. I played with making a change like this: diff --git a/src/share/vm/opto/reg_split.cpp b/src/share/vm/opto/reg_split.cpp --- a/src/share/vm/opto/reg_split.cpp +++ b/src/share/vm/opto/reg_split.cpp @@ -283,6 +283,13 @@ Node *PhaseChaitin::split_Rematerialize( Node *in = def->in(i); // Check for single-def (LRG cannot redefined) uint lidx = n2lidx(in); + if (walkThru) { + while ( in->is_SpillCopy() && lidx >= _maxlrg ) { + in = in->in(1); + lidx = Find_id(in); + } + } + if( lidx >= _maxlrg ) continue; // Value is a recent spill-copy if (lrgs(lidx).is_singledef()) continue; to force a new copy to be made when the multi def problem shows up. It fixes this problem though I think we end up with more copies than we need since it will force a new MSC for the use and I don't see why we couldn't just end up assigning the new copy to the same register and end up with the same problem. I can see that the reason post allocate copy removal can't eliminate these copies is that the skip_copies code doesn't have a very sophisticated notion of copies. It gives up as soon as it hits the phi instead of figuring out that all the phis and copies together just refer back to one value. For example we have this: 2335 Phi === 306 2815 2314 [[ 817 2417 2357 2816 2417 817 ]] #float 817 cmpX_cc === _ 2335 2335 [[ 818 816 ]] 818 MachProj === 817 [[]] #1 2337 MoveF2I_reg_reg_sse === _ 2314 [[ 816 ]] !orig=[819] where 2314 and 2335 are in the same register. Since 2314 isn't related to 2335 the logic that determines the value of that input decides it's 2189 instead of 2335 when if you search through the inputs to 2335 they are both copies of 2189. If post allocate considered their value to be the same then it would have switched 2337 to using 2335 and everything would be ok. I think part of the problem is that the live ranges don't make a distinction between multiple valued LRGs and ones where there's only one actual value but multiple MSCs and Phis to move that value around. tom > > > Thanks, > Vladimir > > Tom Rodriguez wrote: >> That looks like a hack. I believe that post allocate copy removal should already be fixing this up and I'm wondering why it's missing this case. Maybe the post allocate copy removal change you got from me is somehow causing it. I've been looking into a case that the Itanium folks reported where we were hitting this same assert. I tried to reproduce it by turning on OptoScheduling on x86 and running a full CTW. I hit a couple failures which all tie back to problems with handling of int to oop conversions and coalescing. The thing I learned that is new and relates to what you are seeing is that the register allocator doesn't consider a copy to interfere with it's input, so it's okay to assign them to the same register. This means that right after allocation it's very likely that the def/use edges and the register assignments won't match in exactly the way you are seeing. Post allocate copy removal is doing an abstract interpretation and so it should be finding the earliest def and replacing uses with that def. What happens in the case I'm seeing is that skip_copies won't skip through copies that change the oopness of a value so it doesn't fix up a use, resulting in simultaneous live uses of two different values in the same register. It's benign since the execution will work correctly. >> tom >> On Dec 5, 2008, at 8:41 PM, Vladimir Kozlov wrote: >>> >>> I fixed it by patching inputs in post_allocate_copy_removal() >>> >>> @@ -595,5 +595,29 @@ void PhaseChaitin::post_allocate_copy_re >>> >>> } // End of for all instructions in the block >>> >>> + for( j = 1; j < phi_dex; j++ ) { >>> + uint k; >>> + Node *phi = b->_nodes[j]; >>> + for( k=1; k<phi->req(); k++ ) { >>> + Node *x = phi->in(k); >>> + // Look for usage of this input by other nodes in this block >>> + // and replace it with this phi. >>> + if (x != phi && x->outcnt() > 1 && _cfg._bbs[x->_idx] != b) { >>> + for ( uint m = 0; m < x->outcnt(); ) { >>> + bool get_next = true; >>> + Node* use = x->raw_out(m); >>> + if (use != phi && _cfg._bbs[use->_idx] == b) { >>> + for (uint l = 0; l < use->req(); l++) { >>> + if (use->in(l) == x) { >>> + use->set_req(l, phi); >>> + get_next = false; >>> + } >>> + } >>> + } >>> + if (get_next) m++; >>> + } >>> + } >>> + } >>> + } >>> } // End for all blocks >>> } >>> >>> >>> And here the fix for the problem when Op_CreateEx is not first instruction: >>> >>> +++ b/src/share/vm/opto/reg_split.cpp >>> @@ -94,17 +94,19 @@ void PhaseChaitin::insert_proj( Block *b >>> void PhaseChaitin::insert_proj( Block *b, uint i, Node *spill, uint maxlrg ) { >>> // Skip intervening ProjNodes. Do not insert between a ProjNode and >>> // its definer. >>> while( i < b->_nodes.size() && >>> (b->_nodes[i]->is_Proj() || >>> - b->_nodes[i]->is_Phi() ) ) >>> + b->_nodes[i]->is_Phi() || >>> + (b->_nodes[i]->is_Mach() && >>> + b->_nodes[i]->as_Mach()->ideal_Opcode() == Op_CreateEx)) ) >>> i++; >>> >>> Thanks, >>> Vladimir >>> >>> Vladimir Kozlov wrote: >>>> Tom, >>>> The problem is caused by your fix for 6732194 in >>>> split_Rematerialize(): >>>> if (lidx < _maxlrg && lrgs(lidx).is_multidef()) { >>>> // walkThru found a multidef LRG, which is unsafe to use, so >>>> // just keep the original def used in the clone. >>>> in = spill->in(i); >>>> lidx = Find_id(in); >>>> } >>>> This code prevents usage of the new spill Phi in the current >>>> block as input for the new spill node (next set_req() is not called >>>> since the original def is SpillCopy): >>>> if( lidx < _maxlrg && lrgs(lidx).reg() >= LRG::SPILL_REG ) { >>>> Node *rdef = Reachblock[lrg2reach[lidx]]; >>>> if( rdef ) spill->set_req(i,rdef); >>>> As result spill node keep SpillCopy node input which is also used by new >>>> spill Phi. >>>> L634/N2194 Phi === L0/N337 L634/N2757 L634/N2173 #float >>>> L127[EFLAGS]/N664 cmpX_cc === _ L634[XMM0a,XMM1a,XMM2a,XMM3a,XMM4a,XMM5a,XMM6a,XMM7a]/N2194 L634[XMM0a,XMM1a,XMM2a,XMM3a,XMM4a,XMM5a,XMM6a,XMM7a]/N2194 >>>> L656[ECX-ESI]/N2196 MoveF2I_reg_reg_sse === _ L634[XMM0a,XMM1a,XMM2a,XMM3a,XMM4a,XMM5a,XMM6a,XMM7a]/N2173 Spill_1 Spill_2 >>>> Vladimir
07-01-2012