JDK-6840775 : Multiple JVM crashes seen with 1.6.0_10 through early access of 1.6.0_14 - possibly related to GC
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 1.0,6u7-rev,6u13,6u14
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: solaris_10
  • CPU: generic,sparc
  • Submitted: 2009-05-13
  • Updated: 2011-03-10
  • Resolved: 2010-05-25
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 6 JDK 7 Other
6u18 b01Fixed 7Fixed hs16Fixed
Related Reports
Duplicate :  
Duplicate :  
Duplicate :  
Duplicate :  
Duplicate :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Description
Customer has a J2EE application which they are migrating from Java 1.5 to Java 1.6.

After migration, they're experiencing all kinds of crashes of the JVM, with essentially the same code as they had before.

While analyzing the problem, they created a few stand-alone java programs that can manifest some of the crashes without the entire platform code (in excess of 1.4 million lines of code), external libraries of application servers.  Please see the attached jar file for the test cases.

Details on the crashes their experiencing are posted at http://forums.sun.com/thread.jspa?threadID=5384686

Test cases shows various SIGBUS crashes.  The crashes mostly occur when using CMS, but problems can also be seen when using the default gc as chosen by ergonomics.  So it is possible we have a C2 issue here as well as an issue with GC.

Customer is running the application on a Solaris 10 (non zoned) Sun Sparc T5220.

Please see comments section of the bug for the steps to reproduce these crashes.

I noticed that the crashes from the test cases occur fairly quickly with 1.6.0_10 through early access of 1.6.0_14.  In my tests, I have not reproduced the crash with 1.6.0_05 and 1.6.0_07 yet.  Although the _05 test had to be terminated due to a system resource issue.  The _07 test has been currently running for about 18 hours now.  It is interesting to note that 1.6.0_05 and 1.6.0_07 are using hotspot 10 while 1.6.0_10 and higher are using hotspot 11.
.

Comments
EVALUATION fixed in 6u18b01.
02-04-2010

EVALUATION http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/acba6af809c8
11-07-2009

EVALUATION http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/acba6af809c8
02-07-2009

EVALUATION The deoptimization problem in Arrays.copyOf(), Arrays.copyOfRegion(), Object.clone() will be fixed in separate bug: 6833129: specjvm98 fails with NullPointerException in the compiler with -XX:DeoptimizeALot
10-06-2009

EVALUATION The problem is for the slow_arraycopy call we create a separate value for allocation result which lives only until the call: // Promote from rawptr to oop, so it looks right in the call's GC map. dest = _gvn.transform( new(C,2) CheckCastPPNode(control(), dest, TypeInstPtr::NOTNULL) ); // Edit the call's debug-info to avoid referring to original_dest. // (The problem with original_dest is that it isn't ready until // after the InitializeNode completes, but this stuff is before.) // Substitute in the locally valid dest_oop. replace_in_map(original_dest, dest); generate_slow_arraycopy(adr_type, src, src_offset, dest, dest_offset, copy_length, nargs); And the original CheckCastPP is pined below the call so it lives only after the call: InitializeNode* init = insert_mem_bar_volatile(Op_Initialize, Compile::AliasIdxRaw, raw_dest)->as_Initialize(); _gvn.hash_delete(original_dest); original_dest->set_req(0, control()); _gvn.hash_find_insert(original_dest); // put back into GVN table So there is no live oop value (only raw) across the call which should be put into OopMap. As result the newly allocated array in copyOf() intrinsic is not put into OopMap and we are screwed when GC happens on the exit from the call to slow_arraycopy: c55 MOV [ESP + #96],EBX <<<<<<<<<<<<< newly allocated objarray c59 #checkcastPP of EBX c6f MOV ECX,EDX c71 XOR EDX,EDX c73 MOV [ESP + #0],EBX c76 XOR EBX,EBX c78 MOV [ESP + #4],EBX c7c MOV [ESP + #8],ESI c80 NOP # 3 bytes pad for loops and calls c83 CALL,static wrapper for: slow_arraycopy # java.util.ArrayList::toArray @ bci:21 L[0]=_ L[1]=_ # com.sigma.samp.vframe.entityutil.XmlObjectSerializerBase::assocListToXmlObject @ bci:162 L[0]=_ L[1]=_ L[2]=_ L[3]=esp + #16 L[4]=_ L[5]=_ L[6]=_ L[7]=_ L[8]=_ STK[0]=esp + #16 # OopMap{[16]=Oop off=3208} <<<<<<<<<<<<<<<< [esp + #96] not on oopMap c88 c88 B170: # B112 <- B169 Freq: 0.0162912 # Block is sole successor of call c88 MOV EBX,[ESP + #96] <<<<<<<<<<<<<<<< restored old value c8c JMP B112
10-06-2009

SUGGESTED FIX Disable intrinsics: Arrays.copyOf(), Arrays.copyOfRegion(), Object.clone() in 1.6.0_10 through 1.6.0_14: diff -r 6af0a709d52b src/share/vm/runtime/globals.hpp --- a/src/share/vm/runtime/globals.hpp Wed Mar 11 14:16:13 2009 -0700 +++ b/src/share/vm/runtime/globals.hpp Mon Jun 08 17:17:06 2009 -0700 @@ -450,7 +450,7 @@ class CommandLineFlags { "inline Object::hashCode() native that is known to be part " \ "of base library DLL") \ \ - develop(bool, InlineObjectCopy, true, \ + develop(bool, InlineObjectCopy, false, \ "inline Object.clone and Arrays.copyOf[Range] intrinsics") \ \ develop(bool, InlineNatives, true, \ It will be fixed in jdk 7.
09-06-2009

EVALUATION Next instrinsics introduced in HS10/1.6.0_10 are broken: Arrays.copyOf(), Arrays.copyOfRegion(), Object.clone() They will fail deoptimization and GC events because they have incomplete jvm state.
09-06-2009

WORK AROUND -XX:-ReduceFieldZeroing -XX:-ReduceInitialCardMarks -XX:-ReduceBulkZeroing
29-05-2009

EVALUATION -XX:-ReduceFieldZeroing -XX:-ReduceInitialCardMarks -XX:-ReduceBulkZeroing
29-05-2009