JDK-4753265 : [1.4.0_02] Crash in 64 bit HotSpot Server JVM,, backport 4746263
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 1.4.0_03
  • Priority: P2
  • Status: Resolved
  • Resolution: Fixed
  • OS: solaris_8
  • CPU: sparc
  • Submitted: 2002-09-25
  • Updated: 2009-06-25
  • Resolved: 2003-01-07
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
1.4.0_04 04Fixed
Related Reports
Relates :  
Description
Customer is trying to benchmark their application using J2SE1.4 64 bit on
E4500 and E10K systems, but are running into crashes.  They have
tried with 1.4.0_02, early access 1.4.0_03 and 1.4.1 64bit
versions and the problem occurs in all with the server HotSpot
compiler.  They have tried with 1.4.0_2 -client using the
32bit binary and the crashes don't occur, but the performance is
only 40% of what -server 64bit can provide.

The first crash message is seen consistently using 64-Bit Server VM 
1.4.0_03-ea-b01. The cust tested with a 64bit fastdebug build of 1.4.0_03
and  assert failures occurred assert(top <= end, "pointers out of order")
The stack trace is located in the attachments and the actual core files can 
be provided upon requested.

Unexpected Signal : 10 occurred at PC=0xFFFFFFFF39A3001C
Function=printString (compiled Java code)
Library=(N/A)

Current Java thread:

Dynamic libraries:
0x100000000 	/c1t1/tal/jre/14003/bin/sparcv9/java
0xffffffff7f300000 	/usr/lib/64/libthread.so.1
0xffffffff7f500000 	/usr/lib/64/libdl.so.1
0xffffffff7ef00000 	/usr/lib/64/libc.so.1
0xffffffff7ee00000 	/usr/platform/SUNW,Ultra-Enterprise-10000/lib/sparcv9/libc_psr.so.1
0xffffffff7d000000 	/c1t1/tal/jre/14003/lib/sparcv9/server/libjvm.so
0xffffffff7ce00000 	/usr/lib/64/libCrun.so.1
0xffffffff7cc00000 	/usr/lib/64/libsocket.so.1
0xffffffff7ca00000 	/usr/lib/64/libnsl.so.1
0xffffffff7c800000 	/usr/lib/64/libm.so.1
0xffffffff7db00000 	/usr/lib/64/libw.so.1
0xffffffff7c500000 	/usr/lib/64/libmp.so.2
0xffffffff7c200000 	/c1t1/tal/jre/14003/lib/sparcv9/native_threads/libhpi.so
0xffffffff7c000000 	/c1t1/tal/jre/14003/lib/sparcv9/libverify.so
0xffffffff7be00000 	/c1t1/tal/jre/14003/lib/sparcv9/libjava.so
0xffffffff7bb00000 	/c1t1/tal/jre/14003/lib/sparcv9/libzip.so
0xfffffffee3700000 	/usr/lib/locale/en_US.ISO8859-1/sparcv9/en_US.ISO8859-1.so.2
0xfffffffee1a00000 	/t3-6/gatherer/g20/Shared/libxacct_native_sparcv9_SunOS.so
0xfffffffee1800000 	/c1t1/tal/jre/14003/lib/sparcv9/libnet.so
0xfffffffee0400000 	/t3-6/gatherer/g20/Shared/libdb_java-3.3-sparcv9-SunOS.so
0xfffffffedf400000 	/c1t1/tal/jre/14003/lib/sparcv9/libawt.so
0xfffffffedf200000 	/c1t1/tal/jre/14003/lib/sparcv9/libmlib_image.so
0xfffffffedf000000 	/c1t1/tal/jre/14003/lib/sparcv9/motif21/libmawt.so
0xfffffffedeb00000 	/usr/dt/lib/sparcv9/libXm.so.4
0xfffffffede900000 	/usr/openwin/lib/sparcv9/libXt.so.4
0xfffffffede700000 	/usr/openwin/lib/sparcv9/libXext.so.0
0xfffffffede500000 	/usr/openwin/lib/sparcv9/libXtst.so.1
0xfffffffede200000 	/usr/openwin/lib/sparcv9/libX11.so.4
0xfffffffede000000 	/usr/openwin/lib/sparcv9/libdps.so.5
0xfffffffedde00000 	/usr/openwin/lib/sparcv9/libSM.so.6
0xfffffffeddb00000 	/usr/openwin/lib/sparcv9/libICE.so.6
0xfffffffedd900000 	/usr/openwin/lib/sparcv9/libdga.so.1

Local Time = Wed Sep 18 00:34:50 2002
Elapsed Time = 577
#
# HotSpot Virtual Machine Error : 10
# Error ID : 4F530E43505002D5 01
# Please report this error at
# http://java.sun.com/cgi-bin/bugreport.cgi
#
# Java VM: Java HotSpot(TM) 64-Bit Server VM (1.4.0_03-ea-b01 mixed mode)
#
# An error report file has been saved as hs_err_pid22037.log.
# Please refer to the file for further information.
#

#
# HotSpot Virtual Machine Error, assertion failure
# Please report this error at
# http://java.sun.com/cgi-bin/bugreport.cgi
#
# Java VM: Java HotSpot(TM) 64-Bit Server VM (1.4.0_03-internal-20020821-debug mixed mode)
#
# assert(top <= end, "pointers out of order")
#
# Error ID: /net/jdk/export/jpse01/hshen/J2SE/140/hotspot/src/share/vm/memory/collectedHeap.inline.hpp, 121 [ Patched ]
#
# Problematic Thread: prio=5 tid=0x10573c2b8 nid=0x5c runnable 
#
Dumping core....


Comments
CONVERTED DATA BugTraq+ Release Management Values COMMIT TO FIX: 1.4.0_04 FIXED IN: 1.4.0_04 INTEGRATED IN: 1.4.0_04
14-06-2004

SUGGESTED FIX Diffs applied to 1.4.0_03 to fix: ------- memnode.hpp ------- *** sccs.WLaiwq Fri Oct 4 14:06:18 2002 --- memnode.hpp Thu Oct 3 11:34:16 2002 *************** *** 1,5 **** #ifdef USE_PRAGMA_IDENT_HDR ! #pragma ident "%W% %E% %U% JVM" #endif /* * Copyright 1991-2002 Sun Microsystems, Inc. All rights reserved. --- 1,5 ---- #ifdef USE_PRAGMA_IDENT_HDR ! #pragma ident "@(#)memnode.hpp 1.88 02/10/03 11:32:42 JVM" #endif /* * Copyright 1991-2002 Sun Microsystems, Inc. All rights reserved. *************** *** 192,197 **** --- 192,207 ---- virtual int Opcode() const; virtual uint ideal_reg() const { return Op_RegP; } virtual int store_Opcode() const { return Op_StoreP; } + // depends_only_on_test is almost always true, and needs to be almost always + // true to enable key hoisting & commoning optimizations. However, for the + // special case of RawPtr loads from TLS top & end, the control edge carries + // the dependence preventing hoisting past a Safepoint instead of the memory + // edge. (An unfortunate consequence of having Safepoints not set Raw + // Memory; itself an unfortunate consequence of having Nodes which produce + // results (new raw memory state) inside of loops preventing all manner of + // other optimizations). Basically, it's ugly but so is the alternative. + // See comment in graphkit.cpp, around line 1923 GraphKit::allocate_heap. + virtual bool depends_only_on_test() const { return adr_type() != TypeRawPtr::BOTTOM; } }; //------------------------------LoadKlassNode---------------------------------- *************** *** 202,207 **** --- 212,219 ---- : LoadPNode(c,mem,adr,at,tk) {} virtual int Opcode() const; virtual const Type *Value( PhaseTransform *phase ) const; + virtual bool depends_only_on_test() const { return true; } + }; //------------------------------LoadSNode-------------------------------------- *************** *** 322,327 **** --- 334,340 ---- : LoadPNode(c,mem,adr,TypeRawPtr::BOTTOM, TypeRawPtr::BOTTOM) {} virtual int Opcode() const; virtual int store_Opcode() const { return Op_StorePConditional; } + virtual bool depends_only_on_test() const { return true; } }; //------------------------------LoadLLockedNode--------------------------------- ------- loopopts.cpp ------- *** sccs.Kuaqxq Fri Oct 4 14:06:58 2002 --- loopopts.cpp Mon Sep 30 12:33:33 2002 *************** *** 1,5 **** #ifdef USE_PRAGMA_IDENT_SRC ! #pragma ident "%W% %E% %U% JVM" #endif // // Copyright 1997-2002 Sun Microsystems, Inc. All rights reserved. --- 1,5 ---- #ifdef USE_PRAGMA_IDENT_SRC ! #pragma ident "@(#)loopopts.cpp 1.162 02/09/30 12:31:22 JVM" #endif // // Copyright 1997-2002 Sun Microsystems, Inc. All rights reserved. *************** *** 629,638 **** // If trying to do a 'Split-If' at the loop head, it is only // profitable if the cmp folds up on BOTH paths. Otherwise we // risk peeling a loop forever. - // CNC - Disabled for now. - if( n_ctrl->is_Loop() ) - policy = 999; // Policy requires BOTH paths to win // Split compare 'n' through the merge point if it is profitable Node *phi = split_thru_phi( n, n_ctrl, policy ); if( !phi ) return; --- 629,647 ---- // If trying to do a 'Split-If' at the loop head, it is only // profitable if the cmp folds up on BOTH paths. Otherwise we // risk peeling a loop forever. + // CNC - Disabled for now. Requires careful handling of loop + // body selection for the cloned code. Also, make sure we check + // for any input path not being in the same loop as n_ctrl. For + // irreducible loops we cannot check for 'n_ctrl->is_Loop()' + // because the alternative loop entry points won't be converted + // into LoopNodes. + IdealLoopTree *n_loop = get_loop(n_ctrl); + for( uint j = 1; j < n_ctrl->req(); j++ ) + if( get_loop(n_ctrl->in(j)) != n_loop ) + return; + + // Split compare 'n' through the merge point if it is profitable Node *phi = split_thru_phi( n, n_ctrl, policy ); if( !phi ) return; ================= ###@###.### 2002-10-04
04-10-2002

EVALUATION This is the basic stuff : HeapWord*CollectedHeap::common_mem_allocate_noinit ffffffff7c4bb978 HeapWord*CollectedHeap::common_mem_allocate_noinit(unsigned long,Thread*) (5, 10573c2b8, fffffffed5c00c40, ffffffff7ce31112, ffffffff7c8688e4,0) + 80 ffffffff7c4badb8 oopDesc*CollectedHeap::obj_allocate(KlassHandle,int,Thread*) (10573c5b8, 5, 10573c2b8, 1f, ffffffff7c8688e4, 0) + c0 ffffffff7c4b30c8 instanceOopDesc*instanceKlass::allocate_instance(Thread*) (ffffffff3300bfb8, 10573c2b8, 0, 0, 2, 1001ced58) + 4b8 ffffffff7cd07f40 void OptoRuntime::new_C(klassOopDesc*,JavaThread*) (ffffffff3300bfa8, 10573c2b8, 0, 0, 30, fffffffefaeba558) + 1a0 ffffffff38998bbc ???????? (ffffffff3300bfa8, fffffffefaeba558, 2, fffffffefaeba540, 2, 86) # assert(top <= end, "pointers out of order") # # Error ID: /net/jdk/export/jpse01/hshen/J2SE/140/hotspot/src/share/vm/memory/collectedHeap.inline.hpp, 121 [ Patched ] # 115 HeapWord* CollectedHeap::allocate_from_tlab(Thread* thread, size_t size) { 116 assert(UseTLAB, "should use UseTLAB"); 117 118 HeapWord* top = thread->tlab().top(); 119 HeapWord* end = thread->tlab().end(); 120 121 assert(top <= end, "pointers out of order"); 122 123 if (pointer_delta(end, top) >= size) { 124 // successful thread-local allocation 125 if (!ZeroTLAB) { 126 // need to clear individual objects 127 Memory::set_words(top, size); 128 } 129 // This addition is safe because we know that top is 130 // at least size below end, so the add can't wrap. 131 thread->tlab().set_top(top + size); 132 133 assert(thread->tlab().invariants(), "TLAB integrity violated"); 134 return top; 135 } 136 // Otherwise... 137 return allocate_from_tlab_slow(thread, size); 138 } This is the basic stuff in the bug report: Awaiting core file /machine access... cycling home now.. ###@###.### 2002-09-25 ----- ----- ###@###.### 2002-09-25 Hitting this assert in 1.4.1 is probably an occurrence of the following bug 4746263 JDK 1.4.1 dumps core during ECperf; fails debug assert top <= end A fix for this is going into 1.4.1_01 and is available in HotSpot's current main/baseline /net/balvenie/export/imgr_home/archive/main/baseline/20020920073822.mpal.c2_merge_20020919/product.tgz ----- ----- ###@###.### 2002-09-27 Customer has confirmed that crashes no longer occur with the fix for 4746263 in 1.4.1_01 and would like the fix backported to the next 1.4.0_x as well. =============================== ###@###.### 2002-10-03 Unfortunately the following does not appear to be sufficient to fix this in 1.4.0 : ------- memnode.hpp ------- *** sccs.HmayZJ Thu Oct 3 10:35:43 2002 --- memnode.hpp Mon Sep 30 12:33:33 2002 *************** *** 1,5 **** #ifdef USE_PRAGMA_IDENT_HDR ! #pragma ident "%W% %E% %U% JVM" #endif /* * Copyright 1991-2002 Sun Microsystems, Inc. All rights reserved. --- 1,5 ---- #ifdef USE_PRAGMA_IDENT_HDR ! #pragma ident "@(#)memnode.hpp 1.87 02/09/30 12:31:36 JVM" #endif /* * Copyright 1991-2002 Sun Microsystems, Inc. All rights reserved. *************** *** 115,120 **** --- 115,130 ---- virtual uint ideal_reg() const { return Op_RegI; } virtual Node *Ideal(PhaseGVN *phase, bool can_reshape); virtual int store_Opcode() const { return Op_StoreB; } + // depends_only_on_test is almost always true, and needs to be almost always + // true to enable key hoisting & commoning optimizations. However, for the + // special case of RawPtr loads from TLS top & end, the control edge carries + // the dependence preventing hoisting past a Safepoint instead of the memory + // edge. (An unfortunate consequence of having Safepoints not set Raw + // Memory; itself an unfortunate consequence of having Nodes which produce + // results (new raw memory state) inside of loops preventing all manner of + // other optimizations). Basically, it's ugly but so is the alternative. + // See comment in graphkit.cpp, around line 1923 GraphKit::allocate_heap. + virtual bool depends_only_on_test() const { return adr_type() != TypeRawPtr::BOTTOM; } }; //------------------------------LoadCNode-------------------------------------- *************** *** 148,153 **** --- 158,164 ---- : LoadINode(c,mem,adr,TypeAryPtr::RANGE,ti) {} virtual int Opcode() const; virtual const Type *Value( PhaseTransform *phase ) const; + virtual bool depends_only_on_test() const { return true; } }; //------------------------------LoadLNode-------------------------------------- *************** *** 322,327 **** --- 333,339 ---- : LoadPNode(c,mem,adr,TypeRawPtr::BOTTOM, TypeRawPtr::BOTTOM) {} virtual int Opcode() const; virtual int store_Opcode() const { return Op_StorePConditional; } + virtual bool depends_only_on_test() const { return true; } }; //------------------------------LoadLLockedNode--------------------------------- ------- loopopts.cpp ------- *** sccs.pcai0J Thu Oct 3 10:36:19 2002 --- loopopts.cpp Mon Sep 30 12:33:33 2002 *************** *** 1,5 **** #ifdef USE_PRAGMA_IDENT_SRC ! #pragma ident "%W% %E% %U% JVM" #endif // // Copyright 1997-2002 Sun Microsystems, Inc. All rights reserved. --- 1,5 ---- #ifdef USE_PRAGMA_IDENT_SRC ! #pragma ident "@(#)loopopts.cpp 1.162 02/09/30 12:31:22 JVM" #endif // // Copyright 1997-2002 Sun Microsystems, Inc. All rights reserved. *************** *** 629,638 **** // If trying to do a 'Split-If' at the loop head, it is only // profitable if the cmp folds up on BOTH paths. Otherwise we // risk peeling a loop forever. - // CNC - Disabled for now. - if( n_ctrl->is_Loop() ) - policy = 999; // Policy requires BOTH paths to win // Split compare 'n' through the merge point if it is profitable Node *phi = split_thru_phi( n, n_ctrl, policy ); if( !phi ) return; --- 629,647 ---- // If trying to do a 'Split-If' at the loop head, it is only // profitable if the cmp folds up on BOTH paths. Otherwise we // risk peeling a loop forever. + // CNC - Disabled for now. Requires careful handling of loop + // body selection for the cloned code. Also, make sure we check + // for any input path not being in the same loop as n_ctrl. For + // irreducible loops we cannot check for 'n_ctrl->is_Loop()' + // because the alternative loop entry points won't be converted + // into LoopNodes. + IdealLoopTree *n_loop = get_loop(n_ctrl); + for( uint j = 1; j < n_ctrl->req(); j++ ) + if( get_loop(n_ctrl->in(j)) != n_loop ) + return; + + // Split compare 'n' through the merge point if it is profitable Node *phi = split_thru_phi( n, n_ctrl, policy ); if( !phi ) return; Any suggestions? Heres the stack/info after the above with the same assert: # Java VM: Java HotSpot(TM) 64-Bit Server VM (1.4.0+4753265-TEST+20020930.152537 +chrisph-debug mixed mode) # # assert(top <= end, "pointers out of order") # # Error ID: /net/altair.east/terra/space5/chrisph/4746263/build/src/share/vm/mem ory/collectedHeap.inline.hpp, 121 [ Patched ] # # Problematic Thread: prio=5 tid=0x1059d8228 nid=0x38 runnable # Dumping core.... core '/t3-1/gatherer/g29/Gatherer/core' of 15088: /c1t1/tal/jre/14003e/bin/sparcv9/java -server -showversion -Xms1280m - ----------------- lwp# 57 / thread# 56 -------------------- ... ffffffff7c4bd800 HeapWord*CollectedHeap::common_mem_allocate_noinit(unsigned long,Thread*) (5, 1059d8228, fffffffeba000c40, ffffffff7ce3c692, ffffffff7c86b164, 0) + 80 ffffffff7c4bcc40 oopDesc*CollectedHeap::obj_allocate(KlassHandle,int,Thread*) (1004dd608, 5, 1059d8228, 1f, ffffffff7c86b164, 0) + c0 ffffffff7c4b4f5c instanceOopDesc*instanceKlass::allocate_instance(Thread*) (ffffffff3300bfb8, 1059d8228, 0, 0, 2, 1001ced98) + 4c4 ffffffff7cd0ac38 void OptoRuntime::new_C(klassOopDesc*,JavaThread*) (ffffffff3300bfa8, 1059d8228, 0, 0, 37, fffffffeda724390) + 1a0 ###@###.### 2002-10-03 Hmm, I seemed to have botched the backport... I am moving the code from LoadBNode to LoadPNode in memnodes.hpp, we'll see if that makes the fix better. ###@###.### 2002-10-03 OK... The customer confirms the fix [See Suggested Fix.] ###@###.### 2002-10-04
03-10-2002