JDK-4979639 : JVM Crash in runtime_resolve_interface_method (CMS)
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 1.4.2_01
  • Priority: P2
  • Status: Closed
  • Resolution: Duplicate
  • OS: generic
  • CPU: sparc
  • Submitted: 2004-01-16
  • Updated: 2005-05-04
  • Resolved: 2005-05-04
Related Reports
Duplicate :  
Relates :  
A customer using BEA WEblogic with JVM 1.4.2_01 is seeing a crash with
attached hotspot log on Solaris. 

   #Error ID : 4F530E43505002EF 01
   # Please report this error at
   # http://java.sun.com/cgi-bin/bugreport.cgi
   # Java VM: Java HotSpot(TM) Server VM (1.4.2_01-b06 mixed mode)
The demangled native stack trace is as below:

_lwp_kill (6, 0, 0, ffffffff, ff3403c4, 0) + 8
abort    (ff33c008, 9757d980, 0, 4, 0, 9757d9a1) + 100
void os::abort(int) (1, ff152335, 9757da30, ff17e000, ff1c5acc, 3e7d2c) + 80
void os::handle_unexpected_exception(Thread*,int,unsigned char*,void*) (11889a8, 
a, fee10dac, 9757e798, fedda410, 0) + 2d4
JVM_handle_solaris_signal (fee10dac, 9757e798, 9757e4e0, 3400, 35a8, 0) + 91c
__sighndlr (a, 9757e798, 9757e4e0, fedda394, 0, 0) + c
call_user_handler (fa396200, 53, ff389680, 9757e4e0, 9757e798, a) + 254
sigacthandler (fa396200, 9757e798, 9757e4e0, ff388000, 9757e798, a) + 64

 --- called from signal handler with signal -96902656 (SIG Unknown) ---
e,Handle,KlassHandle,int,Thread*) (9757f
3cc, 9757e980, 9757e97c, 9757e978, 9757e974, 1) + a4
read*) (9757f3cc, 9757eaa4, 9757eaa0, a,
 11889a8, ff186c68) + 14c
Code,Thread*) (9757f3cc, 9757eb24, 9757e
b20, a, b9, 11889a8) + 210
CallInfo&,Thread*) (0, 14d5090, 9757ee
38, 9757f3c4, 9757f3cc, 11889a8) + 3b8
(9757f3c8, 11889a8, 9757f3c4, 9757f3cc,
11889a8, 0) + 48
unsigned char*OptoRuntime::handle_wrong_method_ic_miss(JavaThread*) (11889a8, 0, 
0, 0, 0, 0) + 64


EVALUATION ###@###.### 2004-02-26 I believe the object reference in %l5 in frame 19 is to an object that's been swept and now is a free block causing us to treat as a klass pointer the "prev" pointer in a CMS free block which has a 1 in the lowest bit. > This application is using the CMS collector. > Looks like 1.4.2_03 > > /usr/java1.4.2_03_SDK/jre/lib/sparc/server/libjvm.so > > I don't know if the problem is a bad stackmap from c2 > or an issue with CMS. Any suggestions on what to look > for next? > > To look at the core file: > rlogin merlin64 > cd /net/jpsesvr/jpse-US4/xj111267/paulj/coredump/instinet > dbx > where > > -ross > > In the following, the receiver object is 0xbd528f60 which > has a bad klass pointer. > > 0xbd528f60: 0xbd548598 0xbd528f49 0x00000006 0xbd528f78 CMS free list "previous" pointer > ^^^^^^^^^^ (1 in lowest bit) in 2nd word, > recvr_klass (bus error) next pointer in first word, and > a length in words in 3rd word > 0xbd528f70: 0x00000000 0x00000000 0x00000001 0xf615ded0 > 0xbd528f80: 0x00000000 0xbd528f78 0xbd528f78 0x00000000 > 0xbd528f90: 0x00000001 0xf6009018 0xbd50c518 0x00000000 > 0xbd528fa0: 0x0000004c 0x00000000 0x00000001 0xf60d24d8 > 0xbd528fb0: 0x00000000 0xbd523688 0x00000000 0x00000000 > > object in CMS generation: > > par new generation total 24512K, used 15456K [0xb6000000, 0xb7800000, 0xb7800000) > eden space 24448K, 63% used [0xb6000000, 0xb6f183c0, 0xb77e0000) > from space 64K, 0% used [0xb77e0000, 0xb77e0000, 0xb77f0000) > to space 64K, 0% used [0xb77f0000, 0xb77f0000, 0xb7800000) > concurrent mark-sweep generation total 1024000K, used 589738K [0xb7800000, 0xf6000000, 0xf6000000) > concurrent-mark-sweep perm gen total 56840K, used 43533K [0xf6000000, 0xf9782000, 0xfa000000) > > Current Java thread: > at java.util.Collections$SynchronizedCollection.contains(Collections.java:1543) > - locked <0xbd5217c8> (a java.util.Collections$SynchronizedCollection) > at gpt.infra.server.locking.TransactionalLock.allowReader(TransactionalLock.java:40) > at gpt.infra.server.locking.ReadWrite.beforeRead(ReadWrite.java:61) > - locked <0xbd521b18> (a gpt.infra.server.locking.TransactionalLock) > at gpt.infra.server.locking.TransactionalLock.beforeRead(TransactionalLock.java:145) > - locked <0xbd521b18> (a gpt.infra.server.locking.TransactionalLock) > at gpt.infra.server.locking.ReadWrite$1.acquire(ReadWrite.java:143) > at gpt.persistence.TransactionalObject.read(TransactionalObject.java:42) > at gpt.ucr.execution.sr.itfiInterface.handler.impl.ITFIClientImpl.getSettlementTermId(IT > ... > ---- called from signal handler with signal 10 (SIGBUS) ------ > =>[10] LinkResolver::runtime_resolve_interface_method(0x9797e254, 0x9797d7d8, 0x9797d7d4, 0x9797d7d0, 0x9797d7cc, 0x1), at 0xfee0f378 > [11] LinkResolver::resolve_invokeinterface(0x9797e254, 0x9797d8fc, 0x9797d8f8, 0xa, 0xcfbf90, 0xff18acac), at 0xfee0f22c > [12] LinkResolver::resolve_invoke(0x9797e254, 0x9797d97c, 0x9797d978, 0xa, 0xb9, 0xcfbf90), at 0xfed39424 > [13] OptoRuntime::find_callee_info_helper(0x0, 0xadf050, 0x9797dc90, 0x9797e24c, 0x9797e254, 0xcfbf90), at 0xfeda49a8 > [14] OptoRuntime::find_callee_info(0x9797e250, 0xcfbf90, 0x9797e24c, 0x9797e254, 0xcfbf90, 0x3), at 0xfedb0bdc > [15] OptoRuntime::inner_resolve_helper(0xcfbf90, 0x1, 0x0, 0x9797e2d4, 0xcfbf90, 0xfedb170c), at 0xfedb12e8 > [16] OptoRuntime::resolve_helper(0xcfbf90, 0x1, 0x0, 0xcfbf90, 0xb8, 0x9797e650), at 0xfedb17a4 > [17] OptoRuntime::resolve_virtual_call_C(0xcfbf90, 0xafa61ec6, 0xefa42e, 0x4710, 0x4000, 0x9797e690), at 0xfee74f28 > [18] 0xfa43492c(0xbd528f60, 0xb68187f0, 0x1, 0x1, 0x0, 0x0), at 0xfa43492b > =>[19] 0xfa7ebb54(0x1, 0xb68187f0, 0xcfbf90, 0xbd44abe8, 0x4800, 0x0), at 0xfa7ebb53 > [20] 0xfafd0e40(0xbd521b18, 0xb68187f0, 0xa, 0xff182000, 0xc, 0x9797e6f8), at 0xfafd0e3f > ... > > 0xfa7eba90: ld [%l6 + 0x8], %l5 > 0xfa7eba94: ld [%l5 + 0x4], %l1 > 0xfa7eba98: sethi %hi(0xf615d400), %l0 > 0xfa7eba9c: or %l0, 0x370, %l0 > 0xfa7ebaa0: cmp %l1, %l0 <=== interesting check (%l1 bad klass (CMS free list)) > 0xfa7ebaa4: bne,pn %icc,0xfa7ebb40 -----------------+ > 0xfa7ebaa8: nop | > .... > 0xfa7ebb40: mov %l5, %o0 <----------------------+ > 0xfa7ebb44: mov %i1, %o1 > 0xfa7ebb48: mov %l4, %l0 > 0xfa7ebb4c: sethi %hi(0x0), %g5 > 0xfa7ebb50: add %g5, 0x1, %g5 > 0xfa7ebb54: call 0xfa434900 <=== OptoRuntime::resolve_virtual_call_Java() > 0xfa7ebb58: nop > > current thread: t@null > current frame: [19] > g0-g3 0x00000000 0x00000000 0xbd528f49 0xf6031e50 > g4-g7 0xbd528f51 0x00000010 0x00000000 0xfc415a00 > o0-o3 0xbd528f60 0xb68187f0 0x00000001 0x00000001 > o4-o7 0x00000000 0x00000000 0x9797e668 0xfa7ebb54 > l0-l3 0xbd521b18 0xbd528f49 0x00000000 0xbd5217c8 > l4-l7 0xbd521b18 0xbd528f60 0xbd5217c8 0xf6046c38 > i0-i3 0x00000001 0xb68187f0 0x00cfbf90 0xbd44abe8 > i4-i7 0x00004800 0x00000000 0x9797e6d0 0xfafd0e40 > y 0x00000000 > ccr 0x00000000 > pc 0xfa7ebb54:0xfa7ebb54 call 0xfa434900 > npc 0xfee0f37c:runtime_resolve_interface_method+0xa8 cmp %g2, %g3 > > (dbx) x 0xbd528f60/8X %l5 > 0xbd528f60: 0xbd548598 0xbd528f49 0x00000006 0xbd528f78 > 0xbd528f70: 0x00000000 0x00000000 0x00000001 0xf615ded0 > > 0xbd5217a8: 0x00000001 0xf8241be8 0xbd521b18 0x00000000 > 0xbd5217b8: 0x00000001 0xf8241d00 0xbd521b18 0x00000000 > > (dbx) x 0xbd5217c8/8 %l6 > 0xbd5217c8: 0x9797e6c8 0xf63465a0 0xbd528f60 0xbd5217c8 > 0xbd5217d8: 0x00000001 0xf8241be8 0xbd521c58 0x00000000 > > (dbx) x 0xf615d400+0x370/8X %l0 -- inline klass pointer -- > 0xf615d770: 0x00000001 0xf6000138 0xff1c30e8 0x00000028 > 0xf615d780: 0xf6031e50 0xf615c3c8 0xf60046e8 0xf6032800 ###@###.### 2004-02-26 -- any luck running with fastdebug and/or with heap verification? Also _03 had a silent parallel task queue overflow problem (CMS only, fixed in _05 i think); so you should turn off Parallel Remark while debugging this problem. -XX:-CMSParallelRemarkEnabled (You might also want to turn off perm gen precleaning, but that does not appear implicated in any way here..) -XX:-CMSPermGenPrecleaningEnabled ###@###.### 2004-02-27 The SynchronizedCollection object which contains the bad reference "c" to the Collection object appears to be okay. If the problem is a bad oopmap in compiled code, then the problem may be between the allocation of the Collection and the store into "c" of the SynchronizedCollection object since the SynchronizedCollection object is probably not updated after it's created. However, the problem could be a missed reference by CMS, which could have happened at anytime from the mutator's perspective. ###@###.### 2004-03-16: Current experiments appear to have narrowed this down to either of two previously reported bugs: 4975054 (less likely) and/or 4985197 (more likely). The customer is keeping the current process with the CMS workarounds above under observation for a few more days. ------- This bug has not seen any updates or activities now for over a year. Meanwhile, bugs 4975054 and 4985197 have both been fixed in 1.4.2_05, and the escalation associated with this bug has long since been closed. As such, this bug is ready to be garbage-collected, as it were, and closed as a duplicate of one of these earlier bugs. If there are any issues, please reopen or open a new bug. Note that tao.ma, the RE for this bug no longer works in the CTE organization, so if you happen to file a new bug (or reopen this one), please leave the RE field blank so this can be looked at by someone appropriate. ###@###.### 2005-05-04 01:47:36 GMT