United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
JDK-4979639 : JVM Crash in runtime_resolve_interface_method (CMS)

Submit Date:
Updated Date:
Project Name:
Resolved Date:
Affected Versions:
Fixed Versions:

Related Reports

Sub Tasks

A customer using BEA WEblogic with JVM 1.4.2_01 is seeing a crash with
attached hotspot log on Solaris. 

   #Error ID : 4F530E43505002EF 01
   # Please report this error at
   # http://java.sun.com/cgi-bin/bugreport.cgi
   # Java VM: Java HotSpot(TM) Server VM (1.4.2_01-b06 mixed mode)
The demangled native stack trace is as below:

_lwp_kill (6, 0, 0, ffffffff, ff3403c4, 0) + 8
abort    (ff33c008, 9757d980, 0, 4, 0, 9757d9a1) + 100
void os::abort(int) (1, ff152335, 9757da30, ff17e000, ff1c5acc, 3e7d2c) + 80
void os::handle_unexpected_exception(Thread*,int,unsigned char*,void*) (11889a8, 
a, fee10dac, 9757e798, fedda410, 0) + 2d4
JVM_handle_solaris_signal (fee10dac, 9757e798, 9757e4e0, 3400, 35a8, 0) + 91c
__sighndlr (a, 9757e798, 9757e4e0, fedda394, 0, 0) + c
call_user_handler (fa396200, 53, ff389680, 9757e4e0, 9757e798, a) + 254
sigacthandler (fa396200, 9757e798, 9757e4e0, ff388000, 9757e798, a) + 64

 --- called from signal handler with signal -96902656 (SIG Unknown) ---
e,Handle,KlassHandle,int,Thread*) (9757f
3cc, 9757e980, 9757e97c, 9757e978, 9757e974, 1) + a4
read*) (9757f3cc, 9757eaa4, 9757eaa0, a,
 11889a8, ff186c68) + 14c
Code,Thread*) (9757f3cc, 9757eb24, 9757e
b20, a, b9, 11889a8) + 210
CallInfo&,Thread*) (0, 14d5090, 9757ee
38, 9757f3c4, 9757f3cc, 11889a8) + 3b8
(9757f3c8, 11889a8, 9757f3c4, 9757f3cc,
11889a8, 0) + 48
unsigned char*OptoRuntime::handle_wrong_method_ic_miss(JavaThread*) (11889a8, 0, 
0, 0, 0, 0) + 64




###@###.### 2004-02-26

I believe the object reference in %l5 in frame 19 is
to an object that's been swept and now is a free block
causing us to treat as a klass pointer the "prev" pointer
in a CMS free block which has a 1 in the lowest bit.

> This application is using the CMS collector.
> Looks like 1.4.2_03
>    /usr/java1.4.2_03_SDK/jre/lib/sparc/server/libjvm.so
> I don't know if the problem is a bad stackmap from c2
> or an issue with CMS.  Any suggestions on what to look
> for next?
> To look at the core file:
>    rlogin merlin64
>    cd /net/jpsesvr/jpse-US4/xj111267/paulj/coredump/instinet
>    dbx
>      where
> -ross
> In the following, the receiver object is 0xbd528f60 which
> has a bad klass pointer.
> 0xbd528f60:      0xbd548598 0xbd528f49 0x00000006 0xbd528f78    CMS free list "previous" pointer
>                             ^^^^^^^^^^                          (1 in lowest bit) in 2nd word,
>                             recvr_klass (bus error)             next pointer in first word, and
>                                                                 a length in words in 3rd word
> 0xbd528f70:      0x00000000 0x00000000 0x00000001 0xf615ded0
> 0xbd528f80:      0x00000000 0xbd528f78 0xbd528f78 0x00000000
> 0xbd528f90:      0x00000001 0xf6009018 0xbd50c518 0x00000000
> 0xbd528fa0:      0x0000004c 0x00000000 0x00000001 0xf60d24d8
> 0xbd528fb0:      0x00000000 0xbd523688 0x00000000 0x00000000
> object in CMS generation:
>  par new generation   total 24512K, used 15456K [0xb6000000, 0xb7800000, 0xb7800000)
>   eden space 24448K,  63% used [0xb6000000, 0xb6f183c0, 0xb77e0000)
>   from space 64K,   0% used [0xb77e0000, 0xb77e0000, 0xb77f0000)
>   to   space 64K,   0% used [0xb77f0000, 0xb77f0000, 0xb7800000)
>  concurrent mark-sweep generation total 1024000K, used 589738K [0xb7800000, 0xf6000000,
>  concurrent-mark-sweep perm gen total 56840K, used 43533K [0xf6000000, 0xf9782000, 0xfa000000)
> Current Java thread:
>         at java.util.Collections$SynchronizedCollection.contains(Collections.java:1543)
>         - locked <0xbd5217c8> (a java.util.Collections$SynchronizedCollection)
>         at gpt.infra.server.locking.TransactionalLock.allowReader(TransactionalLock.java:40)
>         at gpt.infra.server.locking.ReadWrite.beforeRead(ReadWrite.java:61)
>         - locked <0xbd521b18> (a gpt.infra.server.locking.TransactionalLock)
>         at gpt.infra.server.locking.TransactionalLock.beforeRead(TransactionalLock.java:145)
>         - locked <0xbd521b18> (a gpt.infra.server.locking.TransactionalLock)
>         at gpt.infra.server.locking.ReadWrite$1.acquire(ReadWrite.java:143)
>         at gpt.persistence.TransactionalObject.read(TransactionalObject.java:42)
>         at gpt.ucr.execution.sr.itfiInterface.handler.impl.ITFIClientImpl.getSettlementTermId(IT
> ...
>   ---- called from signal handler with signal 10 (SIGBUS) ------
> =>[10] LinkResolver::runtime_resolve_interface_method(0x9797e254, 0x9797d7d8, 0x9797d7d4,
0x9797d7d0, 0x9797d7cc, 0x1), at 0xfee0f378
>   [11] LinkResolver::resolve_invokeinterface(0x9797e254, 0x9797d8fc, 0x9797d8f8, 0xa, 0xcfbf90,
0xff18acac), at 0xfee0f22c
>   [12] LinkResolver::resolve_invoke(0x9797e254, 0x9797d97c, 0x9797d978, 0xa, 0xb9, 0xcfbf90), at
>   [13] OptoRuntime::find_callee_info_helper(0x0, 0xadf050, 0x9797dc90, 0x9797e24c, 0x9797e254,
0xcfbf90), at 0xfeda49a8
>   [14] OptoRuntime::find_callee_info(0x9797e250, 0xcfbf90, 0x9797e24c, 0x9797e254, 0xcfbf90,
0x3), at 0xfedb0bdc
>   [15] OptoRuntime::inner_resolve_helper(0xcfbf90, 0x1, 0x0, 0x9797e2d4, 0xcfbf90, 0xfedb170c),
at 0xfedb12e8
>   [16] OptoRuntime::resolve_helper(0xcfbf90, 0x1, 0x0, 0xcfbf90, 0xb8, 0x9797e650), at 0xfedb17a4
>   [17] OptoRuntime::resolve_virtual_call_C(0xcfbf90, 0xafa61ec6, 0xefa42e, 0x4710, 0x4000,
0x9797e690), at 0xfee74f28
>   [18] 0xfa43492c(0xbd528f60, 0xb68187f0, 0x1, 0x1, 0x0, 0x0), at 0xfa43492b
> =>[19] 0xfa7ebb54(0x1, 0xb68187f0, 0xcfbf90, 0xbd44abe8, 0x4800, 0x0), at 0xfa7ebb53
>   [20] 0xfafd0e40(0xbd521b18, 0xb68187f0, 0xa, 0xff182000, 0xc, 0x9797e6f8), at 0xfafd0e3f
> ...
> 0xfa7eba90:     ld      [%l6 + 0x8], %l5
> 0xfa7eba94:     ld      [%l5 + 0x4], %l1
> 0xfa7eba98:     sethi   %hi(0xf615d400), %l0
> 0xfa7eba9c:     or      %l0, 0x370, %l0
> 0xfa7ebaa0:     cmp     %l1, %l0            <=== interesting check (%l1 bad klass (CMS free
> 0xfa7ebaa4:     bne,pn %icc,0xfa7ebb40   -----------------+
> 0xfa7ebaa8:     nop                                       |
> ....
> 0xfa7ebb40:     mov     %l5, %o0   <----------------------+
> 0xfa7ebb44:     mov     %i1, %o1
> 0xfa7ebb48:     mov     %l4, %l0
> 0xfa7ebb4c:     sethi   %hi(0x0), %g5
> 0xfa7ebb50:     add     %g5, 0x1, %g5
> 0xfa7ebb54:     call    0xfa434900      <=== OptoRuntime::resolve_virtual_call_Java()
> 0xfa7ebb58:     nop
> current thread: t@null
> current frame:  [19]
> g0-g3    0x00000000 0x00000000 0xbd528f49 0xf6031e50
> g4-g7    0xbd528f51 0x00000010 0x00000000 0xfc415a00
> o0-o3    0xbd528f60 0xb68187f0 0x00000001 0x00000001
> o4-o7    0x00000000 0x00000000 0x9797e668 0xfa7ebb54
> l0-l3    0xbd521b18 0xbd528f49 0x00000000 0xbd5217c8
> l4-l7    0xbd521b18 0xbd528f60 0xbd5217c8 0xf6046c38
> i0-i3    0x00000001 0xb68187f0 0x00cfbf90 0xbd44abe8
> i4-i7    0x00004800 0x00000000 0x9797e6d0 0xfafd0e40
> y        0x00000000
> ccr      0x00000000
> pc       0xfa7ebb54:0xfa7ebb54  call    0xfa434900
> npc      0xfee0f37c:runtime_resolve_interface_method+0xa8       cmp     %g2, %g3
> (dbx) x 0xbd528f60/8X   %l5
> 0xbd528f60:      0xbd548598 0xbd528f49 0x00000006 0xbd528f78
> 0xbd528f70:      0x00000000 0x00000000 0x00000001 0xf615ded0
> 0xbd5217a8:      0x00000001 0xf8241be8 0xbd521b18 0x00000000
> 0xbd5217b8:      0x00000001 0xf8241d00 0xbd521b18 0x00000000
> (dbx) x 0xbd5217c8/8    %l6
> 0xbd5217c8:      0x9797e6c8 0xf63465a0 0xbd528f60 0xbd5217c8
> 0xbd5217d8:      0x00000001 0xf8241be8 0xbd521c58 0x00000000
> (dbx) x 0xf615d400+0x370/8X  %l0 -- inline klass pointer --
> 0xf615d770:      0x00000001 0xf6000138 0xff1c30e8 0x00000028
> 0xf615d780:      0xf6031e50 0xf615c3c8 0xf60046e8 0xf6032800

###@###.### 2004-02-26

-- any luck running with fastdebug and/or with heap

Also _03 had a silent parallel task queue overflow problem (CMS only,
fixed in _05 i think); so you should turn off Parallel Remark while
debugging this problem.


(You might also want to turn off perm gen precleaning, but that
does not appear implicated in any way here..)


###@###.### 2004-02-27

The SynchronizedCollection object which contains the bad reference
"c" to the Collection object appears to be okay.  If the problem
is a bad oopmap in compiled code, then the problem may be between
the allocation of the Collection and the store into "c" of the
SynchronizedCollection object since the SynchronizedCollection object
is probably not updated after it's created.  However, the problem
could be a missed reference by CMS, which could have happened at
anytime from the mutator's perspective.

###@###.### 2004-03-16:
Current experiments appear to have narrowed this down to either
of two previously reported bugs: 4975054 (less likely) and/or
4985197 (more likely). The customer is keeping the current
process with the CMS workarounds above under observation for a
few more days.


This bug has not seen any updates or activities now for over a year.
Meanwhile, bugs 4975054 and 4985197 have both been fixed in 1.4.2_05,
and the escalation associated with this bug has long since been
closed. As such, this bug is ready to be garbage-collected, as it
were, and closed as a duplicate of one of these earlier bugs.
If there are any issues, please reopen or open a new bug. Note that
tao.ma, the RE for this bug no longer works in the CTE organization,
so if you happen to file a new bug (or reopen this one), please
leave the RE field blank so this can be looked at by someone
###@###.### 2005-05-04 01:47:36 GMT

Hardware and Software, Engineered to Work Together