JDK-4965984 : 1.4.2: Server VM crashes during a compile
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 1.4.2_02,1.4.2_03
  • Priority: P3
  • Status: Closed
  • Resolution: Cannot Reproduce
  • OS: solaris_8
  • CPU: sparc
  • Submitted: 2003-12-10
  • Updated: 2004-04-02
  • Resolved: 2004-04-02
Related Reports
Relates :  
Relates :  
Relates :  
Description
After upgrading to Java 1.4.1 or 1.4.2 from 1.3.1, an application compiled with -server runs for 8-10 hours, and then crashes.  The call stack from the core file shows that the server VM crashed in the middle of a compile:

 ./dbx /usr/java1.4.2_02/bin/java /data/oss/etc/bin/core 
For information about new features see `help changes'
To remove this message, put `dbxenv suppress_startup_message 7.1' in your .dbxrc
Reading java
dbx: internal warning: writable memory segment 0xfeab0000[16384] of size 0 in core
core file header read successfully
Reading ld.so.1
Reading libthread.so.1
Reading libdl.so.1
Reading libc.so.1
Reading libc_psr.so.1
Reading libjvm.so
Reading libCrun.so.1
Reading libsocket.so.1
Reading libnsl.so.1
Reading libm.so.1
Reading libsched.so.1
Reading libw.so.1
Reading libmp.so.2
Reading libhpi.so
Reading libverify.so
Reading libjava.so
Reading libzip.so
Reading libnet.so
Reading libioser12.so
detected a multithreaded program
t@14 (l@14) terminated by signal ABRT (Abort)
0xff31f020: _lwp_kill+0x0008:   bgeu,a  _lwp_kill+0x1c
(dbx) where
current thread: t@14
=>[1] _lwp_kill(0x0, 0xe, 0x0, 0xff33c000, 0xff398000, 0x0), at 0xff31f020 
  [2] raise(0x6, 0x0, 0x0, 0xffffffff, 0xff3403bc, 0x0), at 0xff2cba74 
  [3] abort(0xff33c000, 0xd6b7d9a0, 0x0, 0x4, 0x0, 0xd6b7d9c1), at 0xff2b595c 
  [4] os::abort(0x1, 0xff15323a, 0xd6b7da50, 0xff17e000, 0xff1c58bc, 0x3e93f4), at 0xff098260 
  [5] os::handle_unexpected_exception(0x12bf58, 0xb, 0xfedbbbec, 0xd6b7e7b8, 0xfedd86c8, 0x0), at 0xff096574 
  [6] JVM_handle_solaris_signal(0xfedbbbec, 0xd6b7e7b8, 0xd6b7e500, 0x3400, 0x35ec, 0x0), at 0xfedd8f9c 
  [7] __sighndlr(0xb, 0xd6b7e7b8, 0xd6b7e500, 0xfedd864c, 0x0, 0x0), at 0xff385c18 
  [8] call_user_handler(0xff261a00, 0xe, 0xff3996a0, 0xd6b7e500, 0xd6b7e7b8, 0xb), at 0xff37f8b0 
  [9] sigacthandler(0xff261a00, 0xd6b7e7b8, 0xd6b7e500, 0xff398000, 0xd6b7e7b8, 0xb), at 0xff37fa7c 
  ---- called from signal handler with signal 11 (SIGSEGV) ------
  [10] PhaseChaitin::Split(0x6, 0x12abb8c, 0x0, 0x12abcec, 0x0, 0x12aa84c), at 0xfedbbbec 
  [11] PhaseChaitin::Register_Allocate(0xff1c6ae8, 0xff1c9bcc, 0xff1bebf4, 0xd6b7ed2c, 0x4800, 0x4ac8), at
0xfedc9d38 
  [12] Compile::Code_Gen(0xd6b7f500, 0xff1330f0, 0xd6b7f414, 0xff17e000, 0x0, 0x0), at 0xfedd2bfc 
  [13] Compile::Compile(0xff132f25, 0xb7b1b4c, 0x8ddee64, 0x90ca804, 0xffffffff, 0x1), at 0xfee007b8 
  [14] C2Compiler::compile_method(0x2b8b8, 0xd6b7fd1c, 0x0, 0xc586e8, 0xffffffff, 0x0), at 0xfedfcf7c 
  [15] CompileBroker::invoke_compiler_on_method(0x32a, 0x0, 0xffffffff, 0xff1bce50, 0xff1c9bcc, 0x12bf58), at
0xfedfc740 
  [16] CompileBroker::compiler_thread_loop(0xff13372d, 0xff1bd218, 0x12bf58, 0x12c508, 0x314e6c, 0xfee690f8),
at 0xfeeac098 
  [17] JavaThread::run(0x12bf58, 0xe, 0x40, 0x0, 0x40, 0x0), at 0xfee69120 
  [18] _start(0x12bf58, 0xff261a00, 0x0, 0x0, 0x0, 0x0), at 0xfee65600 
(dbx) threads
      t@1  a  l@1   ?()   LWP suspended in  ___lwp_cond_wait() 
      t@2  b  l@2   _start()   LWP suspended in  ___lwp_cond_wait() 
      t@3  b  l@3   _start()   LWP suspended in  ___lwp_cond_wait() 
      t@4  b  l@4   _start()   LWP suspended in  ___lwp_cond_wait() 
      t@5  b  l@5   _start()   LWP suspended in  ___lwp_cond_wait() 
      t@6  b  l@6   _start()   LWP suspended in  ___lwp_cond_wait() 
      t@7  b  l@7   _start()   LWP suspended in  ___lwp_cond_wait() 
      t@8  b  l@8   _start()   LWP suspended in  ___lwp_cond_wait() 
      t@9  b  l@9   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@10  b l@10   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@11  b l@11   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@12  b l@12   _start()   sleep on 0xff1d1418  in  __lwp_park() 
     t@13  b l@13   _start()   LWP suspended in  ___lwp_cond_wait() 
o>   t@14  b l@14   _start()   signal SIGABRT in  _lwp_kill() 
     t@15  b l@15   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@16  b l@16   _start()   LWP suspended in  _libc_poll() 
     t@17  b l@17   _start()   LWP suspended in  _so_accept() 
     t@18  b l@18   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@19  b l@19   _start()   LWP suspended in  _so_accept() 
     t@20  b l@20   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@21  b l@21   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@22  b l@22   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@23  b l@23   _start()   LWP suspended in  _libc_poll() 
     t@25  b l@25   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@27  b l@27   _start()   LWP suspended in  _so_accept() 
     t@28  b l@28   _start()   LWP suspended in  _so_accept() 
     t@29  b l@29   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@30  b l@30   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@31  b l@31   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@32  b l@32   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@33  b l@33   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@34  b l@34   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@35  b l@35   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@37  b l@37   _start()   LWP suspended in  _libc_poll() 
     t@38  b l@38   _start()   LWP suspended in  _libc_poll() 
     t@39  b l@39   _start()   LWP suspended in  _libc_poll() 
     t@40  b l@40   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@41  b l@41   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@42  b l@42   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@43  b l@43   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@44  b l@44   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@45  b l@45   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@46  b l@46   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@47  b l@47   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@48  b l@48   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@49  b l@49   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@50  b l@50   _start()   LWP suspended in  _so_accept() 
     t@51  b l@51   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@52  b l@52   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@53  b l@53   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@54  b l@54   _start()   LWP suspended in  _so_accept() 
     t@56  b l@56   _start()   LWP suspended in  _libc_poll() 
     t@57  b l@57   _start()   LWP suspended in  _libc_poll() 
     t@79  b l@79   _start()   LWP suspended in  _libc_poll() 
     t@85  b l@85   _start()   LWP suspended in  _so_accept() 
     t@86  b l@86   _start()   LWP suspended in  _so_accept() 
     t@87  b l@87   _start()   LWP suspended in  _libc_read() 
     t@88  b l@88   _start()   LWP suspended in  _libc_read() 
     t@89  b l@89   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@90  b l@90   _start()   LWP suspended in  _libc_read() 
     t@91  b l@91   _start()   LWP suspended in  _libc_read() 
     t@92  b l@92   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@93  b l@93   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@94  b l@94   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@95  b l@95   _start()   LWP suspended in  _so_accept() 
     t@96  b l@96   _start()   LWP suspended in  _libc_read() 
     t@97  b l@97   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@98  b l@98   _start()   LWP suspended in  ___lwp_cond_wait() 
     t@99  b l@99   _start()   LWP suspended in  ___lwp_cond_wait() 
    t@100  b l@100   _start()   LWP suspended in  ___lwp_cond_wait() 
    t@101  b l@101   _start()   LWP suspended in  ___lwp_cond_wait() 
    t@102  b l@102   _start()   LWP suspended in  ___lwp_cond_wait() 
    t@103  b l@103   _start()   LWP suspended in  _so_accept() 
    t@104  b l@104   _start()   LWP suspended in  ___lwp_cond_wait() 
    t@105  b l@105   _start()   LWP suspended in  ___lwp_cond_wait() 
    t@106  b l@106   _start()   LWP suspended in  ___lwp_cond_wait() 
    t@107  b l@107   _start()   LWP suspended in  ___lwp_cond_wait() 
    t@108  b l@108   _start()   LWP suspended in  _libc_poll() 
    t@109  b l@109   _start()   LWP suspended in  ___lwp_cond_wait() 
    t@110  b l@110   _start()   LWP suspended in  ___lwp_cond_wait() 
    t@111  b l@111   _start()   LWP suspended in  _libc_poll() 
    t@113  b l@113   _start()   LWP suspended in  _libc_poll() 
    t@114  b l@114   _start()   LWP suspended in  _libc_poll() 
    t@115  b l@115   _start()   LWP suspended in  _libc_poll() 
    t@116  b l@116   _start()   LWP suspended in  ___lwp_cond_wait() 
    t@117  b l@117   _start()   LWP suspended in  _libc_poll() 
    t@118  b l@118   _start()   LWP suspended in  _libc_poll() 
    t@132  b l@132   _start()   LWP suspended in  ___lwp_cond_wait() 
    t@133  b l@133   _start()   LWP suspended in  ___lwp_cond_wait() 
    t@139  b l@139   _start()   LWP suspended in  ___lwp_cond_wait() 
    t@140  b l@140   _start()   LWP suspended in  ___lwp_cond_wait() 
    t@141  b l@141   _start()   LWP suspended in  _libc_read() 
    t@142  b l@142   _start()   LWP suspended in  _libc_read() 
    t@143  b l@143   _start()   LWP suspended in  _libc_read() 
    t@144  b l@144   _start()   LWP suspended in  ___lwp_cond_wait() 
    t@145  b l@145   _start()   LWP suspended in  ___lwp_cond_wait() 
    t@146  b l@146   _start()   LWP suspended in  ___lwp_cond_wait() 
    t@147  b l@147   _start()   LWP suspended in  ___lwp_cond_wait() 
    t@148  b l@148   _start()   LWP suspended in  ___lwp_cond_wait() 
    t@158  b l@158   _start()   LWP suspended in  ___lwp_cond_wait() 
    t@162  b l@162   _start()   LWP suspended in  ___lwp_cond_wait() 
    t@163  b l@163   _start()   LWP suspended in  ___lwp_cond_wait() 
    t@164  b l@164   _start()   LWP suspended in  ___lwp_cond_wait() 
    t@165  b l@165   _start()   LWP suspended in  ___lwp_cond_wait() 
    t@166  b l@166   _start()   LWP suspended in  ___lwp_cond_wait() 
    t@167  b l@167   _start()   LWP suspended in  ___lwp_cond_wait() 
    t@168  b l@168   _start()   LWP suspended in  _libc_read() 
    t@169  b l@169   _start()   LWP suspended in  ___lwp_cond_wait() 
    t@170  b l@170   _start()   LWP suspended in  ___lwp_cond_wait() 
    t@173  b l@173   _start()   LWP suspended in  ___lwp_cond_wait() 
    t@181  b l@181   _start()   LWP suspended in  ___lwp_cond_wait() 
    t@182  b l@182   _start()   LWP suspended in  ___lwp_cond_wait() 
(dbx) thread -blockedby
Thread t@14 is not asleep
(dbx) exit



hs_err file shows this:

Unexpected Signal : 11 occurred at PC=0xFE1BBBEC
Function=[Unknown. Nearest: JVM_NativePath+0x8AE0]
Library=/usr/java1.4.2_02/jre/lib/sparc/server/libjvm.so

Current Java thread:

Dynamic libraries:
0x10000         java
0xff360000      /usr/lib/libthread.so.1
0xff3a0000      /usr/lib/libdl.so.1
0xff200000      /usr/lib/libc.so.1
0xff330000      /usr/platform/SUNW,Ultra-4/lib/libc_psr.so.1
0xfe000000      /usr/java1.4.2_02/jre/lib/sparc/server/libjvm.so
0xff2e0000      /usr/lib/libCrun.so.1
0xff1e0000      /usr/lib/libsocket.so.1
0xff100000      /usr/lib/libnsl.so.1
0xff0d0000      /usr/lib/libm.so.1
0xff1c0000      /usr/lib/libsched.so.1
0xff310000      /usr/lib/libw.so.1
0xff0a0000      /usr/lib/libmp.so.2
0xff070000      /usr/java1.4.2_02/jre/lib/sparc/native_threads/libhpi.so
0xff020000      /usr/java1.4.2_02/jre/lib/sparc/libverify.so
0xfe7b0000      /usr/java1.4.2_02/jre/lib/sparc/libjava.so
0xfe790000      /usr/java1.4.2_02/jre/lib/sparc/libzip.so
0xcf520000      /usr/java1.4.2_02/jre/lib/sparc/libnet.so
0xceed0000      /usr/java1.4.2_02/jre/lib/sparc/libioser12.so

Heap at VM Abort:
Heap
 par new generation   total 131008K, used 94264K [0xd7400000, 0xdf400000, 0xdf400000)
  eden space 130944K,  71% used [0xd7400000, 0xdd00e108, 0xdf3e0000)
  from space 64K,   0% used [0xdf3f0000, 0xdf3f0000, 0xdf400000)
  to   space 64K,   0% used [0xdf3e0000, 0xdf3e0000, 0xdf3f0000)
 concurrent mark-sweep generation total 393216K, used 94104K [0xdf400000, 0xf7400000, 0xf7400000)
 concurrent-mark-sweep perm gen total 19840K, used 19792K [0xf7400000, 0xf8760000, 0xf9400000)

Local Time = Fri Nov 21 05:44:28 2003
Elapsed Time = 3162
#
# HotSpot Virtual Machine Error : 11
# Error ID : 4F530E43505002EF 01
# Please report this error at
# http://java.sun.com/cgi-bin/bugreport.cgi
#
# Java VM: Java HotSpot(TM) Server VM (1.4.2_02-b03 mixed mode)

Comments
EVALUATION It appears that during phi node splitting, we can not find the reaching def matching a particular input to a Phi node. Specifically, at line 1220 of reg_split.cpp, Reaches[pidx][slidx] results in a NULL Node. The cause of this is not clear, and we are without a workable core file or a reproducible test case. I believe that the problem is detectable before the teh SEGV occurs. Once a friendly and willing customer steps forward, we can make runs with an alternate VM, dumping info when the error condition is detected. Also suspicious is prespliting around loops. We may want to make the -XX:-SplitLRGsAroundLoops option visible to the customer. ###@###.### 2004-01-08 I have offered an "optimized" VM augmented with dumps of data structures at the failure site to the customers. Using this VM, if Reaches[pidx][slidx] is NULL, voluminous debugging output will occur, and a guarantee failure will trigger. If Reaches[pidx][slidx] is bad, then we will fail in the same way as before, and indications point to a memory stomp. ###@###.### 2004-01-15 We are without a test case, and from the escalation, the customer is happy just using the workaround. Am closing the bug as not reproducible, with the hope that if the bug rears its ugly head again, it will be in a more tractable application. ###@###.### 2004-04-02
02-04-2004