United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
JDK-6317989 : Abort (core dumped) occurs related to deoptimization in 1.4.2_07

Details
Type:
Bug
Submit Date:
2005-08-31
Status:
Closed
Updated Date:
2014-02-27
Project Name:
JDK
Resolved Date:
2006-11-20
Component:
hotspot
OS:
linux_redhat_3.0
Sub-Component:
runtime
CPU:
x86
Priority:
P4
Resolution:
Fixed
Affected Versions:
1.4.2_07
Fixed Versions:
1.4.2_14 (b01)

Related Reports
Relates:

Sub Tasks

Description
JVM crashes something related to deoptimization.

MESSAGES :
[kronos test5]$ java -XX:+PrintGC -XX:+PrintCompilation -XX:CompileOnly=Su
spendOnDeopt.deoptee -XX:CompileOnly=SuspendOnDeopt.compileTarget -Xcomp -XX:-In
line  SuspendOnDeopt
  1   b   SuspendOnDeopt::deoptee (147 bytes)
load deopter
suspend done
[Full GC 133K->83K(1984K), 0.0090780 secs]
[Full GC 83K->83K(1984K), 0.0084180 secs]
[Full GC 83K->83K(1984K), 0.0083840 secs]
[Full GC 83K->82K(1984K), 0.0185780 secs]
  2   b   SuspendOnDeopt::compileTarget (64 bytes)

Unexpected Signal : 11 occurred at PC=0xB2FDA35A
Function=SuspendOnDeopt.compileTarget()V (compiled Java code)
Library=(N/A)

Current Java thread:

Dynamic libraries:
08048000-08056000 r-xp 00000000 03:02 5030014    /java/jdk/jdk1.4.2_07/bin/java
08056000-08059000 rw-p 0000d000 03:02 5030014    /java/jdk/jdk1.4.2_07/bin/java
aa9c1000-aa9ce000 r--s 00000000 03:02 7438459    /java/jdk/jdk1.4.2_07/jre/lib/e
xt/ldapsec.jar
aa9ce000-aaa8a000 r--s 00000000 03:02 7438458    /java/jdk/jdk1.4.2_07/jre/lib/e
xt/localedata.jar
aaa8a000-aaa8d000 r--s 00000000 03:02 7438457    /java/jdk/jdk1.4.2_07/jre/lib/e
xt/dnsns.jar
aaa8d000-aaaa9000 r--s 00000000 03:02 7438421    /java/jdk/jdk1.4.2_07/jre/lib/e
xt/sunjce_provider.jar
b4f5c000-b54b5000 r--s 00000000 03:02 6013214    /java/jdk/jdk1.4.2_07/jre/lib/c
harsets.jar
b54b5000-b54c6000 r--s 00000000 03:02 6013176    /java/jdk/jdk1.4.2_07/jre/lib/j
ce.jar
b54c6000-b55a3000 r--s 00000000 03:02 6013177    /java/jdk/jdk1.4.2_07/jre/lib/j
sse.jar
b55a3000-b55b9000 r--s 00000000 03:02 6013175    /java/jdk/jdk1.4.2_07/jre/lib/s
unrsasign.jar
b5603000-b6fac000 r--s 00000000 03:02 6013221    /java/jdk/jdk1.4.2_07/jre/lib/r
t.jar
b6fac000-b6fc0000 r-xp 00000000 03:02 6013128    /java/jdk/jdk1.4.2_07/jre/lib/i
386/libzip.so
b6fc0000-b6fc3000 rw-p 00013000 03:02 6013128    /java/jdk/jdk1.4.2_07/jre/lib/i
386/libzip.so
b6fc3000-b6fe3000 r-xp 00000000 03:02 6013126    /java/jdk/jdk1.4.2_07/jre/lib/i
386/libjava.so
b6fe3000-b6fe5000 rw-p 0001f000 03:02 6013126    /java/jdk/jdk1.4.2_07/jre/lib/i
386/libjava.so
b6fe5000-b6ff5000 r-xp 00000000 03:02 6013125    /java/jdk/jdk1.4.2_07/jre/lib/i
386/libverify.so
b6ff5000-b6ff7000 rw-p 0000f000 03:02 6013125    /java/jdk/jdk1.4.2_07/jre/lib/i
386/libverify.so
b6ff7000-b6fff000 r-xp 00000000 03:02 524330     /lib/libnss_nis-2.3.2.so
b6fff000-b7000000 rw-p 00008000 03:02 524330     /lib/libnss_nis-2.3.2.so
b7000000-b700b000 r-xp 00000000 03:02 524325     /lib/libnss_files-2.3.2.so
b700b000-b700c000 rw-p 0000a000 03:02 524325     /lib/libnss_files-2.3.2.so
b7018000-b701c000 rw-s 00000000 03:02 9748586    /tmp/hsperfdata_tbaba/9710
b701c000-b703d000 r-xp 00000000 03:02 14106628   /lib/tls/libm-2.3.2.so
b703d000-b703e000 rw-p 00020000 03:02 14106628   /lib/tls/libm-2.3.2.so
b703e000-b7050000 r-xp 00000000 03:02 524309     /lib/libnsl-2.3.2.so
b7050000-b7051000 rw-p 00011000 03:02 524309     /lib/libnsl-2.3.2.so
b705a000-b7062000 r-xp 00000000 03:02 6013121    /java/jdk/jdk1.4.2_07/jre/lib/i
386/native_threads/libhpi.so
b7062000-b7063000 rw-p 00007000 03:02 6013121    /java/jdk/jdk1.4.2_07/jre/lib/i
386/native_threads/libhpi.so
b7063000-b7461000 r-xp 00000000 03:02 14631103   /java/jdk/jdk1.4.2_07/jre/lib/i
386/client/libjvm.so
b7461000-b747d000 rw-p 003fd000 03:02 14631103   /java/jdk/jdk1.4.2_07/jre/lib/i
386/client/libjvm.so
b7490000-b75c1000 r-xp 00000000 03:02 14106626   /lib/tls/libc-2.3.2.so
b75c1000-b75c4000 rw-p 00130000 03:02 14106626   /lib/tls/libc-2.3.2.so
b75c7000-b75c9000 r-xp 00000000 03:02 524305     /lib/libdl-2.3.2.so
b75c9000-b75ca000 rw-p 00001000 03:02 524305     /lib/libdl-2.3.2.so
b75ca000-b75d7000 r-xp 00000000 03:02 14106630   /lib/tls/libpthread-0.60.so
b75d7000-b75d8000 rw-p 0000c000 03:02 14106630   /lib/tls/libpthread-0.60.so
b75eb000-b7600000 r-xp 00000000 03:02 524292     /lib/ld-2.3.2.so
b7600000-b7601000 rw-p 00015000 03:02 524292     /lib/ld-2.3.2.so

Heap at VM Abort:
Heap
 def new generation   total 576K, used 0K [0xaaeb0000, 0xaaf50000, 0xab390000)
  eden space 512K,   0% used [0xaaeb0000, 0xaaeb0000, 0xaaf30000)
  from space 64K,   0% used [0xaaf30000, 0xaaf30000, 0xaaf40000)
  to   space 64K,   0% used [0xaaf40000, 0xaaf40000, 0xaaf50000)
 tenured generation   total 1408K, used 82K [0xab390000, 0xab4f0000, 0xaeeb0000)
   the space 1408K,   5% used [0xab390000, 0xab3a4b08, 0xab3a4c00, 0xab4f0000)
 compacting perm gen  total 4096K, used 974K [0xaeeb0000, 0xaf2b0000, 0xb2eb0000
)
   the space 4096K,  23% used [0xaeeb0000, 0xaefa3810, 0xaefa3a00, 0xaf2b0000)

Local Time = Wed Aug 31 14:32:25 2005
Elapsed Time = 1
#
# HotSpot Virtual Machine Error : 11
# Error ID : 4F530E43505002EF
# Please report this error at
# http://java.sun.com/cgi-bin/bugreport.cgi
#
# Java VM: Java HotSpot(TM) Client VM (1.4.2_07-b05 compiled mode)
#
# An error report file has been saved as hs_err_pid9710.log.
# Please refer to the file for further information.
#
Abort (core dumped)

CONFIGURATION :
  OS : RHEL AS3
     Linux kronos 2.4.21-4.ELsmp #1 SMP Fri Oct 3 17:52:56 EDT 2003 i686 i686 i386 GNU/Linux
  MPU : Xeon 2[GHz] * 2

REPRODUCE :
 1) Compile the attached test program, SuspenOnDeopt.java
 2) Launch the command line,
   java -XX:+PrintGC -XX:+PrintCompilation -XX:CompileOnly=SuspendOnDeopt.deoptee  
  -XX:CompileOnly=SuspendOnDeopt.compileTarget -Xcomp -XX:-Inline  SuspendOnDeopt

   NOTE: 
     The above -XX options are specified in order to reproduce the issue easily.
     Actual licensee's customer does not specify and the issue occurs without those -XX options.

OTHERS:
 When system is heavy-loaded, licensee reports this issue is a little bit easer to occur.

                                    

Comments
SUGGESTED FIX

*** /home/chrisphi/east/1.4.2-sust-east//hotspot/src/share/vm/runtime/safepoint.cpp     Tue Jun 27 04:21:12 2006
--- safepoint.cpp       Tue Sep 19 07:22:38 2006
***************
*** 1,5 ****
  #ifdef USE_PRAGMA_IDENT_SRC
! #pragma ident "@(#)safepoint.cpp      1.256 06/06/21 14:02:27 JVM"
  #endif
  /*
   * Copyright 2006 Sun Microsystems, Inc.  All rights reserved.
--- 1,5 ----
  #ifdef USE_PRAGMA_IDENT_SRC
! #pragma ident "%W% %E% %U% JVM"
  #endif
  /*
   * Copyright 2006 Sun Microsystems, Inc.  All rights reserved.
***************
*** 1133,1138 ****
--- 1133,1139 ----
    ThreadCodeBuffer     *cbuf   = state->code_buffer();  
    assert(cbuf != NULL, "for pc to be in code buffer the codebuffer must exist");
    nmethod* nm = get_nmethod();
+   nmethodLocker nml(nm);
  
    // Step 2: Find frame of caller  
    frame stub_fr = thread()->last_frame();
                                     
2006-09-19
EVALUATION

Evaluation from Tom Rodriguez:
What's happening is that in one safepoint we've been suspended and so we sleep in JavaThread::java_suspend_self like this:

=>[1] JavaThread::java_suspend_self(0x4fcc8, 0x4fcc8, 0x4fcc8, 0xf94fc854, 0xf94fc855, 0x0), at 0xfed42590
  [2] JavaThread::handle_special_runtime_exit_condition(0x4fcc8, 0x1, 0xf94fc854, 0xf94fc855, 0x0, 0x0), at 0xfed42000
  [3] SafepointSynchronize::block(0x4fcc8, 0x4fcc8, 0xf94fc85c, 0x0, 0x0, 0x0), at 0xfec88de8
  [4] CompiledCodeSafepointHandler::handle_illegal_instruction_exception(0x4fc28, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfec8c914
  [5] SafepointSynchronize::handle_illegal_instruction_exception(0x4fcc8, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfec89c5c
  [6] 0xf94ec270(0xf101d9a0, 0xb8, 0xffbfd870, 0xffffffff, 0xb8, 0x4fcc8), at 0xf94ec270
  [7] 0xf94e9cf8(0x4fcc8, 0xb8, 0x7, 0x4, 0xf50f81c8, 0xffbfd8c8), at 0xf94e9cf8

This thread will wait in here until it's been unsuspended.  During that time other safepoints occur and the caller frame is deoptimized.  All the other threads will resume and possibly safepoint again.  One of the rules of deoptimization in 1.4.2 is that when deoptimization completes the deoptimized nmethod is no longer live on the stack, though there's a tiny window where it's used for reexecuting the call stack, but the sweeper is required to keep the nmethod live through at least one more safepoint so it's safe to do that.  The problem here is that we can end up sleeping inside SafepointSynchronize::block for more than one safepoint so attempting to use the nmethod can fail.  I think the needed is to put an nmethodLocker on nm in the CompiledCodeSafepointHandler::handle_illegal_instruction_exception.  I'd probably put it at the beginning as soon as the nmethod is picked up, as in:

  nmethod* nm = get_nmethod();
  nmethodLocker nml(nm);

It was violating something I thought was an invariant of the old safepointing mechanism.  I think it was always true at the time the code was written that you'd enter and leave the handle_illegal_instruction code all within the same safepoint but I think changes for thread suspension made that no longer be true which is why the nmethodLocker is needed.
                                     
2006-09-19



Hardware and Software, Engineered to Work Together