United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
JDK-6247825 : Crash occurs at safepoint on deoptimization in 1.4.2_07

Details
Type:
Bug
Submit Date:
2005-03-30
Status:
Closed
Updated Date:
2014-02-27
Project Name:
JDK
Resolved Date:
2006-06-26
Component:
hotspot
OS:
linux_redhat_3.0
Sub-Component:
runtime
CPU:
x86
Priority:
P2
Resolution:
Fixed
Affected Versions:
1.4.2_07
Fixed Versions:
1.4.2_13 (b01)

Related Reports

Sub Tasks

Description
A crash occurs at deoptimization phase in 1.4.2_07.

CONFIGURATION :
Linux kronos 2.4.21-4.ELsmp #1 SMP Fri Oct 3 17:52:56 EDT 2003 i686 i686 i386 GNU/Linux
(RHEL AS3.0)

REPRODUCE :
  1) Compile all the java files included in src.zip.
  2) Launch "java J200394 intf"


MESSAGES :
[tbaba@kronos crash-at-deoptimization]$ java -showversion J200394 intf
java version "1.4.2_07"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_07-b05)
Java HotSpot(TM) Client VM (build 1.4.2_07-b05, mixed mode)

Test of interface

Unexpected Signal : 11 occurred at PC=0xB2FE0FC0
Function=[Unknown.]
Library=(N/A)

NOTE: We are unable to locate the function name symbol for the error
      just occurred. Please refer to release documentation for possible
      reason and solutions.


Current Java thread:

Dynamic libraries:
08048000-08056000 r-xp 00000000 00:1b 2036970    /java/bin/jdk1.4.2_07/linux/bin
/java
08056000-08059000 rw-p 0000d000 00:1b 2036970    /java/bin/jdk1.4.2_07/linux/bin
/java
aa644000-aa700000 r--s 00000000 00:1b 1198578    /java/bin/jdk1.4.2_07/linux/jre
/lib/ext/localedata.jar
aa87d000-aa88a000 r--s 00000000 00:1b 1198579    /java/bin/jdk1.4.2_07/linux/jre
/lib/ext/ldapsec.jar
aa88a000-aa88d000 r--s 00000000 00:1b 1198577    /java/bin/jdk1.4.2_07/linux/jre
/lib/ext/dnsns.jar
aa88d000-aa8a9000 r--s 00000000 00:1b 1198576    /java/bin/jdk1.4.2_07/linux/jre
/lib/ext/sunjce_provider.jar
aaaad000-aacad000 r--p 00000000 03:02 3276804    /usr/lib/locale/locale-archive
b4f5c000-b54b5000 r--s 00000000 00:1b 1813248    /java/bin/jdk1.4.2_07/linux/jre
/lib/charsets.jar
b54b5000-b54c6000 r--s 00000000 00:1b 1812917    /java/bin/jdk1.4.2_07/linux/jre
/lib/jce.jar
b54c6000-b55a3000 r--s 00000000 00:1b 1813129    /java/bin/jdk1.4.2_07/linux/jre
/lib/jsse.jar
b55a3000-b55b9000 r--s 00000000 00:1b 1812916    /java/bin/jdk1.4.2_07/linux/jre
/lib/sunrsasign.jar
b5603000-b6fac000 r--s 00000000 00:1b 1813263    /java/bin/jdk1.4.2_07/linux/jre
/lib/rt.jar
b6fac000-b6fc0000 r-xp 00000000 00:1b 130792     /java/bin/jdk1.4.2_07/linux/jre
/lib/i386/libzip.so
b6fc0000-b6fc3000 rw-p 00013000 00:1b 130792     /java/bin/jdk1.4.2_07/linux/jre
/lib/i386/libzip.so
b6fc3000-b6fe3000 r-xp 00000000 00:1b 130790     /java/bin/jdk1.4.2_07/linux/jre
/lib/i386/libjava.so
b6fe3000-b6fe5000 rw-p 0001f000 00:1b 130790     /java/bin/jdk1.4.2_07/linux/jre
/lib/i386/libjava.so
b6fe5000-b6ff5000 r-xp 00000000 00:1b 130789     /java/bin/jdk1.4.2_07/linux/jre
/lib/i386/libverify.so
b6ff5000-b6ff7000 rw-p 0000f000 00:1b 130789     /java/bin/jdk1.4.2_07/linux/jre
/lib/i386/libverify.so
b6ff7000-b6fff000 r-xp 00000000 03:02 524330     /lib/libnss_nis-2.3.2.so
b6fff000-b7000000 rw-p 00008000 03:02 524330     /lib/libnss_nis-2.3.2.so
b7000000-b700b000 r-xp 00000000 03:02 524325     /lib/libnss_files-2.3.2.so
b700b000-b700c000 rw-p 0000a000 03:02 524325     /lib/libnss_files-2.3.2.so
b7018000-b701c000 rw-s 00000000 03:02 9420972    /tmp/hsperfdata_tbaba/5570
b701c000-b703d000 r-xp 00000000 03:02 14106628   /lib/tls/libm-2.3.2.so
b703d000-b703e000 rw-p 00020000 03:02 14106628   /lib/tls/libm-2.3.2.so
b703e000-b7050000 r-xp 00000000 03:02 524309     /lib/libnsl-2.3.2.so
b7050000-b7051000 rw-p 00011000 03:02 524309     /lib/libnsl-2.3.2.so
b705a000-b7062000 r-xp 00000000 00:1b 385964     /java/bin/jdk1.4.2_07/linux/jre
/lib/i386/native_threads/libhpi.so
b7062000-b7063000 rw-p 00007000 00:1b 385964     /java/bin/jdk1.4.2_07/linux/jre
/lib/i386/native_threads/libhpi.so
b7063000-b7461000 r-xp 00000000 00:1b 1186035    /java/bin/jdk1.4.2_07/linux/jre
/lib/i386/client/libjvm.so
b7461000-b747d000 rw-p 003fd000 00:1b 1186035    /java/bin/jdk1.4.2_07/linux/jre
/lib/i386/client/libjvm.so
b7490000-b75c1000 r-xp 00000000 03:02 14106626   /lib/tls/libc-2.3.2.so
b75c1000-b75c4000 rw-p 00130000 03:02 14106626   /lib/tls/libc-2.3.2.so
b75c7000-b75c9000 r-xp 00000000 03:02 524305     /lib/libdl-2.3.2.so
b75c9000-b75ca000 rw-p 00001000 03:02 524305     /lib/libdl-2.3.2.so
b75ca000-b75d7000 r-xp 00000000 03:02 14106630   /lib/tls/libpthread-0.60.so
b75d7000-b75d8000 rw-p 0000c000 03:02 14106630   /lib/tls/libpthread-0.60.so
b75eb000-b7600000 r-xp 00000000 03:02 524292     /lib/ld-2.3.2.so
b7600000-b7601000 rw-p 00015000 03:02 524292     /lib/ld-2.3.2.so

Heap at VM Abort:
Heap
 def new generation   total 576K, used 164K [0xaaeb0000, 0xaaf50000, 0xab390000)
  eden space 512K,  32% used [0xaaeb0000, 0xaaed93c0, 0xaaf30000)
  from space 64K,   0% used [0xaaf30000, 0xaaf30000, 0xaaf40000)
  to   space 64K,   0% used [0xaaf40000, 0xaaf40000, 0xaaf50000)
 tenured generation   total 1408K, used 0K [0xab390000, 0xab4f0000, 0xaeeb0000)
   the space 1408K,   0% used [0xab390000, 0xab390000, 0xab390200, 0xab4f0000)
 compacting perm gen  total 4096K, used 977K [0xaeeb0000, 0xaf2b0000, 0xb2eb0000
)
   the space 4096K,  23% used [0xaeeb0000, 0xaefa45b0, 0xaefa4600, 0xaf2b0000)

Local Time = Wed Mar 30 11:30:40 2005
Elapsed Time = 0
#
# HotSpot Virtual Machine Error : 11
# Error ID : 4F530E43505002EF
# Please report this error at
# http://java.sun.com/cgi-bin/bugreport.cgi
#
# Java VM: Java HotSpot(TM) Client VM (1.4.2_07-b05 mixed mode)
#
# An error report file has been saved as hs_err_pid5570.log.
# Please refer to the file for further information.
#
Abort (core dumped)
[tbaba@kronos crash-at-deoptimization]$


NOTE:
 The customer has sent their own fix also and asked to review that.
 Please look into the report.txt attached to this bug file.

###@###.### 2005-03-30 05:40:25 GMT

                                    

Comments
SUGGESTED FIX

--- src/share/vm/runtime/safepoint.cpp- Mon Feb 27 20:13:44 2006
+++ src/share/vm/runtime/safepoint.cpp  Tue Jun 20 16:55:18 2006
@@ -1,7 +1,7 @@
 #ifdef USE_PRAGMA_IDENT_SRC
-#pragma ident "@(#)safepoint.cpp       1.252 06/01/20 08:30:10 JVM"
+#pragma ident "@(#)safepoint.cpp       1.255 06/06/20 16:53:26 JVM"
 #endif
 /*
  * Copyright 2006 Sun Microsystems, Inc.  All rights reserved.
  * SUN PROPRIETARY/CONFIDENTIAL.  Use is subject to license terms.
  */
@@ -1343,11 +1343,22 @@
 
           NativeCall* call = nativeCall_at(at_call);
           // replace return address with the destination of the call,
           // so that the stub can use it; the stub will replace the
           // return address with the entry into the deoptimization blob.
-          caller_fr.patch_pc(thread(), call->destination());
+         address dest = call->destination();
+         if (type == relocInfo::virtual_call_type) {
+           // We need to reexecute the whole compiled ic but we can't
+           // since the nmethod is deoptimized, so treat this as an IC
+           // and allow it to reresolve.
+#ifdef COMPILER1
+           dest = Runtime1::entry_for(Runtime1::handle_ic_miss_id);
+#else
+           dest = OptoRuntime::handle_ic_miss_stub();
+#endif
+         }
+         caller_fr.patch_pc(thread(), dest);
 
           arg_size_in_bytes = -1; // notify that compilation must be reexecuted
         } else {
           if (ShowSafepointMsgs) {
             tty->print_cr("Has pending exception");
                                     
2006-06-21
EVALUATION

Thanks to Tom Rodriguez:
"So I've looked at this some and I think there really is a bug here.  I don't think their fix is right, in particular since it won't work on sparc.  The basic problem is that if we safepoint in compiled code at a virtual call site, when we resume we need to reexecute the compiled ic so that the values are set up correctly for call.  The call site can point at a couple different things and some of these stubs expect eax to contain various kinds of oops.  Some of the paths use these oops in ways that would cause a crash if the eax contains an invalid oop but some of them are resilient in the face of invalid oops.  In fact in some cases it's expected that the oop will be invalid but our code for compiled IC patching understand all these cases and makes sure it can be done in a safe fashion.  I'm not entirely clear whether their code is dying because because the value is immediately bad or because we end up getting into the wrong code but I can imagine a couple different ways we might die.

The problem occurs when we safepoint at a call and end up deoptimizing the nmethod containing the call site.  In this case the illegal instruction handler has to execute the call before we return to the deopt blob.  However if this is a virtual call site we can't reexecute the whole compiled ic they way we want to because the nmethod has been deoptimized.  What we end up doing is dispatching to the call destination but using an old value for eax that might not be safe for that call stub.  We really need the value that's actually in the compile IC.  One reason their fix isn't correct is that if the method has been deoptimized the nmethod could contain a stale copy of the oop we need so we could still die.  The other reason is that this code won't work for sparc since updating save_oop_result doesn't update the saved copy of the value from the compiled IC.

I think the proper fix is to return to the handle_ic blob when we deopt at a virtual call site.  The code would look something like this:

address dest = call->destination();
if (type == relocInfo::virtual_call_type) {
  // We need to reexecute the whole compiled ic but we can't
  // since the nmethod is deoptimized, so treat this as an IC
  // and allow it to reresolve.
  dest = Runtime1::entry_for(Runtime1::handle_ic_miss_id);
}
caller_fr.patch_pc(thread(), dest);

This goes in the same place they put their fix in safepoint.cpp, though this code replaces the existing patch_pc instead of coming after it, as theirs did.

The same fix is needed for C2 I think since both compilers should be affected by this bug.  C2's miss handler goes by a different name so you'll need an ifdef for that.  Anyway, I think this is the problem they're seeing.

tom"
                                     
2006-06-16



Hardware and Software, Engineered to Work Together