JDK-4959717 : JVM crash with error "Fatal: null exception in compiled code"
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 1.3.1,1.4.2_01,1.4.2_04
  • Priority: P2
  • Status: Resolved
  • Resolution: Fixed
  • OS: solaris_7,solaris_8,windows_2000
  • CPU: x86,sparc
  • Submitted: 2003-11-25
  • Updated: 2004-06-11
  • Resolved: 2004-03-23
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other Other
1.3.1_12 12Fixed 1.4.2_05Fixed
Related Reports
Duplicate :  
Relates :  
Relates :  
Description
customer is running an intel server farm running Solaris 8 X86.
They started experiencing this in May.  It's progressively gotten worse.  Upto 80 outagages a month at one point.

Initially every crash ended with a truncated core file of only about 10% of what it should have been.

It was not until the use of -XX:+ShowMessageBoxOnError that we were able to collect a pstack, pmap and gcore of the failing process
A> What I see in the log file :
#
# HotSpot Virtual Machine Error, Internal Error
# Please report this error at
# http://java.sun.com/cgi-bin/bugreport.cgi
#
# Java VM: Java HotSpot(TM) Server VM (1.4.2_01-b06 mixed mode)
#
# Error ID: 53484152454432554E54494D450E435050018D
#
# Problematic Thread: prio=5 tid=0x08396a80 nid=0x25 runnable
#
Internal Error
Fatal: null exception in compiled code

Do you want to debug the problem?

All data collected is in /net/cores.ebay/cores/63833818
the cores are in the 24nov and 25nov sub directories.

interesting in the pstacks(this one is from the 25nov crash
there are several threads signitures

-----------------  lwp# 29 / thread# 28  --------------------
 dfac892c nanosleep (95a6e9d0, 95a6e9c8)
 dfb87f90 __nanosleep (64) + 58
 debced77 __1cCosOinfinite_sleep6F_v_ () + 17
 dead5697 __1cMreport_error6Fipkci11E_v_ (1, dec90008, 18d, dec647fc, dec647f0, 
95a6f35c) + 423
 dead4fc9 __1cMreport_fatal6Fpkci1E_v_ (dec90008, 18d, dec8ffe8) + 49
 de8fa3af __1cNSharedRuntimebGcompute_exception_return_address6Fi_pC_ (1) + 22d
 da034111 ???????? ()
-----------------  lwp# 30 / thread# 29  --------------------

-----------------  lwp# 45 / thread# 44  --------------------
 dfac892c nanosleep (9574e494, 9574e48c)
 dfb87f90 __nanosleep (64) + 58
 debcf817 __1cCosLmessage_box6Fpkc2_i_ (dec647fc, 9574ea00) + 56
 dead5638 __1cMreport_error6Fipkci11E_v_ (1, dec90008, 18d, dec647fc, dec647f0, 
9574f22c) + 3c4
 dead4fc9 __1cMreport_fatal6Fpkci1E_v_ (dec90008, 18d, dec8ffe8) + 49
 de8fa3af __1cNSharedRuntimebGcompute_exception_return_address6Fi_pC_ (1) + 22d
 da034111 ???????? ()
-----------------  lwp# 47 / thread# 46  --------------------


-----------------  lwp# 56 / thread# 55  --------------------
 dfac892c nanosleep (9501e890, 9501e888)
 dfb87f90 __nanosleep (64) + 58
 debced77 __1cCosOinfinite_sleep6F_v_ () + 17
 dead52af __1cMreport_error6Fipkci11E_v_ (1, dec90008, 18d, dec647fc, dec647f0, 
9501f21c) + 3b
 dead4fc9 __1cMreport_fatal6Fpkci1E_v_ (dec90008, 18d, dec8ffe8) + 49
 de8fa3af __1cNSharedRuntimebGcompute_exception_return_address6Fi_pC_ (1) + 22d
 da034111 ???????? ()
-----------------  lwp# 59 / thread# 58  --------------------

Comments
CONVERTED DATA BugTraq+ Release Management Values COMMIT TO FIX: 1.3.1_12 1.4.2_05 generic tiger-beta2 FIXED IN: 1.3.1_12 1.4.2_05 tiger-beta2 INTEGRATED IN: 1.3.1_12 1.4.2_05 tiger-b44 tiger-beta2 VERIFIED IN: 1.4.2_05
08-07-2004

SUGGESTED FIX See the webrev for changes to memnode.cpp: /net/prt-archiver.sfbay/export2/archived_workspaces/main/c2_baseline/2004/20040317100205.rasbold.c2_baseline/workspace/webrevs/webrev-2004.03.17/index.html ###@###.### 2004-03-18
18-03-2004

EVALUATION For the code generated, it appears that C2 has moved a LoadKlassNode for a dynamic object above the object's null check. The program fails since there is no exception handling for such a null exception. ###@###.### 2004-03-11 The test program also fails on x86 and sparc under 1.5.0, but with a different symptom. It is not yet clear whether the base problem is the same. ###@###.### 2004-03-11 With the test program, the problem also appears on 1.5.0. The problem is a result of combination of the inlined function returning an interface type which is covered by a null check test. A CastPPNode is created during the parse of the ifnull bytecode. A child CheckCastPPNode is then created by return_current() with no control edge. Finally, a LoadKlassNode is created using the CheckCastPPNode as input. When the CastPP nodes are eliminated after the CCP pass, memory nodes (such as the LoadKlassNode) may need to inherit a control edge from their parent CastPP or CheckCastPP node. In the bug's case, the LoadKlassNode inherited a null control edge input. This is wrong. Under the current scheme, we need to fix MemNode::Ideal_DU_postCCP to look further up the parent chain if CheckCastPP node's control edge is null. ###@###.### 2004-03-12
12-03-2004