United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: JDK-4959717 JVM crash with error "Fatal: null exception in compiled code"
JDK-4959717 : JVM crash with error "Fatal: null exception in compiled code"

Details
Type:
Bug
Submit Date:
2003-11-25
Status:
Resolved
Updated Date:
2004-06-11
Project Name:
JDK
Resolved Date:
2004-03-23
Component:
hotspot
OS:
solaris_8,solaris_7,windows_2000
Sub-Component:
compiler
CPU:
x86,sparc
Priority:
P2
Resolution:
Fixed
Affected Versions:
1.3.1,1.4.2_01,1.4.2_04
Fixed Versions:
1.3.1_12 (12)

Related Reports
Backport:
Backport:
Duplicate:
Relates:
Relates:

Sub Tasks

Description
customer is running an intel server farm running Solaris 8 X86.
They started experiencing this in May.  It's progressively gotten worse.  Upto 80 outagages a month at one point.

Initially every crash ended with a truncated core file of only about 10% of what it should have been.

It was not until the use of -XX:+ShowMessageBoxOnError that we were able to collect a pstack, pmap and gcore of the failing process
A> What I see in the log file :
#
# HotSpot Virtual Machine Error, Internal Error
# Please report this error at
# http://java.sun.com/cgi-bin/bugreport.cgi
#
# Java VM: Java HotSpot(TM) Server VM (1.4.2_01-b06 mixed mode)
#
# Error ID: 53484152454432554E54494D450E435050018D
#
# Problematic Thread: prio=5 tid=0x08396a80 nid=0x25 runnable
#
Internal Error
Fatal: null exception in compiled code

Do you want to debug the problem?

All data collected is in /net/cores.ebay/cores/63833818
the cores are in the 24nov and 25nov sub directories.

interesting in the pstacks(this one is from the 25nov crash
there are several threads signitures

-----------------  lwp# 29 / thread# 28  --------------------
 dfac892c nanosleep (95a6e9d0, 95a6e9c8)
 dfb87f90 __nanosleep (64) + 58
 debced77 __1cCosOinfinite_sleep6F_v_ () + 17
 dead5697 __1cMreport_error6Fipkci11E_v_ (1, dec90008, 18d, dec647fc, dec647f0, 
95a6f35c) + 423
 dead4fc9 __1cMreport_fatal6Fpkci1E_v_ (dec90008, 18d, dec8ffe8) + 49
 de8fa3af __1cNSharedRuntimebGcompute_exception_return_address6Fi_pC_ (1) + 22d
 da034111 ???????? ()
-----------------  lwp# 30 / thread# 29  --------------------

-----------------  lwp# 45 / thread# 44  --------------------
 dfac892c nanosleep (9574e494, 9574e48c)
 dfb87f90 __nanosleep (64) + 58
 debcf817 __1cCosLmessage_box6Fpkc2_i_ (dec647fc, 9574ea00) + 56
 dead5638 __1cMreport_error6Fipkci11E_v_ (1, dec90008, 18d, dec647fc, dec647f0, 
9574f22c) + 3c4
 dead4fc9 __1cMreport_fatal6Fpkci1E_v_ (dec90008, 18d, dec8ffe8) + 49
 de8fa3af __1cNSharedRuntimebGcompute_exception_return_address6Fi_pC_ (1) + 22d
 da034111 ???????? ()
-----------------  lwp# 47 / thread# 46  --------------------


-----------------  lwp# 56 / thread# 55  --------------------
 dfac892c nanosleep (9501e890, 9501e888)
 dfb87f90 __nanosleep (64) + 58
 debced77 __1cCosOinfinite_sleep6F_v_ () + 17
 dead52af __1cMreport_error6Fipkci11E_v_ (1, dec90008, 18d, dec647fc, dec647f0, 
9501f21c) + 3b
 dead4fc9 __1cMreport_fatal6Fpkci1E_v_ (dec90008, 18d, dec8ffe8) + 49
 de8fa3af __1cNSharedRuntimebGcompute_exception_return_address6Fi_pC_ (1) + 22d
 da034111 ???????? ()
-----------------  lwp# 59 / thread# 58  --------------------

                                    

Comments
EVALUATION

For the code generated, it appears that C2 has moved a LoadKlassNode for a dynamic object above the object's null check. The program fails since there is no exception handling for such a null exception.

###@###.### 2004-03-11

The test program also fails on x86 and sparc under 1.5.0, but with a different symptom. It is not yet clear whether the base problem is the same.

###@###.### 2004-03-11

With the test program, the problem also appears on 1.5.0.

The problem is a result of combination of the inlined function returning an interface type which is covered by a null check test.  

A CastPPNode is created during the parse of the ifnull bytecode.  A child CheckCastPPNode is then created by return_current() with no control edge. Finally, a LoadKlassNode is created using the CheckCastPPNode as input.

When the CastPP nodes are eliminated after the CCP pass, memory nodes (such as the LoadKlassNode) may need to inherit a control edge from their parent CastPP or CheckCastPP node.  In the bug's case, the LoadKlassNode inherited a null control edge input.  This is wrong. Under the current scheme, we need to fix MemNode::Ideal_DU_postCCP to look further up the parent chain if CheckCastPP node's control edge is null.

###@###.### 2004-03-12
                                     
2004-03-12
SUGGESTED FIX

See the webrev for changes to memnode.cpp:

/net/prt-archiver.sfbay/export2/archived_workspaces/main/c2_baseline/2004/20040317100205.rasbold.c2_baseline/workspace/webrevs/webrev-2004.03.17/index.html

###@###.### 2004-03-18
                                     
2004-03-18
CONVERTED DATA

BugTraq+ Release Management Values

COMMIT TO FIX:
1.3.1_12
1.4.2_05
generic
tiger-beta2

FIXED IN:
1.3.1_12
1.4.2_05
tiger-beta2

INTEGRATED IN:
1.3.1_12
1.4.2_05
tiger-b44
tiger-beta2

VERIFIED IN:
1.4.2_05


                                     
2004-07-08



Hardware and Software, Engineered to Work Together