United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
JDK-4695869 : stress test intermittently crashes VM

Details
Type:
Bug
Submit Date:
2002-05-31
Status:
Closed
Updated Date:
2002-11-20
Project Name:
JDK
Resolved Date:
2002-11-13
Component:
hotspot
OS:
solaris_8
Sub-Component:
runtime
CPU:
sparc
Priority:
P3
Resolution:
Fixed
Affected Versions:
1.4.1
Fixed Versions:
1.4.2 (mantis)

Related Reports
Relates:

Sub Tasks

Description

Name: ipR10196			Date: 05/31/2002


--------------------------------------
Tests           : 
nsk/stress/jck12a/jck12a012
nsk/stress/jck12a/jck12a014

TestBase        : testbase_nsk 
VM              : server 64-bit
Mode            : mixed
Platform        : sparc
OS              : 5.8

----------------------------------------
Steps to reproduce 
================
1. cd /net/sqesvr.eng/export/vsn/GammaBase/Bugs/{BugID}
2. sh doit.sh $JAVA_HOME -d64 [JAVA_OPTS]

Crash is intermittent, so one may have to wait for test
to be executed several times in a loop till the crash occurs.

This test uses stressing technique by running large bundle of JCK
tests simultaneously. This produces a number of OutOfMemoryError
exceptions, so the test works in a limited memory space.
This leads to a variation of the test crash:

-----------------------------------------------------
Unexpected Signal : 10 occurred at PC=0xFFFFFFFF7E6D1698
Function=[Unknown. Nearest: JVM_SetClassSigners+0xC0F0]
Library=/net/alpheridies/export/VM/hopper/weekly/JDK/b13/solaris-sparcv9/jre/lib/sparcv9/server/libjvm.so


Dynamic libraries:
0x100000000 	/net/alpheridies/export/VM/hopper/weekly/JDK/b13/solaris-sparcv9/bin/sparcv9/java
0xffffffff7f200000 	/usr/lib/64/libthread.so.1
0xffffffff7f400000 	/usr/lib/64/libdl.so.1
0xffffffff7ef00000 	/usr/lib/64/libc.so.1
0xffffffff7ed00000 	/usr/platform/SUNW,Ultra-60/lib/sparcv9/libc_psr.so.1
0xffffffff7e400000 	/net/alpheridies/export/VM/hopper/weekly/JDK/b13/solaris-sparcv9/jre/lib/sparcv9/server/libjvm.so
0xffffffff7e200000 	/usr/lib/64/libCrun.so.1
0xffffffff7e000000 	/usr/lib/64/libsocket.so.1
0xffffffff7dd00000 	/usr/lib/64/libnsl.so.1
0xffffffff7db00000 	/usr/lib/64/libm.so.1
0xffffffff7f100000 	/usr/lib/64/libw.so.1
0xffffffff7d800000 	/usr/lib/64/libmp.so.2
0xffffffff7d500000 	/usr/lib/64/librt.so.1
0xffffffff7d300000 	/usr/lib/64/libaio.so.1
0xffffffff7d100000 	/usr/lib/64/libmd5.so.1
0xffffffff7ce00000 	/usr/platform/SUNW,Ultra-60/lib/sparcv9/libmd5_psr.so.1
0xffffffff7cc00000 	/net/alpheridies/export/VM/hopper/weekly/JDK/b13/solaris-sparcv9/jre/lib/sparcv9/native_threads/libhpi.so
0xffffffff7c800000 	/net/alpheridies/export/VM/hopper/weekly/JDK/b13/solaris-sparcv9/jre/lib/sparcv9/libverify.so
0xffffffff7c600000 	/net/alpheridies/export/VM/hopper/weekly/JDK/b13/solaris-sparcv9/jre/lib/sparcv9/libjava.so
0xffffffff7c300000 	/net/alpheridies/export/VM/hopper/weekly/JDK/b13/solaris-sparcv9/jre/lib/sparcv9/libzip.so
0xffffffff2f500000 	/net/alpheridies/export/VM/hopper/weekly/JDK/b13/solaris-sparcv9/jre/lib/sparcv9/libawt.so
0xffffffff2f200000 	/net/alpheridies/export/VM/hopper/weekly/JDK/b13/solaris-sparcv9/jre/lib/sparcv9/libmlib_image.so
0xffffffff2f000000 	/net/alpheridies/export/VM/hopper/weekly/JDK/b13/solaris-sparcv9/jre/lib/sparcv9/motif21/libmawt.so
0xffffffff2eb00000 	/usr/dt/lib/sparcv9/libXm.so.4
0xffffffff2e900000 	/usr/openwin/lib/sparcv9/libXt.so.4
0xffffffff2e700000 	/usr/openwin/lib/sparcv9/libXext.so.0
0xffffffff2e500000 	/usr/openwin/lib/sparcv9/libXtst.so.1
0xffffffff2e200000 	/usr/openwin/lib/sparcv9/libX11.so.4
0xffffffff2e000000 	/usr/openwin/lib/sparcv9/libdps.so.5
0xffffffff2de00000 	/usr/openwin/lib/sparcv9/libSM.so.6
0xffffffff2db00000 	/usr/openwin/lib/sparcv9/libICE.so.6
0xffffffff2d900000 	/net/alpheridies/export/VM/hopper/weekly/JDK/b13/solaris-sparcv9/jre/lib/sparcv9/libfontmanager.so
0xffffffff2d200000 	/usr/lib/sparcv9/liblayout.so
0xffffffff25e00000 	/net/alpheridies/export/VM/hopper/weekly/JDK/b13/solaris-sparcv9/jre/lib/sparcv9/libdcpr.so

Local Time = Sat May 25 01:55:08 2002
Elapsed Time = 193
#
# HotSpot Virtual Machine Error : 10
# Error ID : 4F530E43505002E6 01
# Please report this error at
# http://java.sun.com/cgi-bin/bugreport.cgi
#
# Java VM: Java HotSpot(TM) 64-Bit Server VM (1.4.1-beta-b13 mixed mode)
#
# An error report file has been saved as hs_err_pid19169.log.
# Please refer to the file for further information.
#
-----------------------------------------------------

or

-----------------------------------------------------
An unexpected exception has been detected in native code outside the VM.
Unexpected Signal : 11 occurred at PC=0xFFFFFFFF2DF20AD4
Function=[Unknown. Nearest: DPSInitCommonTextContextProcs+0x24]
Library=/usr/openwin/lib/sparcv9/libdps.so.5

Current Java thread:
        at sun.awt.font.NativeFontWrapper.getGlyphInfo(Native Method)
        - locked <ffffffff3492fb28> (a java.lang.Class)
        at sun.awt.font.StandardGlyphVector.getGlyphInfo(StandardGlyphVector.java:1245)
        at sun.awt.font.AdvanceCache.initLatinAdvances(AdvanceCache.java:172)
        at sun.awt.font.AdvanceCache.<init>(AdvanceCache.java:206)
        at sun.awt.font.AdvanceCache.get(AdvanceCache.java:162)
        - locked <ffffffff318843f0> (a [Ljava.lang.ref.SoftReference;)
        at java.awt.font.TextLayout$OptInfo.getAdvance(TextLayout.java:248)
        at java.awt.font.TextLayout.getAdvance(TextLayout.java:1016)
        at javasoft.sqe.tests.api.java.awt.java2d.font.TextLayout.GetTest.testCase1(GetTest.java:99)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:324)
        at javasoft.sqe.jck.lib.MultiTest.run(MultiTest.java:137)
        at javasoft.sqe.stresstest.StressTest$TestThread.run(StressTest.java:829)

Dynamic libraries:
0x100000000     /net/alpheridies/export/VM/hopper/weekly/JDK/b13/solaris-sparcv9/bin/sparcv9/java
0xffffffff7f300000      /usr/lib/64/libthread.so.1
0xffffffff7f500000      /usr/lib/64/libdl.so.1
0xffffffff7ef00000      /usr/lib/64/libc.so.1
0xffffffff7ee00000      /usr/platform/SUNW,Ultra-4/lib/sparcv9/libc_psr.so.1
0xffffffff7d000000      /net/alpheridies/export/VM/hopper/weekly/JDK/b13/solaris-sparcv9/jre/lib/sparcv9/server/libjvm.so
0xffffffff7ce00000      /usr/lib/64/libCrun.so.1
0xffffffff7cc00000      /usr/lib/64/libsocket.so.1
0xffffffff7ca00000      /usr/lib/64/libnsl.so.1
0xffffffff7c800000      /usr/lib/64/libm.so.1
0xffffffff7d900000      /usr/lib/64/libw.so.1
0xffffffff7c500000      /usr/lib/64/libmp.so.2
0xffffffff7c200000      /net/alpheridies/export/VM/hopper/weekly/JDK/b13/solaris-sparcv9/jre/lib/sparcv9/native_threads/libhpi.so
0xffffffff7c000000      /net/alpheridies/export/VM/hopper/weekly/JDK/b13/solaris-sparcv9/jre/lib/sparcv9/libverify.so
0xffffffff7bd00000      /net/alpheridies/export/VM/hopper/weekly/JDK/b13/solaris-sparcv9/jre/lib/sparcv9/libjava.so
0xffffffff7bb00000      /net/alpheridies/export/VM/hopper/weekly/JDK/b13/solaris-sparcv9/jre/lib/sparcv9/libzip.so
0xffffffff2f300000      /net/alpheridies/export/VM/hopper/weekly/JDK/b13/solaris-sparcv9/jre/lib/sparcv9/libawt.so
0xffffffff2f100000      /net/alpheridies/export/VM/hopper/weekly/JDK/b13/solaris-sparcv9/jre/lib/sparcv9/libmlib_image.so
0xffffffff2ef00000      /net/alpheridies/export/VM/hopper/weekly/JDK/b13/solaris-sparcv9/jre/lib/sparcv9/motif21/libmawt.so
0xffffffff2ea00000      /usr/dt/lib/sparcv9/libXm.so.4
0xffffffff2e800000      /usr/openwin/lib/sparcv9/libXt.so.4
0xffffffff2e600000      /usr/openwin/lib/sparcv9/libXext.so.0
0xffffffff2e400000      /usr/openwin/lib/sparcv9/libXtst.so.1
0xffffffff2e200000      /usr/openwin/lib/sparcv9/libX11.so.4
0xffffffff2df00000      /usr/openwin/lib/sparcv9/libdps.so.5
0xffffffff2dd00000      /usr/openwin/lib/sparcv9/libSM.so.6
0xffffffff2db00000      /usr/openwin/lib/sparcv9/libICE.so.6
0xffffffff2d800000      /usr/openwin/lib/sparcv9/libdga.so.1
0xffffffff2d600000      /net/alpheridies/export/VM/hopper/weekly/JDK/b13/solaris-sparcv9/jre/lib/sparcv9/libfontmanager.so
0xffffffff2d000000      /usr/lib/sparcv9/liblayout.so
0xffffffff2b500000      /net/alpheridies/export/VM/hopper/weekly/JDK/b13/solaris-sparcv9/jre/lib/sparcv9/libdcpr.so
0xffffffff2c700000      /net/alpheridies/export/VM/hopper/weekly/JDK/b13/solaris-sparcv9/jre/lib/sparcv9/libcmm.so
0xffffffff1ce00000      /net/alpheridies/export/VM/hopper/weekly/JDK/b13/solaris-sparcv9/jre/lib/sparcv9/libmlib_image_v.so

Local Time = Fri May 31 16:01:13 2002
Elapsed Time = 135
#
# The exception above was detected in native code outside the VM
#
# Java VM: Java HotSpot(TM) 64-Bit Server VM (1.4.1-beta-b13 mixed mode)
#
# An error report file has been saved as hs_err_pid22116.log.
# Please refer to the file for further information.
#
Abort
Exit status = 134
-----------------------------------------------------

The latter looks like the known integrated bug from java/classes_2D:

  4463818 Stress test jck12a012 crashes or hangs HotSpot VM

======================================================================

                                    

Comments
EVALUATION

#
# HotSpot Virtual Machine Error, assertion failure
# Please report this error at
# http://java.sun.com/cgi-bin/bugreport.cgi
#
# Java VM: Java HotSpot(TM) Client VM (1.4.1-beta-b14-debug mixed mode)
#
# assert(!thread->has_pending_exception(), "shouldn't be allocating with pending exception")
#
# Error ID: /BUILD_AREA/jdk1.4.1/hotspot/src/share/vm/gc_interface/collectedHeap.cpp, 42 [ Patched ]
#
# Problematic Thread: prio=5 tid=0x44ac68 nid=0x20f runnable 
#
Dumping core....


###@###.### 2002-06-06

Got the following after running for approx 10 minutes with 1.4.1-b14
(product build):

#
# HotSpot Virtual Machine Error, Internal Error
# Please report this error at
# http://java.sun.com/cgi-bin/bugreport.cgi
#
# Java VM: Java HotSpot(TM) 64-Bit Server VM (1.4.1-beta-b14 mixed mode)
#
# Error happened during: generation collection for allocation
#
# Error ID: 53595354454D24494354494F4E4152590E4350500373 01
#
# Problematic Thread: prio=5 tid=0x1006bf070 nid=0xd6 waiting on condition 
#
Internal Error
Fatal: defining loader should not be marked if klass is not

Do you want to debug the problem?

(dbx) thread t@4
t@4 (l@5) stopped in _read at 0xffffffff7f3a48ac
0xffffffff7f3a48ac: _read+0x0008:	ta      %icc,%g0 + 64
(dbx) thread -info
	Thread t@4 (0xffffffff3ab81c90) at priority 64
	state: bound to l@5   
	base function: 0xffffffff7dadfdd0: _start() stack: 0xffffffff3ab82000[1048576]
	flags: BOUND|DETACHED|SUSPENDED 
	masked signals: (none)
	Currently active in _read
(dbx) where
current thread: t@4
=>[1] _read(0x0, 0xffffffff7f4c01dc, 0x400, 0xffffffff7f6267e8, 0xffffffff7f4bb768, 0x0), at 0xffffffff7f3a48ac
  [2] _filbuf(0xffffffff7f4bad68, 0xffffffff7f4c05dc, 0x1, 0x400, 0x0, 0x0), at 0xffffffff7f396414
  [3] fgets(0xffffffff7f4bad68, 0xffffffff7f4c05f8, 0xffffffff7f4bad68, 0xffffffff7f4c05dc, 0xffffffff7f4b25e0, 0x3ff), at 0xffffffff7f398e84
  [4] os::message_box(0xffffffff7dd822da, 0xffffffff3ab7fe2c, 0xffffffff3ab8062c, 0xffffffff7dde53f0, 0x0, 0xffffffff7dd824d9), at 0xffffffff7dcea56c
  [5] report_error(0x1, 0xffffffff7ddd535a, 0x373, 0xffffffff7dd822da, 0xffffffff7dd822e9, 0xffffffff3ab807ec), at 0xffffffff7dbefe38
  [6] report_fatal(0xffffffff7ddd535a, 0x373, 0xffffffff7ddd53a0, 0x0, 0x0, 0xffffffff7df717a0), at 0xffffffff7dbef5e4
  [7] SystemDictionary::do_unloading(0xffffffff7df98524, 0xffffffff7df7c9d0, 0x0, 0xffffffff32a17f58, 0xffffffff7df71720, 0xffffffff7df716f0), at 0xffffffff7daf7dcc
  [8] GenMarkSweep::mark_sweep_phase1(0x1, 0xffffffff3ab81284, 0x0, 0xffffffff7dc00fb0, 0x100109e40, 0x9800), at 0xffffffff7dc015b4
...

The fatal error was triggered by the following guarantee() in
SystemDictionary::do_unloading():
        ...
        if (!class_loader->is_gc_marked()) {
          // If the loader is not reachable this entry should always be removed 
          // (will never be looked up again). Note that this is not the same as
          // unloading the referred class.
          if (k_def_class_loader == class_loader) {
            // This is the defining entry, so the referred class is about
            // to be unloaded.
            // Notify the debugger and jvmpi, and clean up the class.
------>     guarantee(!e->is_gc_marked(), 
                      "klass should not be marked if defining loader is not");
            class_was_unloaded = true;

###@###.### 2002-06-07

Same failure (Fatal Error in SystemDictionary::do_unloading) also happens
with c1 on sparc.

###@###.### 2002-06-11

The test provokes multiple problems, for one example see 4700761.
Email from Jane indicates the cause of the problem when the guarantee
fails:  an OutOfMemoryError happens when the vm calls into java to run
ClassLoader.addClass(), and a stale entry is left in the system dictionary.

From: Jane Loizeaux - Sun Microsystems <###@###.###>
Reply-To: Jane Loizeaux - Sun Microsystems <###@###.###>
Subject: classes vector
To: ###@###.###

Hi John

I just looked at ClassLoader.java in the libraries, and classes
are added to its internal list of classes via addClass:

    /*
     * The classes loaded by this class loader. The only purpose of this
     * table is to keep the classes from being GC'ed until the loader
     * is GC'ed.
     */
    private Vector classes = new Vector();


    /*
     * Called by the VM to record every loaded class with this loader.
     */
    void addClass(Class c) {
        classes.addElement(c);
    }


In the VM, we seem to call this in SystemDictionary::define_instance_klass,
which has TRAPS in its arguments.  I think we update the SystemDictionary
in the update_dictionary call, and then we call addClass after that.

I don't understand how we handle exceptions as well as I could, but I
don't see how we'd escape between update_dictionary (which has no CHECK
in the arguments) and the call out to addClass. So, I'd have to guess
that we're failing during the addClass method call (maybe we have to grow
the ClassLoader's classes vector to accomodate the addition?).

Yikes.

I think that Vectors can be given a capacity increment, so maybe changing
ClassLoader to use a really large increment would tickle the bug?

I just talked to Fred about this one. The problem might well be that
in SystemDictionary::define_instance_class, we do something like:

   update_dictionary
   eager_initialize   ?? not sure what this does, maybe runs initializers?
       the impl method as an EXCEPTION_MARK, so no exceptions should be
       thrown;  I'm not sure what that means in this context, really.
   call addClass
   
If the addClass method fails, we definitely have left junk in the dictionary,
as we've already updated it. Maybe we should be treating the class as
though it had never been loaded. 

On the other hand, Fred wondered if this is a bug worth spending a lot
time trying to solve. Users don't have a lot of recourse when they get
an OutOfMemory error (they probably can't execute more code to try to
clean things up). 

But I'm wondering if just
changing that guarantee to an if statement really would be the best 
solution. I'm a little uncomfortable about leaving bad stuff in the
system dictionary, but I don't see a clean way to get around that, and
changing that code always seems risky.

Jane


###@###.### 2002-06-12

There are several bugs provoked by this test case.

One is a bug in classes 2D code:  native code can call into JNI with a
pending exception, which is forbidden by the jni spec.  Created bug 4700761
for this classes_2D problem; the bug causes the assert mentioned at the
very beginning of the evaluation to fail.

Another bug is the fatal error in the system dictionary.  Using this
bugid to track that problem.

The 2 VM failures mentioned in the description

a) Unexpected Signal : 10 occurred at PC=0xFFFFFFFF7E6D1698
   Function=[Unknown. Nearest: JVM_SetClassSigners+0xC0F0]
and
b) Unexpected Signal : 11 occurred at PC=0xFFFFFFFF2DF20AD4
   Function=[Unknown. Nearest: DPSInitCommonTextContextProcs+0x24]

may be still additional bugs provoked by this test.  Have not reproduced them
yet.

###@###.### 2002-06-12

One possible fix for the fatal error in system dictionary is to detect
the OOMError that occurs when SystemDictionary::define_instance_class()
calls out to ClassLoader.addClass() and delete the entry that was added to
the dictionary.  Not sure if this will work because of side effects from
the earlier call to add_to_hierarchy().  eager_initialize() was also called
on the class, but that should have no side effects on anything but the
class itself.


-------------------------------
Looking at the code, I think the best thing to do is not create the
full-fledged system dictionary entry unless we know it's legal to do so.

So I'd change the actions of define_instance from
   check_constraints      can throw a Java exception
   update compiler info
   add to system dictionary
   eager initialize the class if asked to
   add to ClassLoader vector   can throw java exception
to
   check_constraints           can throw a Java exception
   add to ClassLoader vector   can throw java exception
   update compiler info
   add to system dictionary
   eager initialize the class if asked to

so we don't consider the class OK until we've done everything that
could throw a java-level exception.  Throwing an exception means we
return from define_instance immediately (and the caller will do some
cleanups), so actions after that won't occur.


###@###.### 2002-10-24
                                     
2002-10-24
SUGGESTED FIX

from the putback comments:

  This stress tests causes several failures. The one this bug
  addresses is hitting a guarantee in class unloading that a
  class should only be marked for GC if its defining loader is
  also marked.

  The problem was we'd load a class and put it in the system
  dictionary, but the attempt to add it to a ClassLoader's classes
  vector (which is what keeps us from GC dead classes before the
  defining loader is GC'd) could fail with an OutOfMemory error.

  Three options were considered:
    - weaken the guarantee (go back to pre-Hopper behavior)
      That's just a bad idea. This is a corner case, but we need
      to keep loaded classes alive as long as the ClassLoader is live.
    - remove the class from the system dictionary if we get OOM
      Difficult, as we do some other actions (update class hierarchy)
      at the same time.
    - Ensure we can add the class to the ClassLoader.classes vector
      before updating the system dictionary
      This is the fix I went with.

Files:
update: src/share/vm/memory/systemDictionary.cpp


NOTE that this stress test can still provoke failures;  I think
some of the might be flushed out with -Xcheck:jni. I'll submit
new bugs if I can reproduce anything.

###@###.### 2002-11-01
                                     
2002-11-01
CONVERTED DATA

BugTraq+ Release Management Values

COMMIT TO FIX:
mantis

FIXED IN:
mantis

INTEGRATED IN:
mantis

VERIFIED IN:
mantis


                                     
2004-06-14



Hardware and Software, Engineered to Work Together