United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
JDK-5090967 : SIGSEGV in ContiguousSpace::prepare_for_compaction(CompactPoint*)

Details
Type:
Bug
Submit Date:
2004-08-24
Status:
Resolved
Updated Date:
2012-10-09
Project Name:
JDK
Resolved Date:
2004-09-24
Component:
hotspot
OS:
solaris_9,solaris_8,solaris_1
Sub-Component:
gc
CPU:
sparc
Priority:
P1
Resolution:
Fixed
Affected Versions:
1.4.2_04,1.4.2_05,1.4.2_06
Fixed Versions:
5.0u1 (01)

Related Reports
Backport:
Backport:
Duplicate:
Relates:
Relates:
Relates:

Sub Tasks

Description
Cu is running 1.4.2_04 on Solaris 8, default libthread and are facing SIGSEGVs. The pstack data from mulitple crashes shows the same thing. 

Couple of notes:

1.) The GC Logs (and all other logs) from -Xloggc:[gcstats.log] are available at:  /net/cores.east/cores/dir6/64225774/082004 .

2.) They are running -client but we have seen another cu (Univ of Wisc) have the same exact bug this morning with -server.

3.) They are not willing to test out additional VM switches, they want a patch asap...

4.) We have seen another cu, (Univ of Wisc) have the same *exact* problem this morning with the same VM but running -server, and they were able to get relief by increasing the -Xmx from 512mb to 1024mb. BUT, since VISA already has a setting of -Xmx1024m, we think that this bug will still hit Univ of Wisc in a matter of time.

Here are the switches/arguments used to start the VM:

exec /opt/JREntServer/j2sdk1.4.2_04/jre/bin/java -Dengine.name=jrentserver -Dserver.name=SS55PVISREP01CCDR -cp $CLASSPATH -Dfile.encoding=UTF-8 -Dinstall.root="/opt/JREntServer"  -Dreporthome="/opt/JREntServer" -Xms512m -Xmx1024m -Xverify:none -Xnoclassgc -XX:NewSize=256m -XX:MaxNewSize=512m -XX:MaxPermSize=64m -XX:SurvivorRatio=8 -XX:TargetSurvivorRatio=50 -XX:+MaxFDLimit jet.server.JREntServer -vDebug -vError -logall "$@" >> /opt/JREntServer/logs/SystemOut.log 2>&1 



core 'core' of 10450:   /opt/JREntServer/j2sdk1.4.2_04/jre/bin/java -Dengine.name=jrentserver
-----------------  lwp# 4 / thread# 4  --------------------
 ff3791f0 __sigprocmask (ff37b7f4, 0, 0, fe501d98, ff38c000, 0) + 8
 ff36dd0c _sigon   (fe501d98, ff3938a8, 6, fe500684, fe501d98, ff39321c) + d0
 ff370d4c _thrp_kill (0, 4, 6, ff38c000, 4, ff2c0458) + f8
 ff24bce0 raise    (6, 0, 0, ffffffff, ff2c03c4, ff3931fc) + 40
 ff235984 abort    (ff2bc008, fe5007d8, 0, fffffff8, 4, fe5007f9) + 100
 fe33987c void os::abort(int) (1, fe3d15ca, fe500888, fe3fa000, fe40f908, 35e820) + 80
 fe337974 void os::handle_unexpected_exception(Thread*,int,unsigned char*,void*) (0, b, fe1a832c, fe5015c0, b, 0) + 2cc
 fe33d7e8 JVM_handle_solaris_signal (fe1a832c, fe5015c0, fe501308, 3400, 3558, 0) + 8a4
 ff37b118 __sighndlr (b, fe5015c0, fe501308, fe33b2f4, fe501e40, fe501e30) + c
 ff37811c sigacthandler (b, fe501d98, 0, 0, 0, ff38c000) + 708
 --- called from signal handler with signal 11 (SIGSEGV) ---
 fe1a832c void ContiguousSpace::prepare_for_compaction(CompactPoint*) (b5416ed0, b5400000, fe3fa000, 0, f9800000, 6) + 1e4
 fe1a8104 void Generation::prepare_for_compaction(CompactPoint*) (98898, fe50176c, f5583800, 97cbc, fe421904, 138a8) + 2c
 fe1a809c void GenCollectedHeap::prepare_for_compaction() (98778, fe3bf331, 0, 1, 2e1b8, 0) + 3c
 fe1a4e74 void GenMarkSweep::invoke_at_safepoint(int,ReferenceProcessor*,int) (fe426b1c, 494ff48, 0, 4000, 4c00, 4ec8) + 28c
 fe1a4bb8 void OneContigSpaceCardGeneration::collect(int,int,unsigned,int,int) (9a1c0, 1, 0, 0, 0, 0) + 34
 fe18afa4 void GenCollectedHeap::do_collection(int,int,unsigned,int,int,int,int&) (0, fe3fa000, 0, fe40b1f8, 0, 1) + 4f0
 fe1b2400 void GenCollectedHeap::do_full_collection(int,int,int&) (98778, 0, 1, b2f815f4, fe3fa000, 6) + 20
 fe16c978 void VM_Operation::evaluate() (b2f815d8, 28d888, fe3fa000, 2e908, 3a7230, fe1697d0) + 8c
 fe16c7f8 void VMThread::evaluate_operation(VM_Operation*) (bef90, b2f815d8, 4a04, 4800, 4b20, 0) + 84
 fe0c6c28 void VMThread::loop() (4000, 3c00, 3f28, 3c00, 3ed0, 3800) + 3e0
 fe0c65f0 void VMThread::run() (bef90, 0, fe4193e0, ffff8000, 0, ff38c000) + 8c
 fe0c64dc _start   (bef90, ff38d658, 1, 1, ff38c000, 0) + 134
 ff37b01c _thread_start (bef90, 0, 0, 0, 0, 0) + 40


Heap at VM Abort:
Heap
 def new generation   total 235968K, used 3791K [0xb5400000, 0xc5400000, 0xc5400000)
  eden space 209792K,   1% used [0xb5400000, 0xb57b3e88, 0xc20e0000)
  from space 26176K,   0% used [0xc3a70000, 0xc3a70000, 0xc5400000)
  to   space 26176K,   0% used [0xc20e0000, 0xc20e0000, 0xc3a70000)
 tenured generation   total 262144K, used 71280K [0xc5400000, 0xd5400000, 0xf5400000)
   the space 262144K,  27% used [0xc5400000, 0xc999c0c0, 0xc98f0a00, 0xd5400000)
 compacting perm gen  total 15872K, used 15858K [0xf5400000, 0xf6380000, 0xf9400000)
   the space 15872K,  99% used [0xf5400000, 0xf637c9a8, 0xf637ca00, 0xf6380000)

###@###.### 2004-08-23
###@###.### 2004-08-23

                                    

Comments
PUBLIC COMMENTS

Cu is running 1.4.2_04 on Solaris 8, default libthread and are facing SIGSEGVs. The pstack data from mulitple crashes shows the same thing. 

Couple of notes:

1.) The GC Logs (and all other logs) from -Xloggc:[gcstats.log] are available at:  /net/cores.east/cores/dir6/64225774/082004 .

2.) They are running -client but we have seen another cu (Univ of Wisc) have the same exact bug this morning with -server.

3.) They are not willing to test out additional VM switches, they want a patch asap...

4.) We have seen another cu, have the a very similiar problem this morning with the same VM but running -server, and they were able to get relief by increasing the -Xmx from 512mb to 1024mb. BUT, since VISA already has a setting of -Xmx1024m, we think that this bug will still hit Univ of Wisc in a matter of time.

Here are the switches/arguments used to start the VM:

exec /opt/JREntServer/j2sdk1.4.2_04/jre/bin/java -Dengine.name=jrentserver -Dserver.name=SS55PVISREP01CCDR -cp $CLASSPATH -Dfile.encoding=UTF-8 -Dinstall.root="/opt/JREntServer"  -Dreporthome="/opt/JREntServer" -Xms512m -Xmx1024m -Xverify:none -Xnoclassgc -XX:NewSize=256m -XX:MaxNewSize=512m -XX:MaxPermSize=64m -XX:SurvivorRatio=8 -XX:TargetSurvivorRatio=50 -XX:+MaxFDLimit jet.server.JREntServer -vDebug -vError -logall "$@" >> /opt/JREntServer/logs/SystemOut.log 2>&1 



core 'core' of 10450:   /opt/JREntServer/j2sdk1.4.2_04/jre/bin/java -Dengine.name=jrentserver
-----------------  lwp# 4 / thread# 4  --------------------
 ff3791f0 __sigprocmask (ff37b7f4, 0, 0, fe501d98, ff38c000, 0) + 8
 ff36dd0c _sigon   (fe501d98, ff3938a8, 6, fe500684, fe501d98, ff39321c) + d0
 ff370d4c _thrp_kill (0, 4, 6, ff38c000, 4, ff2c0458) + f8
 ff24bce0 raise    (6, 0, 0, ffffffff, ff2c03c4, ff3931fc) + 40
 ff235984 abort    (ff2bc008, fe5007d8, 0, fffffff8, 4, fe5007f9) + 100
 fe33987c void os::abort(int) (1, fe3d15ca, fe500888, fe3fa000, fe40f908, 35e820) + 80
 fe337974 void os::handle_unexpected_exception(Thread*,int,unsigned char*,void*) (0, b, fe1a832c, fe5015c0, b, 0) + 2cc
 fe33d7e8 JVM_handle_solaris_signal (fe1a832c, fe5015c0, fe501308, 3400, 3558, 0) + 8a4
 ff37b118 __sighndlr (b, fe5015c0, fe501308, fe33b2f4, fe501e40, fe501e30) + c
 ff37811c sigacthandler (b, fe501d98, 0, 0, 0, ff38c000) + 708
 --- called from signal handler with signal 11 (SIGSEGV) ---
 fe1a832c void ContiguousSpace::prepare_for_compaction(CompactPoint*) (b5416ed0, b5400000, fe3fa000, 0, f9800000, 6) + 1e4
 fe1a8104 void Generation::prepare_for_compaction(CompactPoint*) (98898, fe50176c, f5583800, 97cbc, fe421904, 138a8) + 2c
 fe1a809c void GenCollectedHeap::prepare_for_compaction() (98778, fe3bf331, 0, 1, 2e1b8, 0) + 3c
 fe1a4e74 void GenMarkSweep::invoke_at_safepoint(int,ReferenceProcessor*,int) (fe426b1c, 494ff48, 0, 4000, 4c00, 4ec8) + 28c
 fe1a4bb8 void OneContigSpaceCardGeneration::collect(int,int,unsigned,int,int) (9a1c0, 1, 0, 0, 0, 0) + 34
 fe18afa4 void GenCollectedHeap::do_collection(int,int,unsigned,int,int,int,int&) (0, fe3fa000, 0, fe40b1f8, 0, 1) + 4f0
 fe1b2400 void GenCollectedHeap::do_full_collection(int,int,int&) (98778, 0, 1, b2f815f4, fe3fa000, 6) + 20
 fe16c978 void VM_Operation::evaluate() (b2f815d8, 28d888, fe3fa000, 2e908, 3a7230, fe1697d0) + 8c
 fe16c7f8 void VMThread::evaluate_operation(VM_Operation*) (bef90, b2f815d8, 4a04, 4800, 4b20, 0) + 84
 fe0c6c28 void VMThread::loop() (4000, 3c00, 3f28, 3c00, 3ed0, 3800) + 3e0
 fe0c65f0 void VMThread::run() (bef90, 0, fe4193e0, ffff8000, 0, ff38c000) + 8c
 fe0c64dc _start   (bef90, ff38d658, 1, 1, ff38c000, 0) + 134
 ff37b01c _thread_start (bef90, 0, 0, 0, 0, 0) + 40


Heap at VM Abort:
Heap
 def new generation   total 235968K, used 3791K [0xb5400000, 0xc5400000, 0xc5400000)
  eden space 209792K,   1% used [0xb5400000, 0xb57b3e88, 0xc20e0000)
  from space 26176K,   0% used [0xc3a70000, 0xc3a70000, 0xc5400000)
  to   space 26176K,   0% used [0xc20e0000, 0xc20e0000, 0xc3a70000)
 tenured generation   total 262144K, used 71280K [0xc5400000, 0xd5400000, 0xf5400000)
   the space 262144K,  27% used [0xc5400000, 0xc999c0c0, 0xc98f0a00, 0xd5400000)
 compacting perm gen  total 15872K, used 15858K [0xf5400000, 0xf6380000, 0xf9400000)
   the space 15872K,  99% used [0xf5400000, 0xf637c9a8, 0xf637ca00, 0xf6380000)

###@###.### 2004-08-23
###@###.### 2004-08-23
                                     
2004-08-23
SUGGESTED FIX

http://analemma.sfbay/net/spot/archive02/ysr/dragon_work/webrev/index.html

You may use the following regression test for this bug, as well
as for 5101288, which Tom found for the Visa case:

% pwd
/net/smite.sfbay/never/5090967

% java -test_server -XX:+PrintGCDetails -XX:+ShowMessageBoxOnError -Djava.library.path="/net/smite.sfbay/never/5090967" -Xmx8m -Xms8m -XX:NewSize=6m -XX:+UseSerialGC jnialloc

--------------------------------------------------------------------
Fix putback to 1.5.0_01 on 9/16 (and to gc_baseline so it'll go to
1.6.0/Mustang in due course):

Event:            putback-to
Parent workspace: /net/jano.sfbay/export/disk05/hotspot/ws/1.5/tiger_update1_baseline
                  (jano.sfbay:/export/disk05/hotspot/ws/1.5/tiger_update1_baseline)
Child workspace:  /prt-workspaces/20040916084211.ysr.tiger_update/workspace
                  (prt-web:/prt-workspaces/20040916084211.ysr.tiger_update/workspace)
User:             ysr

Comment:

---------------------------------------------------------

Original workspace:     neeraja:/net/spot/scratch/ysr/tiger_update
Submitter:              ysr
Archived data:          /net/prt-archiver.sfbay/data/archived_workspaces/1.5/tiger_update1_baseline/2004/20040916084211.ysr.tiger_update/
Webrev:                 http://analemma.sfbay.sun.com/net/prt-archiver.sfbay/data/archived_workspaces/1.5/tiger_update1_baseline/2004/20040916084211.ysr.tiger_update/workspace/webrevs/webrev-2004.09.16/index.html

Fixed 5090967: SIGSEGV in ContiguousSpace::prepare_for_compaction(CompactPoint*)

http://analemma.sfbay/net/spot/archive02/ysr/dragon_work/webrev

The problem was a pending exception in the context of the
allocating thread when calling in from JNI for an allocation.
This in turn would cause storage to be allocated, but a
subsequent exception check to fail to initialize such
storage, setting the stage for a possible subsequent
mark-compact collection to crash when trying to
navigate the uninitialized space. Although in this case,
the caller was violating the spec (but see also bug
5101288, found by Tom and fixed by Mingyao), it turns out
that we can harden the JVM's memory system against such
user error by simply checking for the presence of such
exceptions before allocating storage.

There was no performance impact from the extra check,
presumably because it's only in the slow path of the
allocation. Another possible alternative, suggested by
Tao Ma, is to not do the pre-allocation check but,
in the post-allocation check, convert the allocated
storage into a filler object if an exception is pending
but space has been allocated. That would avoid the extra
check that this change introduces (at the expense of
allocating the extra storage). Since there was
no performance impact from our current fix, we
chose to stay with it.

Thanks to Poonam Bajaj, Dave Detlefs, Peter Kessler,
Tao Ma and Tom Rodriguez for debugging help and suggestions,
core file analysis and discussion.

Reviewed by: Tom Rodriguez, Tao Ma, Peter Kessler(*)
Approved for Tiger Update 1 by: Jerry Driscoll / Debra Chapatte of
                                TU1 bug team

Fix Verified: yes and no (see below)

Verification Testing:
   . regression test from Tom (thanks!) run with the following params to
     cause mark-compact with uninit storage in Eden (before fix)
     fails immediately without fix and passes with fix:

     java -Djava.library.path="/net/smite.sfbay/never/5090967"
          -Xmx8m -Xms8m -XX:NewSize=6m -XX:+UseSerialGC jnialloc

   . field verification not yet done (and may not be possible
     since offending S1WS code may already have been fixed)
     
Testing:
  . refWorkload, PRT, runThese

Performance (no change): refworkload, alacrity

(*) some clean-ups suggested for Mustang

Files:
update: src/share/vm/gc_interface/collectedHeap.inline.hpp

Examined files: 3130

Contents Summary:
       1   update
    3129   no action (unchanged)
----------------------------------------------------------------------

here's the requisite putback info for gc_baseline:

Event:            putback-to
Parent workspace: /net/jano.sfbay/export/disk05/hotspot/ws/main/gc_baseline
                  (jano.sfbay:/export/disk05/hotspot/ws/main/gc_baseline)
Child workspace:  /prt-workspaces/20040916004420.ysr.dragon_work/workspace
                  (prt-web:/prt-workspaces/20040916004420.ysr.dragon_work/workspace)
User:             ysr

Comment:

---------------------------------------------------------

Original workspace:     sr1-unwk-17:/net/spot/archive02/ysr/dragon_work
Submitter:              ysr
Archived data:          /net/prt-archiver.sfbay/data/archived_workspaces/main/gc_baseline/2004/20040916004420.ysr.dragon_work/
Webrev:                 http://analemma.sfbay.sun.com/net/prt-archiver.sfbay/data/archived_workspaces/main/gc_baseline/2004/20040916004420.ysr.dragon_work/workspace/webrevs/webrev-2004.09.16/index.html

Fixed 5090967: SIGSEGV in ContiguousSpace::prepare_for_compaction(CompactPoint*)

...
-------------------------------------------------------------------------
                                     
2004-09-24
WORK AROUND

Statutory warning: The surgeon-general has determined that making JNI calls
with exceptions pending in the thread context is injurious to the
application and can cause the application to die.
                                     
2004-09-24
EVALUATION

2 cores from first customer show corrupted object in Eden
which confounds phase2 scan during mark-compact. Possibility of
allocation gotchas being investigated.


xxx@xxx 2004-09-07: The problem was that an allocation
was attempted via the JNI interface while an exception was pending in
the allocating thread's context. The JNI spec requires that such pending
exceptions in the JNI environment be cleared before calling any JNI method
(other than those listed below):

-----------------------------------------------------------------------------
Exception Handling

There are two ways to handle an exception in native code:

    * The native method can choose to return immediately, causing the exception to be thrown in the Java code that initiated the native method call.
    * The native code can clear the exception by calling ExceptionClear(), and then execute its own exception-handling code.

After an exception has been raised, the native code must first clear the exception before making other JNI calls. When there is a pending exception, the only JNI functions that are safe to call are ExceptionOccurred(), ExceptionDescribe(), and ExceptionClear(). The ExceptionDescribe() function prints a debugging message about the pending exception.

-----------------------------------------------------------------------------
For the original, see:
http://java.sun.com/docs/books/jni/html/exceptions.html#26383
http://java.sun.com/docs/books/jni/html/design.html#2193

For the latest, see:
http://java.sun.com/j2se/{1.4.2,1.5.0}/docs/guide/jni/spec/design.html#wp770
                                     
2004-09-24
CONVERTED DATA

BugTraq+ Release Management Values

COMMIT TO FIX:
1.5.0_01
mustang

FIXED IN:
1.5.0_01
mustang

INTEGRATED IN:
1.5.0_01


                                     
2004-09-24



Hardware and Software, Engineered to Work Together