JDK-5090967 : SIGSEGV in ContiguousSpace::prepare_for_compaction(CompactPoint*)
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 1.4.2_04,1.4.2_05,1.4.2_06
  • Priority: P1
  • Status: Resolved
  • Resolution: Fixed
  • OS: solaris_1,solaris_8,solaris_9
  • CPU: sparc
  • Submitted: 2004-08-24
  • Updated: 2012-10-09
  • Resolved: 2004-09-24
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other JDK 6
1.4.2_07Fixed 6Fixed
Related Reports
Duplicate :  
Relates :  
Relates :  
Relates :  
Description
Cu is running 1.4.2_04 on Solaris 8, default libthread and are facing SIGSEGVs. The pstack data from mulitple crashes shows the same thing. 

Couple of notes:

1.) The GC Logs (and all other logs) from -Xloggc:[gcstats.log] are available at:  /net/cores.east/cores/dir6/64225774/082004 .

2.) They are running -client but we have seen another cu (Univ of Wisc) have the same exact bug this morning with -server.

3.) They are not willing to test out additional VM switches, they want a patch asap...

4.) We have seen another cu, (Univ of Wisc) have the same *exact* problem this morning with the same VM but running -server, and they were able to get relief by increasing the -Xmx from 512mb to 1024mb. BUT, since VISA already has a setting of -Xmx1024m, we think that this bug will still hit Univ of Wisc in a matter of time.

Here are the switches/arguments used to start the VM:

exec /opt/JREntServer/j2sdk1.4.2_04/jre/bin/java -Dengine.name=jrentserver -Dserver.name=SS55PVISREP01CCDR -cp $CLASSPATH -Dfile.encoding=UTF-8 -Dinstall.root="/opt/JREntServer"  -Dreporthome="/opt/JREntServer" -Xms512m -Xmx1024m -Xverify:none -Xnoclassgc -XX:NewSize=256m -XX:MaxNewSize=512m -XX:MaxPermSize=64m -XX:SurvivorRatio=8 -XX:TargetSurvivorRatio=50 -XX:+MaxFDLimit jet.server.JREntServer -vDebug -vError -logall "$@" >> /opt/JREntServer/logs/SystemOut.log 2>&1 



core 'core' of 10450:   /opt/JREntServer/j2sdk1.4.2_04/jre/bin/java -Dengine.name=jrentserver
-----------------  lwp# 4 / thread# 4  --------------------
 ff3791f0 __sigprocmask (ff37b7f4, 0, 0, fe501d98, ff38c000, 0) + 8
 ff36dd0c _sigon   (fe501d98, ff3938a8, 6, fe500684, fe501d98, ff39321c) + d0
 ff370d4c _thrp_kill (0, 4, 6, ff38c000, 4, ff2c0458) + f8
 ff24bce0 raise    (6, 0, 0, ffffffff, ff2c03c4, ff3931fc) + 40
 ff235984 abort    (ff2bc008, fe5007d8, 0, fffffff8, 4, fe5007f9) + 100
 fe33987c void os::abort(int) (1, fe3d15ca, fe500888, fe3fa000, fe40f908, 35e820) + 80
 fe337974 void os::handle_unexpected_exception(Thread*,int,unsigned char*,void*) (0, b, fe1a832c, fe5015c0, b, 0) + 2cc
 fe33d7e8 JVM_handle_solaris_signal (fe1a832c, fe5015c0, fe501308, 3400, 3558, 0) + 8a4
 ff37b118 __sighndlr (b, fe5015c0, fe501308, fe33b2f4, fe501e40, fe501e30) + c
 ff37811c sigacthandler (b, fe501d98, 0, 0, 0, ff38c000) + 708
 --- called from signal handler with signal 11 (SIGSEGV) ---
 fe1a832c void ContiguousSpace::prepare_for_compaction(CompactPoint*) (b5416ed0, b5400000, fe3fa000, 0, f9800000, 6) + 1e4
 fe1a8104 void Generation::prepare_for_compaction(CompactPoint*) (98898, fe50176c, f5583800, 97cbc, fe421904, 138a8) + 2c
 fe1a809c void GenCollectedHeap::prepare_for_compaction() (98778, fe3bf331, 0, 1, 2e1b8, 0) + 3c
 fe1a4e74 void GenMarkSweep::invoke_at_safepoint(int,ReferenceProcessor*,int) (fe426b1c, 494ff48, 0, 4000, 4c00, 4ec8) + 28c
 fe1a4bb8 void OneContigSpaceCardGeneration::collect(int,int,unsigned,int,int) (9a1c0, 1, 0, 0, 0, 0) + 34
 fe18afa4 void GenCollectedHeap::do_collection(int,int,unsigned,int,int,int,int&) (0, fe3fa000, 0, fe40b1f8, 0, 1) + 4f0
 fe1b2400 void GenCollectedHeap::do_full_collection(int,int,int&) (98778, 0, 1, b2f815f4, fe3fa000, 6) + 20
 fe16c978 void VM_Operation::evaluate() (b2f815d8, 28d888, fe3fa000, 2e908, 3a7230, fe1697d0) + 8c
 fe16c7f8 void VMThread::evaluate_operation(VM_Operation*) (bef90, b2f815d8, 4a04, 4800, 4b20, 0) + 84
 fe0c6c28 void VMThread::loop() (4000, 3c00, 3f28, 3c00, 3ed0, 3800) + 3e0
 fe0c65f0 void VMThread::run() (bef90, 0, fe4193e0, ffff8000, 0, ff38c000) + 8c
 fe0c64dc _start   (bef90, ff38d658, 1, 1, ff38c000, 0) + 134
 ff37b01c _thread_start (bef90, 0, 0, 0, 0, 0) + 40


Heap at VM Abort:
Heap
 def new generation   total 235968K, used 3791K [0xb5400000, 0xc5400000, 0xc5400000)
  eden space 209792K,   1% used [0xb5400000, 0xb57b3e88, 0xc20e0000)
  from space 26176K,   0% used [0xc3a70000, 0xc3a70000, 0xc5400000)
  to   space 26176K,   0% used [0xc20e0000, 0xc20e0000, 0xc3a70000)
 tenured generation   total 262144K, used 71280K [0xc5400000, 0xd5400000, 0xf5400000)
   the space 262144K,  27% used [0xc5400000, 0xc999c0c0, 0xc98f0a00, 0xd5400000)
 compacting perm gen  total 15872K, used 15858K [0xf5400000, 0xf6380000, 0xf9400000)
   the space 15872K,  99% used [0xf5400000, 0xf637c9a8, 0xf637ca00, 0xf6380000)

###@###.### 2004-08-23
###@###.### 2004-08-23

Comments
CONVERTED DATA BugTraq+ Release Management Values COMMIT TO FIX: 1.5.0_01 mustang FIXED IN: 1.5.0_01 mustang INTEGRATED IN: 1.5.0_01
24-09-2004

EVALUATION 2 cores from first customer show corrupted object in Eden which confounds phase2 scan during mark-compact. Possibility of allocation gotchas being investigated. xxx@xxx 2004-09-07: The problem was that an allocation was attempted via the JNI interface while an exception was pending in the allocating thread's context. The JNI spec requires that such pending exceptions in the JNI environment be cleared before calling any JNI method (other than those listed below): ----------------------------------------------------------------------------- Exception Handling There are two ways to handle an exception in native code: * The native method can choose to return immediately, causing the exception to be thrown in the Java code that initiated the native method call. * The native code can clear the exception by calling ExceptionClear(), and then execute its own exception-handling code. After an exception has been raised, the native code must first clear the exception before making other JNI calls. When there is a pending exception, the only JNI functions that are safe to call are ExceptionOccurred(), ExceptionDescribe(), and ExceptionClear(). The ExceptionDescribe() function prints a debugging message about the pending exception. ----------------------------------------------------------------------------- For the original, see: http://java.sun.com/docs/books/jni/html/exceptions.html#26383 http://java.sun.com/docs/books/jni/html/design.html#2193 For the latest, see: http://java.sun.com/j2se/{1.4.2,1.5.0}/docs/guide/jni/spec/design.html#wp770
24-09-2004

WORK AROUND Statutory warning: The surgeon-general has determined that making JNI calls with exceptions pending in the thread context is injurious to the application and can cause the application to die.
24-09-2004

SUGGESTED FIX http://analemma.sfbay/net/spot/archive02/ysr/dragon_work/webrev/index.html You may use the following regression test for this bug, as well as for 5101288, which Tom found for the Visa case: % pwd /net/smite.sfbay/never/5090967 % java -test_server -XX:+PrintGCDetails -XX:+ShowMessageBoxOnError -Djava.library.path="/net/smite.sfbay/never/5090967" -Xmx8m -Xms8m -XX:NewSize=6m -XX:+UseSerialGC jnialloc -------------------------------------------------------------------- Fix putback to 1.5.0_01 on 9/16 (and to gc_baseline so it'll go to 1.6.0/Mustang in due course): Event: putback-to Parent workspace: /net/jano.sfbay/export/disk05/hotspot/ws/1.5/tiger_update1_baseline (jano.sfbay:/export/disk05/hotspot/ws/1.5/tiger_update1_baseline) Child workspace: /prt-workspaces/20040916084211.ysr.tiger_update/workspace (prt-web:/prt-workspaces/20040916084211.ysr.tiger_update/workspace) User: ysr Comment: --------------------------------------------------------- Original workspace: neeraja:/net/spot/scratch/ysr/tiger_update Submitter: ysr Archived data: /net/prt-archiver.sfbay/data/archived_workspaces/1.5/tiger_update1_baseline/2004/20040916084211.ysr.tiger_update/ Webrev: http://analemma.sfbay.sun.com/net/prt-archiver.sfbay/data/archived_workspaces/1.5/tiger_update1_baseline/2004/20040916084211.ysr.tiger_update/workspace/webrevs/webrev-2004.09.16/index.html Fixed 5090967: SIGSEGV in ContiguousSpace::prepare_for_compaction(CompactPoint*) http://analemma.sfbay/net/spot/archive02/ysr/dragon_work/webrev The problem was a pending exception in the context of the allocating thread when calling in from JNI for an allocation. This in turn would cause storage to be allocated, but a subsequent exception check to fail to initialize such storage, setting the stage for a possible subsequent mark-compact collection to crash when trying to navigate the uninitialized space. Although in this case, the caller was violating the spec (but see also bug 5101288, found by Tom and fixed by Mingyao), it turns out that we can harden the JVM's memory system against such user error by simply checking for the presence of such exceptions before allocating storage. There was no performance impact from the extra check, presumably because it's only in the slow path of the allocation. Another possible alternative, suggested by Tao Ma, is to not do the pre-allocation check but, in the post-allocation check, convert the allocated storage into a filler object if an exception is pending but space has been allocated. That would avoid the extra check that this change introduces (at the expense of allocating the extra storage). Since there was no performance impact from our current fix, we chose to stay with it. Thanks to Poonam Bajaj, Dave Detlefs, Peter Kessler, Tao Ma and Tom Rodriguez for debugging help and suggestions, core file analysis and discussion. Reviewed by: Tom Rodriguez, Tao Ma, Peter Kessler(*) Approved for Tiger Update 1 by: Jerry Driscoll / Debra Chapatte of TU1 bug team Fix Verified: yes and no (see below) Verification Testing: . regression test from Tom (thanks!) run with the following params to cause mark-compact with uninit storage in Eden (before fix) fails immediately without fix and passes with fix: java -Djava.library.path="/net/smite.sfbay/never/5090967" -Xmx8m -Xms8m -XX:NewSize=6m -XX:+UseSerialGC jnialloc . field verification not yet done (and may not be possible since offending S1WS code may already have been fixed) Testing: . refWorkload, PRT, runThese Performance (no change): refworkload, alacrity (*) some clean-ups suggested for Mustang Files: update: src/share/vm/gc_interface/collectedHeap.inline.hpp Examined files: 3130 Contents Summary: 1 update 3129 no action (unchanged) ---------------------------------------------------------------------- here's the requisite putback info for gc_baseline: Event: putback-to Parent workspace: /net/jano.sfbay/export/disk05/hotspot/ws/main/gc_baseline (jano.sfbay:/export/disk05/hotspot/ws/main/gc_baseline) Child workspace: /prt-workspaces/20040916004420.ysr.dragon_work/workspace (prt-web:/prt-workspaces/20040916004420.ysr.dragon_work/workspace) User: ysr Comment: --------------------------------------------------------- Original workspace: sr1-unwk-17:/net/spot/archive02/ysr/dragon_work Submitter: ysr Archived data: /net/prt-archiver.sfbay/data/archived_workspaces/main/gc_baseline/2004/20040916004420.ysr.dragon_work/ Webrev: http://analemma.sfbay.sun.com/net/prt-archiver.sfbay/data/archived_workspaces/main/gc_baseline/2004/20040916004420.ysr.dragon_work/workspace/webrevs/webrev-2004.09.16/index.html Fixed 5090967: SIGSEGV in ContiguousSpace::prepare_for_compaction(CompactPoint*) ... -------------------------------------------------------------------------
24-09-2004

PUBLIC COMMENTS Cu is running 1.4.2_04 on Solaris 8, default libthread and are facing SIGSEGVs. The pstack data from mulitple crashes shows the same thing. Couple of notes: 1.) The GC Logs (and all other logs) from -Xloggc:[gcstats.log] are available at: /net/cores.east/cores/dir6/64225774/082004 . 2.) They are running -client but we have seen another cu (Univ of Wisc) have the same exact bug this morning with -server. 3.) They are not willing to test out additional VM switches, they want a patch asap... 4.) We have seen another cu, have the a very similiar problem this morning with the same VM but running -server, and they were able to get relief by increasing the -Xmx from 512mb to 1024mb. BUT, since VISA already has a setting of -Xmx1024m, we think that this bug will still hit Univ of Wisc in a matter of time. Here are the switches/arguments used to start the VM: exec /opt/JREntServer/j2sdk1.4.2_04/jre/bin/java -Dengine.name=jrentserver -Dserver.name=SS55PVISREP01CCDR -cp $CLASSPATH -Dfile.encoding=UTF-8 -Dinstall.root="/opt/JREntServer" -Dreporthome="/opt/JREntServer" -Xms512m -Xmx1024m -Xverify:none -Xnoclassgc -XX:NewSize=256m -XX:MaxNewSize=512m -XX:MaxPermSize=64m -XX:SurvivorRatio=8 -XX:TargetSurvivorRatio=50 -XX:+MaxFDLimit jet.server.JREntServer -vDebug -vError -logall "$@" >> /opt/JREntServer/logs/SystemOut.log 2>&1 core 'core' of 10450: /opt/JREntServer/j2sdk1.4.2_04/jre/bin/java -Dengine.name=jrentserver ----------------- lwp# 4 / thread# 4 -------------------- ff3791f0 __sigprocmask (ff37b7f4, 0, 0, fe501d98, ff38c000, 0) + 8 ff36dd0c _sigon (fe501d98, ff3938a8, 6, fe500684, fe501d98, ff39321c) + d0 ff370d4c _thrp_kill (0, 4, 6, ff38c000, 4, ff2c0458) + f8 ff24bce0 raise (6, 0, 0, ffffffff, ff2c03c4, ff3931fc) + 40 ff235984 abort (ff2bc008, fe5007d8, 0, fffffff8, 4, fe5007f9) + 100 fe33987c void os::abort(int) (1, fe3d15ca, fe500888, fe3fa000, fe40f908, 35e820) + 80 fe337974 void os::handle_unexpected_exception(Thread*,int,unsigned char*,void*) (0, b, fe1a832c, fe5015c0, b, 0) + 2cc fe33d7e8 JVM_handle_solaris_signal (fe1a832c, fe5015c0, fe501308, 3400, 3558, 0) + 8a4 ff37b118 __sighndlr (b, fe5015c0, fe501308, fe33b2f4, fe501e40, fe501e30) + c ff37811c sigacthandler (b, fe501d98, 0, 0, 0, ff38c000) + 708 --- called from signal handler with signal 11 (SIGSEGV) --- fe1a832c void ContiguousSpace::prepare_for_compaction(CompactPoint*) (b5416ed0, b5400000, fe3fa000, 0, f9800000, 6) + 1e4 fe1a8104 void Generation::prepare_for_compaction(CompactPoint*) (98898, fe50176c, f5583800, 97cbc, fe421904, 138a8) + 2c fe1a809c void GenCollectedHeap::prepare_for_compaction() (98778, fe3bf331, 0, 1, 2e1b8, 0) + 3c fe1a4e74 void GenMarkSweep::invoke_at_safepoint(int,ReferenceProcessor*,int) (fe426b1c, 494ff48, 0, 4000, 4c00, 4ec8) + 28c fe1a4bb8 void OneContigSpaceCardGeneration::collect(int,int,unsigned,int,int) (9a1c0, 1, 0, 0, 0, 0) + 34 fe18afa4 void GenCollectedHeap::do_collection(int,int,unsigned,int,int,int,int&) (0, fe3fa000, 0, fe40b1f8, 0, 1) + 4f0 fe1b2400 void GenCollectedHeap::do_full_collection(int,int,int&) (98778, 0, 1, b2f815f4, fe3fa000, 6) + 20 fe16c978 void VM_Operation::evaluate() (b2f815d8, 28d888, fe3fa000, 2e908, 3a7230, fe1697d0) + 8c fe16c7f8 void VMThread::evaluate_operation(VM_Operation*) (bef90, b2f815d8, 4a04, 4800, 4b20, 0) + 84 fe0c6c28 void VMThread::loop() (4000, 3c00, 3f28, 3c00, 3ed0, 3800) + 3e0 fe0c65f0 void VMThread::run() (bef90, 0, fe4193e0, ffff8000, 0, ff38c000) + 8c fe0c64dc _start (bef90, ff38d658, 1, 1, ff38c000, 0) + 134 ff37b01c _thread_start (bef90, 0, 0, 0, 0, 0) + 40 Heap at VM Abort: Heap def new generation total 235968K, used 3791K [0xb5400000, 0xc5400000, 0xc5400000) eden space 209792K, 1% used [0xb5400000, 0xb57b3e88, 0xc20e0000) from space 26176K, 0% used [0xc3a70000, 0xc3a70000, 0xc5400000) to space 26176K, 0% used [0xc20e0000, 0xc20e0000, 0xc3a70000) tenured generation total 262144K, used 71280K [0xc5400000, 0xd5400000, 0xf5400000) the space 262144K, 27% used [0xc5400000, 0xc999c0c0, 0xc98f0a00, 0xd5400000) compacting perm gen total 15872K, used 15858K [0xf5400000, 0xf6380000, 0xf9400000) the space 15872K, 99% used [0xf5400000, 0xf637c9a8, 0xf637ca00, 0xf6380000) ###@###.### 2004-08-23 ###@###.### 2004-08-23
23-08-2004