JDK-6372906 : JVM crashes when classes.jsa file is corrupted
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 6
  • Priority: P2
  • Status: Closed
  • Resolution: Fixed
  • OS: solaris
  • CPU: sparc
  • Submitted: 2006-01-16
  • Updated: 2012-02-01
  • Resolved: 2006-02-15
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 6
6 b72Fixed
Related Reports
Relates :  
Description
During nightly testing, sometimes many tests fail with SIGBUS
An example log is available at

http://vmsqe.sfbay/nightly/mantis/DTWS/results/01-13-06/ServerVM/64BITSOLSPARC/mixed/Main_Baseline/vm.gc-NIGHTLY-Main_Baseline-ServerVM-mixed-64BITSOLSPARC-2006-01-14-06-52-56/ResultDir/allocate001/allocate001.tlog

#
# An unexpected error has been detected by Java Runtime Environment:
#
#  SIGBUS (0xa) at pc=0xfec6658c, pid=13791, tid=2
#
# Java VM: Java HotSpot(TM) Client VM (20060113030552.ap159146.gc_merge mixed mode, sharing)
# Problematic frame:
# Segmentation Fault (core dumped)

Here is pstack output.

-----------------  lwp# 2 / thread# 2  --------------------
 fee59944 void frame::print_on_error(outputStream*,char*,int,bool)const (f788, fe9fb608, ff1b4d18, 7d0, fe9fb558, ff1a7ca8) + 30
 ff06b08c void VMError::report(outputStream*) (fe9fb608, ff1b4d18, fe9fb6a4, 20d18, ff13f016, ff194000) + 350
 ff06bf24 void VMError::report_and_die() (fe9fb6a4, c9fe, 13c00, 0, 214e8, 2) + 474
 fed647a4 JVM_handle_solaris_signal (a, fe9fbb88, fe9fb8d0, 97400, 30000, fec6658c) + 9bc
 ff385fec __sighndlr (a, fe9fbb88, fe9fb8d0, fed63dc8, 0, 0) + c
 ff37fdd8 call_user_handler (a, fe9fbb88, fe9fb8d0, 0, 0, 0) + 234
 ff37ff88 sigacthandler (a, fe9fbb88, fe9fb8d0, 54fe04, ff194000, 34e20) + 64
 --- called from signal handler with signal 10 (SIGBUS) ---
 fec6658c void CompactingPermGenGen::initialize_oops() (32b70, ff19fac0, bac0, ff194000, 52daa4, b800) + 34
 fec5f7d4 int universe_init() (16800, 16800, 32b70, cc00, 32d78, ff1aaa88) + 388
 fec4e51c int init_globals() (16000, 16170, 0, ff1aa178, fe9fbd5c, ff1232b0) + 44
 ff031d40 int Threads::create_vm(JavaVMInitArgs*,bool*) (12db8, fe9fbf1b, 30000, 16400, ff1aa504, ff194000) + 290
 fec45938 JNI_CreateJavaVM (fe9fbf94, fe9fbf90, fe9fbf80, 10002, 54e794, ff194000) + d0
 00012664 JavaMain (fec45868, 2b0cc, 0, 0, 0, 0) + 188
 ff385c94 _lwp_start (0, 0, 0, 0, 0, 0)
-----------------  lwp# 1 / thread# 1  --------------------
 ff31cb30 _lwp_wait (2, ffbff234, 110a0, ff371d18, 5, 0) + 8
 ff379844 _thrp_join (2, 0, ffbff2f8, 1, 0, ffbff2fc) + 44
 ff3799b8 thr_join (2, 0, ffbff2f8, ffbff388, 0, ffbff2fc) + 10
 000188c0 ContinueInNewThread (124dc, 0, 0, ffbff388, fffe7e0c, 0) + 30
 0001249c main     (18000, 2ab28, 10000, 2b0cc, 458, 10001) + eac
 000111c0 _start   (0, 0, 0, 0, 0, 0) + 108

Investigation shows that, for example, on machine starwars.sfbay.sun.com Main_baseline java/javac cannot be started at all (JDK distribution located on starwars in /var/tmp/Work/Work/JDK/NIGHTLY/Main_Baseline/solaris-sparc/bin, I've also copied it to /net/sqesvr-nfs.sfbay/global/nfs/vm1/users/nh161220/jdk-bad in case it gets overwritten). 

The problem seems to be in corrupted classes.jsa file. Removing it solves the problem. Problematic file is attached.

Crash is only repeatable when java is started with default options or -client -XX:+UseSerialGC. Also, crash seems to be hardware dependant. For example there is no crash on gtee.sfbay.sun.com. 

This bug severly impacts testing.

Comments
EVALUATION The corrupted classes.jsa is only 8192 bytes, so it is most likely due to incomplete write during java -Xshare:dump (since I can't reproduce the bug and this bug can only be reproduced on some particular machine sometimes, my evaluation is just observation strictly speaking). To be more defensive, if the code which generate classes.jsa can't finish correctly, we need to remove that file rather than leave there as it is. That will potentially crash the java applications during startup. At least there is one significant problem in the dumping code, we exit the VM right away if the write system call fails during writing the data to classes.jsa. Here is the problematic code: < src/share/vm/memory/filemap.cpp> void FileMapInfo::write_bytes(const void* buffer, int nbytes) { if (_file_open) { int n = ::write(_fd, buffer, nbytes); if (n != nbytes) { fail_stop("Unable to write to shared archive file.", NULL); } } _file_offset += nbytes; } So one possible fix is to remove classes.jsa before calling fail_stop.
20-01-2006

WORK AROUND re-create shared archive
17-01-2006

WORK AROUND Use -Xshare:off or remove corrupted classes.jsa file.
16-01-2006