JDK-8028497 : SIGSEGV at ClassLoaderData::oops_do(OopClosure*, KlassClosure*, bool)
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: hs25
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • Submitted: 2013-11-18
  • Updated: 2014-07-29
  • Resolved: 2014-04-08
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 8 JDK 9
8u20Fixed 9 b10Fixed
Related Reports
Duplicate :  
Duplicate :  
Relates :  
Relates :  
Description
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0xf6ea0881, pid=8148, tid=4104649584
#
# JRE version: Java(TM) SE Runtime Environment (8.0-b116) (build 1.8.0-ea-b116)
# Java VM: Java HotSpot(TM) Client VM (25.0-b58 compiled mode, sharing linux-x86 )
# Problematic frame:
# V  [libjvm.so+0x18f881]  ClassLoaderData::oops_do(OopClosure*, KlassClosure*, bool)+0x21
#
# Core dump written. Default location: /export/local/aurora/sandbox/results/ResultDir/except007/core or core.8148
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.sun.com/bugreport/crash.jsp
#

---------------  T H R E A D  ---------------

Current thread (0xf6b0ec00):  GCTaskThread [stack: 0xf49ff000,0xf4a80000] [id=8153]

siginfo:si_signo=SIGSEGV: si_errno=0, si_code=1 (SEGV_MAPERR), si_addr=0x00000014;; 

Registers:
EAX=0x00000000, EBX=0xf731b13c, ECX=0xf6ef3f40, EDX=0x00000000
ESP=0xf4a7ef20, EBP=0xf4a7ef38, ESI=0xf4a7f07c, EDI=0xf4a7f098
EIP=0xf6ea0881, EFLAGS=0x00010202, CR2=0x00000014

Top of Stack: (sp=0xf4a7ef20)
0xf4a7ef20:   f6db22be e92cff5c f4a7ef58 e92cff20
0xf4a7ef30:   f4a7f07c f4a7f07c f4a7ef58 f6ef3f5c
0xf4a7ef40:   00000000 f4a7f07c f4a7f098 00000001
0xf4a7ef50:   f4a7f07c f6fdb3b6 f4a7ef88 f6fb3cae
0xf4a7ef60:   f4a7f07c 00000000 80ea43c8 80ea43c8
0xf4a7ef70:   f4a7f07c e92cff20 80c06dd8 e92cff20
0xf4a7ef80:   e92d0000 f4a7f07c f4a7efb8 f71e692f
0xf4a7ef90:   80c06dd8 e92cff20 f4a7f07c e3bee930 

Instructions: (pc=0xf6ea0881)
0xf6ea0861:   89 e5 57 56 53 83 ec 0c e8 00 00 00 00 5b 81 c3
0xf6ea0871:   ce a8 47 00 8b 7d 10 80 7d 14 00 74 46 8b 55 08
0xf6ea0881:   8b 42 14 48 0f 84 8f 00 00 00 8b 83 80 f0 ff ff
0xf6ea0891:   83 c2 14 89 55 f0 c7 45 ec 01 00 00 00 83 38 01 
;; f6ea0871 ce                      into   
;; f6ea0872 a8 47                   test   $0x47,%al
;; f6ea0874 00 8b 7d 10 80 7d       add    %cl,0x7d80107d(%ebx)
;; f6ea087a 14 00                   adc    $0x0,%al
;; f6ea087c 74 46                   je     0xf6ea08c4
;; f6ea087e 8b 55 08                mov    0x8(%ebp),%edx
;; ---------------
;; f6ea0881 8b 42 14                mov    0x14(%edx),%eax
;; f6ea0884 48                      dec    %eax
;; f6ea0885 0f 84 8f 00 00 00       je     0xf6ea091a
;; f6ea088b 8b 83 80 f0 ff ff       mov    0xfffff080(%ebx),%eax
;; f6ea0891 83 c2 14                add    $0x14,%edx
;; f6ea0894 89 55 f0                mov    %edx,0xfffffff0(%ebp)
;; f6ea0897 c7 45 ec 01 00 00 00    movl   $0x1,0xffffffec(%ebp)
;; f6ea089e 83 38 01                cmpl   $0x1,(%eax)
;; 
Register to memory mapping:

EAX=0x00000000 is an unknown value
EBX=0xf731b13c: <offset 0x60a13c> in /export/local/aurora/sandbox/java/re/jdk/8/promoted/all/b116/binaries/linux-i586/jre/lib/i386/client/libjvm.so at 0xf6d11000
ECX=0xf6ef3f40: <offset 0x1e2f40> in /export/local/aurora/sandbox/java/re/jdk/8/promoted/all/b116/binaries/linux-i586/jre/lib/i386/client/libjvm.so at 0xf6d11000
EDX=0x00000000 is an unknown value
ESP=0xf4a7ef20 is an unknown value
EBP=0xf4a7ef38 is an unknown value
ESI=0xf4a7f07c is an unknown value
EDI=0xf4a7f098 is an unknown value


Stack: [0xf49ff000,0xf4a80000],  sp=0xf4a7ef20,  free space=511k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x18f881]  ClassLoaderData::oops_do(OopClosure*, KlassClosure*, bool)+0x21;;  ClassLoaderData::oops_do(OopClosure*, KlassClosure*, bool)+0x21
V  [libjvm.so+0x1e2f5c]  CMSOopsInGenClosure::do_class_loader_data(ClassLoaderData*)+0x1c;;  CMSOopsInGenClosure::do_class_loader_data(ClassLoaderData*)+0x1c
V  [libjvm.so+0x2a2cae]  InstanceMirrorKlass::oop_oop_iterate_v(oopDesc*, ExtendedOopClosure*)+0x9e;;  InstanceMirrorKlass::oop_oop_iterate_v(oopDesc*, ExtendedOopClosure*)+0x9e
V  [libjvm.so+0x4d592f]  ContiguousSpace::par_oop_iterate(MemRegion, ExtendedOopClosure*)+0x2f;;  ContiguousSpace::par_oop_iterate(MemRegion, ExtendedOopClosure*)+0x2f
V  [libjvm.so+0x1d5103]  CMSParMarkTask::do_young_space_rescan(unsigned int, OopsInGenClosure*, ContiguousSpace*, HeapWord**, unsigned int)+0xf3;;  CMSParMarkTask::do_young_space_rescan(unsigned int, OopsInGenClosure*, ContiguousSpace*, HeapWord**, unsigned int)+0xf3
V  [libjvm.so+0x1d53c8]  CMSParInitialMarkTask::work(unsigned int)+0x178;;  CMSParInitialMarkTask::work(unsigned int)+0x178
V  [libjvm.so+0x58bbe9]  GangWorker::loop()+0xe9;;  GangWorker::loop()+0xe9
V  [libjvm.so+0x58b7d8]  GangWorker::run()+0x18;;  GangWorker::run()+0x18
V  [libjvm.so+0x461b99]  java_start(Thread*)+0x119;;  java_start(Thread*)+0x119
C  [libpthread.so.0+0x69e9]  abort@@GLIBC_2.0+0x69e
Comments
I have a fix that restores unshareable information such: if (Klass->class_loader_data() != NULL) { CDS Klass->set_class_loader_data(CLD) CLD->add_class(Klass) } if (Klass->java_mirror() != NULL) { create mirror: allocate_mirror(CHECK) mirror->set_klass(Klass) allocate fields and static fields(CHECK) Klass->set_mirror(mirror) } Klass->constants() = allocate_resolved_references array(CHECK) Klass->array_klasses() = allocate mirrors for array classes(CHECK) { these are conditional on whether java-mirror is null or not)} If we get OOMs at CHECKs above, we won't remove unshareable information that's been created before the OOM. The oop pointers inside the mirror will be either NULL or set but the mirror will point back to the class. If any of the fields OOM, the Klass won't point to the mirror and hold the mirror alive. So in the case where we retry loading the class that got the OOM (which I haven't seen any of the tests other than Stefan's test trying to do), the restore_unshareable_info code will restart where it left off. This seems to work for Stefan's test anyway.
18-03-2014

A third way to solve this bug was suggested by Stefan Karlsson: Do not set the _class_loader_data to NULL in the Klass, just let it be. This will work with the classloading and also work for the GC, since all classes from the CDS archive will have _the_null_class_loader as their class loader.
07-01-2014

RULE nsk/stress/except/except001 Crash SIGSEGV
07-01-2014

RULE nsk/stress/except/except012 Crash EXCEPTION_ACCESS_VIOLATION
27-12-2013

Release team: Approved for deferral.
12-12-2013

8-defer-request justification: This bug should be deferred because the crash is very rare (see the ILW) and the conditions are not officially supported: Class Data Sharing and ConcurrentMarkSweepGC. Furthermore, class data sharing is only officially supported -client and not many people run -client with -XX:+UseConcMarkSweepGC. The work around is also quite low, turning off CDS will make the problem go away. What is the cost of turning off CDS? A slightly longer startup time (on the order of 0.1s) and a little more memory usage (since the CDS archive is shared between multiple clients). We have also discussed the proposed fixes above, including a third possible fix of *not* setting _class_loader_data to NULL, but they are all deemed to risky at this point in time. Given all this, I think this bug should be deferred. Defer to: 8-pool
12-12-2013

After discussions with Erik and Coleen we have decided to do a new ILW for this bug. ILW => HLM => P3 Impcat: High, the VM is crashing. Likelihood: Low, we need two rare conditions to be met for the bug to occur: 1) We need to get an exception while loading a share class. 2) The next GC need to be CMS background collection. Workaround: Medium, turn of CDS or use another GC.
11-12-2013

The first solution sounds reasonable to me.
10-12-2013

One possible solution is to follow the _java_mirror field in the instanceKlass in remove_unshareable_info and write NULL at _klass_offset. All the code in the GC is prepared to handle a NULL Klass pointer for java mirrors, since java mirrors to native types always have a NULL Klass pointer injected. Another solution is to overwrite the java mirror with some kind of filler array, which the VM then will treat as an ordinary oop.
10-12-2013

Ok, so here is what happens: We are trying to load a class, sun/misc/FloatingDecimal, from the class data sharing archive, classes.jsa (see "Events" in the hs_err file). During the call to load_shared_class, we get a OutOfMemory exception (see "Exceptions" in the hs_err file). The code handling the exception in SystemDictionary::resolved_instance_class_or_null will, for a partially loaded shared class, call clean_up_shared_class. clean_up_shared_class will call remove_unshareable_info on the instance klass (the Klass for sun/misc/FloatingDecimal). This will set _class_loader_data and _java_mirror to NULL in the Klass (amongst other things). Next up, CMS decides to do a background collection. During the initial marking, CMS encounters the java mirror for sun/misc/FloatingDecimal. How can CMS possibly find the java mirror where there can't be any references to it? Because CMS treats all objects, living or dead, in young gen as roots. When CMS finds the java mirror it will read the Klass pointer injected as a field behind the oop pointing to the Klass for the Java class that java mirror is a mirror for. The problem is, this is sun/misc/FloatingDecimal, whose _class_loader_data we NULLed out above in the call to clean_up_shared_class. CMS will therefore read a NULL value for the _class_loader_data and crash with a SIGSEGV.
22-11-2013

bug was hidden by JDK-8020517
18-11-2013

ILW=HLH=>P2
18-11-2013

test: nsk/stress/except/except007 vm_options: -client -Xcomp -XX:+UnlockCommercialFeatures -XX:+FlightRecorder -XX:FlightRecorderOptions=defaultrecording=true,disk=true,dumponexit=true -XX:+UseConcMarkSweepGC version: 1.8.0-ea-b116 / 25.0-b58
18-11-2013