JDK-8010463 : G1: Crashes with -UseTLAB and heap verification
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: hs25
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2013-03-21
  • Updated: 2013-09-18
  • Resolved: 2013-03-30
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 7 JDK 8 Other
7u40Fixed 8Fixed hs24Fixed
Related Reports
Relates :  
Description
G1 always crashes when run with -UseTLAB and heap verification -XX:+VerifyBeforeGC is enabled in dacapo-bach (www.dacapobench.org) eclipse.

Command line

-XX:+UseG1GC -XX:-UseTLAB -XX:+UnlockDiagnosticVMOptions -XX:+VerifyBeforeGC -jar dacapo.jar eclipse -n 1

Crash (with product vm):
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f78cabf52df, pid=27087, tid=140156792600320
#
# JRE version:  (8.0) (build )
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.0-b23-internal mixed mode linux-amd64 compressed oops)
# Problematic frame:
# V  [libjvm.so+0x8c32df]  Threads::possibly_parallel_oops_do(OopClosure*, CLDToOopClosure*, CodeBlobClosure*)+0x7f
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# .../hs_err_pid27087.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.sun.com/bugreport/crash.jsp
#

Crash (run with debug vm):
# after -XX: or in .hotspotrc:  SuppressErrorAt=/g1CollectedHeap.cpp:3278
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (.../hs8005857/src-8005857/src/share/vm/gc_implementation/g1/g1CollectedHeap.cpp:3278), pid=22290, tid=139667374491392
#  assert(Thread::current()->is_VM_thread()) failed: Expected to be executed serially by the VM thread at this point
#
# JRE version:  (8.0) (build )
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.0-b23-internal-jvmg mixed mode linux-amd64 compressed oops)
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# [...]/hs_err_pid22290.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.sun.com/bugreport/crash.jsp
#

hs_err logs attached

(Originally reported by C. Kotselidis)

Expected behavior:

The VM does not crash when running with the given options.
Comments
Solution is to skip verifying the roots, heap regions, and RemSet until a safepoint or the JVM is completely initialized, or allow a NULL VM thread in Threads::oops_do and Threads::possibly_parallel_oops_do.
21-03-2013

Relaxing the assert in G1's verify code is not sufficient and results in a SEGV: ----> universe2_init [Verifying threads Roots ============================================================================== Unexpected Error ------------------------------------------------------------------------------ SIGSEGV (0xb) at pc=0xfe48925c, pid=16857, tid=2 Do you want to debug the problem? To debug, run 'dbx - 16857'; then switch to thread 2 Enter 'yes' to launch dbx automatically (PATH must include dbx) Otherwise, press RETURN to abort... ============================================================================== The segv is coming from Threads::possibly_parallel_oops_do: void Threads::possibly_parallel_oops_do(OopClosure* f, CLDToOopClosure* cld_f, CodeBlobClosure* cf) { // Introduce a mechanism allowing parallel threads to claim threads as // root groups. Overhead should be small enough to use all the time, // even in sequential code. SharedHeap* sh = SharedHeap::heap(); // Cannot yet substitute active_workers for n_par_threads // because of G1CollectedHeap::verify() use of // SharedHeap::process_strong_roots(). n_par_threads == 0 will // turn off parallelism in process_strong_roots while active_workers // is being used for parallelism elsewhere. bool is_par = sh->n_par_threads() > 0; assert(!is_par || (SharedHeap::heap()->n_par_threads() == SharedHeap::heap()->workers()->active_workers()), "Mismatch"); int cp = SharedHeap::heap()->strong_roots_parity(); ALL_JAVA_THREADS(p) { if (p->claim_oops_do(is_par, cp)) { p->oops_do(f, cld_f, cf); } } VMThread* vmt = VMThread::vm_thread(); if (vmt->claim_oops_do(is_par, cp)) { vmt->oops_do(f, cld_f, cf); } } The VMThread has not been created: (dbx) p vmt vmt = (nil) This segv happens with ParallelGCThreads=0 too: cairnapple{jcuthber}:286> sh t.sh VM option '+UseG1GC' VM option 'ParallelGCThreads=0' VM option '-UseTLAB' VM option '+UnlockDiagnosticVMOptions' VM option '+VerifyBeforeGC' VM option '+PrintGCTimeStamps' VM option '-PrintGCCause' VM option '+PrintGCDetails' VM option '+ShowMessageBoxOnError' VM option '+PrintVMOptions' ----> universe2_init [Verifying threads Roots ============================================================================== Unexpected Error ------------------------------------------------------------------------------ SIGSEGV (0xb) at pc=0xfe5558b7, pid=17112, tid=2 Do you want to debug the problem? To debug, run 'dbx - 17112'; then switch to thread 2 Enter 'yes' to launch dbx automatically (PATH must include dbx) Otherwise, press RETURN to abort... ============================================================================== Backtrace: [10] sigacthandler(0xb, 0xfd55ea70, 0xfd55e870, 0xf, 0x0, 0x4), at 0xfeebd5b1 ---- called from signal handler with signal 11 (SIGSEGV) ------ [11] Threads::oops_do(f = 0xfd55ec34, cld_f = 0xfd55eb68, cf = 0xfd55ec24), line 4171 in "thread.cpp" [12] SharedHeap::process_strong_roots(this = 0x8095588, activate_scope = true, is_scavenging = false, so = 13, roots = 0xfd55ec34, code_roots = 0xfd55ec24, klass_closure = 0xfd55ec08), line 162 in "sharedHeap.cpp" [13] G1CollectedHeap::verify(this = 0x8095588, silent = false, vo = VerifyOption_Default), line 3292 in "g1CollectedHeap.cpp" [14] Universe::verify(silent = false, option = VerifyOption_Default), line 1304 in "universe.cpp" [15] Universe::verify(silent = false), line 450 in "universe.hpp" [16] Universe::verify(), line 453 in "universe.hpp" [17] universe2_init(), line 964 in "universe.cpp" [18] init_globals(), line 115 in "init.cpp" 4163 // is held by some other thread. (Note: the Safepoint abstraction also 4164 // uses the Threads_lock to gurantee this property. It also makes sure that 4165 // all threads gets blocked when exiting or starting). 4166 4167 void Threads::oops_do(OopClosure* f, CLDToOopClosure* cld_f, CodeBlobClosure* cf) { 4168 ALL_JAVA_THREADS(p) { 4169 p->oops_do(f, cld_f, cf); 4170 } 4171 VMThread::vm_thread()->oops_do(f, cld_f, cf); 4172 } How do the other collectors handle this?
21-03-2013

ParallelGC skips it's young and tenured generations until the first GC. Nor does it walk the roots: ----> universe2_init [Verifying threads syms strs zone dict cldg metaspace chunks hand C-heap code cache ] <---- universe2_init ---->init::verify [Verifying threads syms strs zone dict cldg metaspace chunks hand C-heap code cache ] <----init::verify --->create_vm:verify [Verifying threads syms strs zone dict cldg metaspace chunks hand C-heap code cache ] <---create_vm:verify VerifyBeforeGC:[Verifying threads tenured eden syms strs zone dict cldg metaspace chunks hand C-heap code cache ] 1.696: [GC [PSYoungGen: 16895K->324K(19712K)] 16895K->328K(62720K), 0.0207072 secs] [Times: user=0.03 sys=0.00, real=0.02 secs] VerifyBeforeGC:[Verifying threads tenured eden syms strs zone dict cldg metaspace chunks hand C-heap code cache ]
21-03-2013

CMS doesn't skip its generations, but it does not walk the roots either: ----> universe2_init [Verifying threads concurrent mark-sweep generation par new generation remset syms strs zone dict cldg metaspace chunks hand C-heap code cache ] <---- universe2_init ---->init::verify [Verifying threads concurrent mark-sweep generation par new generation remset syms strs zone dict cldg metaspace chunks hand C-heap code cache ] <----init::verify --->create_vm:verify [Verifying threads concurrent mark-sweep generation par new generation remset syms strs zone dict cldg metaspace chunks hand C-heap code cache ] <---create_vm:verify 1.588: [GC 1.588: [ParNew VerifyBeforeGC:[Verifying threads concurrent mark-sweep generation par new generation remset syms strs zone dict cldg metaspace chunks hand C-heap code cache ] : 17471K->291K(19648K), 0.2126589 secs] 17471K->291K(63360K), 0.2129296 secs] [Times: user=0.22 sys=0.00, real=0.21 secs]
21-03-2013

There are actually several verification passes during JVM startup: cairnapple{jcuthber}:279> sh t.sh ----> universe2_init <---- universe2_init ---->init::verify <----init::verify --->create_vm:verify [Verifying threads (SKIPPING roots, heapRegions, remset) syms strs zone dict cldg metaspace chunks hand C-heap code cache ] <---create_vm:verify When UseTLAB is enabled, the verifications in universe2_init() and init_globals() are skipped. The verification in Threads::create_vm() is not but the G1 heap is skipped (since the JVM is not at a safepoint and UseTLAB is true).
21-03-2013

The assertion failure is occurring because the verification call is coming during VM initialization and is being invoked by the main thread (rather than the VM thread). The verification call is coming from: void universe2_init() { EXCEPTION_MARK; Universe::genesis(CATCH); // Although we'd like to verify here that the state of the heap // is good, we can't because the main thread has not yet added // itself to the threads list (so, using current interfaces // we can't "fill" its TLAB), unless TLABs are disabled. gclog_or_tty->print_cr("----> universe2_init"); if (VerifyBeforeGC && !UseTLAB && Universe::heap()->total_collections() >= VerifyGCStartAt) { Universe::heap()->prepare_for_verify(); Universe::verify(); // make sure we're starting with a clean slate } gclog_or_tty->print_cr("<---- universe2_init"); } and we only try the verification if UseTLAB is disabled. In G1's verification code we have the following check: void G1CollectedHeap::verify(bool silent, VerifyOption vo) { if (SafepointSynchronize::is_at_safepoint() || ! UseTLAB) { ... } else { if (!silent) gclog_or_tty->print("(SKIPPING roots, heapRegions, remset) "); } } which basically means that if the VM is not at a safepoint, then we skip checking the roots, heap regions, and RemSet if UseTLAB is true. In this case the JVM is during start up and UseTLAB is disabled, so we enter the code that verifies the roots, heap regions, and Remset and trigger the assert.
21-03-2013

Occurs with GC basher as well.
21-03-2013