JDK-8049304 : race between VM_Exit and _sync_FutileWakeups->inc()
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 8u40,9
  • Priority: P2
  • Status: Closed
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2014-07-03
  • Updated: 2025-08-06
  • Resolved: 2015-09-03
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 9
9 b83Fixed
Related Reports
Causes :  
Duplicate :  
Duplicate :  
Duplicate :  
Duplicate :  
Duplicate :  
Duplicate :  
Duplicate :  
Duplicate :  
Duplicate :  
Duplicate :  
Duplicate :  
Duplicate :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Description
While chasing the following bug, I ran into a SIGSEGV crash at the end of a test run:

    JDK-8047212 runtime/ParallelClassLoading/bootstrap/random/inner-complex
        assert(ObjectSynchronizer::verify_objmon_isinpool(inf)) failed: monitor is invalid

Here's my gory analysis of the SIGSEGV

(gdb) info threads
  17 Thread 0x7f315d865700 (LWP 9161)  0x0000003189a0e1e5 in __lll_unlock_wake () from /lib64/libpthread.so.0
  16 Thread 0x7f31a05db700 (LWP 8885)  0x0000003188e17d75 in memset () from /lib64/ld-linux-x86-64.so.2
  15 Thread 0x7f31b6c6d700 (LWP 8820)  0x0000003189a0b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  14 Thread 0x7f31a03d9700 (LWP 8988)  0x0000003189a0b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  13 Thread 0x7f317c28b700 (LWP 9009)  0x0000003189a0b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  12 Thread 0x7f317c38c700 (LWP 9007)  0x0000003189a0b7bb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  11 Thread 0x7f31b6034700 (LWP 8826)  0x0000003189a0b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  10 Thread 0x7f31b4115700 (LWP 8888)  0x0000003189a0b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  9 Thread 0x7f31a04da700 (LWP 8891)  0x0000003189a0b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  8 Thread 0x7f31b6236700 (LWP 8823)  0x0000003189a0b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  7 Thread 0x7f31b498b700 (LWP 8847)  0x0000003189a0b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  6 Thread 0x7f31a01d7700 (LWP 9002)  0x0000003189a0b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  5 Thread 0x7f31b6135700 (LWP 8825)  0x0000003189a0b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  4 Thread 0x7f31b6337700 (LWP 8821)  0x0000003189a0b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  3 Thread 0x7f317c48d700 (LWP 9003)  0x0000003189a0b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  2 Thread 0x7f31b832d700 (LWP 8815)  0x0000003189a080ad in pthread_join () from /lib64/libpthread.so.0
* 1 Thread 0x7f31715f5700 (LWP 9084)  VMError::report_and_die (this=0x7f31714f8160) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/utilities/vmError.cpp:899


Thread 1 (Thread 0x7f31715f5700 (LWP 9084)):

<infinite recursion on report_and_die() calls>

#2729 0x00007f31b7bf21bc in VMError::report_and_die (this=0x7f31715f2360) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/utilities/vmError.cpp:960
#2730 0x00007f31b7bf24fe in crash_handler (sig=11, info=0x7f31715f2530, ucVoid=0x7f31715f2400) at /work/shared/bugs/8047212/src/8047212_exp/src/os/linux/vm/vmError_linux.cpp:106
#2731 <signal handler called>
#2732 0x0000000000000000 in ?? ()
#2733 0x00007f31b799abbe in outputStream::print_cr (this=0x7f31715f3900, format=<value optimized out>) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/utilities/ostream.cpp:139
#2734 0x00007f31b7bf027f in VMError::report (this=0x7f31715f3a50, st=0x7f31715f3900) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/utilities/vmError.cpp:345
#2735 0x00007f31b7bf1d94 in VMError::report_and_die (this=0x7f31715f3a50) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/utilities/vmError.cpp:972
#2736 0x00007f31b79946a2 in JVM_handle_linux_signal (sig=11, info=0x7f31715f3cb0, ucVoid=0x7f31715f3b80, abort_if_unrecognized=1902066128) at /work/shared/bugs/8047212/src/8047212_exp/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:542
#2737 0x00007f31b798694d in signalHandler (sig=11, info=0x7f31715f3cb0, uc=0x7f31715f3b80) at /work/shared/bugs/8047212/src/8047212_exp/src/os/linux/vm/os_linux.cpp:4126
#2738 <signal handler called>
#2739 0x00007f31b796dcce in inc (this=0x7f3134003e18, __the_thread__=0x7f31b02ed000) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/runtime/perfData.hpp:425
#2740 ObjectMonitor::EnterI (this=0x7f3134003e18, __the_thread__=0x7f31b02ed000) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/runtime/objectMonitor.cpp:609
#2741 0x00007f31b796ec91 in ObjectMonitor::enter (this=0x7f3134003e18, __the_thread__=0x7f31b02ed000) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/runtime/objectMonitor.cpp:371
#2742 0x00007f31b7b2bd7f in ObjectSynchronizer::slow_enter (obj=<value optimized out>, lock=0x7f31715f43a8, __the_thread__=0x7f31b02ed000) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/runtime/synchronizer.cpp:233
#2743 0x00007f31b7b2c07b in ObjectSynchronizer::fast_enter (obj=..., lock=0x7f31715f43a8, attempt_rebias=<value optimized out>, __the_thread__=0x7f31b02ed000) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/runtime/synchronizer.cpp:155
#2744 0x00007f31b75cdf7b in InterpreterRuntime::monitorenter (thread=0x7f31b02ed000, elem=0x7f31715f43a8) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/interpreter/interpreterRuntime.cpp:602
#2745 0x00007f31a1042b1c in ?? ()
#2746 0x00007f31a10429cd in ?? ()
#2747 0x0000000000000003 in ?? ()
#2748 0x0000000097f049d0 in ?? ()
#2749 0x00007f31715f43a8 in ?? ()
#2750 0x00007f31a08322f4 in ?? ()
#2751 0x00007f31715f4420 in ?? ()
#2752 0x00007f31a08a34e8 in ?? ()
#2753 0x0000000000000000 in ?? ()

This frame inflated an ObjectMonitor, associated it with 'obj'
and called ObjectMonitor::enter():

(gdb) frame
#2742 0x00007f31b7b2bd7f in ObjectSynchronizer::slow_enter (
    obj=<value optimized out>, lock=0x7f31715f43a8, 
    __the_thread__=0x7f31b02ed000)
    at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/runtime/synchronizer.cpp:233
233	  ObjectSynchronizer::inflate(THREAD, obj())->enter(THREAD);

Unfortunately, the return value from ObjectSynchronizer::inflate()
isn't stored in a local.

Interesting locals from ObjectSynchronizer::slow_enter:

(gdb) print obj
$15 = <value optimized out>

(gdb) print mark
$13 = (markOopDesc *) 0x7f3134003e1a
(gdb) print *mark
$14 = {<oopDesc> = {_mark = 0x49d0000000000000, _metadata = {_klass = 
    0xf1f10000000097f0, _compressed_klass = 38896}, static _bs = 
    0x7f31b0038200}, static cms_free_chunk_pattern = 1, 
  static size_mask = 34359738367, static size_mask_in_place = 8796093021952}


Let's go down into the ObjectMonitor::enter() call:

(gdb) frame
#2741 0x00007f31b796ec91 in ObjectMonitor::enter (this=0x7f3134003e18, 
    __the_thread__=0x7f31b02ed000)
    at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/runtime/objectMonitor.cpp:371
371	      EnterI(THREAD);

(gdb) print this
$16 = (ObjectMonitor * const) 0x7f3134003e18

Checkout the 'mark' value and the ObjectMonitor *:

mark == 0x7f3134003e1a
this == 0x7f3134003e18

so the 'mark' value is 4 bytes into the ObjectMonitor.
The 'mark' was fetched out of 'obj' in ObjectSynchronizer::slow_enter:

200  markOop mark = obj->mark();

Here's some field offsets in ObjectMonitor:

(gdb) print &_header
$21 = (volatile markOop *) 0x7f3134003e18

(gdb) print &_object
$20 = (void * volatile *) 0x7f3134003e20

so the 'mark' point half way into the _header field.

Here's the rest of the ObjectMonitor:

(gdb) print *this
$22 = {static SpinCallbackFunction = 0, static SpinCallbackArgument = 0, 
  _header = 0x1, _object = 0x97f049d0, SharingPad = {
    -7.4786357953083842e+240}, _owner = 0x7f31b0331800, 
  _previous_owner_tid = 0, _recursions = 0, OwnerIsThread = 1, 
  _cxq = 0x7f31715f4000, _EntryList = 0x0, _succ = 0x0, 
  _Responsible = 0x7f31b02ed000, _PromptDrain = -235802127, 
  _Spinner = -235802127, _SpinFreq = 0, _SpinClock = 0, _SpinDuration = 4800, 
  _SpinState = -1012762419733073423, _count = 1, _waiters = 0, _WaitSet = 0x0, 
  _WaitSetLock = 0, _QMix = -235802127, FreeNext = 0x0, 
  StatA = -1012762419733073423, StatsB = -1012762419733073423, 
  static _sync_ContendedLockAttempts = 0x7f31b000dc10, 
  static _sync_FutileWakeups = 0x7f31b000dcf0, 
  static _sync_Parks = 0x7f31b000ddd0, 
  static _sync_EmptyNotifications = 0x7f31b000dea0, 
  static _sync_Notifications = 0x7f31b000df80, 
  static _sync_SlowEnter = 0x7f31b000e060, 
  static _sync_SlowExit = 0x7f31b000e130, 
  static _sync_SlowNotify = 0x7f31b000e200, 
  static _sync_SlowNotifyAll = 0x7f31b000e2d0, static _sync_FailedSpins = 0x0, 
  static _sync_SuccessfulSpins = 0x7f31b000e490, 
  static _sync_PrivateA = 0x7f31b000e570, 
  static _sync_PrivateB = 0x7f31b000e640, 
  static _sync_MonInCirculation = 0x7f31b000e710, 
  static _sync_MonScavenged = 0x7f31b000e7f0, 
  static _sync_Inflations = 0x7f31b000d650, 
  static _sync_Deflations = 0x7f31b000db40, 
  static _sync_MonExtant = 0x7f31b000e8d0, static Knob_Verbose = 0, 
  static Knob_SpinLimit = 5000}


Here's the frame that starts crashing:

(gdb) frame
#2740 ObjectMonitor::EnterI (this=0x7f3134003e18, 
    __the_thread__=0x7f31b02ed000)
    at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/runtime/objectMonitor.cpp:609
609	           ObjectMonitor::_sync_FutileWakeups->inc();

Here's info about the _sync_FutileWakeups field:

(gdb) print _sync_FutileWakeups
$23 = (PerfCounter *) 0x7f31b000dcf0

gdb) print *_sync_FutileWakeups
$24 = {
    <PerfLongVariant> = {<PerfLong> = {<PerfData> = {<CHeapObj<1792u>> = {<AllocatedObj> = {_vptr.AllocatedObj = 0xbabababababababa}, <No data fields>}, 
        _name = 0xbabababababababa <Address 0xbabababababababa out of bounds>, 
        _v = 3132799674, _u = 3132799674, _on_c_heap = 186, 
        _flags = 3132799674, _pdep = 0xbabababababababa, 
        _valuep = 0xbabababababababa}, <No data fields>}, 
    _sampled = 0xbabababababababa, 
    _sample_helper = 0xbabababababababa}, <No data fields>}

The _sync_FutileWakeups PerfCounter has been freed.

_sync_Parks is the PerfCounter after it:

(gdb) print *_sync_Parks
$25 = {<PerfLongVariant> = {<PerfLong> = {<PerfData> = {<CHeapObj<1792u>> = {<AllocatedObj> = {_vptr.AllocatedObj = 0xbabababababababa}, <No data fields>}, 
        _name = 0xbabababababababa <Address 0xbabababababababa out of bounds>, 
        _v = 3132799674, _u = 3132799674, _on_c_heap = 186, 
        _flags = 3132799674, _pdep = 0xbabababababababa, 
        _valuep = 0xbabababababababa}, <No data fields>}, 
    _sampled = 0xbabababababababa, 
    _sample_helper = 0xbabababababababa}, <No data fields>}

_sync_ContendedLockAttempts is the PerfCounter before it:

(gdb) print *_sync_ContendedLockAttempts
$26 = {<PerfLongVariant> = {<PerfLong> = {<PerfData> = {<CHeapObj<1792u>> = {<AllocatedObj> = {_vptr.AllocatedObj = 0xbabababababababa}, <No data fields>}, 
        _name = 0xbabababababababa <Address 0xbabababababababa out of bounds>, 
        _v = 3132799674, _u = 3132799674, _on_c_heap = 186, 
        _flags = 3132799674, _pdep = 0xbabababababababa, 
        _valuep = 0xbabababababababa}, <No data fields>}, 
    _sampled = 0xbabababababababa, 
    _sample_helper = 0xbabababababababa}, <No data fields>}

Looks like all the PerfCounters have been freed which happens
at VM shutdown.

Here's the current JavaThread's info:

(gdb) print *(JavaThread*)Self
$33 = {<Thread> = {<ThreadShadow> = {<CHeapObj<512u>> = {<AllocatedObj> = {
          _vptr.AllocatedObj = 0x7f31b8251c50}, <No data fields>}, 
      _pending_exception = {_o = 0x0}, _exception_file = 0x0, 
      _exception_line = 0}, _real_malloc_address = 0x7f31b02ec9f0, 
    _SR_lock = 0x7f31b02ee090, _suspend_flags = 0, _num_nested_signal = 1, 
    _active_handles = 0x7f311c000d90, _free_handle_block = 0x7f3120000910, 
    _last_handle_mark = 0x7f31715f4590, _oops_do_parity = 0, 
    _allow_safepoint_count = 0, _allow_allocation_count = 0, 
    _skip_gcalot = false, _tlab = {<CHeapObj<512u>> = {<AllocatedObj> = {
          _vptr.AllocatedObj = 0x7f31b8252090}, <No data fields>}, 
      _start = 0x973d78e8, _top = 0x973d8780, _pf_top = 0x973d7900, 
      _end = 0x973ffff0, _desired_size = 36700, _refill_waste_limit = 573, 
      _allocated_before_last_gc = 0, static _max_size = 65535, 
      static _target_refills = 50, _number_of_refills = 2, 
      _fast_refill_waste = 0, _slow_refill_waste = 4, _gc_waste = 0, 
      _slow_allocations = 0, 
      _allocation_fraction = {<CHeapObj<1280u>> = {<AllocatedObj> = {
            _vptr.AllocatedObj = 0x7f31b823d630}, <No data fields>}, 
        _average = 0.999995649, _sample_count = 1, _weight = 35, 
        _is_old = false, static OLD_THRESHOLD = 100, 
        _last_sample = 0.999995649}, static _global_stats = 0x7f31b0169d60}, 
    _allocated_bytes = 2016, _trace_data = {_buffer = 0x0, _stackdepth = 0, 
      _stackframes = 0x0}, _vm_operation_started_count = 0, 
    _vm_operation_completed_count = 0, 
    _current_pending_monitor = 0x7f3134003e18, 
    _current_pending_monitor_is_from_java = true, 
    _current_waiting_monitor = 0x0, omFreeList = 0x0, omFreeCount = 0, 
    omFreeProvision = 32, omInUseList = 0x0, omInUseCount = 0, 
    _visited_for_critical_count = true, _unhandled_oops = 0xf1f1f1f1f1f1f1f1, 
    _osthread = 0x7f31b02eecc0, _resource_area = 0x7f31b02ed830, 
    _current_resource_mark = 0x0, _handle_area = 0x7f31b02edcd0, 
    _metadata_handles = 0x7f31b02ede80, 
    _stack_base = 0x7f31715f6000 <Address 0x7f31715f6000 out of bounds>, 
    _stack_size = 1052672, _self_raw_id = 0, _lgrp_id = -1, 
    _owned_locks = 0x0, _jvmti_env_iteration_count = 0, 
    _Stalled = 139849302556184, _TypeTag = 11181, _ParkEvent = 0x7f31b02ee200, 
    _SleepEvent = 0x7f31b02ee400, _MutexEvent = 0x7f31b02ee600, 
    _MuxEvent = 0x7f31b02ee800, NativeSyncRecursion = -235802127, _OnTrap = 0, 
    _hashStateW = 345268945, _hashStateX = 842502087, _hashStateY = 34663, 
    _hashStateZ = 273326509, _schedctl = 0x0, rng = {1237846220, -235802127, 
      -235802127, -235802127}}, _next = 0x7f31b0294800, _threadObj = {
    _o = 0x97ff8a30}, _java_call_counter = 1, _anchor = {
    _last_Java_sp = 0x7f31715f43a8, _last_Java_pc = 0x0, 
    _last_Java_fp = 0x7f31715f43f8}, _entry_point = 0x7f31b76bccc0
     <thread_entry(JavaThread*, Thread*)>, _jni_environment = {
    functions = 0x7f31b8271b00}, _deopt_mark = 0x0, _must_deopt_id = 0x0, 
  _deopt_nmethod = 0x0, _vframe_array_head = 0x0, _vframe_array_last = 0x0, 
  _deferred_locals_updates = 0x0, _callee_target = 0x7f31a09399e0, 
  _vm_result = {_o = 0x0}, _vm_result_2 = 0x0, _deferred_card_mark = {
    _start = 0x0, _word_size = 0}, _monitor_chunks = 0x0, 
  _special_runtime_exit_condition = JavaThread::_no_async_condition, 
  _pending_async_exception = {_o = 0x0}, _thread_state = _thread_blocked, 
  _safepoint_state = 0x7f31b02eec60, _saved_exception_pc = 0x0, 
  _terminated = JavaThread::_not_terminated, _suspend_equivalent = true, 
  _in_deopt_handler = 0, _doing_unsafe_access = false, 
  _do_not_unlock_if_synchronized = false, 
  _jni_attach_state = JavaThread::_not_attaching_via_jni, 
  _stack_guard_state = JavaThread::stack_guard_enabled, 
  _stack_overflow_limit = 0x7f317150e000 "", _exception_oop = {_o = 0x0}, 
  _exception_pc = 0x0, _exception_handler_pc = 0x0, 
  _is_method_handle_return = 0, _jni_active_critical = 0, 
  _pending_jni_exception_check_fn = 0x0, _depth_first_number = -235802127, 
  _popframe_condition = 0, _jmp_ring_index = 0, _jmp_ring = {{_target = 0, 
      _instruction = 0, _file = 0x0, _line = 0} <repeats 16 times>}, 
  _satb_mark_queue = {<PtrQueue> = {_vptr.PtrQueue = 0x7f31b824fa90, 
      _qset = 0x7f31b8312d20, _active = false, _buf = 0x0, _index = 0, 
      _sz = 17433981653976478193, _perm = false, 
      _lock = 0x0}, <No data fields>}, 
  static _satb_mark_queue_set = {<PtrQueueSet> = {
      _vptr.PtrQueueSet = 0x7f31b824fab0, _cbl_mon = 0x7f31b0007140, 
      _completed_buffers_head = 0x0, _completed_buffers_tail = 0x0, 
      _n_completed_buffers = 0, _process_completed_threshold = 20, 
      _process_completed = false, _fl_lock = 0x7f31b0007070, 
      _buf_free_list = 0x0, _buf_free_list_sz = 0, _fl_owner = 0x7f31b8312d20, 
---Type <return> to continue, or q <return> to quit---
      _sz = 8192, _all_active = false, _notify_when_complete = false, 
      _max_completed_queue = -1, _completed_queue_padding = 0}, 
    _closure = 0x0, _par_closures = 0x7f31b0165cc0, 
    _shared_satb_queue = {<PtrQueue> = {_vptr.PtrQueue = 0x7f31b824ed90, 
        _qset = 0x7f31b8312d20, _active = false, _buf = 0x0, _index = 0, 
        _sz = 0, _perm = true, _lock = 0x7f31b0007210}, <No data fields>}}, 
  _dirty_card_queue = {<PtrQueue> = {_vptr.PtrQueue = 0x7f31b8238b90, 
      _qset = 0x7f31b8312de0, _active = true, _buf = 0x0, _index = 0, 
      _sz = 17433981653976478193, _perm = false, 
      _lock = 0x0}, <No data fields>}, static _dirty_card_queue_set = 
    {<PtrQueueSet> = {_vptr.PtrQueueSet = 0x7f31b8238b70, 
      _cbl_mon = 0x7f31b00073b0, _completed_buffers_head = 0x0, 
      _completed_buffers_tail = 0x0, _n_completed_buffers = 0, 
      _process_completed_threshold = 12, _process_completed = false, 
      _fl_lock = 0x7f31b00072e0, _buf_free_list = 0x0, _buf_free_list_sz = 0, 
      _fl_owner = 0x7f31b8312de0, _sz = 2048, _all_active = true, 
      _notify_when_complete = true, _max_completed_queue = 24, 
      _completed_queue_padding = 0}, _mut_process_closure = 0x7f31b002dbc0, 
    _shared_dirty_card_queue = {<PtrQueue> = {_vptr.PtrQueue = 0x7f31b824ed90, 
        _qset = 0x7f31b8312de0, _active = true, _buf = 0x0, _index = 0, 
        _sz = 0, _perm = true, _lock = 0x7f31b0007480}, <No data fields>}, 
    _free_ids = 0x7f31b0165d20, _processed_buffers_mut = 0, 
    _processed_buffers_rs_thread = 0, _cur_par_buffer_node = 0x0}, _recorder = 
    0x0, _thread_profiler = 0x0, _safepoint_visible = true, 
  _privileged_stack_top = 0x0, _array_for_gc = 0x0, 
  _popframe_preserved_args = 0x0, _popframe_preserved_args_size = 0, 
  _jvmti_thread_state = 0x0, _jvmti_get_loaded_classes_closure = 0x0, 
  _interp_only_mode = 0, _should_post_on_exceptions_flag = 0, _thread_stat = 
    0x7f31b02eea20, static _stack_size_at_create = 1048576, 
  _blocked_on_compilation = false, _parker = 0x7f31b02eeb70, 
  _cached_monitor_info = 0x0, _claimed_par_id = 4294967295}

And here's the JavaThread's safepoint info:

(gdb) print *((JavaThread*)Self)->_safepoint_state
$36 = {<CHeapObj<1792u>> = {<AllocatedObj> = {
      _vptr.AllocatedObj = 0x7f31b824fa70}, <No data fields>}, 
  _at_poll_safepoint = false, _has_called_back = false, 
  _thread = 0x7f31b02ed000, _type = ThreadSafepointState::_at_safepoint, 
  _orig_thread_state = _thread_blocked}


Here's the JavaThread that called JVM_Halt with code == 95
which called vm_exit() with code == 95:

src/share/vm/prims/jvm.cpp:

 421 JVM_ENTRY_NO_ENV(void, JVM_Halt(jint code))
 422   before_exit(thread);
 423   vm_exit(code);
 424 JVM_END

which means that before_exit() has already been called.


Thread 15 (Thread 0x7f31b6c6d700 (LWP 8820)):
#0  0x0000003189a0b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f31b798bb32 in os::PlatformEvent::park (this=0x7f31b000cd00) at /work/shared/bugs/8047212/src/8047212_exp/src/os/linux/vm/os_linux.cpp:5487
#2  0x00007f31b792bcc0 in ParkCommon (this=0x7f31b0009360, Self=0x7f31b000b800) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/runtime/mutex.cpp:414
#3  Monitor::ILock (this=0x7f31b0009360, Self=0x7f31b000b800) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/runtime/mutex.cpp:481
#4  0x00007f31b792cab4 in lock_without_safepoint_check (this=0x7f31b0009360) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/runtime/mutex.cpp:949
#5  Monitor::lock_without_safepoint_check (this=0x7f31b0009360) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/runtime/mutex.cpp:955
#6  0x00007f31b7a83c64 in SafepointSynchronize::block (thread=0x7f31b000b800) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/runtime/safepoint.cpp:706
#7  0x00007f31b792dfb1 in transition_and_fence (this=0x7f31b0009500, no_safepoint_check=false, timeout=0, as_suspend_equivalent=false) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/runtime/interfaceSupport.hpp:184
#8  trans_and_fence (this=0x7f31b0009500, no_safepoint_check=false, timeout=0, as_suspend_equivalent=false) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/runtime/interfaceSupport.hpp:232
#9  ThreadBlockInVM (this=0x7f31b0009500, no_safepoint_check=false, timeout=0, as_suspend_equivalent=false) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/runtime/interfaceSupport.hpp:311
#10 Monitor::wait (this=0x7f31b0009500, no_safepoint_check=false, timeout=0, as_suspend_equivalent=false) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/runtime/mutex.cpp:1107
#11 0x00007f31b7c1b0f5 in VMThread::execute (op=0x7f31b6c6c3f0) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/runtime/vmThread.cpp:615
#12 0x00007f31b75dda45 in vm_exit (code=95) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/runtime/java.cpp:597
#13 0x00007f31b76c5c51 in JVM_Halt (code=95) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/prims/jvm.cpp:423
#14 0x00007f31a10258ba in ?? ()
#15 0x00007f31a0adcd88 in ?? ()
#16 0x00007f31a1024ea9 in ?? ()
#17 0x00007f31b6c6c4c0 in ?? ()
#18 0x00007f31a0adc1c0 in ?? ()
#19 0x00007f31b6c6c520 in ?? ()
#20 0x00007f31a0adc708 in ?? ()
#21 0x0000000000000000 in ?? ()




Here's the VMThread and it's exiting with code==95.
We're definitely on our way out the door.

VM_Exit::doit() called exit_globals() which is what freed up
the PerfMemory resources.

Thread 16 (Thread 0x7f31a05db700 (LWP 8885)):
#0  0x0000003188e17d75 in memset () from /lib64/ld-linux-x86-64.so.2
#1  0x0000003188e0e7dd in _dl_sort_fini () from /lib64/ld-linux-x86-64.so.2
#2  0x0000003188e0ead3 in _dl_fini () from /lib64/ld-linux-x86-64.so.2
#3  0x0000003189635da2 in exit () from /lib64/libc.so.6
#4  0x00007f31b75dd907 in vm_direct_exit (code=95) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/runtime/java.cpp:616
#5  0x00007f31b7c1dcd5 in VM_Operation::evaluate (this=0x7f31b6c6c3f0) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/runtime/vm_operations.cpp:62
#6  0x00007f31b7c1b799 in VMThread::evaluate_operation (this=<value optimized out>, op=0x7f31b6c6c3f0) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/runtime/vmThread.cpp:355
#7  0x00007f31b7c1c26e in VMThread::loop (this=0x7f31b022d000) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/runtime/vmThread.cpp:475
#8  0x00007f31b7c1c481 in VMThread::run (this=0x7f31b022d000) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/runtime/vmThread.cpp:265
#9  0x00007f31b798d7c2 in java_start (thread=0x7f31b022d000) at /work/shared/bugs/8047212/src/8047212_exp/src/os/linux/vm/os_linux.cpp:828
#10 0x0000003189a07851 in start_thread () from /lib64/libpthread.so.0
#11 0x00000031896e890d in clone () from /lib64/libc.so.6
Comments
We saw the same crash in jdk1.8.0_112, do we have any plan to backport it to jdk8?
18-08-2022

At Twitter we have seen crashed that looks like this one in JDK8. Any chance of a backport?
18-03-2016

URL: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/f44314157fcb User: lana Date: 2015-09-23 23:04:19 +0000
23-09-2015

URL: http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/f44314157fcb User: dcubed Date: 2015-09-03 19:03:46 +0000
03-09-2015

Code review round 0 has motivated more debug experiments for this bug. The first two options in this experiment are from JDK-8129978. The '-XX:+UseNewCode' option disables the fix and the '-XX:+UseNewCode2' option enables the "XXX perfdata destroyed" message. $ $JAVA_HOME/bin/java -XX:+UseConcMarkSweepGC -XX:CMSYoungGenPerWorker=1 -server_8049304_bld_dcubed-jvmg -XX:+UseNewCode -XX:+UseNewCode2 -version Error occurred during initialization of VM GC triggered before VM initialization completed. Try increasing NewSize, current value 192K. XXX perfdata destroyed - is NOT at a safepoint # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0xfffffd7ffda115e4, pid=13993, tid=0x0000000000000023 # # JRE version: (9.0) (build ) # Java VM: Java HotSpot(TM) 64-Bit Server VM (1.9.0-8049304_bld_dcubed-product, mixed mode, tiered, compressed oops, concurrent mark sweep gc, solaris-amd64) # Problematic frame: # V [libjvm.so+0xe115e4] Our PerfData objects are being destroyed while not safepoint-safe in this case so we're just racing with every PerfData object. The CrashTestDummy class is from this bug. The '-XX:+UseNewCode' option disables the fix and the '-XX:+UseNewCode2' option enables the "XXX perfdata destroyed" message. $ $JAVA_HOME/bin/java -server_8049304_bld_dcubed-jvmg -XX:+UseNewCode -XX:+UseNewCode2 CrashTestDummy Doing it about to block forever main 0 main 1 main 2 main 3 main 4 main 5 ^CXXX perfdata destroyed - is at a safepoint # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0xfffffd7ffda115e4, pid=14051, tid=0x0000000000000050 # # JRE version: Java(TM) SE Runtime Environment (9.0) (build 1.9.0-internal-20150821221801.ddaugher.8049304_for_jdk9-b00) # Java VM: Java HotSpot(TM) 64-Bit Server VM (1.9.0-8049304_bld_dcubed-product, mixed mode, tiered, compressed oops, g1 gc, solaris-amd64) # Problematic frame: # V [libjvm.so+0xe115e4] void PerfLongVariant::inc()+0x14 # # Core dump will be written. Default location: /work/shared/bug_hunt/8049304/core or core.14051 # # An error report file with more information is saved as: # /work/shared/bug_hunt/8049304/hs_err_pid14051.log The CTRL-C handler exit code path destroys the PerfData objects while we are at a safepoint so PerfData usage is safe for code paths that only manipulate PerfData while not in a safepoint-safe state. The two FutileWakeup counter updates happen while the thread is in a safepoint-safe state so we still have two races with the shutdown cleanup of the PerfData objects. Here's a hello world test program that falls off the end of main(): $ $JAVA_HOME/bin/java -server_8049304_bld_dcubed-jvmg -XX:+UseNewCode -XX:+UseNewCode2 HelloFallOffMain Hello World! XXX perfdata destroyed - is at a safepoint PerfData cleanup happens at a safepoint. Here's a hello world test program that calls System.exit(0): $ $JAVA_HOME/bin/java -server_8049304_bld_dcubed-jvmg -XX:+UseNewCode -XX:+UseNewCode2 HelloSystemExit Hello World! XXX perfdata destroyed - is at a safepoint PerfData cleanup happens at a safepoint.
28-08-2015

Continued investigating VM shutdown race: JDK-8049304 race between VM_Exit and _sync_FutileWakeups->inc() JDK-8129978 SIGSEGV when parsing command line options - Thanks to Kim for providing easy reproduction instructions for both bugs; I've tweaked the repro code a bit - The "correct" solution is to add a locking/memory ordering mechanism to ensure that PerfData is only used when valid. The locking/memory ordering would slow down the PerfData mechanism for every update. Ouch! - The "fast but safe" solution is to leak the PerfData memory and not clean them up at VM shutdown. We're trying to clean up the code base so the idea of intentionally leaking memory makes me cringe. - The solution I'm investigating is between "fast but safe" and "correct". I'm adding a PerfDataManager.has_PerfData() function that returns true when PerfDataManager is holding PerfData objects and false when none have been allocated or when they have been freed at VM shutdown. The flag holding the state is volatile and I use release_store() to change it so that publication is visible more quickly. On the VM shutdown path, I also do a 1ms sleep after setting the flag and before freeing the memory. - The idea is that the 1ms sleep will give any threads that saw PerfDataManager.has_PerfData() == true a chance to do their operation on the PerfData object before VM shutdown thread frees it.
22-08-2015

Slightly different debug code for making CrashTestDummy.java reproduce the crash: $ hg diff diff -r d9bc27b8ef1e src/share/vm/runtime/perfMemory.cpp --- a/src/share/vm/runtime/perfMemory.cpp Wed Aug 19 11:11:06 2015 -0700 +++ b/src/share/vm/runtime/perfMemory.cpp Wed Aug 19 17:06:46 2015 -0600 @@ -78,6 +78,9 @@ void perfMemory_exit() { // initialization. // PerfMemory::destroy(); +tty->print_cr("XXX perfdata destroyed"); +tty->flush(); +os::naked_short_sleep(999); } void PerfMemory::initialize() { With the 'flush()' call, the "XXX perfdata destroyed" message is not repeated. With full debug (jvmg) bits, I get a better stack trace: --------------- T H R E A D --------------- Current thread (0x0000000000c4c800): JavaThread "Thread-0" [_thread_blocked, id=80, stack(0xfffffd7fc07e8000,0xfffffd7fc08e8000)] Stack: [0xfffffd7fc07e8000,0xfffffd7fc08e8000], sp=0xfffffd7fc08e7110, free space=1020k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0xe0e334] void PerfLongVariant::inc()+0x14 V [libjvm.so+0x15ac09e] void ObjectMonitor::EnterI(Thread*)+0x65e V [libjvm.so+0x15ab4f7] void ObjectMonitor::enter(Thread*)+0x6c7 V [libjvm.so+0x178752c] void ObjectSynchronizer::slow_enter(Handle,BasicLock*,Thread*)+0x27c V [libjvm.so+0x1786f55] void ObjectSynchronizer::fast_enter(Handle,BasicLock*,bool,Thread*)+0x145 V [libjvm.so+0x11e60e2] void InterpreterRuntime::monitorenter(JavaThread*,BasicObjectLock*)+0x2d2 j CrashTestDummy.blocked()V+14 j CrashTestDummy.access$000()V+0 j CrashTestDummy$1.run()V+0 j java.lang.Thread.run()V+11 v ~StubRoutines::call_stub V [libjvm.so+0x1200b22] void JavaCalls::call_helper(JavaValue*,methodHandle*,JavaCallArguments*,Thread*)+0x882 V [libjvm.so+0x15d85b8] void os::os_exception_wrapper(void(*)(JavaValue*,methodHandle*,JavaCallArguments*,Thread*),JavaValue*,methodHandle*,JavaCallArguments* ,Thread*)+0x38 V [libjvm.so+0x1200291] void JavaCalls::call(JavaValue*,methodHandle,JavaCallArguments*,Thread*)+0x91 V [libjvm.so+0x11ff7ab] void JavaCalls::call_virtual(JavaValue*,KlassHandle,Symbol*,Symbol*,JavaCallArguments*,Thread*)+0x23b V [libjvm.so+0x11ff896] void JavaCalls::call_virtual(JavaValue*,Handle,KlassHandle,Symbol*,Symbol*,Thread*)+0x66 V [libjvm.so+0x130a857] void thread_entry(JavaThread*,Thread*)+0xc7 V [libjvm.so+0x17d3687] void JavaThread::thread_main_inner()+0x177 V [libjvm.so+0x17d34de] void JavaThread::run()+0x2de V [libjvm.so+0x15cf034] java_start+0x234 C [libc.so.1+0xdd9db] _thr_setup+0x5b C [libc.so.1+0xddc10] _lwp_start+0x0 [error occurred during error reporting (printing native stack), id 0xe0000000] Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) j CrashTestDummy.blocked()V+14 j CrashTestDummy.access$000()V+0 j CrashTestDummy$1.run()V+0 j java.lang.Thread.run()V+11 v ~StubRoutines::call_stub
19-08-2015

That suggestion of volatile / acquire / release is what I was assuming would be required. I'm guessing we don't want to add an acquire barrier when doing a PerfData update.
08-07-2015

I suppose changing the static PerfCounter * fields into volatiles, an acquire() to fetch and use and a release() to clear it is too much memory barrier?
07-07-2015

Drat! I seem to have gotten lost in the call chains. Unfortunately, exit_globals is called from multiple places. And some of those places are on their way to process exit, and also don't seem to be guaranteeing threads are stopped. It's Threads::destroy_vm that attempts to ensure threads are stopped before calling exit_globals. There are two callers of Threads::destroy_vm: - jni_DestroyJavaVM - ciReplay::replay, which then calls vm_exit. Since it is after destoy_vm, that call to vm_exit should end up going direct to vm_direct_exit rather than VM_Exit, since the VMThread was destroyed by destroy_vm. It's not clear why this doesn't just call os::exit after the destroy_vm, rather than vm_exit / vm_direct_exit. The other callers of exit_globals are - VM_Exit::doit - On its way to process exit. - CompileBroker::handle_full_code_cache - On its way to process exit. (Not obvious why this calls exit_globals at all, and isn't just calling vm_direct_exit.) Maybe exit_globals should have a "terminating_process" argument? Pass it to perfMemory_exit (to control whether to call PerfDataManager::destroy), and use it to decide whether to call ostream_exit or ostream_abort. Then make Threads::destroy_vm call exit_globals(false) and others call exit_globals(true). Regarding leak check tools, all that I've used have mechanisms for whitelisting some allocations. And do we care? By which I mean do we run any such tools, look at the results, and act upon those results? The reason I ask is that I'm pretty sure I've seen code that wasn't being careful about deleting data structures created during startup. Unfortunately, I didn't look carefully or make any notes or file bugs, because I didn't know about jni_DestroyJavaVM until very recently, and so just assumed we didn't care.
07-07-2015

In my analysis of the original crash back on 2014-07-03, there is this note: > VM_Exit::doit() called exit_globals() which is what freed up > the PerfMemory resources. along with a stack trace showing how the VMThread went through (and past) that code path: Thread 16 (Thread 0x7f31a05db700 (LWP 8885)): #0 0x0000003188e17d75 in memset () from /lib64/ld-linux-x86-64.so.2 #1 0x0000003188e0e7dd in _dl_sort_fini () from /lib64/ld-linux-x86-64.so.2 #2 0x0000003188e0ead3 in _dl_fini () from /lib64/ld-linux-x86-64.so.2 #3 0x0000003189635da2 in exit () from /lib64/libc.so.6 #4 0x00007f31b75dd907 in vm_direct_exit (code=95) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/runtime/java.cpp:616 #5 0x00007f31b7c1dcd5 in VM_Operation::evaluate (this=0x7f31b6c6c3f0) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/runtime/vm_operations.cpp:62 #6 0x00007f31b7c1b799 in VMThread::evaluate_operation (this=<value optimized out>, op=0x7f31b6c6c3f0) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/runtime/vmThread.cpp:355 #7 0x00007f31b7c1c26e in VMThread::loop (this=0x7f31b022d000) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/runtime/vmThread.cpp:475 #8 0x00007f31b7c1c481 in VMThread::run (this=0x7f31b022d000) at /work/shared/bugs/8047212/src/8047212_exp/src/share/vm/runtime/vmThread.cpp:265 #9 0x00007f31b798d7c2 in java_start (thread=0x7f31b022d000) at /work/shared/bugs/8047212/src/8047212_exp/src/os/linux/vm/os_linux.cpp:828 #10 0x0000003189a07851 in start_thread () from /lib64/libpthread.so.0 #11 0x00000031896e890d in clone () from /lib64/libc.so.6 So the proposal is to not call perfDataManager::destroy() on the code path that I identified previously and live with the memory leak. Since we're about to exit the process, the memory leak is not an issue... what about memory leak check tools? There's probably a way to shut them up. As for the jni_DestroyJavaVM() code path, you're saying that this one is safe because we wait for all threads to terminate first? Do I have this right?
07-07-2015

I suspect such a swap of function pointers, or even a test of the data pointers to check for cleared during exit, can't be done correctly without introducing memory barriers that we probably don't want in the code in question. I still think the thing to do is to simply "leak" the perfdata memory when we're on the way to process termination. It's ok to destroy the persistent external resources, just don't unmap the associated memory or anything like that. perfMemory_exit is already structured to support that kind of cleanup separation, and already uses that separation sometimes. I'm suggesting it doesn't use that separation in cases where it should. Specifically, I currently think perfDataManager::destroy should only be called from within jni_DestroyJavaVM -> exit_globals -> ..., and not from places where we're about to exit the process anyway.
07-07-2015

My guess is that the blocked thread leads to an "interesting" code path from a JVM stat counter point of view. The blocked thread tries to update the counters after getting free of the block and boom! Is there some way to safely swap out the jvmstat counter function pointers for a benign block of do nothing functions? That way if a racing thread makes it past the "is this enabled?" check, it won't do anything with the soon to be freed memory.
07-07-2015

I can at will provoke a crash that looks very much like this one, using a technique similar to what I described for JDK-8129978. In this case, made the following changes: 1. Patched perfMemory_exit by adding the following at the end: tty->print_cr("perfdata destroyed"); struct timespec ts; ts.tv_sec = 1; ts.tv_nsec = 0; ::nanosleep(&ts, NULL); 2. Compiled and executed the attached CrashTestDummy.java program. That program loops, printing "main <N>" where N is the iteration counter, with brief sleeps for each iteration. 3. Once the program is running and has printed a few lines, Ctrl-C the program to terminate it. I get output like: -------------------- Doing it about to block forever main 0 main 1 [... elided ...] main 13 main 14 ^Cperfdata destroyed # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x00007fe3e50f2ae5, pid=1112, tid=0x00007fe30090d700 # # JRE version: Java(TM) SE Runtime Environment (9.0) (build 1.9.0-internal-fastdebug-kab_2015_06_29_16_25-b00) # Java VM: Java HotSpot(TM) 64-Bit Server VM (1.9.0-internal-fastdebug-kab_2015_06_29_16_25-b00 mixed mode linux-amd64 compressed oops) # Problematic frame: # V [libjvm.so+0xe82ae5] ObjectMonitor::EnterI(Thread*)+0x585 # # Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport %p %s %c %P" (or dumping to /home/kab/sandboxes/cla-segv-8129978/test-attempt/core.1112) # # An error report file with more information is saved as: # /home/kab/sandboxes/cla-segv-8129978/test-attempt/hs_err_pid1112.log # # If you would like to submit a bug report, please visit: # http://bugreport.java.com/bugreport/crash.jsp # perfdata destroyed -------------------- [It might be interesting that the "perfdata destroyed" message appears twice.] This can be run under gdb, to look at a live run. The gdb command handle SIGINT noprint pass allows the Ctrl-C signal to pass through gdb to the application. The existence of the blocked thread is important to provoking the failure. I'm not sure why a thread being blocked attempting to synchronize on an object that some other thread has locked might lead to this error, but it does...
07-07-2015

RULE nsk/monitoring/ThreadMXBean/ThreadInfo/Deadlock/MixedDeadlock001 Crash SIGSEGV Looks like the same issue: V [libjvm.so+0xdbfabb] ObjectMonitor::EnterI(Thread*)+0x52b;; ObjectMonitor::EnterI(Thread*)+0x52b V [libjvm.so+0xdc1771] ObjectMonitor::enter(Thread*)+0x5a1;; ObjectMonitor::enter(Thread*)+0x5a1 V [libjvm.so+0xfb1b33] ObjectSynchronizer::fast_enter(Handle, BasicLock*, bool, Thread*)+0x1d3;; ObjectSynchronizer::fast_enter(Handle, BasicLock*, bool, Thread*)+0x1d3 V [libjvm.so+0x9dbf48] InterpreterRuntime::monitorenter(JavaThread*, BasicObjectLock*)+0x1c8;; InterpreterRuntime::monitorenter(JavaThread*, BasicObjectLock*)+0x1c8 j nsk.share.monitoring.thread.Deadlock$JavaLocker.lock()V+7 v ~StubRoutines::call_stub V [libjvm.so+0x9f5327] JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0x15c7;; JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0x15c7 V [libjvm.so+0xa5d4ae] jni_invoke_nonstatic(JNIEnv_*, JavaValue*, _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) [clone .constprop.266]+0x73e;; _ZL20jni_invoke_nonstaticP7JNIEnv_P9JavaValueP8_jobject11JNICallTypeP10_jmethodIDP18JNI_ArgumentPusherP6Thread.constprop.266+0x73e V [libjvm.so+0xa6d942] jni_CallVoidMethod+0x1d2;; jni_CallVoidMethod+0x1d2 C [libDeadlock.so+0x3c67] Java_nsk_share_monitoring_thread_Deadlock_00024NativeLocker_lock+0x8e7;; Java_nsk_share_monitoring_thread_Deadlock_00024NativeLocker_lock+0x8e7
30-09-2014

---- called from signal handler with signal 10 (SIGBUS) ------ [7] PerfLongVariant::inc(this = 0x100115e80), line 425 in "perfData.hpp" [8] ObjectMonitor::EnterI(this = 0x1018bf978, __the_thread__ = 0x100aad000), line 596 in "objectMonitor.cpp" [9] ObjectMonitor::enter(this = 0x1018bf978, __the_thread__ = 0x100aad000), line 360 in "objectMonitor.cpp" [10] ObjectSynchronizer::slow_enter(obj = CLASS, lock = 0xffffffff132ad0e0, __the_thread__ = 0x100aad000), line 214 in "synchronizer.cpp" [11] ObjectSynchronizer::fast_enter(obj = CLASS, lock = 0xffffffff132ad0e0, attempt_rebias = true, __the_thread__ = 0x100aad000), line 144 in "synchronizer.cpp" [12] InterpreterRuntime::monitorenter(thread = 0x100aad000, elem = 0xffffffff132ad0e0), line 598 in "interpreterRuntime.cpp" RULE nsk/monitoring/ThreadMXBean/ThreadInfo/Deadlock/JavaDeadlock003 Crash none RULE nsk/monitoring/ThreadMXBean/ThreadInfo/Deadlock/JavaDeadlock003 ExitCode 11 RULE nsk/monitoring/ThreadMXBean/ThreadInfo/Deadlock/JavaDeadlock004 Crash SIGBUS
21-08-2014

This seems like a mistake in thinking that "at a safepoint" means not executing any code. The thread has presumably already performed the thread-state transition that marks it as safepoint-safe but then continues to touch the resource that will be freed by the VM op. Ouch. There's probably a chance that a similar thing could happen for a thread that does't participate in the safepoint protocol. It takes so much effort to die with grace. ;-)
04-07-2014

In the "you learn something new everyday department": src/share/vm/runtime/globals.hpp: product(bool, UsePerfData, falseInEmbedded, \ "Flag to disable jvmstat instrumentation for performance testing "\ "and problem isolation purposes") \ So UsePerfData is enabled by default unless we have an Embedded VM. I probably did know that at some point during my Serviceability years...
03-07-2014

Attached the complete thread dump in thread.8815.log. The crashing thread has 2754 frames and no hs_err_pid file was generated.
03-07-2014

The test runs here are for chasing 8047212: $ grep -v PASS doit_loop.*.log; uptime; pwd doit_loop.fast_1.log:Copy fast_1: loop #930808... doit_loop.fast_2.log:Copy fast_2: loop #880544...FAILED. doit_loop.fast_2.log:status=11 doit_loop.jvmg_1.log:Copy jvmg_1: loop #455100... doit_loop.jvmg_2.log:Copy jvmg_2: loop #455076... doit_loop.prod_1.log:Copy prod_1: loop #1450529... doit_loop.prod_2.log:Copy prod_2: loop #1450320... 16:14:56 up 159 days, 3:44, 3 users, load average: 8.28, 8.65, 8.82 /work/shared/bugs/8047212/inner-complex But illustrate how rare this failure is since it took 880544 runs to hit it.
03-07-2014

Looks like the perf monitor code shutdown code presumes that PerfCounters cannot be used when the system is at a safepoint. Not a good assumption.
03-07-2014