JDK-8247560 : Shenandoah: heap iteration holds root locks all the time
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 8-shenandoah,11-shenandoah,14,15
  • Priority: P2
  • Status: Closed
  • Resolution: Fixed
  • Submitted: 2020-06-15
  • Updated: 2024-10-17
  • Resolved: 2020-06-15
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 14 JDK 15 JDK 16
14.0.2Fixed 15 b28Fixed 16Fixed
Related Reports
Relates :  
Description
(synopsis is provisional until the root cause is found)

$ CONF=linux-x86_64-server-fastdebug make images run-test TEST=serviceability/dcmd/gc/HeapDumpCompressedTest.java

Attempting to wait on monitor HProf Compression Backend/11 while holding lock CodeCache_lock/6 -- possible deadlock

#  Internal Error (/home/shade/trunks/jdk-jdk/src/hotspot/share/runtime/mutex.cpp:183), pid=80261, tid=80275
#  assert(false) failed: Shouldn't block(wait) while holding a lock of rank special
#
# JRE version: OpenJDK Runtime Environment (15.0) (fastdebug build 15-internal+0-adhoc.shade.jdk-jdk)
# Problematic frame:
# V  [libjvm.so+0x12004bb]  Monitor::assert_wait_lock_state(Thread*)+0x15b

Host: shade-desktop, AMD Ryzen Threadripper 3970X 32-Core Processor, 64 cores, 125G, Ubuntu 18.04.4 LTS
Time: Mon Jun 15 08:51:51 2020 CEST elapsed time: 1.326935 seconds (0d 0h 0m 1s)

Current thread (0x00007f065c2233a0):  VMThread "VM Thread" [stack: 0x00007f0640a63000,0x00007f0640b63000] [id=80275]

Stack: [0x00007f0640a63000,0x00007f0640b63000],  sp=0x00007f0640b60d40,  free space=1015k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x12004bb]  Monitor::assert_wait_lock_state(Thread*)+0x15b
V  [libjvm.so+0x1201877]  Monitor::wait_without_safepoint_check(long)+0x97
V  [libjvm.so+0xb91d7a]  CompressionBackend::get_new_buffer(char**, unsigned long*, unsigned long*)+0xfa
V  [libjvm.so+0xb850a8]  DumpWriter::finish_dump_segment() [clone .part.87]+0x58
V  [libjvm.so+0xb8522d]  DumpWriter::start_sub_record(unsigned char, unsigned int)+0x9d
V  [libjvm.so+0xb8a1bf]  DumperSupport::dump_object_array(DumpWriter*, objArrayOop)+0x6f
V  [libjvm.so+0xb8abd0]  HeapObjectDumper::do_object(oop)+0x300
V  [libjvm.so+0x14b8554]  ShenandoahHeap::object_iterate(ObjectClosure*)+0x244
V  [libjvm.so+0xb8635e]  VM_HeapDumper::work(unsigned int)+0x13e
V  [libjvm.so+0x17b7e83]  SemaphoreGangTaskDispatcher::coordinator_execute_on_workers(AbstractGangTask*, unsigned int, bool)+0x93
V  [libjvm.so+0x17b6fe6]  WorkGang::run_task(AbstractGangTask*, unsigned int, bool)+0xf6
V  [libjvm.so+0xb8274c]  VM_HeapDumper::doit()+0x11c
V  [libjvm.so+0x1744b5d]  VM_Operation::evaluate()+0x1cd
V  [libjvm.so+0x1773a5b]  VMThread::evaluate_operation(VM_Operation*) [clone .constprop.73]+0x13b
V  [libjvm.so+0x177449d]  VMThread::loop()+0x7bd
V  [libjvm.so+0x17748ea]  VMThread::run()+0xca
V  [libjvm.so+0x1681c46]  Thread::call_run()+0xf6
V  [libjvm.so+0x12ab52e]  thread_native_entry(Thread*)+0x10e

VM Mutex/Monitor currently owned by a thread:  ([mutex/lock_event])
[0x00007f065c025c90] CodeCache_lock - owner thread: 0x00007f065c2233a0
[0x00007f065c027b90] Threads_lock - owner thread: 0x00007f065c2233a0
[0x00007f065c028850] Heap_lock - owner thread: 0x00007f0628000ec0
Comments
[~shade], could you please confirm that the defect is fixed in jdk15 w/ the fix? if it's so, please close/verify this bug as 'verified'
15-07-2020

Changeset: e3b04bc1 Author: Aleksey Shipilev <shade@openjdk.org> Date: 2020-06-15 14:11:43 +0000 URL: https://git.openjdk.java.net/lanai/commit/e3b04bc1
02-07-2020

Changeset: e3b04bc1 Author: Aleksey Shipilev <shade@openjdk.org> Date: 2020-06-15 14:11:43 +0000 URL: https://git.openjdk.java.net/panama-foreign/commit/e3b04bc1
02-07-2020

Changeset: e3b04bc1 Author: Aleksey Shipilev <shade@openjdk.org> Date: 2020-06-15 14:11:43 +0000 URL: https://git.openjdk.java.net/amber/commit/e3b04bc1
02-07-2020

Fix Request (14u) This fixes Shenandoah bug and provides the ground for subsequent backports. Patch applies cleanly to 14u, passes hotspot_gc_shenandoah, tier{1,2} with Shenandoah enabled. Patch is completely isolated in Shenandoah code.
16-06-2020

RFR: https://mail.openjdk.java.net/pipermail/shenandoah-dev/2020-June/012500.html
15-06-2020

I believe this is caused by ShenandoahHeap::object_iterate that has the scoped root processor, that is alive during the entire time we walk the heap. This might be the fix: diff -r a39eb5a4f1c1 src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp --- a/src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp Thu Jun 11 18:16:32 2020 +0200 +++ b/src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp Mon Jun 15 09:04:12 2020 +0200 @@ -1295,15 +1295,20 @@ // Reset bitmap _aux_bit_map.clear(); Stack<oop,mtGC> oop_stack; - // First, we process GC roots according to current GC cycle. This populates the work stack with initial objects. - ShenandoahHeapIterationRootScanner rp; ObjectIterateScanRootClosure oops(&_aux_bit_map, &oop_stack); - rp.roots_do(&oops); + { + // First, we process GC roots according to current GC cycle. + // This populates the work stack with initial objects. + // It is important to relinquish the associated locks before diving + // into heap dumper. + ShenandoahHeapIterationRootScanner rp; + rp.roots_do(&oops); + } // Work through the oop stack to traverse heap. while (! oop_stack.is_empty()) { oop obj = oop_stack.pop(); assert(oopDesc::is_oop(obj), "must be a valid oop");
15-06-2020

URL: https://hg.openjdk.java.net/jdk/jdk15/rev/86a603d04e54 User: shade Date: 2020-06-15 12:12:22 +0000
15-06-2020