JDK-8231198 : Shenandoah: heap walking should visit all roots most of the time
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 8-shenandoah,11-shenandoah,13,14
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2019-09-18
  • Updated: 2019-09-26
  • Resolved: 2019-09-19
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 14
14 b16Fixed
Related Reports
Relates :  
Relates :  
Description
$ CONF=linux-x86_64-server-fastdebug make images run-test TEST=vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t001/ TEST_VM_OPTS="-XX:+UnlockExperimentalVMOptions -XX:+UseShenandoahGC"

Fails with:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/home/shade/trunks/shenandoah-jdk11/src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.inline.hpp:262), pid=29055, tid=29081
#  Error: Shenandoah assert_correct failed; Forwardee must point to a heap address

Referenced from:
  interior location: 0x000000070edabb80
  inside Java heap
    not in collection set
  region: |    0|R  |BTE    70ed00000,    70ee00000,    70ee00000|TAMS    70ee00000|U  1024K|T  1024K|G     0B|S     0B|L     0B|CP   0|SN            1,        1,        0,        0

Object:
  0x000000070edc7718 - klass 0x0000000800001868 java.lang.String
    not allocated after mark start
    not marked 
    not in collection set
  mark: marked(0x0000000000000003)
  region: |    0|R  |BTE    70ed00000,    70ee00000,    70ee00000|TAMS    70ee00000|U  1024K|T  1024K|G     0B|S     0B|L     0B|CP   0|SN            1,        1,        0,        0

Forwardee:
  0x0000000000000000 - safe print, no details

---------------  T H R E A D  ---------------

Current thread (0x00007fbbfc003000):  GCTaskThread "Shenandoah GC Threads#2" [stack: 0x00007fbc001b2000,0x00007fbc002b2000] [id=29081]

Stack: [0x00007fbc001b2000,0x00007fbc002b2000],  sp=0x00007fbc002ae570,  free space=1009k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x194eec4]  VMError::report_and_die(int, char const*, char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long)+0x314
V  [libjvm.so+0x194fd5f]  VMError::report_and_die(Thread*, void*, char const*, int, char const*, char const*, __va_list_tag*)+0x2f
V  [libjvm.so+0xadb6f0]  report_vm_error(char const*, int, char const*, char const*, ...)+0x100
V  [libjvm.so+0x164d38e]  ShenandoahAsserts::print_failure(ShenandoahAsserts::SafeLevel, oop, void*, oop, char const*, char const*, char const*, int)+0x45e
V  [libjvm.so+0x164dc9a]  ShenandoahAsserts::assert_correct(void*, oop, char const*, int)+0x1da
V  [libjvm.so+0x164f717]  ShenandoahAsserts::assert_not_forwarded(void*, oop, char const*, int)+0x47
V  [libjvm.so+0x16892a5]  void ShenandoahConcurrentMark::mark_through_ref<unsigned int, (UpdateRefsMode)0, (StringDedupMode)0>(unsigned int*, ShenandoahHeap*, Padded<BufferedOverflowTaskQueue<ObjArrayChunkedTask, (MemoryType)5, 131072u>, 128ul>*, ShenandoahMarkingContext*)+0x75
V  [libjvm.so+0x1696773]  void objArrayOopDesc::oop_iterate_range<ShenandoahMarkRefsMetadataClosure>(ShenandoahMarkRefsMetadataClosure*, int, int)+0xf3
V  [libjvm.so+0x1696c82]  void ShenandoahConcurrentMark::do_chunked_array_start<ShenandoahMarkRefsMetadataClosure>(Padded<BufferedOverflowTaskQueue<ObjArrayChunkedTask, (MemoryType)5, 131072u>, 128ul>*, ShenandoahMarkRefsMetadataClosure*, oop)+0x2e2
V  [libjvm.so+0x16973a6]  void ShenandoahConcurrentMark::do_task<ShenandoahMarkRefsMetadataClosure>(Padded<BufferedOverflowTaskQueue<ObjArrayChunkedTask, (MemoryType)5, 131072u>, 128ul>*, ShenandoahMarkRefsMetadataClosure*, unsigned short*, ObjArrayChunkedTask*)+0x606
V  [libjvm.so+0x16978a3]  void ShenandoahConcurrentMark::mark_loop_work<ShenandoahMarkRefsMetadataClosure, true>(ShenandoahMarkRefsMetadataClosure*, unsigned short*, unsigned int, ShenandoahTaskTerminator*)+0x373
V  [libjvm.so+0x16a44f4]  void ShenandoahConcurrentMark::mark_loop_prework<true>(unsigned int, ShenandoahTaskTerminator*, ReferenceProcessor*, bool)+0x234
V  [libjvm.so+0x16a539c]  ShenandoahConcurrentMarkingTask::work(unsigned int)+0xbc
V  [libjvm.so+0x19cde80]  GangWorker::loop()+0xe0
V  [libjvm.so+0x187de6d]  Thread::call_run()+0x6d
V  [libjvm.so+0x147b856]  thread_native_entry(Thread*)+0x106

...

Event: 0.443 Executing VM operation: HeapWalkOperation
Event: 0.443 Protecting memory [0x00007fbc33366000,0x00007fbc33367000] with protection modes 0
Event: 0.443 Concurrent reset
Event: 0.443 Concurrent reset done
Event: 0.509 Pause Init Mark (process weakrefs) (unload classes)
Event: 0.510 Pause Init Mark (process weakrefs) (unload classes) done
Event: 0.510 Protecting memory [0x00007fbc33366000,0x00007fbc33367000] with protection modes 1
Event: 0.510 Executing VM operation: HeapWalkOperation done
Event: 0.510 Concurrent marking (process weakrefs) (unload classes)
<crash>
Comments
URL: https://hg.openjdk.java.net/jdk/jdk/rev/de9d23469c68 User: shade Date: 2019-09-19 18:27:59 +0000
19-09-2019

This is caused by JDK-8225550. JVMTI would walk the heap, marking the objects, then it calls to object_iterate to fix up the mark words. And due to JDK-8225550, we are not visiting weak roots. Then next GC cycle would discover already marked objects when dealing with weak roots, and crash.
19-09-2019

It seems that VMThread does coalesced VM operations: first HeapWalkOperation, and then ShenandoahInitMark (not shown due to events logging bug, JDK-8231201). Something goes wrong in between. This helps to pass the test: diff -r 99e3477e99b1 src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp --- a/src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp Wed Sep 18 20:26:25 2019 +0200 +++ b/src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp Wed Sep 18 20:53:21 2019 +0200 @@ -99,7 +99,11 @@ GCCause::Cause cause = GCCause::_last_gc_cause; ShenandoahHeap::ShenandoahDegenPoint degen_point = ShenandoahHeap::_degenerated_unset; - if (alloc_failure_pending) { + if (VMThread::vm_operation() != NULL) { + // VM operation is currently pending, avoid starting the cycle. + // This alleviates races against VM operations that walk the heap. + mode = none; + } else if (alloc_failure_pending) { // Allocation failure takes precedence: we have to deal with it first thing log_info(gc)("Trigger: Handle Allocation Failure");
18-09-2019

Requires fix for JDK-8231197 to expose this. Otherwise the test fails the same way JDK-8231197 fixes.
18-09-2019