JDK-8256641 : CDS VM operations do not lock the heap
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 16
  • Priority: P2
  • Status: Closed
  • Resolution: Fixed
  • Submitted: 2020-11-19
  • Updated: 2021-01-19
  • Resolved: 2020-12-11
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 16 JDK 17
16 b29Fixed 17Fixed
Related Reports
Duplicate :  
Relates :  
Relates :  
Relates :  
Description
CDS VM operations (VM_PopulateDumpSharedSpace, VM_PopulateDynamicDumpSharedSpace and VM_Verify which is called directly once) which optionally do some heap verification do not lock the heap using the Heap_lock when they start.

Any GC VM allocation operations may (temporarily) expose a raw pointer to the heap containing garbage as their result; so any VM operation that iterates over all of the heap must be guarded by the Heap_lock as well to ensure a consistent view.

Otherwise the VM operation that iterates over the heap will come across uninitialized memory, crashing as described in this bug.

Since that verification is only enabled in debug mode, product is not affected.

Original description:
----------------------------

Happened in gh actions when testing for JDK-8255978: "[windows] os::release_memory may not release the full range" (https://github.com/openjdk/jdk/pull/1143, see https://github.com/tstuefe/jdk/runs/1423207838?check_suite_focus=true)

I am quite sure this has nothing to do with my change.

```
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007fd88f32719d, pid=21277, tid=21302
#
# JRE version: OpenJDK Runtime Environment (16.0) (fastdebug build 16-internal+0-tstuefe-412da0658c96b4924aa425c3cabbe90543ee5d63)
# Java VM: OpenJDK 64-Bit Server VM (fastdebug 16-internal+0-tstuefe-412da0658c96b4924aa425c3cabbe90543ee5d63, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0xaf519d]  HeapRegion::block_size(HeapWordImpl* const*) const+0x16d
#
# CreateCoredumpOnCrash turned off, no core file dumped
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
#

---------------  S U M M A R Y ------------

Command Line: -Dtest.vm.opts=-XX:MaxRAMPercentage=25 -Djava.io.tmpdir=/home/runner/work/jdk/jdk/build/run-test-prebuilt/test-support/jtreg_test_hotspot_jtreg_tier1_runtime/tmp -Dtest.tool.vm.opts=-J-XX:MaxRAMPercentage=25 -J-Djava.io.tmpdir=/home/runner/work/jdk/jdk/build/run-test-prebuilt/test-support/jtreg_test_hotspot_jtreg_tier1_runtime/tmp -Dtest.compiler.opts= -Dtest.java.opts=-XX:-CreateCoredumpOnCrash -Dtest.jdk=/home/runner/jdk-linux-x64-debug/jdk-16-internal+0_linux-x64_bin-debug/jdk-16/fastdebug -Dcompile.jdk=/home/runner/jdk-linux-x64-debug/jdk-16-internal+0_linux-x64_bin-debug/jdk-16/fastdebug -Dtest.timeout.factor=4.0 -Dtest.nativepath=/home/runner/jdk-linux-x64-debug/jdk-16-internal+0_linux-x64_bin-tests-debug/hotspot/jtreg/native -Dtest.root=/home/runner/work/jdk/jdk/test/hotspot/jtreg -Dtest.name=runtime/handshake/AsyncHandshakeWalkStackTest.java -Dtest.file=/home/runner/work/jdk/jdk/test/hotspot/jtreg/runtime/handshake/AsyncHandshakeWalkStackTest.java -Dtest.src=/home/runner/work/jdk/jdk/test/hotspot/jtreg/runtime/handshake -Dtest.src.path=/home/runner/work/jdk/jdk/test/hotspot/jtreg/runtime/handshake:/home/runner/work/jdk/jdk/test/hotspot/jtreg/testlibrary:/home/runner/work/jdk/jdk/test/lib -Dtest.classes=/home/runner/work/jdk/jdk/build/run-test-prebuilt/test-support/jtreg_test_hotspot_jtreg_tier1_runtime/classes/runtime/handshake/AsyncHandshakeWalkStackTest.d -Dtest.class.path=/home/runner/work/jdk/jdk/build/run-test-prebuilt/test-support/jtreg_test_hotspot_jtreg_tier1_runtime/classes/runtime/handshake/AsyncHandshakeWalkStackTest.d:/home/runner/work/jdk/jdk/build/run-test-prebuilt/test-support/jtreg_test_hotspot_jtreg_tier1_runtime/classes/testlibrary:/home/runner/work/jdk/jdk/build/run-test-prebuilt/test-support/jtreg_test_hotspot_jtreg_tier1_runtime/classes/test/lib -Dtest.class.path.prefix=/home/runner/work/jdk/jdk/build/run-test-prebuilt/test-support/jtreg_test_hotspot_jtreg_tier1_runtime/classes/runtime/handshake/AsyncHandshakeWalkStackTest.d:/home/runner/work/jdk/jdk/test/hotspot/jtreg/runtime/handshake:/home/runner/work/jdk/jdk/build/run-test-prebuilt/test-support/jtreg_test_hotspot_jtreg_tier1_runtime/classes/testlibrary:/home/runner/work/jdk/jdk/build/run-test-prebuilt/test-support/jtreg_test_hotspot_jtreg_tier1_runtime/classes/test/lib -XX:MaxRAMPercentage=25 -Djava.io.tmpdir=/home/runner/work/jdk/jdk/build/run-test-prebuilt/test-support/jtreg_test_hotspot_jtreg_tier1_runtime/tmp -XX:-CreateCoredumpOnCrash -Djava.library.path=/home/runner/jdk-linux-x64-debug/jdk-16-internal+0_linux-x64_bin-tests-debug/hotspot/jtreg/native -Xbootclasspath/a:. -XX:+UnlockDiagnosticVMOptions -XX:+WhiteBoxAPI com.sun.javatest.regtest.agent.MainWrapper /home/runner/work/jdk/jdk/build/run-test-prebuilt/test-support/jtreg_test_hotspot_jtreg_tier1_runtime/runtime/handshake/AsyncHandshakeWalkStackTest.d/main.0.jta

Host: fv-az58-519, Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz, 2 cores, 6G, Ubuntu 18.04.5 LTS
Time: Thu Nov 19 09:28:28 2020 UTC elapsed time: 0.376932 seconds (0d 0h 0m 0s)

---------------  T H R E A D  ---------------

Current thread (0x00007fd854006600):  GCTaskThread "GC Thread#1" [stack: 0x00007fd85cfb8000,0x00007fd85d0b8000] [id=21302]

Stack: [0x00007fd85cfb8000,0x00007fd85d0b8000],  sp=0x00007fd85d0b6ad0,  free space=1018k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0xaf519d]  HeapRegion::block_size(HeapWordImpl* const*) const+0x16d
V  [libjvm.so+0xc761bb]  HeapRegion::verify(VerifyOption, bool*) const+0x13b
V  [libjvm.so+0xb854b8]  VerifyRegionClosure::do_heap_region(HeapRegion*)+0xc8
V  [libjvm.so+0xc83c72]  HeapRegionManager::par_iterate(HeapRegionClosure*, HeapRegionClaimer*, unsigned int) const+0x72
V  [libjvm.so+0xb8199c]  G1ParVerifyTask::work(unsigned int)+0x3c
V  [libjvm.so+0x191368c]  GangWorker::loop()+0xac
V  [libjvm.so+0x17c61a8]  Thread::call_run()+0xf8
V  [libjvm.so+0x13b1b6e]  thread_native_entry(Thread*)+0x10e


siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x0000000dd56dd5fc
```

The following test failed in the JDK16 CI:

runtime/handshake/AsyncHandshakeWalkStackTest.java
Comments
Changeset: bacf22b9 Author: Thomas Schatzl <tschatzl@openjdk.org> Date: 2020-12-11 18:14:37 +0000 URL: https://git.openjdk.java.net/jdk16/commit/bacf22b9
11-12-2020

Here's snippets from the hs_err_pid for the jdk-16+28-1937-tier4 sighting: # SIGSEGV (0xb) at pc=0x00007f4216795017, pid=12426, tid=12636 # # JRE version: Java(TM) SE Runtime Environment (16.0+28) (fastdebug build 16-ea+28-1937) # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 16-ea+28-1937, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) # Problematic frame: # V [libjvm.so+0x870017] oopDesc::size_given_klass(Klass*)+0x17 <snip> --------------- T H R E A D --------------- Current thread (0x00007f41dc00c310): GCTaskThread "GC Thread#6" [stack: 0x00007f41e50ec000,0x00007f41e51ec000] [id=12636] Stack: [0x00007f41e50ec000,0x00007f41e51ec000], sp=0x00007f41e51eaae0, free space=1018k Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x870017] oopDesc::size_given_klass(Klass*)+0x17 V [libjvm.so+0xd8e707] HeapRegion::verify(VerifyOption, bool*) const+0x187 V [libjvm.so+0xc8d42b] VerifyRegionClosure::do_heap_region(HeapRegion*)+0xdb V [libjvm.so+0xd9d7fa] HeapRegionManager::par_iterate(HeapRegionClosure*, HeapRegionClaimer*, unsigned int) const+0x6a V [libjvm.so+0xc8726d] G1ParVerifyTask::work(unsigned int)+0x3d V [libjvm.so+0x1a1c0a4] GangWorker::run_task(WorkData)+0x84 V [libjvm.so+0x1a1c1e4] GangWorker::loop()+0x44 V [libjvm.so+0x18b1f80] Thread::call_run()+0x100 V [libjvm.so+0x15952e6] thread_native_entry(Thread*)+0x116 siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x0000000dd56dd5fc
04-12-2020

Here's snippet from the hs_err_pid for the jdk-16+28-1945-tier3 sighting: # SIGSEGV (0xb) at pc=0x00007f152e08e017, pid=29840, tid=29844 # # JRE version: Java(TM) SE Runtime Environment (16.0+28) (fastdebug build 16-ea+28-1945) # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 16-ea+28-1945, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) # Problematic frame: # V [libjvm.so+0x870017] oopDesc::size_given_klass(Klass*)+0x17 <snip> --------------- T H R E A D --------------- Current thread (0x00007f1528075f60): GCTaskThread "GC Thread#0" [stack: 0x00007f152c26f000,0x00007f152c36f000] [id=29844] Stack: [0x00007f152c26f000,0x00007f152c36f000], sp=0x00007f152c36dae0, free space=1018k Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x870017] oopDesc::size_given_klass(Klass*)+0x17 V [libjvm.so+0xd8e707] HeapRegion::verify(VerifyOption, bool*) const+0x187 V [libjvm.so+0xc8d42b] VerifyRegionClosure::do_heap_region(HeapRegion*)+0xdb V [libjvm.so+0xd9d7fa] HeapRegionManager::par_iterate(HeapRegionClosure*, HeapRegionClaimer*, unsigned int) const+0x6a V [libjvm.so+0xc8726d] G1ParVerifyTask::work(unsigned int)+0x3d V [libjvm.so+0x1a1c134] GangWorker::run_task(WorkData)+0x84 V [libjvm.so+0x1a1c274] GangWorker::loop()+0x44 V [libjvm.so+0x18b2010] Thread::call_run()+0x100 V [libjvm.so+0x1595376] thread_native_entry(Thread*)+0x116 siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x0000000dd56dd5fc
04-12-2020

In addition to VM_PopulateDumpSharedSpace and VM_Verify, VM_PopulateDynamicDumpSharedSpace also calls verification and so needs to synchronize with GCs.
02-12-2020

I've created an easy reproducer. See stressme.sh in the attachment of this bug. I got about 7 crashes after running for a couple of minuets.
02-12-2020

I looked at the latest crash. The VM_Verify is called from CDS (HeapShared::init_archived_fields_for() -> verify_the_heap()). This code was added recently (JDK-8253081), and it runs a few VM_Verify ops during VM bootstrap. This happens while another Java thread is doing an anewarray, which is probably its first allocation so it tries to initialize the TLAB, using a VM_G1CollectForAllocation. So the chance of these two kinds of VMOps colliding has been greatly increased since JDK-8253081.
01-12-2020

Adding a Heap_lock to VM_Verify and VM_Populate... (actually making it unconditional for the latter) removes the crashes on the reproducer after ~1000 iterations of above test.
01-12-2020

From [~stefank]: VM_GC_operations achieve mutual exclusion by requiring getting the Heap_lock. Neither VM_Verify or VM_PopulateDynamicDumpSharedSpaces do that. I.e. for the latter, this path: G1HeapVerifier::verify G1CollectedHeap::verify Universe::verify Universe::verify DynamicArchiveBuilder::verify_universe DynamicArchiveBuilder::doit VM_PopulateDynamicDumpSharedSpaces::doit
01-12-2020

In the crashes there is always one thread that just executed a VM_G1CollectForAllocation and is waiting to return from VMThread::execute(). Thread 23 (LWP 14707): #0 0x00007ffa724ebc89 in syscall () from /home/tschatzl/Downloads/x/x/4/lib64/libc.so.6 #1 0x00007ffa71cb62f3 in futex (op_arg=79, futex_op=128, addr=0x7ffa48000af0) at /src/hotspot/os/linux/waitBarrier_linux.cpp:69 #2 LinuxWaitBarrier::wait (this=this@entry=0x7ffa48000ae8, barrier_tag=barrier_tag@entry=79) at /src/hotspot/os/linux/waitBarrier_linux.cpp:69 #3 0x00007ffa719aa927 in WaitBarrierType<LinuxWaitBarrier>::wait (barrier_tag=79, this=0x7ffa48000ae0) at /src/hotspot/share/runtime/thread.hpp:854 #4 SafepointSynchronize::block (thread=0x7ffa04000c60) at /src/hotspot/share/runtime/safepoint.cpp:721 #5 0x00007ffa719b6965 in SafepointMechanism::process (thread=0x7ffa04000c60) at /src/hotspot/share/runtime/safepointMechanism.cpp:83 #6 SafepointMechanism::process_if_requested_slow (thread=0x7ffa04000c60) at /src/hotspot/share/runtime/safepointMechanism.cpp:136 #7 0x00007ffa71790c20 in SafepointMechanism::process_if_requested (thread=<optimized out>) at /src/hotspot/share/runtime/safepointMechanism.inline.hpp:80 #8 ThreadBlockInVMWithDeadlockCheck::~ThreadBlockInVMWithDeadlockCheck (this=0x7ffa13bf9ef0, __in_chrg=<optimized out>) at /src/hotspot/share/runtime/interfaceSupport.inline.hpp:279 #9 Monitor::wait (this=0x7ffa68025dd0, timeout=timeout@entry=0, as_suspend_equivalent=as_suspend_equivalent@entry=false) at /src/hotspot/share/runtime/mutex.cpp:230 #10 0x00007ffa71c9f756 in MonitorLocker::wait (this=0x7ffa13bf9f50, this=0x7ffa13bf9f50, timeout=0, as_suspend_equivalent=false) at /src/hotspot/share/runtime/mutexLocker.hpp:259 #11 VMThread::wait_until_executed (op=op@entry=0x7ffa13bfa030) at /src/hotspot/share/runtime/vmThread.cpp:362 #12 0x00007ffa71ca0a09 in VMThread::execute (op=op@entry=0x7ffa13bfa030) at /src/hotspot/share/runtime/vmThread.cpp:531 #13 0x00007ffa70e52493 in G1CollectedHeap::do_collection_pause (this=this@entry=0x7ffa6803b520, word_size=word_size@entry=20002, gc_count_before=gc_count_before@entry=22, succeeded=succeeded@entry=0x7ffa13bfa120, gc_cause=gc_cause@entry=GCCause::_g1_inc_collection_pause) at Meanwhile another thread issued a VM_Verify operation that crashes with that error.
01-12-2020

Can be fairly easily reproduced (around 1 in 10) with the runtime/valhalla/inlinetypes/InlineOops.java test in the valhalla repo. Best to run with the whole directory (runtime/valhalla/inlinetypes/) as test to get some additional load.
01-12-2020

I think the use of VM_Verify there is okay. Sine the size of that chunk of dead data equals MinTLABSize, one possibility would be some non-java thread having a TLAB that is not properly handled in CollectedHeap::prepare_for_verify(). This would result in a non-parseable chunk of memory. This would be my best-fitting theory so far, but it does not look that way digging through the core file.
24-11-2020

FDBigInteger is initialized when an app uses it (directly or indirectly), after the VM bootstrap has completed, so it can happen at an arbitrary point of program execution. Maybe we are in a situation where this is not safe to do? static void verify_the_heap(Klass* k, const char* which) { if (VerifyArchivedFields) { <--- change to if (VerifyArchivedFields && VerifyDuringStartup) ??? ResourceMark rm; log_info(cds, heap)("Verify heap %s initializing static field(s) in %s", which, k->external_name()); VM_Verify verify_op; VMThread::execute(&verify_op); I searched the VM code, and VM_Verify seems to be used under restricted situations only: thread.cpp: if (VerifyDuringStartup) { // Make sure we're starting with a clean slate. VM_Verify verify_op; VMThread::execute(&verify_op); } zDriver.cpp: void ZDriver::pause_verify() { if (VerifyBeforeGC || VerifyDuringGC || VerifyAfterGC) { // Full verification VM_Verify op; VMThread::execute(&op); }
23-11-2020

The memory of the eden region is filled until address bottom()+0x800 (=MinTLABSize) with 0xbaadbabe, i.e. no kind of header/filler at that location.
23-11-2020

JDK-8253081 introduced the code that causes this verification failure, so linking it. The situation looks like the following: - java code calls jdk.internal.misc.CDS.initializeFromArchive java.lang.Thread.State: RUNNABLE JavaThread state: _thread_blocked - jdk.internal.misc.CDS.initializeFromArchive(java.lang.Class) @bci=0 (Interpreted frame) - jdk.internal.math.FDBigInteger.<clinit>() @bci=18, line=87 (Interpreted frame) - jdk.internal.math.FloatingDecimal$BinaryToASCIIBuffer.dtoa(int, long, int, boolean) @bci=93, line=448 (Interpreted frame) - jdk.internal.math.FloatingDecimal.getBinaryToASCIIConverter(double, boolean) @bci=177, line=1785 (Interpreted frame) HeapShared::init_archived_fields_for(Klass* k, const ArchivedKlassSubGraphInfoRecord* record, TRAPS) { does heap verification that fails at the "before" verification: verify_the_heap(k, "before"); Failure occurs always in the first object in the (only) eden region. At least one GC happened before this; the region has apparently been cleared (contains 0xbaadbabe), but top() > bottom() for that region. Also a GC operation is pending at the same time.
23-11-2020

Here's the crashing stack from the jdk-16+26-1702-tier6 sighting: --------------- T H R E A D --------------- Current thread (0x00007fb45c008640): GCTaskThread "GC Thread#2" [stack: 0x00007fb4078b2000,0x00007fb4079b2000] [id=2874] Stack: [0x00007fb4078b2000,0x00007fb4079b2000], sp=0x00007fb4079b0ae0, free space=1018k Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x85a717] oopDesc::size_given_klass(Klass*)+0x17 V [libjvm.so+0xd71047] HeapRegion::verify(VerifyOption, bool*) const+0x187 V [libjvm.so+0xc7161b] VerifyRegionClosure::do_heap_region(HeapRegion*)+0xdb V [libjvm.so+0xd802fa] HeapRegionManager::par_iterate(HeapRegionClosure*, HeapRegionClaimer*, unsigned int) const+0x6a V [libjvm.so+0xc6b45d] G1ParVerifyTask::work(unsigned int)+0x3d V [libjvm.so+0x19f7204] GangWorker::run_task(WorkData)+0x84 V [libjvm.so+0x19f7344] GangWorker::loop()+0x44 V [libjvm.so+0x1893db0] Thread::call_run()+0x100 V [libjvm.so+0x1574796] thread_native_entry(Thread*)+0x116 siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x0000000dd56dd5fc
22-11-2020

Here's the crashing stack from the jdk-16+26-1690-tier6 sighting: --------------- T H R E A D --------------- Current thread (0x00007f04c400ffa0): GCTaskThread "GC Thread#3" [stack: 0x00007f04dc8c2000,0x00007f04dc9c2000] [id=31492] Stack: [0x00007f04dc8c2000,0x00007f04dc9c2000], sp=0x00007f04dc9c0ae0, free space=1018k Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x85a717] oopDesc::size_given_klass(Klass*)+0x17 V [libjvm.so+0xd71047] HeapRegion::verify(VerifyOption, bool*) const+0x187 V [libjvm.so+0xc7161b] VerifyRegionClosure::do_heap_region(HeapRegion*)+0xdb V [libjvm.so+0xd802fa] HeapRegionManager::par_iterate(HeapRegionClosure*, HeapRegionClaimer*, unsigned int) const+0x6a V [libjvm.so+0xc6b45d] G1ParVerifyTask::work(unsigned int)+0x3d V [libjvm.so+0x19f8464] GangWorker::run_task(WorkData)+0x84 V [libjvm.so+0x19f85a4] GangWorker::loop()+0x44 V [libjvm.so+0x1893d30] Thread::call_run()+0x100 V [libjvm.so+0x1574796] thread_native_entry(Thread*)+0x116 siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x0000000dd56dd5fc
21-11-2020

Here's the crashing stack from the jdk-16+26-1690-tier5 sighting: --------------- T H R E A D --------------- Current thread (0x00007fcbe4031b90): GCTaskThread "GC Thread#3" [stack: 0x00007fcbedbfb000,0x00007fcbedcfb000] [id=19878] Stack: [0x00007fcbedbfb000,0x00007fcbedcfb000], sp=0x00007fcbedcf9ae0, free space=1018k Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x85a717] oopDesc::size_given_klass(Klass*)+0x17 V [libjvm.so+0xd71047] HeapRegion::verify(VerifyOption, bool*) const+0x187 V [libjvm.so+0xc7161b] VerifyRegionClosure::do_heap_region(HeapRegion*)+0xdb V [libjvm.so+0xd802fa] HeapRegionManager::par_iterate(HeapRegionClosure*, HeapRegionClaimer*, unsigned int) const+0x6a V [libjvm.so+0xc6b45d] G1ParVerifyTask::work(unsigned int)+0x3d V [libjvm.so+0x19f8464] GangWorker::run_task(WorkData)+0x84 V [libjvm.so+0x19f85a4] GangWorker::loop()+0x44 V [libjvm.so+0x1893d30] Thread::call_run()+0x100 V [libjvm.so+0x1574796] thread_native_entry(Thread*)+0x116 siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x0000000dd56dd5fc
21-11-2020

Here's the crashing stack from the jdk-16+26-1683-tier1 sighting: --------------- T H R E A D --------------- Current thread (0x00007f316c00b2c0): GCTaskThread "GC Thread#5" [stack: 0x00007f31757e8000,0x00007f31758e8000] [id=20933] Stack: [0x00007f31757e8000,0x00007f31758e8000], sp=0x00007f31758e6ae0, free space=1018k Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x85a767] oopDesc::size_given_klass(Klass*)+0x17 V [libjvm.so+0xd71207] HeapRegion::verify(VerifyOption, bool*) const+0x187 V [libjvm.so+0xc717db] VerifyRegionClosure::do_heap_region(HeapRegion*)+0xdb V [libjvm.so+0xd804ba] HeapRegionManager::par_iterate(HeapRegionClosure*, HeapRegionClaimer*, unsigned int) const+0x6a V [libjvm.so+0xc6b61d] G1ParVerifyTask::work(unsigned int)+0x3d V [libjvm.so+0x19f8744] GangWorker::run_task(WorkData)+0x84 V [libjvm.so+0x19f8884] GangWorker::loop()+0x44 V [libjvm.so+0x1894010] Thread::call_run()+0x100 V [libjvm.so+0x1574916] thread_native_entry(Thread*)+0x116 siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x0000000dd56dd5fc
20-11-2020

Most recent larger change to G1 included in this tier1 run (I think) has been JDK-8253081.
20-11-2020

Here's the crashing stack from the jdk-16+26-1668-tier3 sighting: --------------- T H R E A D --------------- Current thread (0x00007fbe90073220): GCTaskThread "GC Thread#0" [stack: 0x00007fbe74ca9000,0x00007fbe74da9000] [id=16516] Stack: [0x00007fbe74ca9000,0x00007fbe74da9000], sp=0x00007fbe74da7ae0, free space=1018k Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x85a7d7] oopDesc::size_given_klass(Klass*)+0x17 V [libjvm.so+0xd71437] HeapRegion::verify(VerifyOption, bool*) const+0x187 V [libjvm.so+0xc71a0b] VerifyRegionClosure::do_heap_region(HeapRegion*)+0xdb V [libjvm.so+0xd806ea] HeapRegionManager::par_iterate(HeapRegionClosure*, HeapRegionClaimer*, unsigned int) const+0x6a V [libjvm.so+0xc6b84d] G1ParVerifyTask::work(unsigned int)+0x3d V [libjvm.so+0x19f89a4] GangWorker::run_task(WorkData)+0x84 V [libjvm.so+0x19f8ae4] GangWorker::loop()+0x44 V [libjvm.so+0x1894210] Thread::call_run()+0x100 V [libjvm.so+0x1574a26] thread_native_entry(Thread*)+0x116 siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x0000000dd56dd5fc
19-11-2020

I also got this running GHA for PR: # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x00007f998894e90d, pid=21426, tid=21451 # # JRE version: OpenJDK Runtime Environment (16.0) (fastdebug build 16-internal+0-dholmes-ora-50ae08472f72c9e92a578670e4411575f0529ef3) # Java VM: OpenJDK 64-Bit Server VM (fastdebug 16-internal+0-dholmes-ora-50ae08472f72c9e92a578670e4411575f0529ef3, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) # Problematic frame: # V [libjvm.so+0xaf490d] HeapRegion::block_size(HeapWordImpl* const*) const+0x16d --------------- S U M M A R Y ------------ Command Line: -Dtest.vm.opts=-XX:MaxRAMPercentage=25 -Djava.io.tmpdir=/home/runner/work/jdk/jdk/build/run-test-prebuilt/test-support/jtreg_test_hotspot_jtreg_tier1_runtime/tmp -Dtest.tool.vm.opts=-J-XX:MaxRAMPercentage=25 -J-Djava.io.tmpdir=/home/runner/work/jdk/jdk/build/run-test-prebuilt/test-support/jtreg_test_hotspot_jtreg_tier1_runtime/tmp -Dtest.compiler.opts= -Dtest.java.opts=-XX:-CreateCoredumpOnCrash -Dtest.jdk=/home/runner/jdk-linux-x64-debug/jdk-16-internal+0_linux-x64_bin-debug/jdk-16/fastdebug -Dcompile.jdk=/home/runner/jdk-linux-x64-debug/jdk-16-internal+0_linux-x64_bin-debug/jdk-16/fastdebug -Dtest.timeout.factor=4.0 -Dtest.nativepath=/home/runner/jdk-linux-x64-debug/jdk-16-internal+0_linux-x64_bin-tests-debug/hotspot/jtreg/native -Dtest.root=/home/runner/work/jdk/jdk/test/hotspot/jtreg -Dtest.name=runtime/handshake/AsyncHandshakeWalkStackTest.java -Dtest.file=/home/runner/work/jdk/jdk/test/hotspot/jtreg/runtime/handshake/AsyncHandshakeWalkStackTest.java -Dtest.src=/home/runner/work/jdk/jdk/test/hotspot/jtreg/runtime/handshake -Dtest.src.path=/home/runner/work/jdk/jdk/test/hotspot/jtreg/runtime/handshake:/home/runner/work/jdk/jdk/test/hotspot/jtreg/testlibrary:/home/runner/work/jdk/jdk/test/lib -Dtest.classes=/home/runner/work/jdk/jdk/build/run-test-prebuilt/test-support/jtreg_test_hotspot_jtreg_tier1_runtime/classes/runtime/handshake/AsyncHandshakeWalkStackTest.d -Dtest.class.path=/home/runner/work/jdk/jdk/build/run-test-prebuilt/test-support/jtreg_test_hotspot_jtreg_tier1_runtime/classes/runtime/handshake/AsyncHandshakeWalkStackTest.d:/home/runner/work/jdk/jdk/build/run-test-prebuilt/test-support/jtreg_test_hotspot_jtreg_tier1_runtime/classes/testlibrary:/home/runner/work/jdk/jdk/build/run-test-prebuilt/test-support/jtreg_test_hotspot_jtreg_tier1_runtime/classes/test/lib -Dtest.class.path.prefix=/home/runner/work/jdk/jdk/build/run-test-prebuilt/test-support/jtreg_test_hotspot_jtreg_tier1_runtime/classes/runtime/handshake/AsyncHandshakeWalkStackTest.d:/home/runner/work/jdk/jdk/test/hotspot/jtreg/runtime/handshake:/home/runner/work/jdk/jdk/build/run-test-prebuilt/test-support/jtreg_test_hotspot_jtreg_tier1_runtime/classes/testlibrary:/home/runner/work/jdk/jdk/build/run-test-prebuilt/test-support/jtreg_test_hotspot_jtreg_tier1_runtime/classes/test/lib -XX:MaxRAMPercentage=25 -Djava.io.tmpdir=/home/runner/work/jdk/jdk/build/run-test-prebuilt/test-support/jtreg_test_hotspot_jtreg_tier1_runtime/tmp -XX:-CreateCoredumpOnCrash -Djava.library.path=/home/runner/jdk-linux-x64-debug/jdk-16-internal+0_linux-x64_bin-tests-debug/hotspot/jtreg/native -Xbootclasspath/a:. -XX:+UnlockDiagnosticVMOptions -XX:+WhiteBoxAPI com.sun.javatest.regtest.agent.MainWrapper /home/runner/work/jdk/jdk/build/run-test-prebuilt/test-support/jtreg_test_hotspot_jtreg_tier1_runtime/runtime/handshake/AsyncHandshakeWalkStackTest.d/main.0.jta Host: fv-az70-540, Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz, 2 cores, 6G, Ubuntu 18.04.5 LTS Time: Thu Nov 19 08:45:15 2020 UTC elapsed time: 0.411586 seconds (0d 0h 0m 0s) --------------- T H R E A D --------------- Current thread (0x00007f9948006600): GCTaskThread "GC Thread#1" [stack: 0x00007f994e5bb000,0x00007f994e6bb000] [id=21451] Stack: [0x00007f994e5bb000,0x00007f994e6bb000], sp=0x00007f994e6b9ad0, free space=1018k Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0xaf490d] HeapRegion::block_size(HeapWordImpl* const*) const+0x16d V [libjvm.so+0xc7592b] HeapRegion::verify(VerifyOption, bool*) const+0x13b V [libjvm.so+0xb84c28] VerifyRegionClosure::do_heap_region(HeapRegion*)+0xc8 V [libjvm.so+0xc833e2] HeapRegionManager::par_iterate(HeapRegionClosure*, HeapRegionClaimer*, unsigned int) const+0x72 V [libjvm.so+0xb8110c] G1ParVerifyTask::work(unsigned int)+0x3c V [libjvm.so+0x1912ebc] GangWorker::loop()+0xac V [libjvm.so+0x17c59d8] Thread::call_run()+0xf8 V [libjvm.so+0x13b118e] thread_native_entry(Thread*)+0x10e siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x0000000dd56dd5fc Register to memory mapping: RAX=0x00007f9989f7b1a0: <offset 0x00000000021211a0> in /home/runner/jdk-linux-x64-debug/jdk-16-internal+0_linux-x64_bin-debug/jdk-16/fastdebug/lib/server/libjvm.so at 0x00007f9987e5a000 RBX= [error occurred during error reporting (printing register info), id 0xb, SIGSEGV (0xb) at pc=0x00007f9988397057] Registers: RAX=0x00007f9989f7b1a0, RBX=0x000000009a100000, RCX=0x0000000000000003, RDX=0x000003000001300d RSP=0x00007f994e6b9ad0, RBP=0x00007f994e6b9b90, RSI=0x000000009a100000, RDI=0x00007f9980041ff0 R8 =0x000000009a100000, R9 =0x00007f99800da150, R10=0x0000000dd56dd5f0, R11=0x0000000000001907 R12=0x00007f998a007970, R13=0x00000000baadbabe, R14=0x00007f99800da150, R15=0x0000000dd56dd5f0 RIP=0x00007f998894e90d, EFLAGS=0x0000000000010246, CSGSFS=0x002b000000000033, ERR=0x0000000000000004 TRAPNO=0x000000000000000e
19-11-2020

0x0000000dd56dd5fc and variants are 0xBAADBABE in a (compressed) pointer. Since it is in the context of block_size(), most likely a bad compressed klass pointer.
19-11-2020

The jdk-16+25-1652-tier5 sighting in sun/tools/jhsdb/HeapDumpTestWithActiveProcess.java crashing with the following thread stack: --------------- T H R E A D --------------- Current thread (0x00007fcac94b9410): GCTaskThread "GC Thread#1" [stack: 0x000070000d1f5000,0x000070000d2f5000] [id=41483] Stack: [0x000070000d1f5000,0x000070000d2f5000], sp=0x000070000d2f4b70, free space=1022k Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.dylib+0x4bea37] oopDesc::size_given_klass(Klass*)+0x17 V [libjvm.dylib+0x709be1] HeapRegion::block_size(HeapWordImpl* const*) const+0x201 V [libjvm.dylib+0x841c80] HeapRegion::verify(VerifyOption, bool*) const+0x220 V [libjvm.dylib+0x77bb94] VerifyRegionClosure::do_heap_region(HeapRegion*)+0x404 V [libjvm.dylib+0x8507f5] HeapRegionManager::par_iterate(HeapRegionClosure*, HeapRegionClaimer*, unsigned int) const+0x135 V [libjvm.dylib+0x77b77a] G1ParVerifyTask::work(unsigned int)+0x3a V [libjvm.dylib+0x11a88fd] GangWorker::run_task(WorkData)+0x5d V [libjvm.dylib+0x11a89e3] GangWorker::loop()+0x53 V [libjvm.dylib+0x109bcb7] Thread::call_run()+0x1b7 V [libjvm.dylib+0xe5c8cf] thread_native_entry(Thread*)+0x15f C [libsystem_pthread.dylib+0x32eb] _pthread_body+0x7e C [libsystem_pthread.dylib+0x6249] _pthread_start+0x42 C [libsystem_pthread.dylib+0x240d] thread_start+0xd siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x0000000dd56dd5fc
19-11-2020

This failure mode has been seen on linux-X64 and macOSX-X64 machines.
19-11-2020

Bumping this from P4 -> P2 since this failure mode has shown up in a Mach5 Tier1 job set.
19-11-2020

Here's a snippet from the log file for the jdk-16+26-1664-tier1 sighting: #section:main ----------messages:(4/319)---------- command: main -Xbootclasspath/a:. -XX:+UnlockDiagnosticVMOptions -XX:+WhiteBoxAPI AsyncHandshakeWalkStackTest reason: User specified action: run main/othervm -Xbootclasspath/a:. -XX:+UnlockDiagnosticVMOptions -XX:+WhiteBoxAPI AsyncHandshakeWalkStackTest Mode: othervm [/othervm specified] elapsed time (seconds): 5.27 ----------configuration:(0/0)---------- ----------System.out:(20/1065)---------- # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x000000010debebb7, pid=83756, tid=38915 # # JRE version: Java(TM) SE Runtime Environment (16.0+26) (fastdebug build 16-ea+26-1664) # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 16-ea+26-1664, mixed mode, sharing, tiered, compressed oops, g1 gc, bsd-amd64) # Problematic frame: # V [libjvm.dylib+0x4bebb7] oopDesc::size_given_klass(Klass*)+0x17 # # Core dump will be written. Default location: core.83756 # Unsupported internal testing APIs have been used. # An error report file with more information is saved as: # /System/Volumes/Data/mesos/work_dir/slaves/4076d11c-c6ed-4d07-84c1-4ab8d55cd975-S435659/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/839ad194-6889-402a-8a67-84beae92e26d/runs/9ff450f7-b278-4f64-b1c5-b9a73a2b4768/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_runtime/scratch/0/hs_err_pid83756.log # # If you would like to submit a bug report, please visit: # https://bugreport.java.com/bugreport/crash.jsp # ----------System.err:(0/0)---------- ----------rerun:(36/8552)*---------- Here's the crashing thread's stack: --------------- T H R E A D --------------- Current thread (0x00007ff1b9549180): GCTaskThread "GC Thread#5" [stack: 0x0000700005c83000,0x0000700005d83000] [id=38915] Stack: [0x0000700005c83000,0x0000700005d83000], sp=0x0000700005d82c10, free space=1023k Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.dylib+0x4bebb7] oopDesc::size_given_klass(Klass*)+0x17 V [libjvm.dylib+0x709d11] HeapRegion::block_size(HeapWordImpl* const*) const+0x201 V [libjvm.dylib+0x843210] HeapRegion::verify(VerifyOption, bool*) const+0x220 V [libjvm.dylib+0x77b904] VerifyRegionClosure::do_heap_region(HeapRegion*)+0x404 V [libjvm.dylib+0x851d85] HeapRegionManager::par_iterate(HeapRegionClosure*, HeapRegionClaimer*, unsigned int) const+0x135 V [libjvm.dylib+0x77b4ea] G1ParVerifyTask::work(unsigned int)+0x3a V [libjvm.dylib+0x11ab0cd] GangWorker::run_task(WorkData)+0x5d V [libjvm.dylib+0x11ab1b3] GangWorker::loop()+0x53 V [libjvm.dylib+0x109e847] Thread::call_run()+0x1b7 V [libjvm.dylib+0xe5f51f] thread_native_entry(Thread*)+0x15f C [libsystem_pthread.dylib+0x6109] _pthread_start+0x94 C [libsystem_pthread.dylib+0x1b8b] thread_start+0xf siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x0000000dd56dd5fc
19-11-2020