JDK-8288970 : G1 does not keep weak nmethod oops alive
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 8,11,17,18,19,20
  • Priority: P2
  • Status: Resolved
  • Resolution: Not an Issue
  • Submitted: 2022-06-22
  • Updated: 2022-11-28
  • Resolved: 2022-11-01
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 20
20Resolved
Related Reports
Blocks :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Description
Summary after investigation (details can be found in the comment section):
This is an issue with the implementation of the @Stable annotation in scenarios where the field is initialized multiple times due to a race condition between threads. In this case, indy string concat code lazily initializes the stable NEW_STRING field holding a MethodHandle but multiple threads can succeed in writing to the field, see (https://github.com/openjdk/jdk/blob/5cdb4b196047d4f2d69df0fc73102c102bf042f7/src/java.base/share/classes/java/lang/invoke/StringConcatFactory.java#L855). Now according to the comments in Stable.java, this could simply be treated as a user error:

 * A field may be annotated as stable if all of its component variables
 * changes value at most once.
[...]
 * It is (currently) undefined what happens if a field annotated as stable
 * is given a third value (by explicitly updating a stable field, a component of
 * a stable array, or a final stable field via reflection or other means).
 * Since the HotSpot VM promotes a non-null component value to constant, it may
 * be that the Java memory model would appear to be broken, if such a constant
 * (the second value of the field) is used as the value of the field even after
 * the field value has changed (to a third value).

However, in this particular scenario, C2's FoldStableValues optimization will constant fold the load at compile time and embed the current oop value in compiled code. If the field is then overwritten, the compiled code is the only (weak) reference to the MethodHandle object. G1 does not keep such references alive, leading to a dead oop and corresponding asserts/crashes.


Running runtime/cds/appcds/dynamicArchive/LotsUnloadTest with -XX:+VerifyBeforeGC -XX:+VerifyAfterGC -XX:+VerifyDuringGC shows some references to dead oops (java.lang.invoke.BoundMethodHandle$Species_L) in some stack frames.

Use the attached patch to improve reproducability. Does not reproduce with c1 (TieredStopAtLevel=1) or interpreter (-Xint)

The change to use single bitmaps in G1 (JDK-8210708) made this to occur a bit more often, but it has been reproduced in master too.

This issue started with JDK  17b13, JDK-8219555 that "increased the aggressiveness of C2 compilation".

# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (workspace/open/src/hotspot/share/gc/g1/g1HeapVerifier.cpp:500), pid=3070816, tid=3070823
#  fatal error: there should not have been any failures
#
# JRE version: Java(TM) SE Runtime Environment (20.0+9) (fastdebug build 20-ea+9-487)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 20-ea+9-487, compiled mode, tiered, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0xe1413f]  G1HeapVerifier::verify(VerifyOption)+0x4cf
#
# Core dump will be written. Default location: Core dumps may be processed with "/opt/core.sh %p" (or dumping to /opt/mach5/mesos/work_dir/slaves/0c72054a-24ab-4dbb-944f-97f9341a1b96-S45803/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5b60ce54-d431-4916-ac28-d78c20d6446b/runs/099177ec-b020-48a9-af28-e9f618d83cec/testoutput/test-support/jtreg_open_test_hotspot_jtreg_runtime_cds_appcds_dynamicArchive_LotsUnloadTest3_java/scratch/0/core.3070816)
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
#

Stack: [0x00007f8b7da2a000,0x00007f8b7db2a000],  sp=0x00007f8b7db27db0,  free space=1015k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0xe1413f]  G1HeapVerifier::verify(VerifyOption)+0x4cf
V  [libjvm.so+0x1adb2cb]  Universe::verify(VerifyOption, char const*)+0x74b
V  [libjvm.so+0xda2863]  G1ConcurrentMark::verify_during_pause(G1HeapVerifier::G1VerifyType, G1ConcurrentMark::VerifyLocation)+0xf3
V  [libjvm.so+0xda860d]  G1ConcurrentMark::remark()+0x45d
V  [libjvm.so+0xe90dee]  VM_G1PauseConcurrent::doit()+0x18e
V  [libjvm.so+0x1b848a2]  VM_Operation::evaluate()+0x182
V  [libjvm.so+0x1ba969a]  VMThread::evaluate_operation(VM_Operation*)+0x18a
V  [libjvm.so+0x1baaf3f]  VMThread::inner_execute(VM_Operation*)+0x40f
V  [libjvm.so+0x1bab115]  VMThread::loop()+0xc5
V  [libjvm.so+0x1bab230]  VMThread::run()+0xb0
V  [libjvm.so+0x1a8ebf0]  Thread::call_run()+0x100
V  [libjvm.so+0x174c663]  thread_native_entry(Thread*)+0x103

[7.924s][error][gc,verify ] GC(25) Root location 0x00007f8b7cf19b20 points to dead obj 0x00007f8b8ce00c90 in region 2:(O)[0x00007f8b8ce00000,0x00007f8b8ceebb00,0x00007f8b8cf00000]
[7.924s][error][gc,verify ] GC(25) java.lang.invoke.BoundMethodHandle$Species_L 
[7.924s][error][gc,verify ] GC(25) {0x00007f8b8ce00c90} - klass: 'java/lang/invoke/BoundMethodHandle$Species_L'
[7.924s][error][gc,verify ] GC(25)  - ---- fields (total size 7 words):
[7.924s][error][gc,verify ] GC(25)  - private 'customizationCount' 'B' @12  0
[7.924s][error][gc,verify ] GC(25)  - private volatile 'updateInProgress' 'Z' @13  false
[7.924s][error][gc,verify ] GC(25)  - private final 'type' 'Ljava/lang/invoke/MethodType;' @16  a 'java/lang/invoke/MethodType'{0x00007f8b8ce0ac28} = ([BJ)Ljava/lang/String; (8ce0ac28 7f8b)
[7.924s][error][gc,verify ] GC(25)  - final 'form' 'Ljava/lang/invoke/LambdaForm;' @24  a 'java/lang/invoke/LambdaForm'{0x00007f8b8ce009c8} => a 'java/lang/invoke/MemberName'{0x00007f8b8ce220b0} = {method} {0x00007f8b745cf2e0} 'reinvoke_L' '(Ljava/lang/Object;Ljava/lang/Object;J)Ljava/lang/Object;' in 'java/lang/invoke/DelegatingMethodHandle$Holder' (8ce009c8 7f8b)
[7.924s][error][gc,verify ] GC(25)  - private 'asTypeCache' 'Ljava/lang/invoke/MethodHandle;' @32  NULL (0 0)
[7.924s][error][gc,verify ] GC(25)  - private 'asTypeSoftCache' 'Ljava/lang/ref/SoftReference;' @40  NULL (0 0)
[7.925s][error][gc,verify ] GC(25)  - final 'argL0' 'Ljava/lang/Object;' @48  a 'java/lang/invoke/DirectMethodHandle'{0x00007f8b8ce13940} (8ce13940 7f8b)
[7.938s][error][gc,verify ] GC(25) Heap after failed verification (kind 0):
[7.938s][error][gc,verify ] GC(25)  garbage-first heap   total 65536K, used 46035K [0x00007f8b8cc00000, 0x00007f8b90c00000)
[7.938s][error][gc,verify ] GC(25)   region size 1024K, 2 young (2048K), 1 survivors (1024K)
[7.938s][error][gc,verify ] GC(25)  Metaspace       used 7479K, committed 7680K, reserved 1114112K
[7.938s][error][gc,verify ] GC(25)   class space    used 518K, committed 640K, reserved 1048576K
Comments
This is no longer an issue in JDK 20. So I'm closing this as not an issue.
01-11-2022

[~lujaniuk] reproducer without stable (on 11, and if repeated several times, on 8): compile_and_test.sh : $BASE/jdk-11.0.17/bin/javac Example.java \ && \ $BASE/jdk-11.0.17/bin/java \ -XX:+UnlockDiagnosticVMOptions \ -XX:CompileCommand=quiet \ -XX:CompileCommand=compileonly,Example::test \ -XX:-TieredCompilation \ -Xbootclasspath/a:. \ -XX:+VerifyDuringGC \ -XX:+VerifyBeforeGC \ -XX:+VerifyAfterGC \ -Xmx10m \ Example Example.java: import java.lang.reflect.*; public class Example { static Object[] garbageObjs = new Object[100_000]; final static Object finalField = null; static Object test() { return finalField; // Constant folded (always?) } public static void main(String[] args) throws Exception { final Field ourField = Example.class.getDeclaredField("finalField"); ourField.setAccessible(true); Field modifiersField = Field.class.getDeclaredField("modifiers"); modifiersField.setAccessible(true); modifiersField.setInt(ourField, ourField.getModifiers() & ~Modifier.FINAL); for (int i = 0; i < 10_000_000; ++i) { garbageObjs[i % 100_000] = new Object(); test(); ourField.set(finalField, i); } } } code adapted from [~thartmann] idea by [~eosterlund] Hartmann's original confirmed reproducing on 17, 11.
19-09-2022

It would be good to have a few other reproducers: * a reproducer that doesn't use `@Stable` (instead, perhaps `static final`) * a reproducer that works on jdk8. This would help clarify the extent and severity of the problem.
29-08-2022

The new nmethod liveness model doesn't have this problem at all, due to always using nmethod entry barriers that keep the oops alive during concurrent marking.
26-08-2022

After JDK-8290025 this has been fixed in mainline.
26-08-2022

Thanks Erik. I'm moving this back to hotspot/gc.
09-08-2022

Fixed with --enable-preview and this patch: diff --git a/src/hotspot/share/gc/shared/barrierSetNMethod.cpp b/src/hotspot/share/gc/shared/barrierSetNMethod.cpp index f8540d2115f..3d790f0d499 100644 --- a/src/hotspot/share/gc/shared/barrierSetNMethod.cpp +++ b/src/hotspot/share/gc/shared/barrierSetNMethod.cpp @@ -41,7 +41,7 @@ class LoadPhantomOopClosure : public OopClosure { public: virtual void do_oop(oop* p) { - NativeAccess<ON_PHANTOM_OOP_REF>::oop_load(p); + oop val = NativeAccess<ON_PHANTOM_OOP_REF>::oop_load(p); } virtual void do_oop(narrowOop* p) { ShouldNotReachHere(); } }; This confirms my suspicion that since G1 lets weak nmethod oops violate SATB invariants normally, this crashes, and that with appropriate nmethod entry barriers, it no longer breaks. However, the nmethod entry barrier code needs a fix to actually keep things alive as intended with G1.
08-08-2022

Without GC verification, we crash: # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x00007f874a1d0154, pid=298891, tid=298909 # # JRE version: Java(TM) SE Runtime Environment (20.0) (fastdebug build 20-internal-2022-07-28-1256564.tobias...) # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 20-internal-2022-07-28-1256564.tobias..., mixed mode, compressed oops, compressed class ptrs, g1 gc, linux-amd64) # Problematic frame: # V [libjvm.so+0x688154] oopDesc::size_given_klass(Klass*)+0x14 Stack: [0x00007f87319fa000,0x00007f8731afa000], sp=0x00007f8731af86a0, free space=1017k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x688154] oopDesc::size_given_klass(Klass*)+0x14 V [libjvm.so+0xe628f3] G1ParScanThreadState::do_copy_to_survivor_space(G1HeapRegionAttr, oop, markWord)+0x63 V [libjvm.so+0xe63a82] G1ParScanThreadState::copy_to_survivor_space(G1HeapRegionAttr, oop, markWord)+0x32 V [libjvm.so+0xea52f4] void G1ParCopyClosure<(G1Barrier)2, false>::do_oop_work<oop>(oop*)+0x114 V [libjvm.so+0xda80e1] void G1CodeBlobClosure::HeapRegionGatheringOopClosure::do_oop_work<oop>(oop*)+0x21 V [libjvm.so+0x170312f] nmethod::oops_do(OopClosure*, bool)+0x22f V [libjvm.so+0xda7c80] G1NmethodProcessor::do_regular_processing(nmethod*)+0x20 V [libjvm.so+0x17041bc] nmethod::oops_do_process_weak(nmethod::OopsDoProcessor*)+0x2c V [libjvm.so+0xda63f5] G1CodeBlobClosure::do_code_blob(CodeBlob*)+0x55 V [libjvm.so+0xdaa6de] G1CodeRootSet::nmethods_do(CodeBlobClosure*) const+0x4e V [libjvm.so+0xe95db7] G1ScanCollectionSetRegionClosure::do_heap_region(HeapRegion*)+0xf7 V [libjvm.so+0xdb0f79] G1CollectedHeap::par_iterate_regions_array(HeapRegionClosure*, HeapRegionClaimer*, unsigned int const*, unsigned long, unsigned int) const+0x189 V [libjvm.so+0xe7b5c9] G1RemSet::scan_collection_set_regions(G1ParScanThreadState*, unsigned int, G1GCPhaseTimes::GCParPhases, G1GCPhaseTimes::GCParPhases, G1GCPhaseTimes::GCParPhases)+0x99 V [libjvm.so+0xec1e1d] G1EvacuateRegionsBaseTask::work(unsigned int)+0x10d V [libjvm.so+0x1c2c4e1] WorkerThread::run()+0x81 V [libjvm.so+0x1ad2820] Thread::call_run()+0x100 V [libjvm.so+0x178fab4] thread_native_entry(Thread*)+0x104
08-08-2022

I was able to write a simple reproducer that triggers the issue reliably on my machine (see attached Test.java). Run with: javac --add-exports java.base/jdk.internal.vm.annotation=ALL-UNNAMED Test.java java -XX:CompileCommand=quiet -XX:CompileCommand=compileonly,Test::test -XX:-TieredCompilation -Xbootclasspath/a:. -XX:+VerifyDuringGC -XX:+VerifyBeforeGC -XX:+VerifyAfterGC -Xmx10m Test [5,646s][error][gc,verify] GC(65) Root location 0x00007f7068029fc8 points to dead obj 0x00000000ff90a840 in region 3:(O)[0x00000000ff900000,0x00000000ffa00000,0x00000000ffa00000] [5,646s][error][gc,verify] GC(65) java.lang.Integer [5,646s][error][gc,verify] GC(65) {0x00000000ff90a840} - klass: 'java/lang/Integer' [5,646s][error][gc,verify] GC(65) Hash: 0x0000000056fb9f75 [5,646s][error][gc,verify] GC(65) - ---- fields (total size 2 words): [5,646s][error][gc,verify] GC(65) - private final 'value' 'I' @12 1662372 (195da4) [5,667s][error][gc,verify] GC(65) Heap after failed verification (kind 0): [5,667s][error][gc,verify] GC(65) garbage-first heap total 10240K, used 5714K [0x00000000ff600000, 0x0000000100000000) [5,667s][error][gc,verify] GC(65) region size 1024K, 2 young (2048K), 1 survivors (1024K) [5,667s][error][gc,verify] GC(65) Metaspace used 6723K, committed 6848K, reserved 1114112K [5,667s][error][gc,verify] GC(65) class space used 547K, committed 640K, reserved 1048576K [5,667s][error][gc,verify] GC(65) [5,667s][error][gc,verify] GC(65) Heap Regions: E=young(eden), S=young(survivor), O=old, HS=humongous(starts), HC=humongous(continues), CS=collection set, F=free, OA=open archive, CA=closed archive, TAMS=top-at-mark-start, PB=parsable bottom [5,667s][error][gc,verify] GC(65) | 0|0x00000000ff600000, 0x00000000ff600000, 0x00000000ff700000| 0%| F| |TAMS 0x00000000ff600000| PB 0x00000000ff600000| Untracked [5,667s][error][gc,verify] GC(65) | 1|0x00000000ff700000, 0x00000000ff800000, 0x00000000ff800000|100%| O| |TAMS 0x00000000ff800000| PB 0x00000000ff800000| Updating [5,667s][error][gc,verify] GC(65) | 2|0x00000000ff800000, 0x00000000ff900000, 0x00000000ff900000|100%| O| |TAMS 0x00000000ff900000| PB 0x00000000ff900000| Updating [5,667s][error][gc,verify] GC(65) | 3|0x00000000ff900000, 0x00000000ffa00000, 0x00000000ffa00000|100%| O| |TAMS 0x00000000ffa00000| PB 0x00000000ffa00000| Updating [5,667s][error][gc,verify] GC(65) | 4|0x00000000ffa00000, 0x00000000ffb00000, 0x00000000ffb00000|100%| O| |TAMS 0x00000000ffb00000| PB 0x00000000ffb00000| Updating [5,667s][error][gc,verify] GC(65) | 5|0x00000000ffb00000, 0x00000000ffb94a00, 0x00000000ffc00000| 58%| O| |TAMS 0x00000000ffb0c600| PB 0x00000000ffb0c600| Updating [5,667s][error][gc,verify] GC(65) | 6|0x00000000ffc00000, 0x00000000ffc00000, 0x00000000ffd00000| 0%| F| |TAMS 0x00000000ffc00000| PB 0x00000000ffc00000| Untracked [5,667s][error][gc,verify] GC(65) | 7|0x00000000ffd00000, 0x00000000ffe00000, 0x00000000ffe00000|100%| S|CS|TAMS 0x00000000ffd00000| PB 0x00000000ffd00000| Complete [5,667s][error][gc,verify] GC(65) | 8|0x00000000ffe00000, 0x00000000ffe00000, 0x00000000fff00000| 0%| F| |TAMS 0x00000000ffe00000| PB 0x00000000ffe00000| Untracked [5,667s][error][gc,verify] GC(65) | 9|0x00000000fff00000, 0x00000000fff968f0, 0x0000000100000000| 58%| E| |TAMS 0x00000000fff00000| PB 0x00000000fff00000| Complete # To suppress the following error report, specify this argument # after -XX: or in .hotspotrc: SuppressErrorAt=/g1HeapVerifier.cpp:500 # # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (/oracle/jdk3/open/src/hotspot/share/gc/g1/g1HeapVerifier.cpp:500), pid=297761, tid=297768 # fatal error: there should not have been any failures # # JRE version: Java(TM) SE Runtime Environment (20.0) (fastdebug build 20-internal-2022-07-28-1256564.tobias...) # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 20-internal-2022-07-28-1256564.tobias..., mixed mode, compressed oops, compressed class ptrs, g1 gc, linux-amd64) # Problematic frame: # V [libjvm.so+0xe3f68b] G1HeapVerifier::verify(VerifyOption)+0x4fb I noticed that class unloading often detects the nmethod with the dead oop and unloads it (see nmethod::is_unloading), preventing the verification failure. With -XX:-ClassUnloading, the issue triggers faster. After some recompilations of the test method, we still assert at some point though. I assume that's because the unloading code is not executed during all GCs.
08-08-2022

I checked the core file and verified that the @Stable static NEW_STRING field is indeed overwritten with a different oop after the C2 compiled code is installed. I.e., the oop embedded in the C2 compiled code is the only reference to the object. Now that embedded oop is updated by the GC once from here: V [libjvm.so+0xda7c40] G1NmethodProcessor::do_regular_processing(nmethod*)+0x20 V [libjvm.so+0x17041dc] nmethod::oops_do_process_weak(nmethod::OopsDoProcessor*)+0x2c V [libjvm.so+0xda63b5]Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0xda7c40] G1CodeBlobClosure::do_code_blob(CodeBlob*)+0x55 V [libjvm.so+0xdaa69e] G1CodeRootSet::nmethods_do(CodeBlobClosure*) const+0x4e V [libjvm.so+0xe95d77] G1ScanCollectionSetRegionClosure::do_heap_region(HeapRegion*)+0xf7 V [libjvm.so+0xdb0f39] G1CollectedHeap::par_iterate_regions_array(HeapRegionClosure*, HeapRegionClaimer*, unsigned int const*, unsigned long, unsigned int) const+0x189 V [libjvm.so+0xe7b589] G1RemSet::scan_collection_set_regions(G1ParScanThreadState*, unsigned int, G1GCPhaseTimes::GCParPhases, G1GCPhaseTimes::GCParPhases, G1GCPhaseTimes::GCParPhases)+0x99 V [libjvm.so+0xec1ddd] G1EvacuateRegionsBaseTask::work(unsigned int)+0x10d V [libjvm.so+0x1c2c861] G1NmethodProcessor::do_regular_processing(nmethod*)+0x20 V [libjvm.so+0x17041dc] WorkerThread::run()+0x81 V [libjvm.so+0x1ad2ba0] nmethod::oops_do_process_weak(nmethod::OopsDoProcessor*)+0x2c V [libjvm.so+0xda63b5] Thread::call_run()+0x100 V [libjvm.so+0x178fdc4] G1CodeBlobClosure::do_code_blob(CodeBlob*)+0x55 V [libjvm.so+0xdaa69e] thread_native_entry(Thread*)+0x104 And again from here: V [libjvm.so+0xda7c40] G1NmethodProcessor::do_regular_processing(nmethod*)+0x20 V [libjvm.so+0x17041dc] nmethod::oops_do_process_weak(nmethod::OopsDoProcessor*)+0x2c V [libjvm.so+0xda63b5] G1CodeBlobClosure::do_code_blob(CodeBlob*)+0x55 V [libjvm.so+0xdaa69e] G1CodeRootSet::nmethods_do(CodeBlobClosure*) const+0x4e V [libjvm.so+0xe95d77] G1ScanCollectionSetRegionClosure::do_heap_region(HeapRegion*)+0xf7 V [libjvm.so+0xdb0f39] G1CollectedHeap::par_iterate_regions_array(HeapRegionClosure*, HeapRegionClaimer*, unsigned int const*, unsigned long, unsigned int) const+0x189 V [libjvm.so+0xe7b589] G1RemSet::scan_collection_set_regions(G1ParScanThreadState*, unsigned int, G1GCPhaseTimes::GCParPhases, G1GCPhaseTimes::GCParPhases, G1GCPhaseTimes::GCParPhases)+0x99 V [libjvm.so+0xec1ddd] G1EvacuateRegionsBaseTask::work(unsigned int)+0x10d V [libjvm.so+0x1c2c861] WorkerThread::run()+0x81 V [libjvm.so+0x1ad2ba0] Thread::call_run()+0x100 V [libjvm.so+0x178fdc4] thread_native_entry(Thread*)+0x104 But the corresponding object still dies. As a quick hack, I wrapped the embedded oop in a global JNI handle via JNIHandles::make_global too keep it alive and that worked just fine. All this seems to support my hypothesis and your analysis [~eosterlund]. Unfortunately, I don't know the G1 code well enough to tell why your proposed solution does not work. I'll try to come up with a simpler reproducer next week to be able to properly debug this locally.
05-08-2022

Thanks for the background, [~eosterlund]. Unfortunately, the issue still reproduces both with --enable-preview as well as with PR9741. I can reproduce it quite reliably but only on Mach5 and only with many runs. I'll keep digging.
05-08-2022

Nice digging [~thartmann]. Is the reproducer easy? If so I wonder if it still happens with --enable-preview, or after https://github.com/openjdk/jdk/pull/9741 ? Why you wonder? Great question. The bug seems to only happen on G1, right? Well it has bugged me for a long time that the treatment of oops in nmethods violate G1 SATB invariants. Nmethods on-stack are strong roots, but nmethods not on-stack are weak (phantom strength). In order to use such a weak oop, you have to keep it alive first. We do that for all non-strong references in the JVM, except nmethod oops. When we with G1 during concurrent marking call into an nmethod the first time, it becomes on-stack, but the oops are not kept alive. The implication is that for G1, and G1 alone, it matters how constant oops of nmethods are used, if it is valid or not. If the oop is somehow kept alive in a different way, e.g. through the constant pool holder, then it is okay. I suspect that in this case, it is not okay in fact, in the scenario that you describe. Now why does this not happen for ZGC? Because we wanted a more robust solution, and have nmethod entry barriers force the non-strong oops to be kept alive when we call an nmethod during marking. For a while I have wanted G1 to do the same thing, so that the model is more robust. So I have snuck it in to the loom changes. When G1 runs with nmethod entry barriers, which it needs now with --enable-preview, it also keeps the non-strong oops of an nmethod alive, when it first gets called during concurrent marking. With JDK-8290025 I am making that the default behaviour. That might fix this bug. Not sure it does, but it would certainly be interesting to try out.
04-08-2022

The caller StringConcatFactory::generateMHInlineCopy() creates the BoundMethodHandle via a call to newString() and directly passes it to dropArgumentsTrusted(): https://github.com/openjdk/jdk/blob/5cdb4b196047d4f2d69df0fc73102c102bf042f7/src/java.base/share/classes/java/lang/invoke/StringConcatFactory.java#L524 The newString() method caches the BoundMethodHandle in the @Stable static field NEW_STRING: https://github.com/openjdk/jdk/blob/5cdb4b196047d4f2d69df0fc73102c102bf042f7/src/java.base/share/classes/java/lang/invoke/StringConcatFactory.java#L855 Disabling FoldStableValues selectively for just that field makes the issue disappear. I added additional logging to print oop->identity_hash() for the folded load of the constant oop and the hash for the one in the C2 compiled newString() method is equivalent to the dead obj (0x000000005c570a66): {method} - this oop: 0x00007faebc39fb08 - method holder: public final synchronized 'java/lang/invoke/StringConcatFactory' - constants: 0x00007faebc39c028 constant pool [623] {0x00007faebc39c028} for public final synchronized 'java/lang/invoke/StringConcatFactory' cache=0x00007faebc3a0930 - access: 0x8100000a private static - name: 'newString' [...] ------------------------ OptoAssembly for Compile_id = 1122 ----------------------- # # java/lang/invoke/MethodHandle:BotPTR * ( ) # 000 N1: # out( B1 ) <- in( B1 ) Freq: 1 IDom: 0/#1 RegPressure: 0 IHRP Index: 1 FRegPressure: 0 FHRP Index: 1 000 B1: # out( N1 ) <- BLOCK HEAD IS JUNK Freq: 1 IDom: 0/#2 RegPressure: 1 IHRP Index: 9 FRegPressure: 0 FHRP Index: 9 000 # stack bang (96 bytes) pushq rbp # Save rbp subq rsp, #16 # Create frame 00c movq RAX, java/lang/invoke/BoundMethodHandle$Species_L java.lang.invoke.BoundMethodHandle$Species_L {0x00007faed654b2c8} Hash: 0x000000005c570a66 [...] 016 addq rsp, 16 # Destroy frame popq rbp cmpq rsp, poll_offset[r15_thread] ja #safepoint_stub # Safepoint: poll for GC 028 ret [5.966s][error ][gc,verify ] GC(9) Root location 0x00007faec47e19f0 points to dead obj 0x00007faed58017e8 in region 16:(O)[0x00007faed5800000,0x00007faed58f5000,0x00007faed5900000] [5.966s][error ][gc,verify ] GC(9) java.lang.invoke.BoundMethodHandle$Species_L [5.966s][error ][gc,verify ] GC(9) {0x00007faed58017e8} - klass: final synchronized 'java/lang/invoke/BoundMethodHandle$Species_L' [5.966s][error ][gc,verify ] GC(9) Hash: 0x000000005c570a66 The oop in compiled code (0x00007faed654b2c8) is not the same as the one reported by GC verification (0x00007faed58017e8) but in the core file it is. That means that the oop has been updated after the code was installed (during compilation, constant oops are kept alive via ciObject::_handle). My current working hypothesis is that during startup, there is a race between multiple threads that find the NEW_STRING field to be null, create a new BoundMethodHandle and write it to the field. In parallel, C2 compilation of the newString() method happens and one of these objects is used as constant in compiled code. However, another thread overwrites the field just after and as a result, the only reference to that object is the constant in compiled code. For some reason, the GC now misses that reference and treats the object as dead. Also, no deoptimization happened for the 'newString" method. It's still alive once we hit the GC verification failure.
04-08-2022

Thanks Erik, I don't see how this issue could be related to JDK-8242115 though. The problematic oop (0x00007fd2b1e02d58) is on the stack of C2 compiled "dropArgumentsTrusted": 0x00007fd2b0dfda20: 0x00007fd2b423abc8 #6 nmethod 0x00007fd2bcb47310 for method J java.lang.invoke.MethodHandles.dropArgumentsTrusted(Ljava/lang/invoke/MethodHandle;I[Ljava/lang/Class;)Ljava/lang/invoke/MethodHandle; unextended_sp for #7 sp for #7 0x00007fd2b0dfda18: 0x00007fd2bcbae660 return address 0x00007fd2b0dfda10: 0x0000000000000001 saved fp 0x00007fd2b0dfda08: 0x00007fd2b423aad0 0x00007fd2b0dfda00: 0x0000000100000002 0x00007fd2b0dfd9f8: 0x00007fd2b423ab10 0x00007fd2b0dfd9f0: 0x0000000000000001 0x00007fd2b0dfd9e8: 0x00007fd2bcb5655d 0x00007fd2b0dfd9e0: 0x00007fd2b1e352e8 0x00007fd2b0dfd9d8: 0x00007fd2b423ab90 0x00007fd2b0dfd9d0: 0x00007fd2b1e02d58 <<<----- Points to dead object 0x00007fd2b0dfd9c8: 0x00007fd2b423abc8 0x00007fd2b0dfd9c0: 0x0000000100000002 #5 nmethod 0x00007fd2bcae0510 for method J java.lang.invoke.MethodType.insertParameterTypes(I[Ljava/lang/Class;)Ljava/lang/invoke/MethodType; unextended_sp for #6 sp for #6 0x00007fd2b0dfd9b8: 0x00007fd2bcb47588 return address 0x00007fd2b0dfd9b0: 0x00007fd2b1e13520 saved fp [7.092s][error][gc,verify ] GC(11) Root location 0x00007fd2b0dfd9d0 points to dead obj 0x00007fd2b1e02d58 in region 16:(O)[0x00007fd2b1e00000,0x00007fd2b1ef5000,0x00007fd2b1f00000] [7.092s][error][gc,verify ] GC(11) java.lang.invoke.BoundMethodHandle$Species_L It comes from the MethodHandle argument being spilled to 0x10(%rsp) and the corresponding code looks sane (spill happens at 0x00007f8dd0ad12f4 and we are at 0x00007f8dd0ad1327): [Verified Entry Point] # {method} {0x00007f8bb42ad458} 'dropArgumentsTrusted' '(Ljava/lang/invoke/MethodHandle;I[Ljava/lang/Class;)Ljava/lang/invoke/MethodHandle;' in 'java/lang/invoke/MethodHandles' # parm0: rsi:rsi = 'java/lang/invoke/MethodHandle' # parm1: rdx = int # parm2: rcx:rcx = '[Ljava/lang/Class;' # [sp+0x60] (sp of caller) ;; N1: # out( B1 ) <- in( B50 B46 B51 B52 B47 B32 B49 B63 B44 B48 ) Freq: 1 ;; B1: # out( B50 B2 ) <- BLOCK HEAD IS JUNK Freq: 1 0x00007f8dd0ad12e0: mov %eax,-0x18000(%rsp) 0x00007f8dd0ad12e7: push %rbp 0x00007f8dd0ad12e8: sub $0x50,%rsp ;*synchronization entry ; - java.lang.invoke.MethodHandles::dropArgumentsTrusted@-1 (line 5269) 0x00007f8dd0ad12ec: mov %rcx,0x8(%rsp) 0x00007f8dd0ad12f1: mov %edx,(%rsp) 0x00007f8dd0ad12f4: mov %rsi,0x10(%rsp) 0x00007f8dd0ad12f9: nop 0x00007f8dd0ad12fa: nop 0x00007f8dd0ad12fb: nop 0x00007f8dd0ad12fc: nop 0x00007f8dd0ad12fd: nop 0x00007f8dd0ad12fe: nop 0x00007f8dd0ad12ff: nop 0x00007f8dd0ad1300: mov 0x10(%rsi),%rbp ; implicit exception: dispatches to 0x00007f8dd0ad16d4 ;*getfield type {reexecute=0 rethrow=0 return_oop=0} ; - java.lang.invoke.MethodHandle::type@1 (line 468) ; - java.lang.invoke.MethodHandles::dropArgumentsTrusted@1 (line 5269) ;; B2: # out( B57 B3 ) <- in( B1 ) Freq: 0,999999 0x00007f8dd0ad1304: mov %rbp,%rsi 0x00007f8dd0ad1307: callq 0x00007f8dd06625a0 ; ImmutableOopMap {rbp=Oop [8]=Oop [16]=Oop } ;*invokestatic dropArgumentChecks {reexecute=0 rethrow=0 return_oop=0} ; - java.lang.invoke.MethodHandles::dropArgumentsTrusted@8 (line 5270) ; {static_call} ;; B3: # out( B46 B4 ) <- in( B2 ) Freq: 0,999979 0x00007f8dd0ad130c: mov %eax,0x4(%rsp) 0x00007f8dd0ad1310: test %rbp,%rbp 0x00007f8dd0ad1313: je 0x00007f8dd0ad1632 ;; B4: # out( B56 B5 ) <- in( B3 ) Freq: 0,999978 0x00007f8dd0ad1319: mov %rbp,%rsi 0x00007f8dd0ad131c: mov (%rsp),%edx 0x00007f8dd0ad131f: mov 0x8(%rsp),%rcx 0x00007f8dd0ad1324: nop 0x00007f8dd0ad1325: nop 0x00007f8dd0ad1326: nop 0x00007f8dd0ad1327: callq 0x00007f8dd0661fa0 ; ImmutableOopMap {[8]=Oop [16]=Oop } ;*invokevirtual insertParameterTypes {reexecute=0 rethrow=0 return_oop=1} ; - java.lang.invoke.MethodHandles::dropArgumentsTrusted@16 (line 5271) ; {optimized virtual_call} The call to "insertParameterTypes" has an oop map that refers to the stack slot containing the oop ([16]=Oop) and therefore the GC should know about it: ImmutableOopMap {[8]=Oop [16]=Oop }
04-08-2022

[~dlong], yes, I'm trying to narrow it down.
04-08-2022

[~thartmann], isn't it possible that the oop became dead before it was passed to dropArgumentsTrusted()? So the problem could be in the caller.
03-08-2022

If I wrote the verification code again, I think I would only follow the data dependencies of the nodes without numbering nodes. Not sure how reliable it is. But I’m almost fully certain that the actual code is incorrect, just not entirely sure how best to demonstrate that.
02-08-2022

[~eosterlund] your verification code already fails for me at build time. The offending node is: 109 CallStaticJavaDirect === 111 0 148 19 0 119 935 0 0 927 887 0 930 0 0 0 0 0 927 0 0 931 933 151 [[ 110 108 153 158 383 831 ]] Static wrapper for: _new_array_Java # rawptr:NotNull ( java/lang/Object:NotNull *, int ) C=0.000100 Arrays::copyOf @ bci:6 (line 3481) reexecute ArrayList::grow @ bci:37 (line 237) ArrayList::grow @ bci:7 (line 244) ArrayList::add @ bci:7 (line 454) ArrayList::add @ bci:20 (line 467) !jvms: Arrays::copyOf @ bci:6 (line 3481) ArrayList::grow @ bci:37 (line 237) ArrayList::grow @ bci:7 (line 244) ArrayList::add @ bci:7 (line 454) ArrayList::add @ bci:20 (line 467) When disabling it during the build, I'm hitting this: # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (/workspace/open/src/hotspot/share/opto/node.hpp:871), pid=959377, tid=959401 # assert(is_DecodeN()) failed: invalid node class: LoadP Current CompileTask: C2: 84 4 b 4 java.lang.invoke.MethodHandleStatics::<clinit> (224 bytes) Stack: [0x00007f05a1a00000,0x00007f05a1b00000], sp=0x00007f05a1afc240, free space=1008k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0xd90682] G1BarrierSetC2::pre_barrier(GraphKit*, bool, Node*, Node*, Node*, unsigned int, Node*, TypeOopPtr const*, Node*, BasicType) const+0xee2 V [libjvm.so+0xd952a4] G1BarrierSetC2::store_at_resolved(C2Access&, C2AccessValue&) const+0x124 V [libjvm.so+0x75547d] BarrierSetC2::store_at(C2Access&, C2AccessValue&) const+0xed V [libjvm.so+0xf3504d] GraphKit::access_store_at(Node*, Node*, TypePtr const*, Node*, Type const*, BasicType, unsigned long)+0x10d V [libjvm.so+0x17f30fc] Parse::do_put_xxx(Node*, ciField*, bool)+0x40c V [libjvm.so+0x17f3ecd] Parse::do_field_access(bool, bool)+0x61d V [libjvm.so+0x17edfcc] Parse::do_one_bytecode()+0xa6c V [libjvm.so+0x17dab44] Parse::do_one_block()+0x864 V [libjvm.so+0x17dbaa7] Parse::do_all_blocks()+0x137 V [libjvm.so+0x17e0ba6] Parse::Parse(JVMState*, ciMethod*, float)+0xbb6 V [libjvm.so+0x927200] ParseGenerator::generate(JVMState*)+0x110 V [libjvm.so+0xb1a4fa] Compile::Compile(ciEnv*, ciMethod*, int, Options, DirectiveSet*)+0x138a V [libjvm.so+0x924af3] C2Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0x683 V [libjvm.so+0xb29848] CompileBroker::invoke_compiler_on_method(CompileTask*)+0xbb8 V [libjvm.so+0xb2a778] CompileBroker::compiler_thread_loop()+0x638 V [libjvm.so+0x10a1ac8] JavaThread::thread_main_inner()+0x238 V [libjvm.so+0x1ad5b80] Thread::call_run()+0x100 V [libjvm.so+0x1792de4] thread_native_entry(Thread*)+0x104
02-08-2022

Deferral Request The issue is hard to reproduce and therefore requires more time to investigate. Although it first showed up with JDK 17, the offending change (JDK-8219555 which changed the behavior with -Xcomp) only triggered an existing problem and earlier versions are likely affected as well. If the root cause turns out to be JDK-8242115 or similar, the fix will be complex.
02-08-2022

I tried to reproduce this with rr's chaos mode but instead hit another issue and filed JDK-8291496. [~tschatzl] do you think that might be related?
28-07-2022

Here's the crashing stack trace from the jdk-20+6-261-tier4 sighting: runtime/cds/appcds/dynamicArchive/LotsUnloadTest.java --------------- T H R E A D --------------- Current thread (0x0000fffed42dfab0): JavaThread "Thread-0" [_thread_in_vm, id=733782, stack(0x0000fffeac5c0000,0x0000fffeac7c0000)] Stack: [0x0000fffeac5c0000,0x0000fffeac7c0000], sp=0x0000fffeac7bac20, free space=2027k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0xa53d70] CollectedHeap::is_oop(oop) const+0xe0 V [libjvm.so+0x15a55f8] oopDesc::is_oop(oop, bool)+0x58 V [libjvm.so+0xe6d138] HandleArea::allocate_handle(oop)+0x58 V [libjvm.so+0x174bed4] SharedRuntime::find_callee_info_helper(vframeStream&, Bytecodes::Code&, CallInfo&, JavaThread*)+0x670 V [libjvm.so+0x174f148] SharedRuntime::handle_ic_miss_helper(JavaThread*)+0xb8 V [libjvm.so+0x174f890] SharedRuntime::handle_wrong_method_ic_miss(JavaThread*)+0x23c v ~RuntimeStub::ic_miss_stub 0x0000fffec45de4d0 [error occurred during error reporting (printing native stack), id 0xe0000000, Internal Error (/opt/mach5/mesos/work_dir/slaves/0c72054a-24ab-4dbb-944f-97f9341a1b96-S10205/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/9f741ea7-2c45-460f-b174-c12aeae93a41/runs/c1e7313a-261d-48ea-b92b-2895901c4c5e/workspace/open/src/hotspot/share/code/codeCache.inline.hpp:49)] Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) v ~RuntimeStub::ic_miss_stub 0x0000fffec45de4d0 J 4616 c2 java.lang.invoke.MethodHandles.dropArgumentsTrusted(Ljava/lang/invoke/MethodHandle;I[Ljava/lang/Class;)Ljava/lang/invoke/MethodHandle; java.base@20-ea (111 bytes) @ 0x0000fffec4be3780 [0x0000fffec4be3700+0x0000000000000080] J 2570 c2 java.lang.invoke.StringConcatFactory.generateMHInlineCopy(Ljava/lang/invoke/MethodType;[Ljava/lang/String;)Ljava/lang/invoke/MethodHandle; java.base@20-ea (511 bytes) @ 0x0000fffec4d4b6b0 [0x0000fffec4d4b500+0x00000000000001b0] J 5018 c2 java.lang.invoke.StringConcatFactory.makeConcatWithConstants(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/String;[Ljava/lang/Object;)Ljava/lang/invoke/CallSite; java.base@20-ea (251 bytes) @ 0x0000fffec4f6d858 [0x0000fffec4f6d640+0x0000000000000218] J 2559 c2 java.lang.invoke.DirectMethodHandle$Holder.invokeStatic(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; java.base@20-ea (22 bytes) @ 0x0000fffec4d487f4 [0x0000fffec4d48780+0x0000000000000074] J 2558 c2 java.lang.invoke.DelegatingMethodHandle$Holder.delegate(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; java.base@20-ea (25 bytes) @ 0x0000fffec4d48278 [0x0000fffec4d48200+0x0000000000000078] J 2553 c2 java.lang.invoke.Invokers$Holder.invokeExact_MT(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; java.base@20-ea (30 bytes) @ 0x0000fffec4d47204 [0x0000fffec4d47140+0x00000000000000c4] J 4601 c2 java.lang.invoke.BootstrapMethodInvoker.invoke(Ljava/lang/Class;Ljava/lang/invoke/MethodHandle;Ljava/lang/String;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Class;)Ljava/lang/Object; java.base@20-ea (974 bytes) @ 0x0000fffec4edc318 [0x0000fffec4edb680+0x0000000000000c98] J 2541 c2 java.lang.invoke.CallSite.makeSite(Ljava/lang/invoke/MethodHandle;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/Object;Ljava/lang/Class;)Ljava/lang/invoke/CallSite; java.base@20-ea (91 bytes) @ 0x0000fffec4d2fbe4 [0x0000fffec4d2fb80+0x0000000000000064] J 2539 c2 java.lang.invoke.MethodHandleNatives.linkCallSiteImpl(Ljava/lang/Class;Ljava/lang/invoke/MethodHandle;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/invoke/MemberName; java.base@20-ea (44 bytes) @ 0x0000fffec4d2ed54 [0x0000fffec4d2ed00+0x0000000000000054] J 2537 c2 java.lang.invoke.MethodHandleNatives.linkCallSite(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/invoke/MemberName; java.base@20-ea (65 bytes) @ 0x0000fffec4d2e508 [0x0000fffec4d2e440+0x00000000000000c8] v ~StubRoutines::call_stub 0x0000fffec45001bc j DefinedAsHiddenKlass+0x0000000801007c00.<init>()V+8 J 4560 c2 java.lang.invoke.DirectMethodHandle$Holder.newInvokeSpecial(Ljava/lang/Object;)Ljava/lang/Object; java.base@20-ea (20 bytes) @ 0x0000fffec4e37bf8 [0x0000fffec4e37b00+0x00000000000000f8] J 4553 c2 java.lang.invoke.Invokers$Holder.invokeExact_MT(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; java.base@20-ea (22 bytes) @ 0x0000fffec4cc4ff0 [0x0000fffec4cc4f40+0x00000000000000b0] J 5016 c2 jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance([Ljava/lang/Object;)Ljava/lang/Object; java.base@20-ea (148 bytes) @ 0x0000fffec4f6b4f8 [0x0000fffec4f6b440+0x00000000000000b8] J 4439 c2 java.lang.reflect.Constructor.newInstanceWithCaller([Ljava/lang/Object;ZLjava/lang/Class;)Ljava/lang/Object; java.base@20-ea (51 bytes) @ 0x0000fffec4ec9d94 [0x0000fffec4ec9d00+0x0000000000000094] J 4438 c2 java.lang.reflect.ReflectAccess.newInstance(Ljava/lang/reflect/Constructor;[Ljava/lang/Object;Ljava/lang/Class;)Ljava/lang/Object; java.base@20-ea (8 bytes) @ 0x0000fffec4ec95a8 [0x0000fffec4ec9540+0x0000000000000068] J 4437 c2 jdk.internal.reflect.ReflectionFactory.newInstance(Ljava/lang/reflect/Constructor;[Ljava/lang/Object;Ljava/lang/Class;)Ljava/lang/Object; java.base@20-ea (13 bytes) @ 0x0000fffec4ec90f0 [0x0000fffec4ec9080+0x0000000000000070] j java.lang.Class.newInstance()Ljava/lang/Object;+117 java.base@20-ea j LotsUnloadApp.run()V+27 J 4290 c1 java.lang.Thread.run()V java.base@20-ea (19 bytes) @ 0x0000fffebd27947c [0x0000fffebd279300+0x000000000000017c] v ~StubRoutines::call_stub 0x0000fffec45001bc
08-07-2022

Last time I wrote some verification code for this, it looked something like this: https://github.com/fisk/jdk/commit/90eb5067eea29604123c48459b24e0a2bdad5e53 Not verified yet that the verification is accurate though so your mileage may vary, but it helps, then that's great.
05-07-2022

ILW = potential crash or heap corruption; rare; no workaround = HLH = P2
01-07-2022

If you want to reproduce on master, you need to adapt the import statement for ClassFileInstaller and the @run tag using it appropriately.
01-07-2022

I strongly suspect this is related to JDK-8242115 (c2 only, g1 only, no cds, test started failing with more aggressive -Xcomp options), so linking the issue and assigning to compiler team for further investigation. At least this CR provides a good reproducer (around 1%)
01-07-2022

Note that the test fails in (both) of the two sub-processes it launches, each needs to have options passed separately.
01-07-2022

This issue starts occurring with JDK-8219555: compiler/jvmci/compilerToVM/IsMatureTest.java fails with Une… which makes C2 compilation with -Xcomp "[..] much more aggressive in tiered mode with -Xcomp and exercise C2 more". I can't explain why this patch causes this issue.
01-07-2022

Attached test case made more independent of helper classes, basically a copy of runtime/cds/appcds/dynamicArchive/LotsUnloadTest.java with external dependencies mostly copy&pasted in. For better reproduceability the test is run ~10 times per invocation. It may be useful to make the flags VerifyBeforeGC/VerifyDuringGC/VerifyAfterGC to be product flags temporarily...
01-07-2022

There are two types of failures. We can fail in the first part of the test when it the CDS archive is about to be created (i.e. no archive loaded). Could only ever reproduce this on linux. In this case there is a "Root location 0x00007fa554c16c40 points to dead obj 0x00007fa564d01c90 in region 3:(O)[0x00007fa564d00000,0x00007fa564d44e58,0x00007fa564e00000] [25.994s][error][gc,verify ] GC(24) java.lang.invoke.BoundMethodHandle$Species_L [25.994s][error][gc,verify ] GC(24) {0x00007fa564d01c90} - klass: 'java/lang/invoke/BoundMethodHandle$Species_L' [25.994s][error][gc,verify ] GC(24) - ---- fields (total size 7 words): [25.994s][error][gc,verify ] GC(24) - private 'customizationCount' 'B' @12 0 [25.994s][error][gc,verify ] GC(24) - private volatile 'updateInProgress' 'Z' @13 false [25.994s][error][gc,verify ] GC(24) - private final 'type' 'Ljava/lang/invoke/MethodType;' @16 a 'java/lang/invoke/MethodType'{0x00007fa564d17b68} = ([BJ)Ljava/lang/String; (64d17b68 7fa5) [25.994s][error][gc,verify ] GC(24) - final 'form' 'Ljava/lang/invoke/LambdaForm;' @24 a 'java/lang/invoke/LambdaForm'{0x00007fa564d1dae8} => a 'java/lang/invoke/MemberName'{0x00007fa564d28af8} = {method} {0x00000008000fc558} 'reinvoke_L' '(Ljava/lang/Object;Ljava/lang/Object;J)Ljava/lang/Object;' in 'java/lang/invoke/DelegatingMethodHandle$Holder' (64d1dae8 7fa5) [25.994s][error][gc,verify ] GC(24) - private 'asTypeCache' 'Ljava/lang/invoke/MethodHandle;' @32 NULL (0 0) [25.994s][error][gc,verify ] GC(24) - private 'asTypeSoftCache' 'Ljava/lang/ref/SoftReference;' @40 NULL (0 0) [25.995s][error][gc,verify ] GC(24) - final 'argL0' 'Ljava/lang/Object;' @48 a 'java/lang/invoke/DirectMethodHandle'{0x00007fa564d17b30} (64d17b30 7fa5) " That root location is from the stack. on windows/osx/linux: we fail with a reference to a dead object from eden to old gen: [17.663s][error][gc,verify ] GC(5) Field 0x00000127cb9002d8 of live obj 0x00000127cb9002a8 in region 35:(E)[0x00000127cb900000,0x00000127cb93d760,0x00000127cba00000] [17.663s][error][gc,verify ] GC(5) java.lang.invoke.BoundMethodHandle$Species_LL [17.663s][error][gc,verify ] GC(5) {0x00000127cb9002a8} - klass: 'java/lang/invoke/BoundMethodHandle$Species_LL' [17.663s][error][gc,verify ] GC(5) - ---- fields (total size 8 words): [17.664s][error][gc,verify ] GC(5) - private 'customizationCount' 'B' @12 0 [17.664s][error][gc,verify ] GC(5) - private volatile 'updateInProgress' 'Z' @13 false [17.664s][error][gc,verify ] GC(5) - private final 'type' 'Ljava/lang/invoke/MethodType;' @16 a 'java/lang/invoke/MethodType'{0x00000127ca6521b0} = (J)[B (ca6521b0 127) [17.664s][error][gc,verify ] GC(5) - final 'form' 'Ljava/lang/invoke/LambdaForm;' @24 a 'java/lang/invoke/LambdaForm'{0x00000127ca632848} => a 'java/lang/invoke/MemberName'{0x00000127ca636dc0} = {method} {0x00000127d100e2e8} 'invoke' '(Ljava/lang/Object;J)Ljava/lang/Object;' in 'java/lang/invoke/LambdaForm$MH+0x0000000801005000' (ca632848 127) [17.664s][error][gc,verify ] GC(5) - private 'asTypeCache' 'Ljava/lang/invoke/MethodHandle;' @32 NULL (0 0) [17.664s][error][gc,verify ] GC(5) - private 'asTypeSoftCache' 'Ljava/lang/ref/SoftReference;' @40 NULL (0 0) [17.664s][error][gc,verify ] GC(5) - final 'argL0' 'Ljava/lang/Object;' @48 a 'java/lang/invoke/DirectMethodHandle'{0x00000127ca6521f0} (ca6521f0 127) [17.664s][error][gc,verify ] GC(5) - final 'argL1' 'Ljava/lang/Object;' @56 " bytes "{0x00000127cba001b8} (cba001b8 127) [17.664s][error][gc,verify ] GC(5) points to dead obj 0x00000127ca6521f0 in region 16:(O)[0x00000127ca600000,0x00000127ca700000,0x00000127ca700000] [17.664s][error][gc,verify ] GC(5) java.lang.invoke.DirectMethodHandle [17.664s][error][gc,verify ] GC(5) {0x00000127ca6521f0} - klass: 'java/lang/invoke/DirectMethodHandle' [17.664s][error][gc,verify ] GC(5) - ---- fields (total size 7 words): [17.664s][error][gc,verify ] GC(5) - private 'customizationCount' 'B' @12 0 [17.664s][error][gc,verify ] GC(5) - private volatile 'updateInProgress' 'Z' @13 false [17.664s][error][gc,verify ] GC(5) - private final 'type' 'Ljava/lang/invoke/MethodType;' @16 a 'java/lang/invoke/MethodType'{0x00000127ca64b3c8} = (Ljava/lang/String;J)[B (ca64b3c8 127) [17.664s][error][gc,verify ] GC(5) - final 'form' 'Ljava/lang/invoke/LambdaForm;' @24 a 'java/lang/invoke/LambdaForm'{0x00000127ca6372e8} => a 'java/lang/invoke/MemberName'{0x00000127ca637490} = {method} {0x0000000800cd01d8} 'invokeStatic' '(Ljava/lang/Object;Ljava/lang/Object;J)Ljava/lang/Object;' in 'java/lang/invoke/DirectMethodHandle$Holder' (ca6372e8 127) [17.664s][error][gc,verify ] GC(5) - private 'asTypeCache' 'Ljava/lang/invoke/MethodHandle;' @32 NULL (0 0) [17.664s][error][gc,verify ] GC(5) - private 'asTypeSoftCache' 'Ljava/lang/ref/SoftReference;' @40 NULL (0 0) [17.664s][error][gc,verify ] GC(5) - final 'crackable' 'Z' @14 true [17.664s][error][gc,verify ] GC(5) - final 'member' 'Ljava/lang/invoke/MemberName;' @48 a 'java/lang/invoke/MemberName'{0x00000127ca652228} = {method} {0x00000008004a20c8} 'newArrayWithSuffix' '(Ljava/lang/String;J)[B' in 'java/lang/StringConcatHelper' (ca652228 127) It's always arg0 that's problematic, the reference to the DirectMethodHandle. I.e. the argL1 reference is always correct.
28-06-2022

No relation to CDS: also occurs running with -Xshare:off.
28-06-2022

In the test there are no intermediate gcs between concurrent start and remark pause; which means that we are missing some references here.
28-06-2022

Only reproduces with G1 afaict, so adding label.
28-06-2022