JDK-8278602 : CDS dynamic dump may access unloaded classes
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 18,19
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: os_x
  • CPU: aarch64
  • Submitted: 2021-12-13
  • Updated: 2022-01-04
  • Resolved: 2022-01-04
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 19
19 masterFixed
Related Reports
Relates :  
Description
I am still seeing the assertion "assert(ZAddress::is_marked(addr)) failed: Should be marked" in current jdk/jdk (but on macOS darwinaarch64). Appeared in test runtime/cds/appcds/loaderConstraints/DynamicLoaderConstraintsTest.java#custom-cl-zgc

# Internal Error (/openjdk/nb/darwinaarch64/jdk-dev/src/hotspot/share/gc/z/zBarrier.cpp:41), pid=84433, tid=42755
# assert(ZAddress::is_marked(addr)) failed: Should be marked

Current thread (0x0000000125efafd0): VMThread "VM Thread" [stack: 0x000000016f3f4000,0x000000016f5f7000] [id=42755]

Stack: [0x000000016f3f4000,0x000000016f5f7000], sp=0x000000016f5f6520, free space=2057k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V [libjvm.dylib+0x115135c] VMError::report_and_die(int, char const*, char const*, char*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long)+0x5d4
V [libjvm.dylib+0x1151a9c] VMError::report_and_die(Thread*, void*, char const*, int, char const*, char const*, char*)+0x40
V [libjvm.dylib+0x5a245c] report_vm_error(char const*, int, char const*, char const*, ...)+0x80
V [libjvm.dylib+0x11aa71c] unsigned long ZBarrier::mark<false, true, false, true>(unsigned long)+0x98
V [libjvm.dylib+0x81fac0] oop ZBarrier::barrier<&(ZBarrier::is_good_or_null_fast_path(unsigned long)), &(ZBarrier::load_barrier_on_oop_slow_path(unsigned long))>(oop volatile*, oop)+0xa0
V [libjvm.dylib+0x996eb4] AccessInternal::PostRuntimeDispatch<ZBarrierSet::AccessBarrier<548932ull, ZBarrierSet>, (AccessInternal::BarrierType)2, 548932ull>::oop_access_barrier(void*)+0x94
V [libjvm.dylib+0x84fb2c] InstanceKlass::signers() const+0x34
V [libjvm.dylib+0x1073b0c] SystemDictionaryShared::check_for_exclusion_impl(InstanceKlass*)+0x28c
V [libjvm.dylib+0x1073834] SystemDictionaryShared::check_for_exclusion(InstanceKlass*, DumpTimeClassInfo*)+0x188
V [libjvm.dylib+0x1075b8c] SystemDictionaryShared::check_excluded_classes()+0x250
V [libjvm.dylib+0x668bd0] DynamicArchiveBuilder::doit()+0xd8
V [libjvm.dylib+0x668a64] VM_PopulateDynamicDumpSharedSpace::doit()+0xac
V [libjvm.dylib+0x1157b40] VM_Operation::evaluate()+0x104
V [libjvm.dylib+0x1173880] VMThread::evaluate_operation(VM_Operation*)+0x12c
V [libjvm.dylib+0x1174440] VMThread::inner_execute(VM_Operation*)+0x33c
V [libjvm.dylib+0x1173558] VMThread::loop()+0xb4
V [libjvm.dylib+0x1173340] VMThread::run()+0xc0
V [libjvm.dylib+0x10a5fd0] Thread::call_run()+0x21c
V [libjvm.dylib+0xdaa074] thread_native_entry(Thread*)+0x160
C [libsystem_pthread.dylib+0x7878] _pthread_start+0x140
Comments
Changeset: 09cf5f19 Author: Ioi Lam <iklam@openjdk.org> Date: 2022-01-04 04:52:49 +0000 URL: https://git.openjdk.java.net/jdk/commit/09cf5f19d76b17790ffb899aad247f821a27d46b
04-01-2022

> I think Coleen meant "you can think of the mirror as the 'loader' in this special case." ? Not sure the distinction here, but for non-strong hidden classes the mirror == loader or more accurately mirror == _holder_ // Unloading support bool ClassLoaderData::is_alive() const { bool alive = keep_alive() // null class loader and incomplete non-strong hidden class. || (_holder.peek() != NULL); // and not cleaned by the GC weak handle processing. return alive; } If the holder is (class loader or mirror) reference has been cleared by GC, it is no longer alive and this ClassLoaderData can be unloaded. The criteria for clearing the holder differs per GC but effectively, if there are no references outside of the ClassLoaderData object itself, and GC has reached the phase where it has cleared WeakHandles in OopStorage, the ClassLoaderData is dead and can be unloaded. Does this help?
17-12-2021

> You can think of the 'loader' as the mirror for this special case. I think Coleen meant "you can think of the mirror as the 'loader' in this special case." ? But I'm still unclear on the actual process here. What marks a class_loader_data as not alive and based on what criteria? And when does that happen in relation to the unloading of the class? Which is the chicken and which is the egg? :)
17-12-2021

> That's true for regular classes but non-STRONG hidden classes can be unloaded independent of their classloader. David, I think you are confused by the terminology. That's the reason I think Klass::is_loader_alive() should be renamed to something more precise. ClassLoaderData is not the same as java.lang.ClassLoader, and there's no one-to-one relationship between them. Consider this test: ======== import java.lang.invoke.MethodType; import java.lang.invoke.MethodHandles; import java.lang.invoke.MethodHandles.Lookup; import static java.lang.invoke.MethodHandles.Lookup.ClassOption.*; public class Foo { public static void main(String args[]) throws Throwable { System.out.println("Foo.getClassLoader() = " + Foo.class.getClassLoader()); String resname = "FooHidden.class"; byte[] classdata = Foo.class.getClassLoader().getResourceAsStream(resname).readAllBytes(); Lookup lookup = MethodHandles.lookup(); Class<?> cl = lookup.defineHiddenClass(classdata, false, NESTMATE).lookupClass(); cl.newInstance(); } } class FooHidden { public FooHidden() { System.out.println("FooHidden.getClassLoader() = " + FooHidden.class.getClassLoader()); } } ======== $ java -cp . -Xlog:class+load=debug::none Foo Foo source: file:/test/ klass: 0x0000000800c00800 super: 0x0000000800000d08 loader: [loader data: 0x00007efdc4147c70 for instance a 'jdk/internal/loader/ClassLoaders$AppClassLoader'{0x00000007ff756d40}] bytes: 1686 checksum: 574fe204 Foo.getClassLoader() = jdk.internal.loader.ClassLoaders$AppClassLoader@46fbb2c1 FooHidden/0x0000000800c01000 source: Foo klass: 0x0000000800c01000 super: 0x0000000800000d08 loader: [loader data: 0x00007efdc415b0b0 for instance a 'jdk/internal/loader/ClassLoaders$AppClassLoader'{0x00000007ff756d40} has a FooHidden.getClassLoader() = jdk.internal.loader.ClassLoaders$AppClassLoader@46fbb2c1 ======== Both classes share the same java.lang.ClassLoader instance. However, their ClassLoaderDatas are different (0x00007efdc4147c70 vs 0x00007efdc415b0b0). When the hidden class is unloaded, it's ClassLoaderData (0x00007efdc415b0b0) is no longer alive. That's the condition tested by my patch.
16-12-2021

> That's true for regular classes but non-STRONG hidden classes can be unloaded independent of their classloader. is_loader_alive() handles this because it looks at the class "holder" which is either the class loader OR the mirror of the non-strong hidden classes. This name purposely does not expose this implementation detail. You can think of the 'loader' as the mirror for this special case.
16-12-2021

> As far as I know, a class can become unloaded (e.g., Klass::java_mirror() no longer works) only sometime after class_loader_data()->is_alive() becomes false. That's true for regular classes but non-STRONG hidden classes can be unloaded independent of their classloader.
16-12-2021

As far as I know, a class can become unloaded (e.g., Klass::java_mirror() no longer works) only sometime after class_loader_data()->is_alive() becomes false. I believe the current patch is safe because: For any class K seen by DumpTimeSharedClassTable::IterationHelper::do_entry() - The iteration loop holds the DumpTimeTable_lock. Before an InstanceKlass is deallocated, it must grab the DumpTimeTable_lock and remove itself from the _dumptime_table. Therefore, if we can get K from the _dumptime_table, we know that K is still valid. - We also know that K->class_loader_data() is valid -- it will remain valid until all of its classes have been deallocated. But K has not be deallocated yet. - The iteration happens only inside a safepoint, where ClassLoaderData::is_alive() will never transit from true to false, for any ClassLoaderData (according to [~coleenp] and [~stefank]). Therefore, we know that if K->class_loader_data()->is_alive() is true, this class has not be unloaded, and will not be unloaded as long as we are in the safepoint (i.e., VM_PopulateDynamicDumpSharedSpace::doit()).
16-12-2021

That is confusing. So `is_loader_alive()` has nothing to do with the loader at all? At what point during class unloading does an actual klass instance become unusable? Are we guaranteed that if we find the klass instance then we can ask it `is_loader_alive()` and be certain that that state can't change? (I see you've put in assertions to catch that).
16-12-2021

Reproducer: (1) Download LotsUnload.java and Makefile from the attachments (2) Apply the following patch to JDK to increase the likelihood of the crash diff --git a/src/hotspot/share/gc/z/zDriver.cpp b/src/hotspot/share/gc/z/zDriver.cpp index 30fdeb6804f..18b5a013399 100644 --- a/src/hotspot/share/gc/z/zDriver.cpp +++ b/src/hotspot/share/gc/z/zDriver.cpp @@ -457,6 +457,8 @@ void ZDriver::gc(const ZDriverRequest& request) { concurrent(mark_continue); } + os::naked_sleep(NewCodeParameter); + // Phase 4: Concurrent Mark Free concurrent(mark_free); (3) make run (read the Makefile for details) (4) Run a command like the following. You may need to adjust the i=0; while /my/build/fastdebug/images/jdk/bin/java -cp LotsUnload.jar -XX:ArchiveClassesAtExit=dyn.jsa -Xmx64m -Xms32m -XX:+UseZGC -XX:SharedArchiveFile=ZGCBaseArchive.jsa -XX:NewCodeParameter=90 LotsUnload 4 150; do i=$(expr $i + 1); echo -n "$i "; done For me, I got a crash in less than a minute. ====== I marked the bug as noreg-hard The test case requires a patch in ZGC to add arbitrary delays. Therefore, it's not suitable to be integrated into the regression test suite. I am unable to write a reproducer for an unpatched JVM. [~stefank] also tried manipulating the ZGC command-line flags but was unsuccessful.
16-12-2021

It looks like the fix in JDK-8277998 was not effective. The class that caused the crash was probably a hidden class that got unloaded when we are in the middle of the VM_PopulateDynamicDumpSharedSpace operation. I can't reproduce the problem even with modifying the VM code. However, I think we can do this to make the CDS code safe from class unloading: *** EDIT - the following proposal is withdrawn. Please see the PR for the actual fix https://github.com/openjdk/jdk/pull/6859 *** Proposed fix: 1. When we start to dump a CDS archive, mark all currently alive class loaders. Use OopHandles to keep these loaders alive during the entire CDS dumping process. 2. When walking the SystemDictionaryShared::_dumptime_table , skip all classes whose loader is not marked in step 1. 3. Clear the OopHandles after the CDS dump is finished. As a result, whenever we look at a class during the CDS dump process, we can be guaranteed that this class is not unloaded.
16-12-2021

[~dholmes] In my latest version, we no longer mark the class loader oop. Instead, we call Klass::is_loader_alive() to check if a class has been unloaded. See https://github.com/openjdk/jdk/pull/6859 ======= BTW, this function probably should be renamed in a separate RFE. It checks if the class_loader_data() is alive. If a non-strong hidden class is unloaded, Klass::is_loader_alive() will report false. inline bool Klass::is_loader_alive() const { return class_loader_data()->is_alive(); }
16-12-2021

[~iklam] Marking the classloader as alive won't help for hidden classes that are not created as STRONG as they do not depend on their classloader for liveness.
16-12-2021

ILW = HLM = P3
14-12-2021

Sorry I have no core file. Regarding >The class that caused the crash was probably a hidden class that got unloaded when we are in the middle of the VM_PopulateDynamicDumpSharedSpace operation Indeed I see in the hs_err : VM_Operation (0x000000016db5a788): PopulateDumpSharedSpace, mode: safepoint, requested by thread 0x000000012700e020 A couple of class load/unload events are logged as well (not sure if those hidden classes would be logged?). Classes loaded (20 events): Event: 0.211 Loading class sun/net/www/protocol/jrt/Handler Event: 0.211 Loading class sun/net/www/protocol/jrt/Handler done Event: 0.216 Loading class sun/net/www/protocol/jrt/JavaRuntimeURLConnection Event: 0.216 Loading class sun/net/www/protocol/jrt/JavaRuntimeURLConnection done Event: 0.217 Loading class sun/net/www/protocol/jrt/JavaRuntimeURLConnection$1 Event: 0.217 Loading class sun/net/www/protocol/jrt/JavaRuntimeURLConnection$1 done Event: 0.219 Loading class jdk/internal/jimage/ImageBufferCache Event: 0.219 Loading class jdk/internal/jimage/ImageBufferCache done Event: 0.219 Loading class jdk/internal/jimage/ImageBufferCache$1 Event: 0.219 Loading class jdk/internal/jimage/ImageBufferCache$1 done Event: 0.219 Loading class jdk/internal/jimage/ImageBufferCache$2 Event: 0.219 Loading class jdk/internal/jimage/ImageBufferCache$2 done Event: 0.219 Loading class java/util/AbstractMap$SimpleEntry Event: 0.219 Loading class java/util/AbstractMap$SimpleEntry done Event: 0.296 Loading class java/lang/Throwable$WrappedPrintStream Event: 0.296 Loading class java/lang/Throwable$PrintStreamOrWriter Event: 0.296 Loading class java/lang/Throwable$PrintStreamOrWriter done Event: 0.296 Loading class java/lang/Throwable$WrappedPrintStream done Event: 0.296 Loading class java/lang/StackTraceElement$HashedModules Event: 0.296 Loading class java/lang/StackTraceElement$HashedModules done Classes unloaded (4 events): Event: 0.381 Thread 0x000000014600ea20 Unloading class 0x0000000800c05800 'jdk/test/lib/Asserts' Event: 0.381 Thread 0x000000014600ea20 Unloading class 0x0000000800c01cd8 'MyHttpHandler' Event: 0.381 Thread 0x000000014600ea20 Unloading class 0x0000000800c01a00 'com/sun/net/httpserver/HttpExchange' Event: 0.381 Thread 0x000000014600ea20 Unloading class 0x0000000800c01800 'LoaderConstraintsApp' Classes redefined (0 events): No events
14-12-2021

[~mbaesken] do you have a core file that shows what the offending class is?
13-12-2021

Similar issue was dicussed here : https://bugs.openjdk.java.net/browse/JDK-8277998
13-12-2021