JDK-8291456 : com/sun/jdi/ClassUnloadEventTest.java failed with: Wrong number of class unload events: expected 10 got 4
  • Type: Bug
  • Component: hotspot
  • Sub-Component: jvmti
  • Affected Version: 20
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2022-07-28
  • Updated: 2022-12-24
  • Resolved: 2022-10-20
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 17 JDK 20
17.0.6Fixed 20 b21Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Sub Tasks
JDK-8291650 :  
Description
com/sun/jdi/ClassUnloadEventTest.java is a new test that was added by JDK-8256811.

java.lang.RuntimeException: Wrong number of class unload events: expected 10 got 4
	at ClassUnloadEventTest.runDebugger(ClassUnloadEventTest.java:166)
	at ClassUnloadEventTest.main(ClassUnloadEventTest.java:59)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
	at java.base/java.lang.reflect.Method.invoke(Method.java:578)
	at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
	at java.base/java.lang.Thread.run(Thread.java:1589)

This was with a linux-x64-debug build using the following flags: -XX:+CreateCoredumpOnCrash -XX:+UseParallelGC -XX:+UseNUMA

However, I don't think the flags are related to the failure. I see another failure in an ad-hoc job that only used the -server flag. It also was on linux-x64-debug.
Comments
Fix request [17u] on behalf of Andrew Dinn Required follow up of JDK-8256811.
28-10-2022

A pull request was submitted for review. URL: https://git.openjdk.org/jdk17u-dev/pull/832 Date: 2022-10-25 11:06:21 +0000
25-10-2022

Changeset: c5e04640 Author: Serguei Spitsyn <sspitsyn@openjdk.org> Date: 2022-10-20 01:18:39 +0000 URL: https://git.openjdk.org/jdk/commit/c5e0464098f8f7cd9c568c7b1c3a06139453eaab
20-10-2022

I don't know why but auto-resolve of this issue did not happen after the fix was integrated.
20-10-2022

This is changeset: https://github.com/openjdk/jdk/commit/c5e0464098f8f7cd9c568c7b1c3a06139453eaab
20-10-2022

A pull request was submitted for review. URL: https://git.openjdk.org/jdk/pull/10736 Date: 2022-10-18 01:20:15 +0000
18-10-2022

Will check if JVMTI can force flushing all pending ObjectFree events at the VM shutdown. I'm not sure yet it can be deadlock prone.
14-10-2022

The ObjectFree events are flushed by the SetEventNotificationMode when disabled. The ClassLoad events were initially implemented as normal events but then reimplemented as an extension. They are enabled and disabled with the JVMTI SetExtensionEventCallback. The SetExtensionEventCallback spec says: "Sets the callback function for an extension event and enables the event. Or, if the callback is NULL, disables the event. Note that unlike standard events, setting the callback and enabling the event are a single operation." We still can flush the ClassUnload events when they are disabled with the SetExtensionEventCallback.
14-10-2022

> The ObjectFree events are flushed by the SetEventNotificationMode when disabled. I'm not sure if this will help the test, or in general help the JDI user. The JVMTI ObjectFree events trigger the JDWP ClassUnload events. The JDI user expects to get all the ClassUnload events before the debuggee exits. We already know that doesn't happen unless the debuggee has a short delay before exiting (the current workaround in the test implemented by JDK-8291650). What you are suggesting is that the JDI user somehow trigger the JVMTI SetEventNotificationMode to disable ObjectFree events so they will be flushed. There is no way for the JDI user to do this. The ObjectFree events are always enabled so the debug agent can track loaded classes. Even if the JDI user disables the ClassUnloadRequest, ObjectFree events will still be enabled by the debug agent.
13-10-2022

Oh, that's right. It's ObjectFree events that we need to make sure get flushed since they trigger the JDWP CLASS_UNLOAD events. Is ClassUnload actually an extension event, or just something with remnants in the JVMTI implementation, but not actually functional.
13-10-2022

Forgot that the ClassUnload is an extension event. In fact, the extension events implementation is very confusing. I'm even not sure if SetEventNotificationMode can really enable these events. Extension events enabling/disabling is triggered by the SetExtensionEventCallback. We may have useless calls to SetEventNotificationMode to enable extension events in our vthread tests.
13-10-2022

Agreed, it is a good suggestion to document it in the JVMTI spec. The only events that need this kind of flushing are those that are not posted on the current thread.
13-10-2022

I'm thinking if disabling the ClassUnload events with the JVMTI SetEventNotificationMode should flush the pending ClassUnload events. This approach has to be less dead lock prone than flushing the ClassUnload events in the JVMTI VMDeath event. Then the test will need to explicitly disable ClassUnload events with the JVMTI SetEventNotificationMode function. There will be a symmetry between flushing ObjectFree and ClassUnload events as pending events for both event types will be flushed with the SetEventNotificationMode. Does this looks reasonable?
13-10-2022

I think so. We need to make sure is documented in the JVMTI spec.
13-10-2022

Since JDK-8291456 added some useful debuggee logging to this test, I decided to remove the delay that was added to the debuggee to work around the problem and see what the logging output showed. It was as expected based on previous conclusions of the root cause. It shows all 10 of the classes being unloaded, but in this case only 5 classes have ClassUnloadEvents generated for them. ----------System.out:(25/1931)---------- Running debugger EventSet: event set, policy:2, count:1 = {VMStartEvent in thread main} EventSet: event set, policy:0, count:2 = {ClassUnloadEvent, ClassUnloadEvent} EventSet: event set, policy:0, count:1 = {ClassUnloadEvent} EventSet: event set, policy:0, count:1 = {ClassUnloadEvent} EventSet: event set, policy:0, count:1 = {ClassUnloadEvent} EventSet: event set, policy:0, count:2 = {ClassUnloadEvent, ClassUnloadEvent} EventSet: event set, policy:0, count:1 = {VMDeathEvent} stdout: [0.013s][info][gc] Using G1 stdout: Running debuggee stdout: [0.432s][info][class,unload] unloading class SampleClass__9 0x0000000801003c30 stdout: [0.432s][info][class,unload] unloading class SampleClass__8 0x0000000801003a28 stdout: [0.432s][info][class,unload] unloading class SampleClass__7 0x0000000801003820 stdout: [0.432s][info][class,unload] unloading class SampleClass__6 0x0000000801003618 stdout: [0.432s][info][class,unload] unloading class SampleClass__5 0x0000000801003410 stdout: [0.432s][info][class,unload] unloading class SampleClass__Alt__4 0x0000000801003208 stdout: [0.432s][info][class,unload] unloading class SampleClass__Alt__3 0x0000000801003000 stdout: [0.432s][info][class,unload] unloading class SampleClass__Alt__2 0x0000000801002c10 stdout: [0.432s][info][class,unload] unloading class SampleClass__Alt__1 0x0000000801002a08 stdout: [0.432s][info][class,unload] unloading class SampleClass__Alt__0 0x0000000801002800 stdout: [0.432s][info][class,unload] unloading class java.lang.invoke.LambdaForm$MH/0x0000000801001c00 0x0000000801001c00 stdout: [0.432s][info][class,unload] unloading class java.lang.invoke.LambdaForm$MH/0x0000000801001800 0x0000000801001800 stdout: [0.432s][info][class,unload] unloading class java.lang.invoke.LambdaForm$MH/0x0000000801001000 0x0000000801001000 stdout: [0.506s][info][gc ] GC(0) Pause Full (WhiteBox Initiated Full GC) 6M->4M(40M) 88.199ms stdout: Exiting debuggee ----------System.err:(12/800)---------- java.lang.RuntimeException: Wrong number of class unload events: expected 10 got 5 at ClassUnloadEventTest.runDebugger(ClassUnloadEventTest.java:191) at ClassUnloadEventTest.main(ClassUnloadEventTest.java:64) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) at java.base/java.lang.reflect.Method.invoke(Method.java:578) at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127) at java.base/java.lang.Thread.run(Thread.java:1589)
25-08-2022

The following failures: > java.lang.RuntimeException: Wrong number of class unload events: expected 10 got 0 Appear to be a different issue. Note the "expected 10 got 0" instead of "expected 10 got 4", and always occurring on windows. Please see JDK-8292879.
24-08-2022

The delay at the end of the test I added in JDK-8256811 does seem to have quieted down the failures quite a bit, but I see we are still seeing them. Seems they are now "expected 10 got 0" instead of "expected 10 got 4". Also, both are tier6 with -Xcomp on Windows. Possibly we just need a longer delay at the end of the test due to the -Xcomp, although the fact that we got 0 ClassUnloadEvents make me suspicious that unloading of the classes is just not being triggered.
18-08-2022

Assigned to myself for more analysis in 20. Some investigation can be needed.
10-08-2022

[~sspitsyn] JVMTI should be generating ObjectFree events for each unloaded classes. Each of these ObjectFree events causes the debug agent to immediately synthesize and enqueue a ClassUnload event. If SUSPEND_ALL or SUSPEND_EVENT_THREAD are used (which the test is using), the debug agent will not return to JVMTI until after the event has been delivered and the debugger has resumed all threads. So there should never be any ClassUnload events for the debug agent to flush. When the VMDeath comes in (on a different thread) the debug agent disables all callbacks, so no more ObjectFree events will arrive. Note however that before processing the VMDeath, cbVMDeath first waits for any pending callbacks to complete. So all "in process" ClassUnload events will be sent and their thread resumed before the VMDeath shuts down any further event processing by the debug agent. Note this is what Zyngu said: > Jvmti posts VMDeathEvent [1] event much earlier than I expected during JVM exits. Upon receiving VMDeathEvent, event handler discards subsequent events. So the subsequent events would be ObjectFree events that arrive after the VMDeathEvent (or more accurately, would have arrived if not for the debug agent disabling callbacks due to the VMDeathEvent) So I believe the issue is that when the app exits, JVMTI is still in the process of posting ObjectFree events for each unloaded class. This gets shut down due to the VMDeathEvent, leaving some ObjectFree events never being posted (because the debug agent has disabled callbacks). So the solution would be for JVMTI to defer posting the VMDeath until after it is done processing the unloaded classes and posting ObjectFree events for all of them. I don't see how to solve this in the debug agent.
05-08-2022

[~cjplummer] wrote: > It sounds like this is a test bug. The debugger and debuggee need some sort of synchronization > so the debuggee doesn't exit until the debugger has received the expected events. > However, I don't think the debugger really knows when it is done, so it relies on the VMThreadDeath event. > Perhaps we should just add a short delay in the debuggee before it exits. There is a way for debug agent to flush ObjectFree events. I'm not sure it will result in flushing ClassUnload events though. It is possible by disabling the ObjectFree events with the JVMTI SetEventNotificationMode in the VMThreadDeath event handler. There is a risk however that this can potentially cause some deadlocks, so that this needs to be done carefully.
05-08-2022

About the assert mentioned by Chris above: # assert(env->is_enabled(JVMTI_EVENT_OBJECT_FREE)) failed: checking The function JvmtiTagMap::flush_object_free_events() checks for the `is_enabled(JVMTI_EVENT_OBJECT_FREE)` condition before call to the `remove_and_post_dead_objects()`: void JvmtiTagMap::flush_object_free_events() { assert_not_at_safepoint(); if (env()->is_enabled(JVMTI_EVENT_OBJECT_FREE)) { { MonitorLocker ml(lock(), Mutex::_no_safepoint_check_flag); // If another thread is posting events, let it finish while (_posting_events) { ml.wait(); } if (!_needs_cleaning || is_empty()) { _needs_cleaning = false; return; } _posting_events = true; } // Drop the lock so we can do the cleaning on the VM thread. // Needs both cleaning and event posting (up to some other thread // getting there first after we dropped the lock). remove_and_post_dead_objects(); However, it is not protected by any lock, so it is racy with the JVMTI `SetEventNotificationMode()`. So that the `is_enabled(JVMTI_EVENT_OBJECT_FREE)` condition can be disabled before the `JvmtiExport::post_object_free()` is called. I think, this assert should be either removed or replaced with the double check of this condition.
04-08-2022

At the moment, I agree with Chris that the fix of JDK-8256811 is still valuable. I kind of expected some regressions because it was hard to ensure there are no new intermittent test failures. So, it is okay to address this issue separately. Please, note we are still in the early release stage. I'm still thinking about the best way to address this.
03-08-2022

Not M1. X64. Your changes for JDK-8256811 are fixing the issue of delayed ClassUnload events, meaning ClassUnload events that won't arrive until some other event is processed, or even worse won't arrive until after the next GC and another event is processed. I think you are pointing out that your fix does not guarantee all ClassUnload events will be delivered before VMDeath. I'm not so sure this is a bug, and even if it is, that doesn't mean there isn't value in the JDK-8256811 fix as it currently stands (although the asserts need to be fixed, and at the very least we need to modify this test so it doesn't fail).
29-07-2022

Are they M1 machines? could be the problem of unsynchronized setting/getting the flag? [~cjplummer] [~sspitsyn] I am pretty sure that original reported failure is *not* a test bug, as JVM shutdown can race ServiceThread to post VMDeadEvent, so JDK-8256811 did not fix missing events. Given the incomplete fix and a couple of issues seen in tests, e.g. JDK-8290908, probably should consider backout JDK-8256811 for now and reconsider the solution.
29-07-2022

I'm not sure if this is related at all, but in someone's adhoc job and a CI job (both on macosx) I see the following assert in the debuggee, causing this test to fail: # Internal Error (/System/Volumes/Data/mesos/work_dir/slaves/0c72054a-24ab-4dbb-944f-97f9341a1b96-S26308/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/616729ac-37b4-4df2-a696-d00537a3d62a/runs/7a6bcbc7-a6e2-4c6e-b352-8e64a14c814c/workspace/open/src/hotspot/share/prims/jvmtiExport.cpp:1700), pid=64461, tid=42499 # assert(env->is_enabled(JVMTI_EVENT_OBJECT_FREE)) failed: checking Current thread (0x00007f86bc81ba10): JavaThread "Service Thread" daemon [_thread_in_vm, id=42499, stack(0x000070000c131000,0x000070000c231000)] Stack: [0x000070000c131000,0x000070000c231000], sp=0x000070000c230820, free space=1022k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.dylib+0x12b9169] VMError::report_and_die(int, char const*, char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long)+0x6e9 V [libjvm.dylib+0x12b97eb] VMError::report_and_die(Thread*, void*, char const*, int, char const*, char const*, __va_list_tag*)+0x3b V [libjvm.dylib+0x6c841d] report_vm_error(char const*, int, char const*, char const*, ...)+0xdd V [libjvm.dylib+0xccd81f] JvmtiExport::post_object_free(JvmtiEnv*, GrowableArray<long>*)+0x17f V [libjvm.dylib+0xcf7be2] JvmtiTagMap::post_dead_objects(GrowableArray<long>*)+0x92 V [libjvm.dylib+0xcf8123] JvmtiTagMap::remove_and_post_dead_objects()+0x103 V [libjvm.dylib+0xcf837f] JvmtiTagMap::flush_object_free_events()+0x16f V [libjvm.dylib+0xcfaeaa] JvmtiTagMap::flush_all_object_free_events()+0x12a V [libjvm.dylib+0x10af97d] ServiceThread::service_thread_entry(JavaThread*, JavaThread*)+0x57d V [libjvm.dylib+0xa247e5] JavaThread::thread_main_inner()+0x235 V [libjvm.dylib+0x1209f07] Thread::call_run()+0x177 V [libjvm.dylib+0xfafbe0] thread_native_entry(Thread*)+0x150 C [libsystem_pthread.dylib+0x6109] _pthread_start+0x94 C [libsystem_pthread.dylib+0x1b8b] thread_start+0xf
29-07-2022

It sounds like this is a test bug. The debugger and debuggee need some sort of synchronization so the debuggee doesn't exit until the debugger has received the expected events. However, I don't think the debugger really knows when it is done, so it relies on the VMThreadDeath event. Perhaps we should just add a short delay in the debuggee before it exits.
29-07-2022

After digging a bit, I believe JDK-8256811 did not fix missing events completely. Jvmti posts VMDeathEvent [1] event much earlier than I expected during JVM exits. Upon receiving VMDeathEvent, event handler discards subsequent events. I wonder why VMDeathEvent has to be delivered so early, can it wait until all JavaThreads reach vm_exit safepoint? or after ServiceThread exits (similar to WatcherThread and StringDedupThread, etc.) [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/java.cpp#L499
29-07-2022

I see we have a 2nd CI failure with the same flags, but I've been unable to reproduce it. If done 700 runs of just the one test, and 240 runs of all of :jdk_jdi, which is how the test task the reproduces the two failures was run. I have made changes to run the debuggee with -Xlog:class+unload, but since it's not failing, it's not really helping to track down the problem. If we continue to see CI failures, perhaps I'll push that change.
29-07-2022

I were not able to reproduce locally (AMD Ryzen 7 5800X, 8 cores with Ubuntu). I wonder we should add "-Xlog:class+unload" to debuggee's launch options, that should help us to diagnose if the failure is due to class unloading or event posting. arguments.get("options").setValue("--add-exports java.base/jdk.internal.org.objectweb.asm=ALL-UNNAMED -Xbootclasspath/a:. -XX:+UnlockDiagnosticVMOptions -XX:+WhiteBoxAPI -Xlog:class+unload"); Also, could you attach ClassUnloadEventTest.jtr file. Thanks
28-07-2022