JDK-8371889 : JFR: Deadlock in ThrowableTracer
  • Type: Bug
  • Component: hotspot
  • Sub-Component: jfr
  • Affected Version: 17,21
  • Priority: P3
  • Status: Open
  • Resolution: Unresolved
  • Submitted: 2025-11-14
  • Updated: 2025-11-24
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbdUnresolved
Related Reports
Relates :  
Description
We've seen a deadlock that involves ThrowableTracer, and it looks like this:

There is one thread that is holding the class-lock for jdk.jfr.internal.Utils because it is trying to register an Event class, and is then triggering a ClassNotFoundException (which looks like it might be an expected situation - that CNFE is caught and handled in the call path), which triggers ThrowableTracer, which in turn is calling into Handlers which then wants to init that class and waits to acquire the class-init lock for jdk.jfr.events.Handlers.

There is another thread, which is also throwing an exception and triggering ThrowableTracer, which is also calling into handlers and successfully acquired the class-init lock for that class, but then waits to acquire the class-lock for Utils, which the other thread holds.

I am trying to build a reproducer, but so far I failed because I could not get the thread that registers the Event to throw a ClassNotFoundException. I think for the other thread is is irrelevant what exactly it does, as long as it is throwing Exceptions. My attempt so far:

class TestJFRDeadLock {

    @Name("Test")
    @Label("Test")
    public static class TestEvent extends Event {
        static {
            System.out.println("static block");
        }
        private static final EventType TYPE = EventType.getEventType(TestEvent.class);
    }
    public static void main(String[] args) {
        new Thread(() -> {
            while (true) {
                try {
                    throw new Exception("test");
                } catch (Exception e) {
                    System.out.println("caught exception");
                }
            }
        }).start();
        System.out.println(new TestEvent().getClass().getName());
        System.exit(0);
    }
}

Run with: java -XX:StartFlightRecording:exceptions=all,filename=test.jfr TestJFRDeadLock

This bug happened with JDK17. I believe it is likely fixed (unintentionally) by   https://bugs.openjdk.org/browse/JDK-8319374 because it removed much of the locking that is involved in this deadlock.

 Those are the stacktraces of the relevant threads:

"dd-task-scheduler" #11 daemon prio=5 os_prio=0 cpu=423.02ms elapsed=76350.72s tid=0x00007ec274be3340 nid=0x103 in Object.wait()  [0x00007ec2353fc000]
   java.lang.Thread.State: RUNNABLE
	at jdk.jfr.internal.instrument.ThrowableTracer.traceThrowable(jdk.jfr@17.0.16/ThrowableTracer.java:55)
	- waiting on the Class initialization monitor for jdk.jfr.events.Handlers
	at java.lang.Throwable.<init>(java.base@17.0.16/Throwable.java:296)
	at java.lang.Exception.<init>(java.base@17.0.16/Exception.java:85)
	at java.lang.ReflectiveOperationException.<init>(java.base@17.0.16/ReflectiveOperationException.java:76)
	at java.lang.ClassNotFoundException.<init>(java.base@17.0.16/ClassNotFoundException.java:71)
	at java.lang.Class.forName0(java.base@17.0.16/Native Method)
	at java.lang.Class.forName(java.base@17.0.16/Class.java:467)
	at jdk.jfr.internal.MetadataRepository.makeHandler(jdk.jfr@17.0.16/MetadataRepository.java:187)
	at jdk.jfr.internal.MetadataRepository.register(jdk.jfr@17.0.16/MetadataRepository.java:142)
	- locked <0x0000000403792e18> (a jdk.jfr.internal.MetadataRepository)
	at jdk.jfr.internal.MetadataRepository.register(jdk.jfr@17.0.16/MetadataRepository.java:130)
	- locked <0x0000000403792e18> (a jdk.jfr.internal.MetadataRepository)
	at jdk.jfr.FlightRecorder.register(jdk.jfr@17.0.16/FlightRecorder.java:136)
	at com.datadog.profiling.controller.openjdk.events.SmapEntryEvent.<clinit>(SmapEntryEvent.java)
	at jdk.jfr.internal.JVM.getHandler(jdk.jfr@17.0.16/Native Method)
	at jdk.jfr.internal.Utils.getHandler(jdk.jfr@17.0.16/Utils.java:451)
	- locked <0x0000000403739760> (a java.lang.Class for jdk.jfr.internal.Utils)
	at jdk.jfr.internal.MetadataRepository.getHandler(jdk.jfr@17.0.16/MetadataRepository.java:174)
	at jdk.jfr.internal.MetadataRepository.getEventType(jdk.jfr@17.0.16/MetadataRepository.java:114)
	- locked <0x0000000403792e18> (a jdk.jfr.internal.MetadataRepositorywai)
	at jdk.jfr.EventType.getEventType(jdk.jfr@17.0.16/EventType.java:204)
	at com.datadog.profiling.controller.openjdk.events.SmapEntryFactory.<clinit>(SmapEntryFactory.java:31)
	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(java.base@17.0.16/Native Method)
	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(java.base@17.0.16/NativeMethodAccessorImpl.java:77)
	at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@17.0.16/DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(java.base@17.0.16/Method.java:569)
	at datadog.trace.bootstrap.Agent.registerSmapEntryEvent(Agent.java:893)
	- locked <0x0000000402617e58> (a java.lang.Class for datadog.trace.bootstrap.Agent)
	at datadog.trace.bootstrap.Agent.startJmx(Agent.java:843)
	- locked <0x0000000402617e58> (a java.lang.Class for datadog.trace.bootstrap.Agent)
	at datadog.trace.bootstrap.Agent.access$1700(Agent.java:82)
	at datadog.trace.bootstrap.Agent$JmxStartTask.run(Agent.java:819)
	at datadog.trace.util.AgentTaskScheduler$RunnableTask.run(AgentTaskScheduler.java:59)
	at datadog.trace.util.AgentTaskScheduler$RunnableTask.run(AgentTaskScheduler.java:54)
	at datadog.trace.util.AgentTaskScheduler$PeriodicTask.run(AgentTaskScheduler.java:329)
	at datadog.trace.util.AgentTaskScheduler$Worker.run(AgentTaskScheduler.java:284)
	at java.lang.Thread.run(java.base@17.0.16/Thread.java:840) 


"dd-profiler-recording-scheduler" #20 daemon prio=5 os_prio=0 cpu=1370.37ms elapsed=76349.15s tid=0x00007ec2750f3b50 nid=0x106 waiting for monitor entry  [0x00007ec1d9ffb000]
   java.lang.Thread.State: BLOCKED (on object monitor)
	at jdk.jfr.internal.Utils.getHandler(jdk.jfr@17.0.16/Utils.java:450)
	- waiting to lock <0x0000000403739760> (a java.lang.Class for jdk.jfr.internal.Utils)
	at jdk.jfr.events.Handlers.<clinit>(jdk.jfr@17.0.16/Handlers.java:31)
	at jdk.jfr.internal.instrument.ThrowableTracer.traceThrowable(jdk.jfr@17.0.16/ThrowableTracer.java:55)
	at java.lang.Throwable.<init>(java.base@17.0.16/Throwable.java:273)
	at java.lang.Error.<init>(java.base@17.0.16/Error.java:71)
	at java.lang.LinkageError.<init>(java.base@17.0.16/LinkageError.java:55)
	at java.lang.IncompatibleClassChangeError.<init>(java.base@17.0.16/IncompatibleClassChangeError.java:54)
	at java.lang.NoSuchMethodError.<init>(java.base@17.0.16/NoSuchMethodError.java:57)
	at java.lang.invoke.MethodHandleNatives.resolve(java.base@17.0.16/Native Method)
	at java.lang.invoke.MemberName$Factory.resolve(java.base@17.0.16/MemberName.java:1085)
	at java.lang.invoke.MemberName$Factory.resolveOrNull(java.base@17.0.16/MemberName.java:1129)
	at java.lang.invoke.InvokerBytecodeGenerator.resolveFrom(java.base@17.0.16/InvokerBytecodeGenerator.java:687)
	at java.lang.invoke.InvokerBytecodeGenerator.lookupPregenerated(java.base@17.0.16/InvokerBytecodeGenerator.java:739)
	at java.lang.invoke.InvokerBytecodeGenerator.generateCustomizedCode(java.base@17.0.16/InvokerBytecodeGenerator.java:748)
	at java.lang.invoke.LambdaForm.compileToBytecode(java.base@17.0.16/LambdaForm.java:854)
	at java.lang.invoke.DirectMethodHandle.makePreparedLambdaForm(java.base@17.0.16/DirectMethodHandle.java:305)
	at java.lang.invoke.DirectMethodHandle.preparedLambdaForm(java.base@17.0.16/DirectMethodHandle.java:233)
	at java.lang.invoke.DirectMethodHandle.preparedLambdaForm(java.base@17.0.16/DirectMethodHandle.java:218)
	at java.lang.invoke.DirectMethodHandle.preparedLambdaForm(java.base@17.0.16/DirectMethodHandle.java:227)
	at java.lang.invoke.DirectMethodHandle.make(java.base@17.0.16/DirectMethodHandle.java:108)
	at java.lang.invoke.MethodHandles$Lookup.getDirectMethodCommon(java.base@17.0.16/MethodHandles.java:4004)
	at java.lang.invoke.MethodHandles$Lookup.getDirectMethodNoSecurityManager(java.base@17.0.16/MethodHandles.java:3960)
	at java.lang.invoke.MethodHandles$Lookup.getDirectMethodForConstant(java.base@17.0.16/MethodHandles.java:4204)
	at java.lang.invoke.MethodHandles$Lookup.linkMethodHandleConstant(java.base@17.0.16/MethodHandles.java:4152)
	at java.lang.invoke.MethodHandleNatives.linkMethodHandleConstant(java.base@17.0.16/MethodHandleNatives.java:615)
	at jdk.jfr.internal.SecuritySupport.setDaemonThread(jdk.jfr@17.0.16/SecuritySupport.java:483)
	at jdk.jfr.internal.PlatformRecorder.startDiskMonitor(jdk.jfr@17.0.16/PlatformRecorder.java:450)
	at jdk.jfr.internal.PlatformRecorder.<init>(jdk.jfr@17.0.16/PlatformRecorder.java:89)
	at jdk.jfr.FlightRecorder.getFlightRecorder(jdk.jfr@17.0.16/FlightRecorder.java:179)
	- locked <0x0000000408003a38> (a java.lang.Class for jdk.jfr.internal.PlatformRecorder)
	at jdk.jfr.Recording.<init>(jdk.jfr@17.0.16/Recording.java:114)
	at jdk.jfr.Recording.<init>(jdk.jfr@17.0.16/Recording.java:138)
	at com.datadog.profiling.controller.openjdk.OpenJdkOngoingRecording.<init>(OpenJdkOngoingRecording.java:52)
	at com.datadog.profiling.controller.openjdk.OpenJdkController.createRecording(OpenJdkController.java:312)
	at com.datadog.profiling.controller.openjdk.OpenJdkController.createRecording(OpenJdkController.java:59)
	at com.datadog.profiling.controller.ProfilingSystem.startProfilingRecording(ProfilingSystem.java:158)
	at com.datadog.profiling.controller.ProfilingSystem$$Lambda$513/0x00007ec1e0336ad0.run(Unknown Source)
	at datadog.trace.util.AgentTaskScheduler$PeriodicTask.run(AgentTaskScheduler.java:329)
	at datadog.trace.util.AgentTaskScheduler$Worker.run(AgentTaskScheduler.java:284)
	at java.lang.Thread.run(java.base@17.0.16/Thread.java:840)

Comments
Fix-request (jdk21u): this change fixes a potential deadlock in JFR between the class-lock for jdk.jfr.internal.Utils and the class-initialization-lock for jdk.jfr.events.EventConfigurations (see description, note: Handlers has been renamed to EventConfigurations). The fix is to simply initialize the EventConfigurations class early, and doing so in PlatformRecorder.<init>() is the safe place to do so (that is before any instrumentation - which generates the code which would use EventConfigurations - is done). JDK21 is the last JDK which has that problem, hence the original fix here.
24-11-2025

A pull request was submitted for review. Branch: master URL: https://git.openjdk.org/jdk21u-dev/pull/2483 Date: 2025-11-19 13:58:35 +0000
20-11-2025

I have not looked into it, but my first stab at it would be to initialize the handlers in the PlatformRecorder constructor so they are set up before instrumentation for ThrowableTracer is added.
18-11-2025

I can't seem to reproduce the issue with a standalone program. I believe it has to do with some ClassLoader weirdness. This can be seen in the stacktraces: 1. In the first stacktrace, it calls into jdk.jfr.EventType.getEventType() with a Class (of type SmapEntryEvent), and that class is passed to JVM.getHandler(), which then goes on an initializes that class. How can we pass a Class that is not initialized around? 2. Similar, we get a ClassNotFoundException when trying to find the handler class - and that happens even though that Event class has @Registered(true) on it (or rather, it doesn't have anything on it, and inherits that annotation from its superclass, which is just the JFR Event class. I know that our code does some ClassLoader twists, and I kinda suspect that we are passing the Event class from a different ClassLoader around, but I am not sure. Anyhow, I think the deadlock there is pretty clear and should not happen - if any exception is thrown, we should get the exception somewhere rather than deadlock. The exception happens because Utils.getHandler() is synchronized and can end up calling into Handlers class, and thus attempt to acquire the class-init lock on that class, while the Handlers class-initializer of Handlers calls into Utils.getHandler(). I see two ways out of this (short of backporting JDK-8319374): 1. Initialize the Handlers class earlier, e.g. from the class-initializer of Utils - this way we would turn around the locking (go Handlers -> Utils on all paths, instead of Utils -> Handlers on some of them). I guess the risk is that we inadvertantly trip up another locking issue elsewhere. 2. Utils.get/setHandler() could synchronize on the Event-class that is passed into them, instead of the whole Utils class. This makes sense because they are presumably accessing a field in the Event class. Or do you guys see a better/preferred way to fix that? (Or no reason to fix at all?)
17-11-2025

[~egahlin] Thank you, that gets me a little bit further. BTW, I stumbled over that code in jfrEventClassTransformer.cpp, and wonder if it is correct: untypedEventHandler = !(annotation_value(ik, registered_symbol, value) || java_base_can_read_jdk_jfr()); Do we really want to use the return value of annotation_value() there - which tells us if the annotation is present or not, but not its actual value, or do we want to know the actual value of the annotation?
17-11-2025

[~rkennke] Could be that bug was fixed in JDK 25 (JDK-8356224).
17-11-2025

You could try to create a user-defined event with the annotation @Registered(false). Then it will not be registered during class initialization.
15-11-2025

[~mgronlun] looks similar, but I think it is a different issue, there are other locks involved. Do you know under which circumstances an Event class would not get its handler eagerly generated on class-load of the Event? I think this is the missing piece for my reproducer.
14-11-2025

Take a look at JDK-8368670 to see if there are similarities.
14-11-2025