JDK-8316469 : Java 21 class initialization deadlock with Jetty
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 21
  • Priority: P4
  • Status: Closed
  • Resolution: Not an Issue
  • Submitted: 2023-09-19
  • Updated: 2023-09-19
  • Resolved: 2023-09-19
Related Reports
Relates :  
Relates :  
Description
On 8/09/2023 12:38 am, Simone Bordet wrote:> Hello,
> 
> We switched the Jetty builds to Java 21 a while ago, and they randomly
> fail with a hard deadlock during class initialization.
> 
> We tried to understand if we were doing something wrong, but the code
> compiles fine, and at first glance there seems to be nothing wrong
> with it, but it may well be a popular Java puzzler that I am not aware
> of :)
> 
> We did not see these failures with Java 20, and we do not see them with Java 17.
> 
> I was wondering if something changed in Java 21 to cause this deadlock?

Taking a thread dump we see one thread in this state:

java.lang.Thread.State: RUNNABLE
at org.eclipse.jetty.http.HttpFields.build(org.eclipse.jetty.http@12.0.2-SNAPSHOT/HttpFields.java:76)
- waiting on the Class initialization monitor for
org.eclipse.jetty.http.MutableHttpFields
at org.eclipse.jetty.http.HttpFields.<clinit>(org.eclipse.jetty.http@12.0.2-SNAPSHOT/HttpFields.java:67)
"ForkJoinPool-1-worker-1" #23 [1457161] daemon prio=5 os_prio=0
cpu=320.77ms elapsed=8870.98s allocated=18414K defined_classes=928
tid=0x00007f6ec0c6a6f0 nid=1457161 waiting on condition
[0x00007f6e91ffb000]

And other threads are in this state (or similar -- the bottom frame is
different but still triggering the clinit of HttpFields):

java.lang.Thread.State: RUNNABLE
at org.eclipse.jetty.http.HttpTester.parseResponse(org.eclipse.jetty.http@12.0.2-SNAPSHOT/HttpTester.java:274)
- waiting on the Class initialization monitor for
org.eclipse.jetty.http.HttpFields
at org.eclipse.jetty.http.HttpTester.parseResponse(org.eclipse.jetty.http@12.0.2-SNAPSHOT/HttpTester.java:261)
"ForkJoinPool-1-worker-2" #24 [1457162] daemon prio=5 os_prio=0
cpu=136.15ms elapsed=8870.96s allocated=3303K defined_classes=340
tid=0x00007f6e4c01ddb0 nid=1457162 waiting on condition
[0x00007f6e91efc000]

Class hierarchy:

MutableHttpFields implements HttpFields.Mutable
HttpFields.Mutable extends HttpFields

Full thread dump:
https://gist.github.com/olamy/7e883adcf51b2410337abfa775a79a0b

Code:
https://github.com/eclipse/jetty.project/blob/jetty-12.0.1/jetty-core/jetty-http/src/main/java/org/eclipse/jetty/http/HttpFields.java

-- 
Simone Bordet
---
Finally, no matter how good the architecture and design are,
to deliver bug-free software with optimal performance and reliability,
the implementation technique must be flawless.   Victoria Livschitz


Comments
AFAICS there is no product bug here. The Jetty code has a fragile structure, but perhaps intended to be used in a way that ensures correct behaviour. But the fact HttpTester.Message extends MutableHttpFields seems to violate any such expectations and introduces the inverted initialization/linking order which leads to deadlock I initially thought JDK-8288064 may have been the root cause of the change in observed behaviour, but it was fixed in JDK 20 and the problem was not observed there. Closing as "Not an Issue". If any further information comes to light please update this issue. Thanks.
19-09-2023

I used this code to try and reproduce the complex structure of the Jetty code, but distilled to the minimum set of types. It does not deadlock without the presence of the Sleeper to block whilst holding one of the init-locks. But with that it deadlocks on all Java versions I tried from 8 through 22. File: MutableHttpFieldsDeadlock.java class Sleeper { static { try { Thread.sleep(5); } catch (InterruptedException ex) {} } } // Presence of default methods forces interface initialization when initializing // the class that implements the interface. interface HttpFields { // Block whilst holding the init lock Sleeper s = new Sleeper(); interface Mutable extends HttpFields { default Mutable get() { return null; } } HttpFields Mutable = new MutableHttpFields(); default HttpFields get() { return this; } } class MutableHttpFields implements HttpFields.Mutable { MutableHttpFields() {} } class Message extends MutableHttpFields { Message() {} } public class MutableHttpFieldsDeadlock { public static void main(String[] args) throws Throwable { Thread t = new Thread() { public void run() { Message m = new Message(); } }; t.start(); HttpFields h = new HttpFields() { }; } }
19-09-2023

dholmes-ora commented Sep 19, 2023 It is a fragile design where initialization of a supertype depends on initialization of its own subtypes. That is setting things up for class initialization deadlock, unless the only "entry path" is via initialization of the supertype. As soon as you have code that starts initialization from a subtype - as your HttpTester.Message does - then deadlock is just a matter of threads and timing. That said, there may have been a change in JDK 21 that exposed this, or made it more likely. I will look into that. It should be easy enough to create a simpler reproducer using the pattern shown in the Jetty code.
19-09-2023

sbordet commented Sep 19, 2023 @jaikiran it happens most often in jetty-ee9-servlets and jetty-ee10-servlets because in those modules we use the utility class HttpTester. HttpTester.Message extends MutableHttpFields, so we have an entry point from the concrete class MutableHttpFields; we have another entry point from the initialization of HttpFields.EMPTY. The two entry points navigate the class hierarchy in opposite directions, hence the deadlock. See eclipse/jetty.project#10540. I read the Java and JVM language specification, and indeed this seems allowed behavior, although I guess implementations were somehow stricter before Java 21. If your analysis is that this is expected behavior and per-spec, we can close the issue. If you think would be useful to investigate what changes led to this behavior (to be more evident) in 21 we can try to code a simpler reproducible case and see how it behaves in the various Java version. Let us know.
19-09-2023

jaikiran commented Sep 18, 2023 Hello Joakim, Thank you for those additional details. And these are just the numbers from our CI. From our point of view this JVM lockup is new and unique to JDK 21. I am not denying that you are seeing an issue with class initialization only in Java 21. What I was looking for was some (debug) data which would help us understand why/how Java 21 "woke up" an issue with that class hierarchy. Usage of -verbose:class and comparing that result between the JDK versions is just a start to trying to get that data to try and narrow it down. Initial analysis which David explained through those logs so far hasn't suggested a JDK issue - this case where threads can deadlock while doing static initialization of "cyclic" class usages is a known thing (I haven't checked if it is documented anywhere) and isn't specific to Java 21. If this is something that can be reproduced in a simpler application with specific steps, that would help too.
19-09-2023

joakime commented Sep 18, 2023 • Hello Jaikiran, This codebase (The jetty-12.0.x branch), is built (build from scratch + unit testing + integration testing) on our CI against Java 17 and Java 21. (a matrix build) Eg: https://jenkins.webtide.net/blue/organizations/jenkins/jetty.project/detail/jetty-12.0.x/1540/pipeline (green build) https://jenkins.webtide.net/blue/organizations/jenkins/jetty.project/detail/jetty-12.0.x/1535/pipeline (lockup build) This means every branch from jetty-12.0.x, all PRs, and close to every commit as well. The jetty-12.0.x branch has 1540 builds so far, and has had 2379 merges into it. So including builds of branches from jetty-12.0.x, and PRs. We should be well over 5000 builds with this codebase on JDK 17. We switched from jdk19 to jdk21 as our "other jdk" build on July 28. So that means 5000-ish builds on JDK 17, with no issues related to JVM lockup. About 3710 builds on JDK 19, with no issues related to JVM lockup. And finally about 272 builds on JDK 21, with 33 JVM lockups. And these are just the numbers from our CI. From our point of view this JVM lockup is new and unique to JDK 21.
19-09-2023

jaikiran commented Sep 18, 2023 Having looked at that class hierarchy I think this issue should be reproduced in older JDKs too, when multiple threads are involved. Are you sure the class initialization sequence and the number of threads involved when testing this with older JDKs hasn't changed? Perhaps you could use -verbose:class both in older and this patched JDK (along with these above logs) to get hold of the loading logs and compare them to see if there's some difference?
19-09-2023

sbordet commented Sep 18, 2023 @dholmes-ora, the reason we were thinking about a fix is that we have valid Java code that can deadlock class initialization, and this does not seem optimal. Other class initialization gotchas are known Java puzzlers that lead to NullPointerException or ClassInitializationError, but at least these hint at what's wrong with the exception and don't deadlock the JVM. We did not see this problem until Java 21, so thought this problem to be a regression in Java 21. We will work around the problem in our code, but wanted to report the JVM deadlock as something that might be unexpected new behavior in Java 21.
19-09-2023

On 16/09/2023 7:50 am, David Holmes wrote: > On 15/09/2023 5:00 pm, Simone Bordet wrote: >> Perhaps we are using a pattern in the code that makes it more likely >> to happen? > > Perhaps. I don't know the jetty code at all so you would need to go back > to them with these findings. If two classes have a dependency on each > other this way then you would normally expect there to be a preferred > "entry point" so that initialization will always occur in the same > order. But it is a fragile design where a superclass/interface uses its > own subclass in its static initialization. > > Glad this helped expose the problem.
19-09-2023

On 15/09/2023 5:00 pm, Simone Bordet wrote: > We have attached the logs to the pull request above. > > Seems like a classic deadlock: > > '[8.061s][debug][class,init] Thread ForkJoinPool-1-worker-1 is > initializing org.eclipse.jetty.http.HttpFields'. > ... > '[8.113s][debug][class,init] Thread ForkJoinPool-1-worker-2 is > initializing org.eclipse.jetty.http.MutableHttpFields'. > '[8.113s][debug][class,init] Thread ForkJoinPool-1-worker-2 waiting > for initialization of org.eclipse.jetty.http.HttpFields by thread > ForkJoinPool-1-worker-1'. > ... > '[8.117s][debug][class,init] Thread ForkJoinPool-1-worker-1 > recursively initializing org.eclipse.jetty.http.HttpFields'. > '[8.117s][debug][class,init] Thread ForkJoinPool-1-worker-1 waiting > for initialization of org.eclipse.jetty.http.MutableHttpFields by > thread ForkJoinPool-1-worker-2'. > > It is peculiar that we already get this deadlock for the same classes. > Perhaps we are using a pattern in the code that makes it more likely to happen?
19-09-2023

On 13/09/2023 7:44 am, David Holmes wrote: > Hi Simone, > > Thanks for the logs. Unfortunately they shed no light on what is > happening. There is no sign of the MutableHttpFields class starting > actual initialization, though we can see that it was linked (verified). > And there are no exceptions indicated relevant to the class loading and > initialization process that I can discern. > > The class initialization logging is unfortunately rather sparse and I > think we would need to add some additional logging to shed further light > on this. Are you in a position to patch the sources, create a custom > build and test again with that build?
19-09-2023

On 12/09/2023 6:35 pm, Simone Bordet wrote: Here is another failed run: JVM threads dump: https://gist.github.com/olamy/b3e20a76f0fc77232882b9be95db47e1 JVM output (gist is truncated, download full file): https://gist.githubusercontent.com/olamy/86f0a1215c722e5e9acf96cae597422e/raw/4404693c39fd767122d34a7a3dde1d797afa6f25/gistfile1.txt The JVM stops outputting at 2.360s, then we have a Jetty idle timeout after 30 seconds, then we see: [51.472s][debug][exceptions] Thread::clear_pending_exception: cleared exception:java.lang.VirtualMachineError '. And then more log lines follow after another 5 minutes. Let us know your findings.
19-09-2023

On 11/09/2023 3:22 pm, David Holmes wrote: > I've looked at the dump and there is no sign that the MutableHttpFields > is actively in use. It suggests to me that class initialization has > failed but the class state has not been correctly updated and the > waiters released. There were some changes in JDK 21 about how failures > in this area were handled, so it is possible I/we got something wrong. > Is it possible to try running this with additional logging enabled e.g. > > -Xlog:class+init=debug -Xlog:exceptions=debug
19-09-2023