JDK-8337753 : Target class of upcall stub may be unloaded
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.lang.foreign
  • Affected Version: 23,24
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2024-08-02
  • Updated: 2024-11-25
  • Resolved: 2024-10-03
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 23 JDK 24
23.0.2Fixed 24 b19Fixed
Related Reports
Duplicate :  
Relates :  
Relates :  
Description
FFM upcall stubs embed a Method* of the target method in the stub. This Method* is read from the LambdaForm::vmentry field associated with the target MethodHandle at the time when the upcall stub is generated. The MH instance itself is stashed in a global JNI ref. So, there should be a reachability chain to the holder class: MH (receiver) -> LF (form) -> MemberName (vmentry) -> ResolvedMethodName (method) -> Class<?> (vmholder).

However, it appears that, due to multiple threads racing to initialize the vmentry field of the LambdaForm of the target method handle of an upcall stub, it is possible that the vmentry is updated _after_ we embed the corresponding Method* into an upcall stub. Technically, it is fine to keep using a 'stale' vmentry, but the problem is that now the reachability chain is broken, since the upcall stub only extracts the target Method*, and doesn't keep the stale vmentry reachable. The holder class can then be unloaded, resulting in a crash.
Comments
I was under the mistaken impression that a regular fix request after RDP2 would be sufficient, but it appears not to be. I'd like the elevate the previous fix request to a critical fix request, with the same reasoning. Copied here for convenience: I'd like to backport this to 23u. The issue results in JVM crashes and potential heap corruption, so it is quite severe in consequences. We initially thought that it was very rare, but we've now had a report from a user that they are seeing this crash frequently in their application (https://github.com/openjdk/jdk23u/pull/163#issuecomment-2438415048), and they would like to have this fix. It is not really possible to work around the issue either.
04-11-2024

Fix request (23u): I'd like to backport this to 23u. The issue results in JVM crashes and potential heap corruption, so it is quite severe in consequences. We initially thought that it was very rare, but we've now had a report from a user that they are seeing this crash frequently in their application (https://github.com/openjdk/jdk23u/pull/163#issuecomment-2438415048), and they would like to have this fix. It is not really possible to work around the issue either.
25-10-2024

A pull request was submitted for review. Branch: master URL: https://git.openjdk.org/jdk23u/pull/163 Date: 2024-10-14 12:34:18 +0000
14-10-2024

Changeset: 6af13580 Branch: master Author: Jorn Vernee <jvernee@openjdk.org> Date: 2024-10-03 12:02:24 +0000 URL: https://git.openjdk.org/jdk/commit/6af13580c2086afefde489275bc2353c2320ff3f
03-10-2024

A pull request was submitted for review. Branch: master URL: https://git.openjdk.org/jdk/pull/20479 Date: 2024-08-06 17:26:55 +0000
03-09-2024

New theory: there can be multiple racing writes of the LambdaForm::vmentry field. LambdaForms are shared between method handles, but their vmentry field is only initialized in the constructor of MethodHandle (through a call to LF::prepare). So, it is theoretically possible for 2 MethodHandles to be created with the same LambdaForm instance. Then, in each MH's constructor, the LF is prepared, resulting in 2 plane writes to the vmentry field, in 2 different threads, pointing at 2 different LF classes. Fast-forward to where we are generating an upcall stub, we read back a 'stale' vmentry field, and embed the Method* that it hold in the upcall stub. 'stale' because we have not yet seen the update of the vmentry field from the thread that actually won the race (updated the field last). Meaning that, there is no reachability chain to the holder class of our target method at all, and the class will be unloaded at some point. To test this theory, I ran the test with `-Djava.lang.invoke.MethodHandle.TRACE_INTERPRETER=true`, which shows the same LambdaForm being compiled to bytecode different times (each time resulting in a store to the vmentry field). In the worst case I'm seeing the same LF instance being compiled 512 different times (revealing that my 16 core VM is backed by a 512 core machine).
06-08-2024

I've attached another hs_err log + combined log of -Xlog:class+load and -Xlog:class+unload, where we can see the target class of the upcall stub being unloaded. In the hs_err log, the upcall stub references the Method* 0x0000ffff4d199f20: 0x0000ffff7425469c: mov x12, #0x9f20 // #40736 0x0000ffff742546a0: movk x12, #0x4d19, lsl #16 0x0000ffff742546a4: movk x12, #0xffff, lsl #32 This value is also present in R12: R12={method} {0x0000ffff4d199f20} 'invoke' '(Ljava/lang/Object;DDDD)V' in 'java/lang/invoke/LambdaForm$MH+0x00003800015cf000' If we look up this class in xlog_unloaded.txt, we find that it is first loaded, and then unloaded (and there are no other occurrences of a hidden class with the same ID): [182.669s][info][class,load ] java.lang.invoke.LambdaForm$MH/0x00003800015cf000 source: __JVM_LookupDefineClass__ ... [184.391s][info][class,unload] unloading class java.lang.invoke.LambdaForm$MH/0x00003800015cf000 0x00003800015cf000
05-08-2024

See the attached hs_err_1.log, where the method pointed to by the receiver in R1 (see 'form' field) doesn't match the method embedded in the upcall stub, loaded in R12: {method} {0x0000ffff50fc3318} 'invoke' '(Ljava/lang/Object;FFD)V' in 'java/lang/invoke/LambdaForm$MH+0x0000000201478800' (0xd86d4270) {method} {0x0000ffff50eb9f18} 'invoke' '(Ljava/lang/Object;FFD)V' in 'java/lang/invoke/LambdaForm$BMH+0x00000002012f0000' Interestingly though, the 'customizationCount' of the receiver is still 0, so the Method* in the receiver seems to be being updated through other means.
02-08-2024

ILW = Crash, intermittent, Workaround is to manually keep LF reachable (which requires reflection/is not ovious) = HLM = P3
02-08-2024