JDK-8294947 : Use 64bit atomics in patch_verified_entry on x86_64
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 8,11,17,18,19,20
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • OS: linux
  • CPU: x86_64
  • Submitted: 2022-10-07
  • Updated: 2023-01-04
  • Resolved: 2022-11-15
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 17 JDK 20
11.0.19-oracleFixed 17.0.7-oracleFixed 20 b24Fixed
Related Reports
Relates :  
Relates :  
Description
I'm working on a crash that seems to be related to CMC[1] - the JVM crashes when a method become not re-entrant because a JavaThread executing a compiled method reaches an instruction partially-assembled during patching of verified entry point.

Unfortunately, no simple reproducer available.

In the void NativeJump::patch_verified_entry() we atomically patch first 4 bytes, then atomically patch 5th byte, then atomically patch first 4 bytes again. 

But from CMC point of view it's better to patch atomically 8 bytes at once.

1. http://cr.openjdk.java.net/~jrose/jvm/hotspot-cmc.html
Comments
Fix request [17u] I backport this for parity with 17.0.7-oracle. Low risk, simple change changing update of a field. Only one platform, but the most important one. Clean backport. SAP nightly testing passed.
04-01-2023

A pull request was submitted for review. URL: https://git.openjdk.org/jdk17u-dev/pull/1025 Date: 2023-01-03 13:51:46 +0000
03-01-2023

Changeset: d0fae43e Author: Dmitry Samersoff <dsamersoff@openjdk.org> Date: 2022-11-15 10:43:05 +0000 URL: https://git.openjdk.org/jdk/commit/d0fae43e89a73e9d73b074fa12276c43ba629278
15-11-2022

A pull request was submitted for review. URL: https://git.openjdk.org/jdk/pull/11059 Date: 2022-11-09 12:41:59 +0000
09-11-2022

Thanks [~dsamersoff]. We have been seeing some strange crashes in the OCI virtualized environment, all on AMC EPYC hardware (see JDK-8258825). I have been wondering if something in the hypervisor could be contributing. Your crash sounds strange if RIP is at entry-point + 2 or entry-point + 4, because the old instruction is supposed to be at least 5 bytes. However, if the hypervisor is fetching instructions differently than the CPU, maybe that could cause problems. Looking around in the Linux KVM code, I see places where it fetches instruction bytes when emulating instructions, or when computing the next RIP. If the hypervisor does not read the bytes atomically then it seems like it could see strange transient decodings, even if the JVM does patch the code atomically.
28-10-2022

The problem is reproducible in a virtualized environment on multi-core Xeon Gold machine. The crash always happens if the compiled method being replaced, immediately after patching, and RIP always points to the byte right after inserted jmp-to-self.
28-10-2022

ILW = 3-step patching allows too many intermediate states, possible cause of crashes; hard to reproduce; no workaround = MLH = P4
26-10-2022

[~dsamersoff], I agree making this atomic is a good idea. But I'm curious, what kind of hardware is the crash happening on, and what are the symtoms that point to patch_verified_entry()? I tried inserting delays in patch_verified_entry() between the 3 patching steps to make the race condition more evident, but I still couldn't demonstrate a problem with patch_verified_entry().
26-10-2022