JDK-8297487 : G1 Remark: no need to keep alive oop constants of nmethods on stack
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 20
  • Priority: P4
  • Status: Closed
  • Resolution: Fixed
  • Submitted: 2022-11-23
  • Updated: 2023-01-12
  • Resolved: 2023-01-10
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 21
21 b05Fixed
Related Reports
Blocks :  
Relates :  
Relates :  
Relates :  
Description
The stackwalks to keep alive oops of nmethods found on stack during G1 remark seem redundant as nmethod entry barriers do the job already: https://github.com/openjdk/jdk/blob/f26bd4e0e8b68de297a9ff93526cd7fac8668320/src/hotspot/share/gc/shared/barrierSetNMethod.cpp#L85

Objects in the constant pool of a nmethod are weakly referenced. So G1 requires nmethod entry barriers to conservatively keep alive the constant objects for SATB during concurrent marking (see https://bugs.openjdk.org/browse/JDK-8288970).
Comments
Method handle / continuation intrinsics are not unloaded. You are right though, that CodeCache::arm_all_nmethods() is called after weak_refs_work(). And it also does walk the stacks to disarm all on-stack nmethods, but without marking the nmethods as on-stack. So there is some kind of discrepancy there, but I'm not sure yet how it manifests exactly.
12-01-2023

Erik, thanks for having a look into the issues. Sounds plausible what you are saying. Isn't the nmethod unloading done during `weak_refs_work()`, though, before `CodeCache::arm_all_nmethods()` is called? Method handle / continuation intrinsics are nmethods without entry barriers (if I'm not mistaken). If so, could this be a problem too? Thanks, Richard.
12-01-2023

FTR, I replied to Richard's questions in JDK-8299956.
12-01-2023

Hmm. [~rrich] I think I see what happened here. The MarkingCodeBlobClosure would also nm->mark_as_maybe_on_stack for the CodeCache unloading support. And G1 remark pauses bump the global epoch. Therefore, without the MarkingCodeBlobClosure, we end up in a situation where the on-stack nmethods are armed and haven't been labelled as on-stack. This could mean that the GC ends up nuking nmethods because it doesn't realize they are on-stack. I forgot about this because ZGC doesn't re-arm nmethods when marking terminates, the way that G1 does.
12-01-2023

Hi Tobias, I don't see a connection between the crash logs attached to the BACKOUT (JDK-8299956) and JDK-8297487. Also this change was very thoroughly tested on all platforms, fastdebug/release without any failures. I tried to reproduce with `time make test TEST=test/langtools/tools/javac` but so far without success. I checked out the original commit of JDK-8297487 for this. Can you help and explain how you reproduced the issue? Did the BACKOUT actually help at all? Thanks, Richard.
11-01-2023

This caused a regression. I will back the change out with JDK-8299956. Please file a REDO according to https://openjdk.org/guide/#backing-out-a-change if appropriate.
11-01-2023

Changeset: eab1e626 Author: Richard Reingruber <rrich@openjdk.org> Date: 2023-01-10 10:32:32 +0000 URL: https://git.openjdk.org/jdk/commit/eab1e6260d5622722b3e930b8457a64c6da28ccc
10-01-2023

A pull request was submitted for review. URL: https://git.openjdk.org/jdk/pull/11314 Date: 2022-11-23 10:05:56 +0000
23-11-2022