JDK-8300915 : G1: incomplete SATB because nmethod entry barriers don't get armed
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2023-01-23
  • Updated: 2023-02-14
  • Resolved: 2023-01-30
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 21
21 b08Fixed
Related Reports
Blocks :  
Relates :  
Relates :  
Description
Symptom:
--------

Crashes as described in JDK-8299956 because of class/nmethod unloading even though a nmethod is on stack.

The crashes are reproducible with the release build running test/langtools:tier1 repeatedly with a concurrency of 6 within 15 - 180 minutes.

Analysis:
--------

Debugging code after G1ConcurrentMark::finalize_marking() shows there are
nmethods with dead oops (mostly classloaders) on stack if MarkingCodeBlobClosure
is changed not to mark oops during G1 remark.

The following steps lead to a G1 concurrent marking cycle without arming nmethod entry barriers.
This could cause the symptom because nmethod barriers should be armed to keep oop constants
of nmethods alive.

Step 1

CodeCache::on_gc_marking_cycle_start() is called and nmethods are armed in
G1CollectedHeap::start_codecache_marking_cycle_if_inactive() before young GC

  Stack:
    CodeCache::on_gc_marking_cycle_start() : void
    G1CollectedHeap::start_codecache_marking_cycle_if_inactive() : void
    G1ConcurrentMark::pre_concurrent_start(enum GCCause::Cause) : void
    G1YoungCollector::pre_evacuate_collection_set(G1EvacInfo *) : void
    G1YoungCollector::collect() : void
    G1CollectedHeap::do_collection_pause_at_safepoint_helper() : void

Step 2

Concurrent marking start is undone at same safepoint

  Stack:
    G1ConcurrentMarkThread::start_undo_mark() : void
    G1CollectedHeap::start_concurrent_cycle(bool) : void
    G1CollectedHeap::do_collection_pause_at_safepoint_helper() : void  

Step 3

Because of the undo the CodeCache::on_gc_marking_cycle_finish() in G1ConcurrentMark::remark() is not reached.

Step 4

Next concurrent cycle starts. Same stack as in Step 1. Nmethods are not armed
because CodeCache::is_gc_marking_cycle_active() returns true in
G1CollectedHeap::start_codecache_marking_cycle_if_inactive()

This can cause the issues given in JDK-8299956. The dead loaders are most
probably loaders of (maybe inlined) optimized virtual calls that aren't
reachable anymore. Nevertheless the referencing nmethods must not be unloaded if
they are on stack. The backout done with JDK-8299956 prevents this by iterating
all frames and marking the oops of nmethods on stack.

A better fix would be to make sure nmethod entry barriers are armed when g1 marking starts.

Comments
Changeset: 3db558b6 Author: Richard Reingruber <rrich@openjdk.org> Date: 2023-01-30 08:43:15 +0000 URL: https://git.openjdk.org/jdk/commit/3db558b67bebfe559833331475f481c588147084
30-01-2023

A pull request was submitted for review. URL: https://git.openjdk.org/jdk/pull/12194 Date: 2023-01-25 13:32:27 +0000
25-01-2023