Symptom:
--------
Crashes as described in JDK-8299956 because of class/nmethod unloading even though a nmethod is on stack.
The crashes are reproducible with the release build running test/langtools:tier1 repeatedly with a concurrency of 6 within 15 - 180 minutes.
Analysis:
--------
Debugging code after G1ConcurrentMark::finalize_marking() shows there are
nmethods with dead oops (mostly classloaders) on stack if MarkingCodeBlobClosure
is changed not to mark oops during G1 remark.
The following steps lead to a G1 concurrent marking cycle without arming nmethod entry barriers.
This could cause the symptom because nmethod barriers should be armed to keep oop constants
of nmethods alive.
Step 1
CodeCache::on_gc_marking_cycle_start() is called and nmethods are armed in
G1CollectedHeap::start_codecache_marking_cycle_if_inactive() before young GC
Stack:
CodeCache::on_gc_marking_cycle_start() : void
G1CollectedHeap::start_codecache_marking_cycle_if_inactive() : void
G1ConcurrentMark::pre_concurrent_start(enum GCCause::Cause) : void
G1YoungCollector::pre_evacuate_collection_set(G1EvacInfo *) : void
G1YoungCollector::collect() : void
G1CollectedHeap::do_collection_pause_at_safepoint_helper() : void
Step 2
Concurrent marking start is undone at same safepoint
Stack:
G1ConcurrentMarkThread::start_undo_mark() : void
G1CollectedHeap::start_concurrent_cycle(bool) : void
G1CollectedHeap::do_collection_pause_at_safepoint_helper() : void
Step 3
Because of the undo the CodeCache::on_gc_marking_cycle_finish() in G1ConcurrentMark::remark() is not reached.
Step 4
Next concurrent cycle starts. Same stack as in Step 1. Nmethods are not armed
because CodeCache::is_gc_marking_cycle_active() returns true in
G1CollectedHeap::start_codecache_marking_cycle_if_inactive()
This can cause the issues given in JDK-8299956. The dead loaders are most
probably loaders of (maybe inlined) optimized virtual calls that aren't
reachable anymore. Nevertheless the referencing nmethods must not be unloaded if
they are on stack. The backout done with JDK-8299956 prevents this by iterating
all frames and marking the oops of nmethods on stack.
A better fix would be to make sure nmethod entry barriers are armed when g1 marking starts.