United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: JDK-7045751 G1: +ExplicitGCInvokesConcurrent causes excessive single region evacuation pauses
JDK-7045751 : G1: +ExplicitGCInvokesConcurrent causes excessive single region evacuation pauses

Details
Type:
Bug
Submit Date:
2011-05-17
Status:
Closed
Updated Date:
2011-11-25
Project Name:
JDK
Resolved Date:
2011-09-30
Component:
hotspot
OS:
generic
Sub-Component:
gc
CPU:
generic
Priority:
P4
Resolution:
Fixed
Affected Versions:
8-pool
Fixed Versions:
hs22 (b01)

Related Reports
Backport:
Backport:

Sub Tasks

Description
Triaging 7041440 brought to light the following:

The test program for 7041440 consisted of a number of Java threads who just perform a single System.gc() call. When run normally, these System.gc() calls result in back to back full GCs (requested by different threads). When one thread is successful in starting a full GC, those thread who have not yet done a full GC will be blocked and waiting to start their own.

In the test case for 7041440, the test program was run with +ExplicitGCInvokesConcurrent.

ExplicitGCInvokesConcurrent is supposed to convert the full GC to an evacuation pause that starts a concurrent marking cycle. The requesting thread then blocks until the concurrent marking is complete.

With the test case for 7041440 we see the following (perhaps silly) behavior:

Each thread that requests a System.gg() creates an instance of the VM_G1IncCollectionPause vm operation and enqueues it on the VM operation queue (using VMThread::execute).

The VM thread then starts executing these enqueued VM operations...

The VM thread executes the VM_G1IncCollectionPause for thread A. It executes an initial mark pause. Thread A waits in G1IncCollection::doit_epilogue until the concurrent mark completes. During this evacuation pause, concurrent marking is started. This evacuation pause leaves only one survivor region in the collection set.

The VM thread then processes the VM operation enqueued by Thread B and executes G1IncCollection::doit. The VM thread first reads the # of _full_collections_completed. sees that a concurrent mark is already in progress and so does not force an initial mark. It then executes an evacuation pause where the collection set is a single region (the survivor region from the pause requested by thread A). This evacuation pause completes and leaves a single survivor region in the collection set. Thread B waits in VM_G1IncCollectionPause::doit_epilogue until the # of _full_collections_completed is incremented at the end of the marking cycle.

The VM thread then processs the VM operation enqueued by Thread C and executes G1IncCollection::doit....

And so on.

We see a bunch of evacuation pauses where the collection set is only one heap region as a result of the enqueued VM_G1IncCollectionPause instances. At some point the surviving data is promoted and the collection set for the evacuation pauses is empty.

Eventually the marking cycle completes and a new initial mark pause is performed - starting the process over again.

This behavior is obviously wrong. While the marking is in progress - we probably should not be doing the pauses. We should either:

  * Wait before the read of _full_collections_completed, before the pause until the marking completes.
    When marking completes we would execute another initial mark pause (and concurrent mark).

  * Alternatively, if marking is already active we should be skip the pause completely.

In both cases the requesting Java thread will be waiting in VM_G1IncCollectionPause::doit_epilogue() until _full_collections_completed is incremented at the end of the marking cycle.

Skipping the pauses if marking is active (instead of waiting before the pause) is easier.

                                    

Comments
SUGGESTED FIX

In VM_G1IncCollectionPause::doit():

  GCCauseSetter x(g1h, _gc_cause);
  if (_should_initiate_conc_mark) {
    // It's safer to read full_collections_completed() here, given
    // that noone else will be updating it concurrently. Since we'll
    // only need it if we're initiating a marking cycle, no point in
    // setting it earlier.
    _full_collections_completed_before = g1h->full_collections_completed();

    // At this point we are supposed to start a concurrent cycle. We
    // will do so if one is not already in progress.
    bool res = g1h->g1_policy()->force_initial_mark_if_outside_cycle();
  }

  _pause_succeeded =
    g1h->do_collection_pause_at_safepoint(_target_pause_time_ms);
  if (_pause_succeeded && _word_size > 0) {
    // An allocation had been requested.
    _result = g1h->attempt_allocation_at_safepoint(_word_size,
                                      true /* expect_null_cur_alloc_region */);
  } else {
    assert(_result == NULL, "invariant");
  }

"res" is true if we are not in a marking cycle and the next evacuation pause should initiate a concurrent mark. Therefore if res is false we should skip the evacuation pause.
                                     
2011-05-17
EVALUATION

Incorrect behavior with +ExplicitGCInvokesConcurrent.
                                     
2011-05-17
EVALUATION

http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/5130fa1b24f1
                                     
2011-06-15
EVALUATION

http://hg.openjdk.java.net/hsx/hotspot-comp/hotspot/rev/5130fa1b24f1
                                     
2011-07-08
EVALUATION

http://hg.openjdk.java.net/hsx/hotspot-rt/hotspot/rev/5130fa1b24f1
                                     
2011-07-08



Hardware and Software, Engineered to Work Together