United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: JDK-7185699 G1: Prediction model discrepancies
JDK-7185699 : G1: Prediction model discrepancies

Details
Type:
Bug
Submit Date:
2012-07-20
Status:
Resolved
Updated Date:
2013-04-30
Project Name:
JDK
Resolved Date:
2012-08-27
Component:
hotspot
OS:
generic
Sub-Component:
gc
CPU:
generic
Priority:
P4
Resolution:
Fixed
Affected Versions:
7u4
Fixed Versions:
hs24 (b21)

Related Reports
Backport:
Backport:
Backport:
Backport:

Sub Tasks

Description
While investigating 7173711 and 7041879 I discovered the following inconsistencies and discrepancies in G1's prediction model code.

The first issue is in G1CollectedHeap::pending_card_num(), which is used to set G1CollectorPolicy::_pending_cards. The value of this field is used to predict the base elapsed time of the GC when finalizing the collection set.

The code for this routine:

size_t G1CollectedHeap::pending_card_num() {
  size_t extra_cards = 0;
  JavaThread *curr = Threads::first();
  while (curr != NULL) {
    DirtyCardQueue& dcq = curr->dirty_card_queue();
    extra_cards += dcq.size();
    curr = curr->next();
  }

  DirtyCardQueueSet& dcqs = JavaThread::dirty_card_queue_set();
  size_t buffer_size = dcqs.buffer_size();
  size_t buffer_num = dcqs.completed_buffers_num();
  return buffer_size * buffer_num + extra_cards;
}

basically cycles through the current list of Java threads and summing up the sizes of the incomplete update buffers and then adds that to the total size of the unprocessed update buffers. This is supposed to yield the number of un-processed (i.e. dirty) cards.

But DirtyCardQueue::size() and DirtyCardQueueSet::buffer_size() return sizes in bytes and not the number of entries (or entry capacity).

The second issue is:

After the cleanup operation we sort the old regions based upon GC efficiency. The actual GC efficiency of a region is calculated when we add the region to the collectionSetChooser. If we look at the code for the gc_efficiency routine:

void HeapRegion::calc_gc_efficiency() {
  G1CollectedHeap* g1h = G1CollectedHeap::heap();
  G1CollectorPolicy* g1p = g1h->g1_policy();
  _gc_efficiency = (double) reclaimable_bytes() /
                            g1p->predict_region_elapsed_time_ms(this, false);
}

The code for reclaimable_bytes() looks OK, I believe that predict_region_elapsed_time(...) has an issue:

double
G1CollectorPolicy::predict_region_elapsed_time_ms(HeapRegion* hr, bool young) {
  size_t rs_length = hr->rem_set()->occupied();
  size_t card_num;
  if (gcs_are_young()) {
    card_num = predict_young_card_num(rs_length);
  } else {
    card_num = predict_non_young_card_num(rs_length);
  }
  size_t bytes_to_copy = predict_bytes_to_copy(hr);

  double region_elapsed_time_ms =
    predict_rs_scan_time_ms(card_num) +
    predict_object_copy_time_ms(bytes_to_copy);

  if (young)
    region_elapsed_time_ms += predict_young_other_time_ms(1);
  else
    region_elapsed_time_ms += predict_non_young_other_time_ms(1);

  return region_elapsed_time_ms;
}

The code that sets "card_num" looks suspicious. We're calculating the efficiency of an old region which has some live data and we want to predict how long it would take to collected during the upcoming mixed gc phase. When we are calculating the efficency of an old heap region, however, the value of gcs_are_young() is true and, as a result, the predicted number of cards is calculated from the numberSeq that is updated during young GCs.

The value of gcs_are_young() will change at the end of the first evacuation pause 
after marking completes - which will be a young GC since the size of the 
eden was calculated before the end of marking.

I believe that this particular call path should be using predict_non_young_card_num(); the other call paths should be using gcs_are_young().

Also on the call path from calc_gc_efficiency() - we know exactly how many bytes to copy - we don't need to predict it.

                                    

Comments
EVALUATION

Incorrect code in G1's prediction model.
                                     
2012-07-20
EVALUATION

http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/7383557659bd
                                     
2012-08-21



Hardware and Software, Engineered to Work Together