JDK-8003237 : G1: Reduce unnecessary (and failing) allocation attempts when handling an evacuation failure
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 8,9
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2012-11-09
  • Updated: 2021-08-04
  • Resolved: 2015-08-20
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 9
9 b81Fixed
Related Reports
Relates :  
Relates :  
Description
If an evacuation failure is seen during object copying, we still have to proceed with the scanning of the remaining roots and RSets. Objects for which we fail to allocate space are self-forwarded (i.e. they are forwarded to themselves). This means that for many, many object copies we are going to execute the allocation path (attempting to to allocate a PLAB, and hence attempting to allocate a new GC alloc region) many, many times. Most (if not all) of these allocation attempts (of GC alloc regions) are doomed to failure.

We should be able to reduce the time of pauses that see an evacuation failure if we reduce the number of times we execute the slow path when an evacuation failure is seen.
Comments
We should be able to reduce the pause times of an evacuation failure if we reduce the number of times we execute the slow path when the evacuation failure is seen. Basically if you look at copy_to_survivor_space: > HeapWord* obj_ptr = _par_scan_state->allocate(alloc_purpose, word_sz); > #ifndef PRODUCT > // Should this evacuation fail? > if (_g1->evacuation_should_fail()) { > if (obj_ptr != NULL) { > _par_scan_state->undo_allocation(alloc_purpose, obj_ptr, word_sz); > obj_ptr = NULL; > } > } > #endif // !PRODUCT we try to allocate space from the G1ParScanthreadState: > HeapWord* allocate(GCAllocPurpose purpose, size_t word_sz) { > HeapWord* obj = alloc_buffer(purpose)->allocate(word_sz); > if (obj != NULL) return obj; > return allocate_slow(purpose, word_sz); > } which calls: > HeapWord* allocate_slow(GCAllocPurpose purpose, size_t word_sz) { > HeapWord* obj = NULL; > size_t gclab_word_size = _g1h->desired_plab_sz(purpose); > if (word_sz * 100 < gclab_word_size * ParallelGCBufferWastePct) { > G1ParGCAllocBuffer* alloc_buf = alloc_buffer(purpose); > add_to_alloc_buffer_waste(alloc_buf->words_remaining()); > alloc_buf->retire(false /* end_of_gc */, false /* retain */); > > HeapWord* buf = _g1h->par_allocate_during_gc(purpose, gclab_word_size); > if (buf == NULL) return NULL; // Let caller handle allocation failure. > // Otherwise. > alloc_buf->set_word_size(gclab_word_size); > alloc_buf->set_buf(buf); > > obj = alloc_buf->allocate(word_sz); > assert(obj != NULL, "buffer was definitely big enough..."); > } else { > obj = _g1h->par_allocate_during_gc(purpose, word_sz); > } > return obj; > } Which will try to either retire the current LAB and allocate a new LAB buffer, or allocate space for the object directly from G1CollectedHeap::par_allocate_during_gc(). If you follow this routine you will see that we lock the free list lock before reattempting the allocation again (this time we'll also try to get a new region). Notice also that we do this independently for survivor regions and for promotion regions. In the event of an evacuation failure, most of the time going down this path - we'll fail and so we have taken the lock for nothing. I believe you could short circuit this code path in the event of an evacuation failure (remember we have a global flag that records that) and only go down the slow path if we have a good idea that new_alloc_region_and_allocate() might actually return a region. I don't think we've measured the overhead of taking the slow path - but once we hit an evacuation failure for a particular allocation purpose (survivor or tenured) - we'll go through the path for every object we attempt to copy.
14-11-2012