JDK-8253238 : [REDO] Improve object array chunking test in G1's copy_to_survivor_space
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: hs25,8
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2020-09-16
  • Updated: 2020-09-29
  • Resolved: 2020-09-22
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 16
16 b17Fixed
Related Reports
Cloners :  
Relates :  
Relates :  
Description
All collectors including G1 treat object arrays special if they exceed a size of ParGCArrayScanChunk . Instead of pushing all elements of the object array on the work stack, the collectors only push part of it and the object array again with an indication that part of it still needs to be scanned.

In the respective copy_to_survivor_space() methods, there is always a check whether this chunking is done or not. In case of G1 this check is relatively costly, while it has been optimized better in others.

E.g. in g1CollectedHeap.cpp, copy_to_survivor_space()

    if (obj->is_objArray() && arrayOop(obj)->length() >= ParGCArrayScanChunk) {
      [... do chunking... ]
    } else {
      [ ... regular processing... ]
    }

a slightly faster variant switches the two predicates, and replaces the arrayOop(obj)->length() with the "word_sz" variable we already retrieved anyway.

I.e.

    if (word_sz >= ParGCArrayScanChunk && obj->is_objArray()) {

additionally it's better to have the size check predicate first, as it is even more excluding than the other.

Another particularity that should be fixed is the use of ParGCArrayScanChunk: it's a global that is typically located far away from the ParCopyClosure code, requiring extra pages etc. to be loaded in. VTune runs indicate that this is a relatively big issue (lots of cycles spent on this and surrounding instructions), as the G1 collector code is already touching too many places.

The recommendation here is to put a copy of ParGCArrayScanChunk into a member variable of G1ParCopyClosure. This fixes this issue.

I.e.

      if (word_sz >= _local_pargc_array_scan_chunk_sz && obj->is_objArray()) {


There is a minor wrinkle with putting just ParGCArrayScanChunk into _local_pargc_array_scan_chunk_sz and comparing it with word_sz. Word_sz contains a slightly larger value than the original  arrayOop(obj)->length(). Word_sz is the entire object size, while the arrayOop::length() call returns the number of elements in the array.

Using the local _local_pargc_array_scan_chunk_sz this can be accounted for if needed.

Comments
Changeset: 0e98fc1c Author: Kim Barrett <kbarrett@openjdk.org> Date: 2020-09-22 05:14:06 +0000 URL: https://git.openjdk.java.net/jdk/commit/0e98fc1c
22-09-2020

Also, parallel scavenge seems to use the same code: in psPromotionManager.inline.hpp, line 188: if (new_obj_size > _min_array_size_for_chunking && new_obj->is_objArray() && PSChunkLargeArrays) {
16-09-2020