The policy that picks which old regions to collect during mixed GCs, introduced with:
7132029: G1: mixed GC phase lasts for longer than it should
is as follows:
1) Any old region with live bytes more than X% is never considered for collection (X is set with the develop parameter G1OldCSetRegionLiveThresholdPercent, default == 95%). The rest are added to the CSet chooser and are sorted according to GC efficiency.
2) We consider all old regions in the CSet chooser for collection until we reach a point where the total reclaimable bytes on the remaining regions is less than Y% of the heap size (Y is set with the develop parameter G1OldReclaimableThresholdPercent, default == 1%). At that point we stop doing mixed GCs and revert back to young GCs.
The above can maybe be improved in the following way:
After we add all candidate old regions to the CSet chooser we do a short additional processing step. Let's call it 1.5) given that it slots in between 1) and 2) above:
1.5) Scan the CSet chooser array backwards (i.e., from worst region to best region) and we keep removing regions until the total reclaimable bytes of the regions we remove reaches a certain limit. This limit (ideally: it should be user-settable) expresses how much free space to "sacrifice" (i.e., not collect) in order to avoid collecting some potentially expensive regions.
Given the above 2) becomes much easier: since we've already removed the expensive regions from the array, we will carry on doing mixed GCs until the array is empty.
One question is whether to do the extra filtering described in 1) or add all old regions to the CSet chooser and only rely in step 1.5) to filter out unwanted regions. I'd suggest to keep the filtering to, if anything else, cut down how much work we'll need to do during 1.5) (the array will be shorter).