Bug ID: JDK-8031381 Investigate allocation into tail of humongous objects

Type: Enhancement
Component: hotspot
Sub-Component: gc
Affected Version: 8u20,9

Priority: P4
Status: Closed
Resolution: Won't Fix

Submitted: 2014-01-08
Updated: 2020-01-22
Resolved: 2016-03-01

Recently we found some applications that use a lot of large objects for buffers that perform poorly with G1 (CRM Fuse, CRM StressBPM).
They all use buffers that are slightly too large to fit into a single region, so another almost completely empty one is used, wasting lots of precious memory. Memory is relatively tight too.

I some cases this can be somewhat worked around by increasing the region size, however in others we cannot since the total heap is already quite small, leading to other problems if we increased region size even more.
Also, increasing the region size poses other problems, i.e. high copy cost of large objects.

Output of -XX:+PrintRegionLivenssInfo contains many lines showing the following:

###   HUMS 0x0000000750100000-0x0000000750200000    1048576    1048576    1048576           
###   HUMC 0x0000000750200000-0x0000000750300000         16         16         168

or 

###   HUMS 0x000000076ed00000-0x000000076ee00000    1048576    1048576    1048576
###   HUMC 0x000000076ee00000-0x000000076ef00000    1048576    1048576    1048576
###   HUMC 0x000000076ef00000-0x000000076f000000    1048576    1048576    1048576
###   HUMC 0x000000076f000000-0x000000076f100000    1048576    1048576    1048576
###   HUMC 0x000000076f100000-0x000000076f200000         16         16         16           

or even worse (4M region size)

###   HUMS 0x000000073d400000-0x000000073d800000    4194304    4194304    4194304  
###   HUMC 0x000000073d800000-0x000000073dc00000         16         16         16             

One option is to make the application do more GC aware allocations which seems possible in above cases, but in others this might not be easily possible.

The proposed solution to investigate for this case is whether it is useful to allow allocation into the tail portion of such humongous objects.

Changes should be roughly limited to:
- start allocation after the end of the last HUMC block (it might be useful to introduce new HUMT, i.e. Humongous Tail regions)
- when reclaiming large objects, instead of freeing them immediately check whether they have been allocated into (one potential simple check could be checking whether the top pointer of that tail region is after the end of the object) - if so, relabel that region as old, and format the tail of the large object as array object.
- evacuation: allow evacuation of everything but the large object itself(?); the allocation into the PLAB should fail if we come across the object anyway, so we can check that in this path.
Need to clean up the remembered set for these regions though (if we allow allocation in single region large objects - humongous tail regions cannot have a remembered set for the humongous object itself). In that case we should not free that region later too.
In a first prototype we can probably ignore humongous objects taking only a single region.

Added a patch file called hum-alloc-tail with a patch based on top of: http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/6bd5c687f11a This patch worked for the benchmarks we ran, but we did not find a benchmark where this was a significant improvement. Given that it would make the code more complex it was decided not to integrate this. Will close this bug now.

01-03-2016

Patch based on http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/6bd5c687f11a

01-03-2016

Relates :	JDK-8027918 - Investigate increasing large object size threshold for G1
Relates :	JDK-8172713 - Allow allocation into the tail end of humongous objects