Bug ID: JDK-8158045 Improve large object handling during evacuation

Versions (Unresolved/Resolved/Fixed)

The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.

JDK 16
16 b16Fixed

On large heaps we use 32M regions, which means that the maximum object size that is not directly allocated in old is 16M.

This is already quite large, so that if there are multiple of these objects to copy and scan,  if there are many threads, we effectively serialize GC.

Further, the current splitting of scanning of large objects is in itself also a serialization: we only create a head and a single tail object, where the former is immediately processed by the current thread, and the latter pushed on the object stack.

There are a few problems with this mechanism:
- the amount of achievable parallelism is very small, as we only generate one tail object. I.e. only one other thread can steal, and then do the same.
- the split threshold is too small, i.e. we split at ~100 words into parts of 50 words or so. However it may happen that because of this, the original thread will immediately claim the tail (particular if it is a value type array), so effectively not parallelizing at all.

We have seen serious serialization going on (termination time in the seconds per threads) on large young gens (many GBs) on systems with many threads.

Cloners :	JDK-8253237 - [REDO] Improve large object handling during evacuation
Relates :	JDK-8252330 - [lworld] GC array task splitting doesn't split arrays of inline types
Relates :	JDK-8253169 - [BACKOUT] Improve large object handling during evacuation