Currently during marking cycles, self-forwarded objects in regions that experience an evacuation failure (i.e. objects that are not evacuated during a pause due to space limitations) are not explicitly marked in the NEXT marking bitmap. Instead the value of NTAMS for the region is to the region bottom. This has the effect of making everything in the region implicitly live instead of just the self-forwarded objects.
The current strategy has a potentially negative side effect: at the end of marking, regions with an evacuation failure will look to be fairly full of live objects and, as such, may not be considered good candidates for collection during the upcoming mixed-GC phase.
If we changed the strategy so that the marking information is refined (i.e. the self-forwarded objects are explicitly marked - similar to what happens during an initial mark pause) then the region(s) will no longer look to be full of live objects. As a result a region could look like as good candidate for collection during the mixed GC phase and we might free up more space.
There are a couple of issues this change in strategy raises:
* Do we need to trace self-forwarded objects after they have been explicitly marked?
We do not need to trace/scan the reference fields in these explicitly marked self-forwarded objects. In the current strategy they are implicitly marked and we don't scan implicitly marked objects. When an object A is self-forwarded during an evacuation pause - its reference fields are scanned and the referenced objects are either successfully evacuated or are self-forwarded themselves. In either case, object A and the objects to which A refers are above NTAMS of their respective regions and, thus, are implicitly live.
A similar argument exists if the self-forwarded object A is explicitly marked. The objects to which A refers will either be successfully evacuation (in which case they will be implicitly live) or will be self-forwarded themselves (in which case they will be explicitly marked).
If the concurrent marking global finger is below a region that gets an evacuation failure then, when a marking thread claims then region, it will scan and trace the objects in that region that are marked (i.e. those objects that were self-forwarded) - needlessly (for the reason listed above). So the marking thread(s) will end up doing some wasted work - unless the region is tagged in some way so that the marking threads do not scan the region.
One other issue came out of some experiments that were done with the strategy change listed above (code changes in the attached patch) occurs when an evacuation failure is seen between remark and cleanup - the liveness counting data can become inconsistent (and in a potentially bad way).
Currently when an object is marked - it is also "counted" (i.e. included in some per-worker liveness counting data structures). During the remark pause the per-worker liveness data is aggregated (and the per-worker data is cleared). The 'global' liveness counting data is then finalized during the cleanup pause.
If we explicitly mark self-forwarded objects between the remark and cleanup pauses, any additions to the per-worker counting data are not currently included in the global counting data. To resolve this we need another aggregation phase during cleanup, or we relocate the current aggregation phase from remark to the cleanup pause (perhaps combine it with finalization of the counting data).