Bug ID: JDK-8167077 Limit deferred card marking for (large) objArrays with G1

Type: Enhancement
Component: hotspot
Sub-Component: compiler
Affected Version: 9,10

Priority: P3
Status: Open
Resolution: Unresolved

Submitted: 2016-10-04
Updated: 2025-03-20

Other
tbdUnresolved

Deferred card marking is an optimization that batches the barrier application for initial stores to objects to increase performance.

This optimization, in conjunction with large objArrays can cause inacceptable pause times, as scanning of these arrays takes a significant amount of time. This breaks pause time requirements by itself.

The applicability of this optimization for objArrays seems to be very limited too (but may be another issue).

E.g.

public class Test {
  private static Object foo() {
    Object i = new Integer(1);
    Object[] x = new Object[123];
    x[0] = i;
    return x;
  }
  public static void main(String ...args) {
    for (int i = 0; i < 1_0000_0000; i++) {
      foo();
    }
  }
}

works, but something like

Object b = new Object();

private static Object foo() {
  Object x = new Object[123];
  x[0] = b;
  return x;
}

does not.

This example also shows the limitations of this optimizations in another way: it requires you to assign each element explicitly, because any other way (like in a loop) also will not omit the barriers (in the general case can not because typically loops often have forced safepoint locations at their back-edges). Initializing large objArrays using single assignments seems very unlikely.

This means that this optimization may not be too useful for (large) objArrays, but still be very effective for regular objects for member initialization.

Measurements on some microbenchmarks that do nothing but

private static Object foo() {
  for (int i = 0; i < 5000; i++) {
    Object temp = new Object[some-value];
  }
}

can easily cause pause target requirements to be missed. (Note: not sure why the compiler emits any deferred card mark in this case, as there is no initializing store at all in this particular case. That may be another issue).

E.g. an objArray with ~1.5M entries (13M in size) results in 22ms avg/36ms max *additional* pause.
Even smaller objArrays with ~650k entries (5M in size) cause additional work of 6ms avg/10ms max additional pause.

In workloads, where a small, deterministic pause time is expected (like 30ms), this optimization already eats up a large part (or actually causes missed pause time goals) of the pause that should preferably be spent in actual space reclamation.

Another option would be to be a bit more precise about the initializing store, i.e. only pass the area(s) where initializing stores actually occur to the GC. I.e. in this case:

  private static Object foo() {
    Object i = new Integer(1);
    Object[] x = new Object[123];
    x[0] = i;
    x[..] = ...;
    x[10] = i;
    return x;
  }

only pass an area covering at least the first eleven elements of the object to the GC to rescan later instead of the entire object.

Probably not an issue with JDK-8342382 any more. The deferred card marking is a very fast memset() of that card table memory range then.

20-03-2025

Relates :	JDK-8166899 - Deferred card marking of large objArrays generates lots of unnecessary work
Relates :	JDK-8342382 - Implementation of JEP G1: Improve Application Throughput with a More Efficient Write-Barrier