JDK-8294206 : Concurrent refinement thread adjustment and (de-)activation suboptimal
  • Type: CSR
  • Component: hotspot
  • Sub-Component: gc
  • Priority: P3
  • Status: Draft
  • Resolution: Unresolved
  • Fix Versions: tbd
  • Submitted: 2022-09-22
  • Updated: 2022-10-15
Related Reports
CSR :  
Description
Summary
-------

Obsolete the following VM product options:

`-XX:-G1UseAdaptiveConcRefinement`<br>
`-XX:G1ConcRefinementGreenZone=`_buffer-count_<br>
`-XX:G1ConcRefinementYellowZone=`_buffer-count_<br>
`-XX:G1ConcRefinementRedZone=`_buffer-count_<br>
`-XX:G1ConcRefinementThresholdStep=`_buffer-count_

Add the following VM diagnostic option:

`-XX:-G1UseConcRefinement`

Problem
-------
The control system for G1 concurrent refinement is being replaced.  There are several command line options that affected the old control system.

By default, all of these options were initialized ergonomically.  Unless adaptive control was disabled (via `-XX:-G1UseAdaptiveConcRefinement`), values from the command line were only used as initial values, subject to adjustment by the adaptive controller.  Determining good values with the adaptive controller disabled was generally at least very hard, and likely impossible for some applications.

Oracle's Garbage Collection Tuning Guide mentions these options, with the admonishment "Change with caution because this may cause extremely long pauses."  Even the suggested configuration approach for improving throughput described in that document is quite risky and prone to harming performance.

These options are meaningless for the new controller.

Solution
--------

Obsolete the following VM product options, without any deprecation period:

`-XX:-G1UseAdaptiveConcRefinement`<br>
`-XX:G1ConcRefinementGreenZone=`_buffer-count_<br>
`-XX:G1ConcRefinementYellowZone=`_buffer-count_<br>
`-XX:G1ConcRefinementRedZone=`_buffer-count_<br>
`-XX:G1ConcRefinementThresholdStep=`_buffer-count_

As these options are meaningless for the new controller, the only way to provide a deprecation period would have been to continue to provide the old controller as an option. That isn't being done.

The new controller *could* have used `G1ConcRefinementGreenZone` to provide a fixed value for the target number of pending cards for GC refinement. That is the value that it controlled under the old controller, though it is poorly named for that purpose. That same value exists in the new controller. But using a fixed value instead of allowing the controller (old or new) to dynamically determine the value is unlikely to give good performance and behavior.

A configuration that was useful for some kinds of debugging and testing was to disable `G1UseAdaptiveConcRefinement` and set `G1ConcRefinementGreenZone` to a very large value, effectively disabling concurrent refinement.  To support this use case with the new controller, the `-XX:-G1UseConcRefinement` diagnostic option has been added (see JDK-8155996).

This option should not be used to (attempt to) improve throughput performance, even though the Tuning Guide's suggested using `G1UseAdaptiveConcRefinement` and `G1ConcRefinementGreenZone` for that purpose.  The problems with the old controller that might make such a configuration perhaps helpful have been fixed by the new controller.  Users should instead just follow the existing recommendation to consider adjusting `-XX:G1RSetUpdatingPauseTimePercent`.  That option remains unchanged.

Specification
-------------

https://github.com/openjdk/jdk/pull/10256

Use of any of `-XX:+/-G1UseAdaptiveConcRefinement`,
`-XX:G1ConcRefinementGreenZone`, `-XX:G1ConcRefinementYellowZone`,
`-XX:G1ConcRefinementRedZone`, or `-XX:G1ConcRefinementGhresholdStep` will no
longer affect concurrent refinement.  Instead, using any of these options will
print the usual obsolete option warning.

New diagnostic option:

```
+  product(bool, G1UseConcRefinement, true, DIAGNOSTIC,                      \
+          "Control whether concurrent refinement is performed. "            \
+          "Disabling effectively ignores G1RSetUpdatingPauseTimePercent")   \
```


Comments
Moved back to draft. As part of dealing with an issue found during additional perf testing, we'll also end up removing G1ConcRefinementServiceIntervalMillis.
15-10-2022