JDK-8330847 : G1 accesses uninitialized memory when predicting eden copy time
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 17,21,22,23
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2024-04-22
  • Updated: 2024-06-03
  • Resolved: 2024-05-27
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 23
23 b25Fixed
Related Reports
Duplicate :  
Relates :  
Relates :  
Description
This is reproducible using `tomcat`, `jython` and other benchmarks from the Dacapo suite. It seems to happen in the first few cycles, but I have also seen it happen on the 7th cycle. It is evident in the logs as:
```
[189.667s][trace][gc,ergo,cset   ] GC(7) Added young regions to CSet. Eden: 64 regions, Survivors: 0 regions, predicted eden time: 342093961900175.06ms, predicted base time: 181.72ms, target pause time: 200.00ms, remaining time: 0.00ms
```

The unrealistically high time prediction causes G1 to not select any old regions for mixed collections.

Possibly caused by this change: https://github.com/openjdk/jdk/pull/16344/files#diff-c9ffd02739befc1333f3d79c5b9dd13dd443b28db1a0172081c41e5750c97dd8R1102

Which seems to subvert the boundary checking here:
https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/g1/g1SurvRateGroup.hpp#L78

In the cases I've seen, `count` has been equal to `G1SurvRateGroup::_stats_arrays_length`, so `count -1` ducks under the check that `age < _stats_arrays_length`.
Comments
Changeset: f3d6fbf5 Author: Thomas Schatzl <tschatzl@openjdk.org> Date: 2024-05-27 11:20:10 +0000 URL: https://git.openjdk.org/jdk/commit/f3d6fbf52eac44734695935f73c5cfc0fb9ba167
27-05-2024

A pull request was submitted for review. URL: https://git.openjdk.org/jdk/pull/19364 Date: 2024-05-23 11:03:41 +0000
24-05-2024

I.e. currently the flow is like: // Before GC SurvRateGroup::stop_adding_regions() // expands predictor arrays if necessary use predictors when predicting eden copy time // After GC SurvRateGroup::finalize_predictions() // updates predictions Previously it has been // Before GC SurvRateGroup::stop_adding_regions() // expands predictor arrays if necessary use incrementally determined eden copy time based on region zero's prediction // that was the fix for JDK-8231579 // After GC SurvRateGroup::finalize_predictions() // updates predictions
23-05-2024

The issue seems to have been introduced with JDK-8231579 where G1 started to use predictions for eden copy times when finalizing the young gen collection set. Previously G1 incrementally calculated eden copy time during mutator retirement (well, wrongly).
23-05-2024

When expanding the prediction arrays, the _accum_surv_rate_pred arrays (and the actual predictors) are not seeded, containing random data.
23-05-2024