JDK-8295669 : Regression ~8% in DaCapo-pmd-large in 20-b12
Type:Bug
Component:hotspot
Sub-Component:gc
Affected Version:20
Priority:P3
Status:Resolved
Resolution:Other
Submitted:2022-10-19
Updated:2022-11-10
Resolved:2022-11-10
The Version table provides details related to the release that this issue/RFE will be addressed.
Unresolved : Release in which this issue/RFE will be addressed. Resolved: Release in which this issue/RFE has been resolved. Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.
This seems some unexpected interaction with frequent system.gcs, the changed (overall improved) young gen sizing and the very short application runtime as described in comments; since testing in more realistic conditions (e.g. disabling system.gc, fixing heap sizes) does not show any impact closing this out.
10-11-2022
The current heap sizing for young-only collections and for full gcs are fighting against each other constantly, increasing and shrinking the heap all the time in both baseline and changes. I.e. for baseline it works like this:
baseline:
mutator phase: expand to 1.2g
full gc notices that there is so much free space, shrink back to 430m
repeat
changes:
full gc: shrink 450M -> 223M
mutator phase: pause time ratio is not met; expand 223M -> 1.2g; this is the only gc in this mutator phase because due to the large expansion, young gen gets fairly large (larger than before)
full gc: shrink: 1.2g -> 450M
mutator phase: (do nothing, is happy about heap size)
repeat
Due to timing it seems like that with the changes, full gc with changes can shrink down to 2 old gen regions sometimes (vs. constantly 6 without the changes), so every second full gc g1 is able to shrink to 223M.
24-10-2022
"Mess up prediction" = after that change, young generation heap sizes is much larger; g1 does 45% less gcs with the changes. See attached 20221024-pmd-pauses.png.
This seems to be detrimental to performance (i.e. not doing compacting pauses). Regularly G1 does one pause (due to large young gen) between iterations (system.gcs - the spikes in the graph), while without the changes there is this very regular pattern of 4 young gcs per iteration.
24-10-2022
The issue seems to be that the explicit gcs every iteration (500ms) mess up prediction. I.e. without disableexplicitgc scores are back. Reducing heap region size helps a little only.
24-10-2022
Some local observations (between that jdk-20+11 build and latest), need to be confirmed:
- total pause times are equal
- young gen (eden and survivor) are (on average) larger with latest
- after setting region size to 1M there is no difference
- with -XX:+DisableExplicitGC the difference is gone
So far it looks like that the change increased eden/young gen, which in turn increased survivor space (the benchmark is run on a very large machine, so the increase is very large from 1 to 2 regions; due to very large region sizes because of default max heap size this signficantly increases survivor). That in turn has negative effects on benchmark performance.
Limiting the increase in survivor size (i.e. 1m region size) seems to have positive effect.