JDK-8243668 : DaCapo-fop and SPECjvm2008.lu-large regress using G1 in Windows after JDK-8241670
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 15
  • Priority: P2
  • Status: Closed
  • Resolution: Won't Fix
  • Submitted: 2020-04-27
  • Updated: 2020-05-18
  • Resolved: 2020-05-18
Related Reports
Relates :  
Description
Some regressions on Windows made it all the way to retriage in 15-b18.
Comments
Since the issues we are seeing are not really product problems but issues caused by the way the tests are run or some environmental problems we close this as won't fix.
18-05-2020

Yet another update. I now understand what's causing the long system time, it seems to be touching pages for the first time. If I run with -XX:+AlwaysPreTouch the long system times goes away. The reason the long system times goes away after a while in the original runs is that after the first concurrent cycle Remark will free old regions. G1 then start re-using regions that have already been touched. The reason for the regression is that after build 18 we are no longer lucky enough to do the concurrent cycle before the measurements start.
05-05-2020

An update about the SPECjvm2008 test. Looking at the logs of a run with extra logging show that both run have many GCs where the system time is high. I currently don't know what's causing this, maybe swapping or something. Another strange thing is that the b17 run stops having this high system time around the time when the actual measurements start, possibly because the heap has just been shrunk down to around 10g. With b18 this happens a bit later, and for a while that run also stops having high system times. Later in the run the high system time comes back even though the heap is still below 10g. If we could understand what's causing the system time and avoid it, it would be easier to debug if the actual change in sizing and ergonomics cause a real regression. Will do some more run to try to narrow it down.
04-05-2020

Fewer workers in the young collections is caused by using fewer workers in the Full collections. In the Full collection the region size is taken into consideration since there will be on average half a region waste per worker. Our policy is then to never shrink the number of workers to more than half of the previous value, so for the smaller region case we got more workers during the test: b17: [2.353s][info ][gc,task ] GC(12) Using 25 workers of 35 for full compaction [2.529s][info ][gc,task ] GC(13) Using 14 workers of 35 for evacuation [2.618s][info ][gc,task ] GC(14) Using 8 workers of 35 for evacuation [2.650s][info ][gc,task ] GC(15) Using 2 workers of 35 for full compaction [2.713s][info ][gc,task ] GC(16) Using 2 workers of 35 for evacuation [2.726s][info ][gc,task ] GC(17) Using 2 workers of 35 for evacuation [2.969s][info ][gc,task ] GC(18) Using 25 workers of 35 for full compaction [3.132s][info ][gc,task ] GC(19) Using 13 workers of 35 for evacuation [3.207s][info ][gc,task ] GC(20) Using 7 workers of 35 for evacuation [3.263s][info ][gc,task ] GC(21) Using 2 workers of 35 for full compaction [3.314s][info ][gc,task ] GC(22) Using 2 workers of 35 for evacuation [3.327s][info ][gc,task ] GC(23) Using 2 workers of 35 for evacuation [3.551s][info ][gc,task ] GC(24) Using 25 workers of 35 for full compaction [3.652s][info ][gc,task ] GC(25) Using 13 workers of 35 for evacuation [3.690s][info ][gc,task ] GC(26) Using 7 workers of 35 for evacuation [3.717s][info ][gc,task ] GC(27) Using 4 workers of 35 for evacuation [3.742s][info ][gc,task ] GC(28) Using 3 workers of 35 for evacuation [3.853s][info ][gc,task ] GC(29) Using 25 workers of 35 for full compaction [3.955s][info ][gc,task ] GC(30) Using 13 workers of 35 for evacuation b18: [2.355s][info ][gc,task ] GC(12) Using 6 workers of 35 for full compaction [2.569s][info ][gc,task ] GC(13) Using 5 workers of 35 for evacuation [2.652s][info ][gc,task ] GC(14) Using 1 workers of 35 for full compaction [2.740s][info ][gc,task ] GC(15) Using 2 workers of 35 for evacuation [2.780s][info ][gc,task ] GC(16) Using 2 workers of 35 for evacuation [2.826s][info ][gc,task ] GC(17) Using 2 workers of 35 for evacuation [2.969s][info ][gc,task ] GC(18) Using 6 workers of 35 for full compaction [3.167s][info ][gc,task ] GC(19) Using 5 workers of 35 for evacuation [3.246s][info ][gc,task ] GC(20) Using 1 workers of 35 for full compaction [3.325s][info ][gc,task ] GC(21) Using 2 workers of 35 for evacuation [3.363s][info ][gc,task ] GC(22) Using 2 workers of 35 for evacuation [3.406s][info ][gc,task ] GC(23) Using 2 workers of 35 for evacuation [3.548s][info ][gc,task ] GC(24) Using 6 workers of 35 for full compaction [3.780s][info ][gc,task ] GC(25) Using 5 workers of 35 for evacuation [3.828s][info ][gc,task ] GC(26) Using 1 workers of 35 for full compaction [3.909s][info ][gc,task ] GC(27) Using 2 workers of 35 for evacuation [3.947s][info ][gc,task ] GC(28) Using 2 workers of 35 for evacuation [3.989s][info ][gc,task ] GC(29) Using 2 workers of 35 for evacuation [4.123s][info ][gc,task ] GC(30) Using 6 workers of 35 for full compaction I'm not sure the problems we see here warrants any change to avoid the regression in the DaCapo benchmark.
30-04-2020

The runs with more logs provide some more insight. For the DaCapo benchmark the GC pause times from b17 and b18 show a regression: b17 average young pause: 8ms, total young pause time 460ms b18 average young pause: 13ms, total young pause time 588ms Longer pauses mean worse score in this benchmark and looking at the log it looks like the longer pauses are cause by using fewer workers. This benchmark does a system.gc between each iteration and doesn't allocate that much, so the heap gets pretty small. I still don't understand why fewer regions would yield fewer workers for the evacuation. Reading the code it should depend on the size not the number of regions. Need to look closer at this.
29-04-2020

A quick look at the log for lu-large shows that we manage to keep the heap smaller after JDK-8241670. Haven't done any more indepth analysis, but I will start a run to get some more logs.
29-04-2020

Lu-large regression: ~3.55% Dacapo-fop regression: ~5.71%
28-04-2020