JDK-6819098 : G1: reduce RSet scanning times
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: hs15
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2009-03-18
  • Updated: 2013-09-18
  • Resolved: 2011-03-08
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 6 JDK 7 Other
6u14Fixed 7Fixed hs14Fixed
Related Reports
Relates :  
Relates :  
Relates :  
We recently tried G1 on a 32GB heap, with a 4GB / 8GB young gen. We saw that, even though not much was copied during GC (so we can assume that the RSets of the collection set were relatively empty), the RSet scanning times were higher than we would have expected them, around 5ms on average for the 4GB and 13ms on average for the 8GB young gen. We should see if there's a bottleneck somewhere to allow us to speed up the RSet scanning code.
Modified previous naive work stealing algorithm by introducing a feedback-driven exponential skipping.

Testing: JBB2005 on a 16-core intel core2 box with 30G heap (25G young gen), 13 GC threads. The RSet scanning times reduced ~600%.

EVALUATION Approved for JDK 7 M3 build 59.

EVALUATION http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/b803b1b9e206

EVALUATION One extra data point: using larger regions (8MB instead of 1MB) decreases the RSet scanning time dramatically (given that larger regions means fewer regions in the CSet). See also CR 6819085.

EVALUATION Initially I thought that the RSet scanning time might be dominated by the iteration over the collection set, which would be hard to speed up. However, I'm not now so sure. First the scanning time it's 2.5 times longer for a young gen twice the size. Second because of another bug (see 6819077) thread 0 starts late into the GC and doesn't actually scan any RSets (but it does iterate over the CSet trying to find RSets to claim). Its times are 1.4ms for the 4GB and 2.4ms for the 8GB young gen. So, the iteration itself seems reasonably short. So the bottleneck is due to contention (somehow) between the GC threads during the RSet scanning operation.