JDK-8153176 : Long pause in ParOldGC, because ParallelTaskTerminator peeks wrong TaskQueueSet
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 9
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2016-03-31
  • Updated: 2017-11-29
  • Resolved: 2016-04-05
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 7 JDK 8 JDK 9
7u131Fixed 8u152Fixed 9 b115Fixed
Description
Symptom: sporadic long marking phases (up to n times longer, where n is the number of HW threads) with parallel old GC, if the object graph has many linear subgraphs.

Analysis: ParallelTaskTerminator::offer_termination() does not discover objects to scan on the marking stacks of co-worker threads, because it looks at the wrong TaskQueueSet. So instead of participating in the work, a gc thread that reaches offer_termination() will do SpinPause until the marking has been finished - potentially by just one co-worker thread.

Fix: let ParallelTaskTerminator check ParCompactionManager::stack_array() instead of ParCompactionManager::region_array().

A specialized microbenchmark, TestStealMarkingTask (source attached and results below), shows about 3x shorter parallel old full GCs with the patch.

Benchmark run without patch
---------------------------
$ jdk9/bin/java -Xms2g -Xmx2g -XX:+UseParallelGC -cp /usr/work/d038402/jtests/QuickJavaTests/bin test.gc.TestStealMarkingTask 10 10000 10000 2
Building list of length 10000 holding sub-lists of length 10000
Heap stats: free:1921 MB (97.9 %)  max:1963 MB
done
Heap stats: free:808 MB (41.2 %)  max:1963 MB
Calling System.gc()
Heap stats: free:817 MB (41.6 %)  max:1963 MB
Full GC duration: 3359 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %)  max:1963 MB
Full GC duration: 3087 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %)  max:1963 MB
Full GC duration: 3024 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %)  max:1963 MB
Full GC duration: 3535 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %)  max:1963 MB
Full GC duration: 2831 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %)  max:1963 MB
Full GC duration: 3313 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %)  max:1963 MB
Full GC duration: 4919 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %)  max:1963 MB
Full GC duration: 3624 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %)  max:1963 MB
Full GC duration: 2689 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %)  max:1963 MB
Full GC duration: 3418 ms


Benchmark run without patch: about 3x shorter full GC pauses
------------------------------------------------------------
$ jdk9_patched/bin/java -Xms2g -Xmx2g -XX:+UseParallelGC -cp /usr/work/d038402/jtests/QuickJavaTests/bin test.gc.TestStealMarkingTask 10 10000 10000 2
Building list of length 10000 holding sub-lists of length 10000
Heap stats: free:1921 MB (97.9 %)  max:1963 MB
done
Heap stats: free:808 MB (41.2 %)  max:1963 MB
Calling System.gc()
Heap stats: free:817 MB (41.6 %)  max:1963 MB
Full GC duration: 1536 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %)  max:1963 MB
Full GC duration: 1255 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %)  max:1963 MB
Full GC duration: 1351 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %)  max:1963 MB
Full GC duration: 1302 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %)  max:1963 MB
Full GC duration: 1281 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %)  max:1963 MB
Full GC duration: 1259 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %)  max:1963 MB
Full GC duration: 1257 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %)  max:1963 MB
Full GC duration: 1292 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %)  max:1963 MB
Full GC duration: 1305 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %)  max:1963 MB
Full GC duration: 1282 ms

Comments
Wow. I noticed that the same bug exists in RefProcTaskExecutor::execute
31-03-2016

I created the bug, but it was analyzed and reported by Richard Reingruber.
31-03-2016