JDK-8066387 : G1GC induces 3s GC pause when JVM is asked to exit
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 8u20,9
  • Priority: P5
  • Status: Closed
  • Resolution: Duplicate
  • OS: linux_ubuntu
  • CPU: x86_64
  • Submitted: 2014-11-04
  • Updated: 2018-12-05
  • Resolved: 2018-12-05
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbd_majorResolved
Related Reports
Duplicate :  
Relates :  
Description
FULL PRODUCT VERSION :
java version "1.8.0_20"
Java(TM) SE Runtime Environment (build 1.8.0_20-b26)
Java HotSpot(TM) 64-Bit Server VM (build 25.20-b23, mixed mode)


FULL OS VERSION :
Linux theta 3.13.0-35-generic #62-Ubuntu SMP Fri Aug 15 01:58:42 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

A DESCRIPTION OF THE PROBLEM :

When I call System.exit(), it takes 3 seconds for the JVM to exit.

The essential portion of my code are:

    long start = System.currentTimeMillis();
    /* graph building code elided */
    long end1 = System.currentTimeMillis();
    /* maxflow solver code elided */
    long end2 = System.currentTimeMillis();
    System.out.printf("graph building %.2f%n", (end1 - start)/1000.0);
    System.out.printf("maxflow solver %.2f%n", (end2 - end1)/1000.0);
    System.exit(0);

Observe the output:

$ time java -verbosegc -XX:+UseG1GC maxflow.TightKnight < tightknight1a.in 
[GC pause (G1 Evacuation Pause) (young) 14M->6026K(124M), 0.0141284 secs]
[GC pause (G1 Evacuation Pause) (young) 20M->25M(124M), 0.0321413 secs]
[GC pause (G1 Evacuation Pause) (young) 42M->42M(248M), 0.0600765 secs]
[GC pause (G1 Evacuation Pause) (young) 59M->61M(496M), 0.0547508 secs]
[GC pause (G1 Evacuation Pause) (young) 84M->86M(793M), 0.0714937 secs]
[GC pause (G1 Evacuation Pause) (young) 126M->128M(1031M), 0.1089157 secs]
[GC pause (G1 Evacuation Pause) (young) 179M->180M(1221M), 0.1346149 secs]
[GC pause (G1 Evacuation Pause) (young) 241M->243M(1373M), 0.1493734 secs]
[GC pause (G1 Evacuation Pause) (young) 314M->316M(1495M), 0.1725528 secs]
[GC pause (G1 Evacuation Pause) (young) 381M->381M(1592M), 0.1751415 secs]
[GC pause (G1 Evacuation Pause) (young) 450M->453M(1670M), 0.2076116 secs]
[GC pause (G1 Evacuation Pause) (young) 526M->528M(1732M), 0.1729304 secs]
[GC pause (G1 Evacuation Pause) (young) 603M->605M(1782M), 0.1947178 secs]
[GC pause (G1 Evacuation Pause) (young) 683M->686M(1822M), 0.1777260 secs]
[GC pause (G1 Evacuation Pause) (young) 765M->768M(1854M), 0.1886087 secs]
[GC pause (G1 Evacuation Pause) (young) 848M->850M(1880M), 0.2351417 secs]
[GC pause (G1 Evacuation Pause) (young) 932M->934M(1900M), 0.2211004 secs]
[GC pause (G1 Humongous Allocation) (young) (initial-mark) 954M->956M(1916M), 0.1463504 secs]
[GC concurrent-root-region-scan-start]
[GC concurrent-root-region-scan-end, 0.0282788 secs]
[GC concurrent-mark-start]
No
graph building 4.43
maxflow solver 1.74
[GC concurrent-mark-end, 3.2637473 secs]
[GC remark, 0.0083629 secs]
[GC cleanup 1050M->1024M(1916M), 0.0062458 secs]
[GC concurrent-cleanup-start]
[GC concurrent-cleanup-end, 0.0000735 secs]

real    0m8.068s
user    0m19.219s
sys     0m1.710s

After it prints "maxflow solver 1.74" the JVM is asked to exit.

Instead of exiting, it starts a "GC concurrent-mark-end" which takes 3.2 seconds.

The result is an increase in the runtime from 5 to 8 seconds, or a 60% slowdown.
When the JVM is asked to exit, IMO, it should cease all GC activities and exit().


THE PROBLEM WAS REPRODUCIBLE WITH -Xint FLAG: Did not try

THE PROBLEM WAS REPRODUCIBLE WITH -server FLAG: Yes

EXPECTED VERSUS ACTUAL BEHAVIOR :
Expected: It should run 5 seconds, then exit.
Actual: It runs 8 seconds.

BTW, this is a programming competition problem. We're working with tight time outs here; a 60% slowdown can mean the difference between pass and fail.


REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
I can't share the code I am using, but if you want to look at this seriously, could probably produce a self-contained bug.
---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
Don't use G1GC.

The problem does not occur with -XX:+UseConcMarkSweepGC or -XX:+UseParallelGC


Comments
The only way I can imagine this could happen is that there is a very large humongous array of j.l.O. that is currently being marked which in JDK8 was not interruptible. In JDK9+ this should have been fixed with JDK-8057003. Closing this as duplicate of that for now lacking a reproducer.
05-12-2018

ILW => LMH => P5 Impact: Low, slow exit times due to synchronization with concurrent gc threads. Likelihood: Medium, need to exit during a concurrent cycle. Workaround: High, switch GC.
02-12-2014