JDK-6569768 : CMS: System.gc() may hang with -XX:+ExplicitGCInvokesConcurrent upon concurrent mode failure
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 7
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2007-06-14
  • Updated: 2012-02-01
  • Resolved: 2011-03-07
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 6 JDK 7 Other
6u10Fixed 7Fixed hs11Fixed
Related Reports
Relates :  
Relates :  
Description
Stress test allocating garbage and invoking System.gc() in different threads intermittenly hangs with -XX: -XX:+ExplicitGCInvokesConcurrent because System.gc() does not return.

See comments for more details.
Changed synopsis field from:

     System.gc() may hang with -XX:+ExplicitGCInvokesConcurrent

to:

     CMS: System.gc() may hang with -XX:+ExplicitGCInvokesConcurrent upon concurrent mode failure
http://gtee.sfbay/gtee/results/JDK7/NIGHTLY/VM/latest/GC_Baseline-Xconc/vm/64BITSOLARIS-AMD64/server/mixed/vm-64BITSOLARIS-AMD64_server_mixed_vm.gc.testlist2007-06-15-06-58-28/ResultDir/Juggle1_gc/


gc/memory/Array/ArrayJuggle/Juggle1_gc
gc/memory/Array/ArrayJuggle/Juggle2_gc

Comments
SUGGESTED FIX Event: putback-to Parent workspace: /net/jano.sfbay/export/disk05/hotspot/ws/main/gc_baseline (jano.sfbay:/export/disk05/hotspot/ws/main/gc_baseline) Child workspace: /net/prt-web.sfbay/prt-workspaces/20070618121351.ysr.mustang/workspace (prt-web:/net/prt-web.sfbay/prt-workspaces/20070618121351.ysr.mustang/workspace) User: ysr Comment: --------------------------------------------------------- Job ID: 20070618121351.ysr.mustang Original workspace: neeraja:/net/jano.sfbay/export/hotspot/users1/ysr/mustang Submitter: ysr Archived data: /net/prt-archiver.sfbay/data/archived_workspaces/main/gc_baseline/2007/20070618121351.ysr.mustang/ Webrev: http://prt-web.sfbay.sun.com/net/prt-archiver.sfbay/data/archived_workspaces/main/gc_baseline/2007/20070618121351.ysr.mustang/workspace/webrevs/webrev-2007.06.18/index.html Fixed 6569768: CMS: System.gc() may hang with -XX:+ExplicitGCInvokesConcurrent upon concurrent mode failure webrev: http://analemma.SFBay.Sun.COM/net/jano/export/disk05/hotspot/users/ysr/mustang/webrev There was a bug in the implementation of +ExplicitGCInvokesConcurrent where, upon a mark-compact collection that follows a concurrent mode failure, we were not taking care to update the "full collections" counter and inform any threads that had requested a System.gc(). If the program never allocated enough after that point in the old generation or no other thread made another System.gc() call, no further concurrent full collections would occur and the threads that had made the System.gc() call would remain stranded. Even in cases where a concurrent collection eventually occurs that notifies and resumes the stranded callers, the apparent latency of the call can be unbounded. The fix is to ensure that when a partial collection of the heap escalates into a full collection (potentially resulting in a concurrent mode failure), we update the counter and post the relevant notification. As a side-effect we also fixed a general bug in the framework collectors where such a collection was also neglecting to resize perm gen following such a collection (or report the change in its size with PrintGCDetails). Fix Verified: y Verification Testing: Juggle1_gc and Juggle2_gc as in bug report (More testing details / stress flags and workarounds in bug record) Other Testing: PRT, refworkload, runThese quick (w/ appropriate flags; see above) Reviewed by: Tony Printezis, Jon Masamitsu (Note: certain related clean-ups suggested by Jon to be rolled in with a pair of related bug fixes coming up soon.) Files: update: src/share/vm/memory/genCollectedHeap.cpp Examined files: 3991 Contents Summary: 1 update 3990 no action (unchanged)
18-06-2007

WORK AROUND -XX:-UseCMSCompactAtFullCollection would do away with this problem (except that you are now open to irreversible fragmentation), but for the small problem that you will then need a fix for 6483690, an arguably more serious bug. In any system in which the old generation eventually fills up, the hung thread would eventually resume, but this would show up as an extremely long System.gc() latency from the viewpoint of the caller.
15-06-2007

SUGGESTED FIX see PRT webrev link embedded in next entry.
15-06-2007

EVALUATION Concurrent mode failure appears to miss sending a relevant notification to the requesting thread.
14-06-2007