JDK-6662086 : 6u4+, 7b11+: CMS never clears referents when -XX:+ParallelRefProcEnabled
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 6u4
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: solaris_10
  • CPU: x86,sparc
  • Submitted: 2008-02-12
  • Updated: 2010-12-03
  • Resolved: 2008-05-21
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 6 JDK 7 Other
6u10Fixed 7Fixed hs10Fixed
Related Reports
Duplicate :  
Relates :  
Relates :  
Description
It has been observed with 6.0 u4 that CMS is far less efficient than 
with 6.0 u2. While we get a regular sawtooth curve in u2 we notice
with u4 that CMS for some reason seems to be unable to collect all
dead objects. The effect will be be over time more frequent CMS runs 
which collect fewer and fewer objects. Used memory increases as well
as cpu consumption. However, later on, CMS is then suddenly able to
collect large chunks. Memory usage will go down. This scenario will
then repeat from the beginning.
Changed synopsis to reflect evaluation of root cause:

6u2+, 7b11+: CMS never clears referents when -XX:+ParallelRefProcEnabled

Comments
EVALUATION The regression was introduced in 6417901 (which was integrated in 6u4 and 7b11). We have verified that the fix in the Suggested Fix section fixes the problem at both customers who reported the issue.
06-05-2008

EVALUATION Verified that only CMS was affected by the issue. Reworked some of the code and added asserts so as to reduce the possibility of inadvertent such regressions in the future.
01-05-2008

SUGGESTED FIX JPRT: [hotspotwest] job notification - success with job 2008-05-06-224147.ysr.hg_gc_fixes JPRT Job ID: 2008-05-06-224147.ysr.hg_gc_fixes JPRT System Used: hotspotwest JPRT Version Used: 1.0: (2008-04-29) Case of the Misguided Missile [c2c0735e7f00] Job URL: http://prt-web.sfbay.sun.com/archive/2008/05/2008-05-06-224147.ysr.hg_gc_fixes Job ARCHIVE: /net/prt-archiver.sfbay/data/jprt/archive/2008/05/2008-05-06-224147.ysr.hg_gc_fixes User: ysr Email: ###@###.### Release: jdk7 Job Source: Mercurial: /net/spot/workspaces/ysr/hg_gc_fixes/{make,src,agent} Parent: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot Push Parent: ssh://###@###.###/jdk7/hotspot-gc-gate/hotspot CR List: 6662086 Changeset: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/b5489bb705c9 File List: {.} Exclude List: {build} Command Line: jprt submit -m jprt.txt -noforest Job submitted at: Tuesday May 6, 2008 15:41:49 PDT Total time in queue: 2h 02m 35s Job started at: Tuesday May 6, 2008 15:43:48 PDT Job integrated at: Tuesday May 6, 2008 17:44:00 PDT Job finished at: Tuesday May 6, 2008 17:44:24 PDT Job run time: 2h 35s Job state: success Job flags: SYNC INTEGRATE PRECIOUS Bundles: USE: jprt install 2008-05-06-224147.ysr.hg_gc_fixes NOTE: Zip files containing exe or dll files on windows have had problems with execute permissions. You may need to 'chmod a+x' the windows exe and dll files. User Comments: 6662086: 6u4+, 7b11+: CMS never clears referents when -XX:+ParallelRefProcEnabled Summary: Construct the relevant CMSIsAliveClosure used by CMS during parallel reference processing with the correct span. It had incorrectly been constructed with an empty span, a regression introduced in 6417901. Reviewed-by: jcoomes
01-05-2008

EVALUATION Based on data collected by ###@###.### using an instrumented jvm that he built and visual inspection of the code, the problem appears to be that the CMSIsAliveClosure passed into the work method does not have a correctly initialized _span (from code inspection). We'll also need to check whether the same (or similar) problem(s) also exist(s) in the other collectors. In the case of CMS, this was causing the _is_alive closure to declare that all referents were strongly reachable even when they were not (when +ParallelRefProcEnabled). The fix is to pass in the _span to the CMSIsAliveClosure at construction time so it's correctly initialized. See suggested fix section.
30-04-2008

WORK AROUND Do not use -XX:+ParallelRefProcEnabled (i.e. revert to default which disables parallel reference processing). However, this might adversely affect CMS remark pause times in applications that make heavy use of Reference objects (including for instance Finalizers) and run on large MP boxes.
14-02-2008

EVALUATION Added "when -XX:+ParallelRefProcEnabled" to synopsis, based on further tests at customer. The regression appears to have started with 6417901 where the parallel reference processing code was extensively reworked to extend it to other collectors besides CMS. We are investigating the root cause of the problem now.
14-02-2008