JDK-6722113 : CMS: Incorrect overflow handling during precleaning of Reference lists
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: hs14,1.4.2,6u3,6u4,6u7-rev
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • OS: generic,solaris_9,solaris_10
  • CPU: generic,x86,sparc
  • Submitted: 2008-07-03
  • Updated: 2011-03-08
  • Resolved: 2011-03-07
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other JDK 6 JDK 7 Other
5.0u18-rev,hs11.3Fixed 6u13-revFixed 7Fixed hs11.3Fixed
Related Reports
Duplicate :  
Duplicate :  
Relates :  
Relates :  
Relates :  
Description
Here's a description from the Evaluation field of 6578335 during the
investigation of which this bug was first diagnosed:

There was a second bug in the overflow handling encountered during
the precleaning of reference lists. Because the same closure was used
for this work during the remark and during the preclean stage, we ended
up with incorrect overflow handling (correct for the remark phase,
incorrect for preclean phase). This needs to be fixed. The temporary
workaround is to disable CMSPrecleanRefLists{1,2}.

Comments
EVALUATION http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/00b023ae2d78
20-11-2008

EVALUATION 6722113 CMS: Incorrect overflow handling during precleaning of Reference lists http://bugs.sun.com/view_bug.do?bug_id=6722113 webrev: http://webrev.invokedynamic.info/ysr/6722113/ One problem was that we had reused a CMS reference processing closure to do concurrent precleaning of the discovered reference list, but had not extended it to deal correctly with marking stack overflow that might occur during a concurrent phase. The fix is to do so in the usual way for CMS, by redirtying the MUT-card containing the overflown object, taking care to deal with reference array objects appropriately. A further problem was that the marking done during the precleaning pass was not updating the discovered list lengths correctly because of the interleaved manner in which discovery and preclean-processing proceed in this case. This could lead to issues during the rebalancing of the per-thread discovered lists when doing reference processing multi-threaded. Adjusted related comments. (Note that this is the first case where we are explicitly doing reference processing without disabling discovery, I'd like to call this out explicitly for reviewers to pay special attention to. This will require looking at the reference processing code in detail, not just the portions that changed in the webrev. Note that this will need to change if concurrent precleaning were ever to become parallel/multi-threaded.) Testing: jprt, refworkload, with and without +ParallelRefProcEnabled Thanks for your reviews. -- ramki
14-11-2008

SUGGESTED FIX Workspace: /net/spot/workspaces/ysr/cms_ref_preclean_ovflw
04-11-2008

EVALUATION I checked, and this code is present at least in the latest 5uXX train as well, but not in 1.4.2_XX. So we will need to fix this in 5uXX as well. I will file a subCR for 5uXX once we have this fixed in hs14.
16-08-2008

SUGGESTED FIX Either keep state in the shared closure so as to distinguish the stop-world case from the concurrent case and apply the right kind of overflow handling, or more simply perhaps just use a different closure for the concurrent precleaning phase and use the overflow_stack, not the overflo_list for overflow handling. Also see if assertions could have been strengthened to catch this bug earlier (i.e. when an attempt was made to use the wrong overflow handling mechanism).
03-07-2008

EVALUATION This is a bug since 6.0 when precleaning of Reference lists was first introduced. RE should check if the feature may have been backported to earlier releases (unlikely) in which case other subCR's may also need to be filed.
03-07-2008

WORK AROUND -XX:-CMSPrecleanRefLists{1,2}
03-07-2008