JDK-4615723 : CMS: deal with CMS marking stack overflow
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 1.4.1,5.0
  • Priority: P2
  • Status: Closed
  • Resolution: Other
  • OS: generic
  • CPU: generic
  • Submitted: 2001-12-19
  • Updated: 2012-10-03
  • Resolved: 2012-10-03
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
5.0 b22Resolved
Related Reports
Duplicate :  
Relates :  
Relates :  
Relates :  
Relates :  
Description
See comments section.

Comments
Been in resolved state for more than ten years. Closing.
03-10-2012

CONVERTED DATA BugTraq+ Release Management Values COMMIT TO FIX: tiger FIXED IN: tiger INTEGRATED IN: tiger-b22 tiger-b25 tiger-beta
14-06-2004

EVALUATION See comments section. ###@###.### 2003-08-05: This becomes more important if CMS is expected to "replace" the train (in the sense of -Xincgc) because client applications wouldn;t want to pay the cost of the increased footprint from the marking stack. I am therefore raising the priority to P2 for Tiger and committing this to 1.5.
11-06-2004

WORK AROUND ###@###.### 2003-08-11: -XX:-CMSParallelRemarkEnabled would workaround the silent work-list overflow. However, there is no workaround for stack overflow during concurrent marking or precleaning other than to run with a larger marking stack.
11-06-2004

SUGGESTED FIX The first part has been putback: Event: putback-to Parent workspace: /net/jano/export/disk05/hotspot/ws/main/gc_baseline (jano:/export/disk05/hotspot/ws/main/gc_baseline) Child workspace: /export/imgr_home/ws/20030923170909.ysr.ovflw2 (balvenie:/export/imgr_home/ws/20030923170909.ysr.ovflw2) User: ysr Comment: Original workspace: neeraja:/net/spot/archive02/ysr/ovflw2 Parent workspace: /net/jano/export/disk05/hotspot/ws/main/gc_baseline Submitter: ysr imgr data: /net/balvenie.sfbay/export/imgr_home/archive/main/gc_baseline/2003/20030923170909.ysr.ovflw2 Partial: 4615723 CMS: deal with CMS marking stack overflow webrev: http://analemma.sfbay/net/spot/archive02/ysr/ovflw/webrev This putback addresses stack overflow during the concurrent phases: marking and prelceaning. Overflow during the remark phase will be addressed in a second putback. To recover from overflow during marking, we discard the stack contents, remembering the least address thus discarded. Upon completion of the forward traversal, during which the restart address may need to be updated, we do another traversal starting at the remembered address, doing more iterative retraversals as necessary. In case of overflow during precleaning, we remember to revisit the object by marking the page in the mod union table on which the discraded object lives. In the absence of mutation, both are guaranteed to terminate even with a bounded (non-zero) marking stack size. In the presence of mutation, termination is guaranteed because of (for example) losing the race to a foreground collection. The current stack setting of 8K (down from the original 8M) prevents overflow with all programs in refWorkload. In order to let customers assess the impact of frequent stack overflow which can kill performance, we currently emit a warning upon each such event. Reviewed by: Ross Knippel, Jon Masamitsu Verified fix: y Verification testing: . run with artificially low default and/or max size to induce frequent stack overflow Other testing: (CMS, ? artificially small stack size/max) . imgr all platforms . refworkload . volanotest, atg passed linux i486 product SPECjvm98 GeoMean 37.98 59.37 passed linux i486 product1 SPECjvm98 GeoMean 44.71 49.51 passed linux i486 productcore SPECjvm98 GeoMean 94.63 94.63 passed solaris i486 product SPECjvm98 GeoMean 36.34 56.81 passed solaris i486 product1 SPECjvm98 GeoMean 44.63 49.34 passed solaris i486 productcore SPECjvm98 GeoMean 104.42 104.42 passed solaris sparc product SPECjvm98 GeoMean 22.95 33.62 passed solaris sparc product1 SPECjvm98 GeoMean 22.71 24.90 passed solaris sparc productcore SPECjvm98 GeoMean 40.68 40.68 passed solaris sparcv9 product SPECjvm98 GeoMean 24.57 35.18 passed solaris sparcv9 productcore SPECjvm98 GeoMean 41.15 41.15 passed windows i486 compiler2 SPECjvm98 GeoMean 49.64 96.37 passed windows i486 compiler1 SPECjvm98 GeoMean 56.12 72.18 passed windows i486 core SPECjvm98 GeoMean 118.22 118.22 passed windows ia64 core SPECjvm98 GeoMean 15.55 15.55 Files: update: src/share/vm/memory/concurrentMarkSweepGeneration.cpp update: src/share/vm/memory/concurrentMarkSweepGeneration.hpp update: src/share/vm/memory/concurrentMarkSweepGeneration.inline.hpp update: src/share/vm/memory/genOopClosures.hpp update: src/share/vm/runtime/globals.hpp Examined files: 2882 Contents Summary: 5 update 2877 no action (unchanged) ###@###.### 2003-10-10: The following putback to gc_baseline completes work on this bug. Event: putback-to Parent workspace: /net/jano.sfbay/export/disk05/hotspot/ws/main/gc_baseline (jano.sfbay:/export/disk05/hotspot/ws/main/gc_baseline) Child workspace: /prt-workspaces/20031010014636.ysr.ovflw2/workspace (prt-web:/prt-workspaces/20031010014636.ysr.ovflw2/workspace) User: ysr Comment: --------------------------------------------------------- Original workspace: neeraja:/net/spot/archive02/ysr/ovflw2 Submitter: ysr Archived data: /net/prt-archiver.sfbay/export2/archived_workspaces/main/gc_baseline/2003/20031010014636.ysr.ovflw2/ Webrev: http://analemma.sfbay.sun.com/net/prt-web.sfbay/prt-workspaces/20031010014636.ysr.ovflw2/workspace/webrevs/webrev-2003.10.10/index.html Fixed: 4615723 CMS: deal with CMS marking stack overflow webrev: http://analemma.sfbay/net/spot/archive02/ysr/ovflw2/webrev This putback completes marking-stack/work-queue overflow during the stop-world remark phase (including that encountered during reference processing). Overflow objects are linked on a global (per-collector) overflow list, via the mark-word. Non-prototypical mark-words are spooled into a C-heap growable array (this will be revisited in the future), and restored at the end of a phase. The main change was that, in the parallel case, we needed to make sure that each grey object was handled by a unique thread, so that in the event of an overflow an oop would not be linked multiple times into the overflow list. Work queue overflow handling allowed us to reduce the work queue size to 8K from the former 32K. (We increased the single marking-stack size from the former 8K to 32K to avoid the occasional overflow in _209_db). Reviewed by: Jon Masamitsu Verified fix: y Verification testing: . run with CMSMarkStackOverflowALot (as well as small CMSMarkStackOverflowInterval) to induce frequent marking-stack/work-queue overflow Other testing: (CMS, ? simulated overflow) . spec . refworkload . volanotest Files: update: src/share/vm/memory/concurrentMarkSweepGeneration.cpp update: src/share/vm/memory/concurrentMarkSweepGeneration.hpp update: src/share/vm/memory/concurrentMarkSweepGeneration.inline.hpp update: src/share/vm/memory/genOopClosures.hpp update: src/share/vm/memory/referenceProcessor.cpp update: src/share/vm/oops/oop.hpp update: src/share/vm/oops/oop.inline.hpp update: src/share/vm/runtime/globals.hpp update: src/share/vm/utilities/bitMap.cpp update: src/share/vm/utilities/bitMap.hpp update: src/share/vm/utilities/taskqueue.hpp Examined files: 2977 Contents Summary: 11 update 2966 no action (unchanged)
11-06-2004