United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
JDK-5053106 : A lot of warnings "Mark stack overflow in CMSKeepAliveClosure" is printed to con

Details
Type:
Bug
Submit Date:
2004-05-26
Status:
Resolved
Updated Date:
2004-06-15
Project Name:
JDK
Resolved Date:
2004-06-15
Component:
hotspot
OS:
solaris_8,linux,linux_redhat_8.0
Sub-Component:
gc
CPU:
x86,sparc
Priority:
P3
Resolution:
Fixed
Affected Versions:
5.0
Fixed Versions:
5.0 (b56)

Related Reports

Sub Tasks

Description
Warning:"Java HotSpot(TM) Client VM warning: Mark stack overflow in CMSKeepAliveClosure" is printed many times on console when running Netbeans IDE. Both on Linux RH 9 and Win XP. Memory settings:-J-Xms192m -J-Xmx192m -J-XX:PermSize=25m -J-XX:+UseConcMarkSweepGC -J-XX:+CMSClassUnloadingEnabled. JDK 1.5.0 beta 2 build 51.
###@###.### 2004-05-26

                                    

Comments
EVALUATION


###@###.### 2004-06-02: fixed; testing in progress; review
and approval (for Tiger RC) pending.

###@###.### 2004-06-07: put back to Tiger-RC (target b56).
                                     
2004-06-02
SUGGESTED FIX

Should make a single warning per CMS cycle.
Note that the messages themselves are harmless in the
sense that the GC proceeds correctly despite the
reported resoure constraints (but possibly with some
minor performance impact in terms of slightly longer pause
times or slightly longer collection cycles during such events).

Also, in the case of such overflow during remark (1-thread
rescan or 1-thread weak ref tracing), should expand
marking stack so as to reduce possibility of overflow
in the future, thus reducing the performance impact.

Will try to target this fix for Tiger-RC, if possible,
since the current warning appears to be causing some
customer confusion.

------
Here are the relevant diffs in a Tiger-based workspace:

------- concurrentMarkSweepGeneration.hpp -------
429a430,439
>   // Counters in support of marking stack / work queue overflow handling:
>   // a non-zero value indicates certain types of overflow event during
>   // the most recent previous CMS cycle and could lead to stack
>   // resizing efforts at an opprotune future time.
>   size_t        _ser_pmc_preclean_ovflw;
>   size_t        _ser_pmc_remark_ovflw;
>   size_t        _par_pmc_remark_ovflw;
>   size_t        _ser_kac_ovflw;
>   size_t        _par_kac_ovflw;
> 

------- concurrentMarkSweepGeneration.cpp -------
438c438,443
<   _stats(cmsGen)
---
>   _stats(cmsGen),
>   _ser_pmc_preclean_ovflw(0),
>   _ser_pmc_remark_ovflw(0),
>   _par_pmc_remark_ovflw(0),
>   _ser_kac_ovflw(0),
>   _par_kac_ovflw(0)
3439a3445,3467
> 
>   // If we encountered any (marking stack / work queue) overflow
>   // events during the current CMS cycle, take appropriate
>   // remedial measures, where possible, so as to try and avoid
>   // recurrence of that condition.
>   assert(_markStack.isEmpty(), "Zero grey objects");
>   size_t ser_ovflw = _ser_pmc_remark_ovflw + _ser_pmc_preclean_ovflw +
>                      _ser_kac_ovflw;
>   if (ser_ovflw > 0) {
>     if (PrintGCDetails) {
>       gclog_or_tty->print_cr("Marking stack overflow (benign) "
>         "(pmc_pc=%d, pmc_rm=%d, kac=%d)",
>         _ser_pmc_preclean_ovflw, _ser_pmc_remark_ovflw,
>         _ser_kac_ovflw);
>     }
>     _markStack.expand();
>   }
>   if (PrintGCDetails && (_par_pmc_remark_ovflw > 0 || _par_kac_ovflw > 0)) {
>     gclog_or_tty->print_cr("Work queue overflow (benign) "
>       "(pmc_rm=%d, kac=%d)",
>       _par_pmc_remark_ovflw, _par_kac_ovflw);
>   }
> 
4735c4763
<     warning("hit CMSMarkStackMax limit");
---
>     warning("Hit CMSMarkStackMax limit");
4756c4784
<     warning("failed to expand marking stack from %d K to % d K",
---
>     warning("Failed to expand marking stack from %d K to % d K",
5244c5272
<          warning("Marking stack overflow during preclean");
---
>          _collector->_ser_pmc_preclean_ovflw++;
5249c5277
<          warning("Marking stack overflow during re-mark");
---
>          _collector->_ser_pmc_remark_ovflw++;
5281c5309
<         warning("Work queue overflow during re-mark");
---
>         _collector->_par_pmc_remark_ovflw++; //  imprecise OK: no need to CAS
5936d5963
<       warning("Mark stack overflow in CMSKeepAliveClosure");
5937a5965
>       _collector->_ser_kac_ovflw++;
5994c6022
<         warning("Work queue overflow in CMSInnerParMarkAndPushClosure");
---
>         _collector->_par_kac_ovflw++;

###@###.### 2004-06-07: Final fix can be obtained from
the following location; put back to Tiger b56:

Event:            putback-to
Parent workspace: /net/jano.sfbay/export/disk05/hotspot/ws/main/gc_baseline
                  (jano.sfbay:/export/disk05/hotspot/ws/main/gc_baseline)
Child workspace:  /prt-workspaces/20040604170326.ysr.tiger/workspace
                  (prt-web:/prt-workspaces/20040604170326.ysr.tiger/workspace)
User:             ysr

Comment:

---------------------------------------------------------

Original workspace:     neeraja:/net/jano.sfbay/export/hotspot/users1/ysr/tiger
Submitter:              ysr
Archived data:          /net/prt-archiver.sfbay/export2/archived_workspaces/main/gc_baseline/2004/20040604170326.ysr.tiger/
Webrev:                 http://analemma.sfbay.sun.com/net/prt-archiver.sfbay/export2/archived_workspaces/main/gc_baseline/2004/20040604170326.ysr.tiger/workspace/webrevs/webrev-2004.06.04/index.html

Fixed 5053106: A lot of warnings "Mark stack overflow in CMSKeepAliveClosure" is printed to console

  http://analemma.sfbay/net/jano/export/disk05/hotspot/users/ysr/tiger/webrev

Previously each overflow event generated a message, even without any
remedial action to prevent its future recurrence. In some cases,
this would generate a lot of output which was both off-putting
as well as confusing to customers (several external customers
reported this confusion in the past several weeks; this bug is new
in Tiger).

The changes are:
. an event (overflow, stack expansion/failure) is logged to a counter
. per each cycle, the counters are checked and as
  necessary a single message issued and remedial action
  taken where appropriate
. the message is issued only when PrintGCDetails or
  PrintCMSStatictis (as appropriate) are set
. the message goes to the GC log
. the message is prefixed with "benign" so as not to cause
  customers to panic upon seeing such a message

We requested this fix in Tiger RC because:
. it is very low risk (risk & change limited to CMS)
. it fixes a performance anomaly (albeit a rare case)
. it reduces the possibility of customer confusion

Fixed some print format control strings (%d to SIZE_FORMAT
where appropriate).

Reviewed by: John Coomes, Paul Hohensee, Jon Masamitsu
Approved by: (low risk) core team

Fix verified (y/n): y

Verification testing:
  . product/debug CMS with ridiculously small marking stack size
  . debug CMS with CMSMarkStackOverflowALot

Other testing:
  . PRT
  . spec
  . refworkload

Files:
update: src/share/vm/memory/concurrentMarkSweepGeneration.cpp
update: src/share/vm/memory/concurrentMarkSweepGeneration.hpp

Examined files: 3215

Contents Summary:
       2   update
    3213   no action (unchanged)
                                     
2004-09-19
WORK AROUND

Try increaing the cms marking stack size by means of
-XX:CMSMarkStackSize=4M

This will reduce the number of sucvh messages, perhs down to zero if
you are lucky, but for sufficiently bushy and deep object graphs
(especially those rooted off of weak reference articulation points)
may not be sufficient to get rid of the messages entirely.

Otherwise try even larger marking stack sizes:

-XX:CMSMarkStackSize=32M -XX:CMSMarkStackSizeMax=32M
                                     
2004-09-19
CONVERTED DATA

BugTraq+ Release Management Values

COMMIT TO FIX:
tiger-rc

FIXED IN:
tiger-rc

INTEGRATED IN:
tiger-b56
tiger-rc


                                     
2004-09-19



Hardware and Software, Engineered to Work Together