United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
JDK-6214543 : CMS: more helpful message for concurrent mode failures

Details
Type:
Enhancement
Submit Date:
2005-01-06
Status:
Resolved
Updated Date:
2010-12-08
Project Name:
JDK
Resolved Date:
2005-06-10
Component:
hotspot
OS:
solaris_8,generic
Sub-Component:
gc
CPU:
sparc,generic
Priority:
P4
Resolution:
Fixed
Affected Versions:
1.4.2_02,1.4.2_04,1.4.2_07
Fixed Versions:

Related Reports
Backport:
Backport:

Sub Tasks

Description
Noticed that there are no "CMS-concurrent-sweep" statements after that [03/Jan/2005:15:56:57]. 



[03/Jan/2005:14:20:32] WARNING (24945): CORE3283: stderr: 252263.030: [CMS-concurrent-sweep-start]
[03/Jan/2005:14:20:36] WARNING (24945): CORE3283: stderr: 252267.261: [CMS-concurrent-sweep: 4.177/4.231 secs]
[03/Jan/2005:14:20:36] WARNING (24945): CORE3283: stderr: 252267.262: [CMS-concurrent-reset-start]
[03/Jan/2005:14:20:36] WARNING (24945): CORE3283: stderr: 252267.422: [CMS-concurrent-reset: 0.160/0.160 secs]
[03/Jan/2005:15:24:03] WARNING (24945): CORE3283: stderr: 256075.228: [GC [1 CMS-initial-mark: 705420K(1036288K)] 705421K(1048512K), 0.0136227 secs]
[03/Jan/2005:15:24:03] WARNING (24945): CORE3283: stderr: 256075.242: [CMS-concurrent-mark-start]
[03/Jan/2005:15:24:07] WARNING (24945): CORE3283: stderr: 256078.572: [CMS-concurrent-mark: 3.330/3.330 secs]
[03/Jan/2005:15:24:07] WARNING (24945): CORE3283: stderr: 256078.573: [CMS-concurrent-preclean-start]^M
[03/Jan/2005:15:24:07] WARNING (24945): CORE3283: stderr: 256078.596: [CMS-concurrent-preclean: 0.023/0.023 secs]
[03/Jan/2005:15:24:07] WARNING (24945): CORE3283: stderr: 256078.599: [GC256078.599: [Rescan (parallel) , 0.0553983 secs]256078.654: [weak refs processing, 0.0211087 secs] [1 CMS-remark: 705420K(1036288K)] 711280K(1048512K), 0.0776254 secs]
[03/Jan/2005:15:24:07] WARNING (24945): CORE3283: stderr: 256078.677: [CMS-concurrent-sweep-start]
[03/Jan/2005:15:24:11] WARNING (24945): CORE3283: stderr: 256082.672: [CMS-concurrent-sweep: 3.899/3.995 secs]
[03/Jan/2005:15:24:11] WARNING (24945): CORE3283: stderr: 256082.672: [CMS-concurrent-reset-start]
[03/Jan/2005:15:24:11] WARNING (24945): CORE3283: stderr: 256082.815: [CMS-concurrent-reset: 0.143/0.143 secs]
[03/Jan/2005:15:56:57] WARNING (24945): CORE3283: stderr: 258049.153: [GC [1 CMS-initial-mark: 713636K(1036288K)] 713844K(1048512K), 0.0097374 secs]
[03/Jan/2005:15:56:57] WARNING (24945): CORE3283: stderr: 258049.163: [CMS-concurrent-mark-start]
[03/Jan/2005:15:57:03] WARNING (24945): CORE3283: stderr: : 1068K->0K(12224K), 0.1910539 secs]258052.944: [CMS258054.610: [CMS-concurrent-mark: 3.493/5.447 secs]
[03/Jan/2005:15:57:22] WARNING (24945): CORE3283: stderr: 258073.595: [GC [1 CMS-initial-mark: 733171K(1036288K)] 733491K(1048512K), 0.0085612 secs]
[03/Jan/2005:15:57:22] WARNING (24945): CORE3283: stderr: 258073.604: [CMS-concurrent-mark-start]
[03/Jan/2005:15:57:39] WARNING (24945): CORE3283: stderr: : 6865K->0K(12224K), 0.9945913 secs]258090.758: [CMS258091.096: [CMS-concurrent-mark: 3.559/17.492 secs]
[03/Jan/2005:15:57:58] WARNING (24945): CORE3283: stderr: 258110.178: [GC [1 CMS-initial-mark: 768101K(1036288K)] 768549K(1048512K), 0.0091307 secs]
[03/Jan/2005:15:57:58] WARNING (24945): CORE3283: stderr: 258110.187: [CMS-concurrent-mark-start]
[03/Jan/2005:15:58:16] WARNING (24945): CORE3283: stderr: 258127.313: [GC 258127.313: [ParNew: 11954K->11954K(12224K), 0.0000652 secs]258127.313: [CMS258128.415: [CMS-concurrent-mark: 3.557/18.228 secs]
[03/Jan/2005:15:58:33] WARNING (24945): CORE3283: stderr: 258145.334: [GC [1 CMS-initial-mark: 717506K(1036288K)] 718066K(1048512K), 0.0118304 secs]
[03/Jan/2005:15:58:33] WARNING (24945): CORE3283: stderr: 258145.346: [CMS-concurrent-mark-start]
[03/Jan/2005:15:58:55] WARNING (24945): CORE3283: stderr: : 6532K->0K(12224K), 1.8597741 secs]258165.643: [CMS258166.575: [CMS-concurrent-mark: 3.562/21.228 secs]
[03/Jan/2005:15:59:13] WARNING (24945): CORE3283: stderr: 258184.975: [GC [1 CMS-initial-mark: 770501K(1036288K)] 770741K(1048512K), 0.0096138 secs]
[03/Jan/2005:15:59:13] WARNING (24945): CORE3283: stderr: 258184.985: [CMS-concurrent-mark-start]
[03/Jan/2005:15:59:32] WARNING (24945): CORE3283: stderr: : 11947K->0K(12224K), 2.3358861 secs]258202.439: [CMS258203.892: [CMS-concurrent-mark: 3.506/18.907 secs]
[03/Jan/2005:15:59:48] WARNING (24945): CORE3283: stderr: 258219.937: [GC [1 CMS-initial-mark: 716085K(1036288K)] 716470K(1048512K), 0.0076492 secs]
[03/Jan/2005:15:59:48] WARNING (24945): CORE3283: stderr: 258219.945: [CMS-concurrent-mark-start]
[03/Jan/2005:16:00:14] WARNING (24945): CORE3283: stderr: : 5185K->0K(12224K), 2.4303478 secs]258244.898: [CMS258246.097: [CMS-concurrent-mark: 3.515/26.152 secs]
[03/Jan/2005:16:00:36] WARNING (24945): CORE3283: stderr: 258267.877: [GC [1 CMS-initial-mark: 868463K(1036288K)] 868719K(1048512K), 0.0083395 secs]
[03/Jan/2005:16:00:36] WARNING (24945): CORE3283: stderr: 258267.886: [CMS-concurrent-mark-start]
[03/Jan/2005:16:00:53] WARNING (24945): CORE3283: stderr: : 11605K->0K(12224K), 2.5810415 secs]258283.306: [CMS258285.568: [CMS-concurrent-mark: 3.445/17.682 secs]
[03/Jan/2005:16:01:12] WARNING (24945): CORE3283: stderr: 258304.297: [GC [1 CMS-initial-mark: 782716K(1036288K)] 783165K(1048512K), 0.0086518 secs]
[03/Jan/2005:16:01:12] WARNING (24945): CORE3283: stderr: 258304.306: [CMS-concurrent-mark-start]
[03/Jan/2005:16:01:29] WARNING (24945): CORE3283: stderr: : 1683K->0K(12224K), 2.4806704 secs]258319.408: [CMS258321.570: [CMS-concurrent-mark: 3.420/17.264 secs]
[03/Jan/2005:16:01:50] WARNING (24945): CORE3283: stderr: 258341.680: [GC [1 CMS-initial-mark: 813600K(1036288K)] 813856K(1048512K), 0.0125876 secs]
[03/Jan/2005:16:01:50] WARNING (24945): CORE3283: stderr: 258341.693: [CMS-concurrent-mark-start]

###@###.### 2005-1-06 03:25:59 GMT

I am changing this bug's synopsis from:
CMS-concurrent-sweep messages missing from a "normal" concurrent CMS cycle
to:
CMS: concurrent mode failures running S1AS at NCS installation
to track the main nature of the current investigation.
The secondary issue of explicating the verbose gc messages
regarding occurance of a concurrent mode failure will also
be tackled as part of this investigation, but is not the
main thrust of thos bug.
###@###.### 2005-1-12 23:25:10 GMT

I am changing this bug's synopsis (yet again),
based on the current status of this engagement, from:

CMS: concurrent mode failures running S1AS at NCS installation

to

CMS: more helpful message for concurrent mode failures
###@###.### 2005-03-04 21:14:58 GMT

                                    

Comments
SUGGESTED FIX

Event:            putback-to
Parent workspace: /net/jano.sfbay/export/disk05/hotspot/ws/main/gc_baseline
                  (jano.sfbay:/export/disk05/hotspot/ws/main/gc_baseline)
Child workspace:  /net/prt-web.sfbay/prt-workspaces/20050310092810.ysr.bugs/workspace
                  (prt-web:/net/prt-web.sfbay/prt-workspaces/20050310092810.ysr.bugs/workspace)
User:             ysr

Comment:

---------------------------------------------------------

Original workspace:     neeraja:/net/spot/scratch/ysr/bugs
Submitter:              ysr
Archived data:          /net/prt-archiver.sfbay/data/archived_workspaces/main/gc_baseline/2005/20050310092810.ysr.bugs/
Webrev:                 http://analemma.sfbay.sun.com/net/prt-archiver.sfbay/data/archived_workspaces/main/gc_baseline/2005/20050310092810.ysr.bugs/workspace/webrevs/webrev-2005.03.10/index.html

Fixed 6214543: CMS: more helpful message for concurrent mode failure

   http://analemma.sfbay/net/spot/scratch/ysr/bugs/webrev

[If you happen to look at the bug report, this started out as
an investigation of conc mode failures at a customer site
in Singapore, but that issue was resolved as the customer
had overlooked switching on class unloading with CMS.
The customer-facing engineer however wanted the concurrent
mode failure messages to be more explicit. A fortnight ago,
a similar request came in from Nortel, which is the second
SR in this CR.]

We declare "(concurrent mode failure)" whenever
an ongoing CMS collection is interrupted by an
explicit GC request or, irrespective of an ongoing
CMS cycle, the VM needed to bail into a stop-world
collection of the CMS heap.

For the case where we collect because of explicit user
request but do not interrupt an ongoing concurrent cycle,
we do not print a message declaring concurrent
mode failure.

The final form of the code gained from a simplification
suggested by Jon.

Reviewed by: Paul Hohensee, Jon Masamitsu

Fix verified: yes

Verification testing: run with CMS and artificially small
  heap to induce frequent concurrent mode failures; note
  that explicit requests do not print the message
  unless they collide with an ongoing concurrent
  collection. Also tested with -XX:+ExplicitGCInvokesConcurrent
  and with CMSInitiatingOccupancyFraction=100 UseCMSInitiatingOccupancyOnly

Files:
update: src/share/vm/memory/concurrentMarkSweepGeneration.cpp

Examined files: 3217

Contents Summary:
       1   update
    3216   no action (unchanged)

###@###.### 2005-03-10 23:30:48 GMT
                                     
2005-01-06
WORK AROUND

In verbose gc logs +PrintGCDetails look for
a pattern that matches:

  [GC: [ParNew: x^*][CMS x^*] x^*]

where x is any character other than "[" or "]".

The above match indicates a concurrent mode failure.
###@###.### 2005-03-04 21:14:59 GMT
                                     
2005-01-06
EVALUATION

The most recent update from the customer-facing engineer
appears to indicate that the problem was a pilot error
because of not enabling class unloading with CMS
(where it is off by default).

Since we have not received any further updates on this issue,
we will (unless informed otherwise) use this bug to affect
update of the verbose gc messages to explicate the
occurance of a concurrent mode failure. Those changes
will be in 1.6.0 and will be available on an as-needed
basis to back port to older releases as an EOU issue.


###@###.### 2005-03-01 17:34:07 GMT

Explicated in the verbose gc logs, the occurance of a concurrent
mode failure; look for "concurrent mode failure". See suggested
fix field. The fix is simple and may be backported to 1.4.2
at the discretion of JPSE.
###@###.### 2005-03-10 23:30:48 GMT

This has long been integrated into Mustang. Why hasn't this been
marked "Integrated" yet? Check with ###@###.###; this
sounds like a process snafu.
###@###.### 2005-05-27 23:07:48 GMT

I am marking this integrated since I have not heard back from Erik
about this.
###@###.### 2005-06-10 18:05:55 GMT
                                     
2005-01-06



Hardware and Software, Engineered to Work Together