United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: JDK-6668573 CMS: reference processing crash if ParallelCMSThreads > ParallelGCThreads
JDK-6668573 : CMS: reference processing crash if ParallelCMSThreads > ParallelGCThreads

Details
Type:
Bug
Submit Date:
2008-02-27
Status:
Closed
Updated Date:
2013-09-12
Project Name:
JDK
Resolved Date:
2011-04-24
Component:
hotspot
OS:
solaris,generic,solaris_10
Sub-Component:
gc
CPU:
x86,sparc,generic
Priority:
P3
Resolution:
Fixed
Affected Versions:
hs20,hs21,6u25,6u27,7
Fixed Versions:
hs21 (b05)

Related Reports
Backport:
Duplicate:
Relates:

Sub Tasks

Description
Run with for example:

-XX:+PrintGCDetails -XX:+ShowMessageBoxOnError XX:+UseConcMarkSweepGC -XX:ParallelGCThreads=2 -XX:ParallelCMSThreads=10 -XX:+CMSConcurrentMTEnabled -XX:CMSInitiatingOccupancyFraction=1 -XX:+ExplicitGCInvokesConcurrentAndUnloadsClasses


# To suppress the following error report, specify this argument
# after -XX: or in .hotspotrc:  SuppressErrorAt=/referenceProcessor.cpp:925
==============================================================================
Unexpected Error
------------------------------------------------------------------------------
Internal Error at referenceProcessor.cpp:925, pid=11636, tid=13 
Error: assert(0 <= id && id < _num_q,"Id is out-of-bounds (call Freud?)")

Do you want to debug the problem?

To debug, run 'dbx - 11636'; then switch to thread 13
Enter 'yes' to launch dbx automatically (PATH must include dbx)
Otherwise, press RETURN to abort...
==============================================================================
current thread: t@13
  [1] _waitid(0x0, 0x2d75, 0xf257f198, 0x3, 0x0, 0xfdae2800), at 0xff2c19e4 
  [2] _waitpid(0x2d75, 0xf257f2e0, 0x0, 0x2d75, 0xd, 0x0), at 0xff268404 
  [3] waitpid(0x2d75, 0xf257f2e0, 0x0, 0x0, 0x0, 0xfdae2800), at 0xff2b48dc 
=>[4] os::fork_and_exec(cmd = 0xff12ffd8 "dbx - 11636"), line 5761 in "os_solaris.cpp"
  [5] VMError::show_message_box(this = 0xf257f520, buf = 0xff12ffd8 "dbx - 11636", buflen = 2000), line 53 in "vmError_solaris.cpp"
  [6] VMError::report_and_die(this = 0xf257f520), line 661 in "vmError.cpp"
  [7] report_assertion_failure(file_name = 0xfef8262c "/net/neeraja/export/ysr/hg-gc/src/share/vm/memory/referenceProcessor.cpp", line_no = 925, message = 0xfef82675 "assert(0 <= id && id < _num_q,"Id is out-of-bounds (call Freud?)")"), line 173 in "debug.cpp"
  [8] ReferenceProcessor::get_discovered_list(this = 0x1f0e70, rt = REF_FINAL), line 925 in "referenceProcessor.cpp"
  [9] ReferenceProcessor::discover_reference(this = 0x1f0e70, obj = 0xf44182d0, rt = REF_FINAL), line 1081 in "referenceProcessor.cpp"
  [10] instanceRefKlass::oop_oop_iterate_nv(this = 0xf742ca68, obj = 0xf44182d0, closure = 0xf257f7d8), line 208 in "instanceRefKlass.cpp"
  [11] oopDesc::oop_iterate(this = 0xf44182d0, blk = 0xf257f7d8), line 470 in "oop.inline.hpp"
  [12] Par_MarkFromRootsClosure::scan_oops_in_oop(this = 0xf257f9e4, ptr = 0xf44182d0), line 7063 in "concurrentMarkSweepGeneration.cpp"
  [13] Par_MarkFromRootsClosure::do_bit(this = 0xf257f9e4, offset = 24756U), line 6980 in "concurrentMarkSweepGeneration.cpp"
  [14] BitMap::iterate(this = 0xa6448, blk = 0xf257f9e4, leftOffset = 0, rightOffset = 131072U), line 450 in "bitMap.cpp"
  [15] CMSBitMap::iterate(this = 0xa6404, cl = 0xf257f9e4, left = 0xf4400000, right = 0xf4480000), line 223 in "concurrentMarkSweepGeneration.inline.hpp"
  [16] CMSConcMarkingTask::do_scan_and_mark(this = 0xf237fb94, i = 0, sp = 0x73e80), line 3849 in "concurrentMarkSweepGeneration.cpp"
  [17] CMSConcMarkingTask::work(this = 0xf237fb94, i = 0), line 3710 in "concurrentMarkSweepGeneration.cpp"
  [18] YieldingFlexibleGangWorker::loop(this = 0xb2400), line 342 in "yieldingWorkgroup.cpp"
  [19] GangWorker::run(this = 0xb2400), line 186 in "workgroup.cpp"
  [20] java_start(thread_addr = 0xb2400), line 1010 in "os_solaris.cpp"

Current function is ReferenceProcessor::get_discovered_list
  925     assert(0 <= id && id < _num_q, "Id is out-of-bounds (call Freud?)");
(dbx) print id
id = 8
(dbx) print _num_q
_num_q = 2
(dbx) print ParallelGCThreads
ParallelGCThreads = 2U
(dbx) print ParallelCMSThreads
ParallelCMSThreads = 10U
As of JDK 7, ParallelCMSThreads is now called ConcGCThreads, so the synopsis
(in the context of JDK 7) might read:-

   CMS: reference processing crash if ConcGCThreads > ParallelGCThreads

                                    

Comments
SUGGESTED FIX

Webrev under development, test and review:-

  /net/neeraja/export/ysr/refproc/webrev

Also contains the following hack to fix a CMS ref proc breakage
that came in with 6984287 and might need to be independently
fixed (backported to 6uXX?):-

--- old/src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp	Thu Mar 10 02:15:20 2011
+++ new/src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp	Thu Mar 10 02:15:20 2011
@@ -5710,6 +5714,7 @@
     {
       assert(_collector->_span.equals(_span) && !_span.is_empty(),
              "Inconsistency in _span");
+      set_for_termination(workers->active_workers());
     }
 
   OopTaskQueueSet* task_queues() { return queues(); }
                                     
2011-03-10
SUGGESTED FIX

Use Andrey's suggestion:

     Could we use ReferenceProcessor::balance_queues to transform
     discovery_queues[ParallelCMSThreads] ->
     processing_queues[ParallelGCThreads] ?
                                     
2011-03-10
EVALUATION

http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/92da084fefc9
                                     
2011-03-17
EVALUATION

http://hg.openjdk.java.net/jdk7/hotspot-rt/hotspot/rev/92da084fefc9
                                     
2011-03-21
EVALUATION

http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/92da084fefc9
                                     
2011-03-25
EVALUATION

The relevant code in the reference processor needs to allow for the
possibility that users may run with ParallelCMSThreads > ParallelGCThreads.

(We might also want to add a warning that such a setting might potentially be
sub-optimal wrt performance, but that is orthogonal to the correctness fix above.)
                                     
2008-02-27
WORK AROUND

always use ParallelCMSThreads <= ParallelGCThreads (if modifying them
via the command line).
                                     
2008-02-27
As ConcGCThreads became an alias for ParallelCMSThreads, 
issue was also observed with 64-bit Java SE 6u35 Server VM 
and
ConcGCThreads > ParallelGCThreads 

So, as a workaround use:
ParallelCMSThreads <= ParallelGCThreads
or
ConcGCThreads <= ParallelGCThreads 

                                     
2012-10-12



Hardware and Software, Engineered to Work Together