JDK-6668573 : CMS: reference processing crash if ParallelCMSThreads > ParallelGCThreads
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: hs20,hs21,6u25,6u27,7
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • OS: generic,solaris,solaris_10
  • CPU: generic,x86,sparc
  • Submitted: 2008-02-27
  • Updated: 2015-06-22
  • Resolved: 2011-04-24
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 7 Other
7Fixed hs21Fixed
Related Reports
Duplicate :  
Duplicate :  
Relates :  
Description
Run with for example:

-XX:+PrintGCDetails -XX:+ShowMessageBoxOnError XX:+UseConcMarkSweepGC -XX:ParallelGCThreads=2 -XX:ParallelCMSThreads=10 -XX:+CMSConcurrentMTEnabled -XX:CMSInitiatingOccupancyFraction=1 -XX:+ExplicitGCInvokesConcurrentAndUnloadsClasses


# To suppress the following error report, specify this argument
# after -XX: or in .hotspotrc:  SuppressErrorAt=/referenceProcessor.cpp:925
==============================================================================
Unexpected Error
------------------------------------------------------------------------------
Internal Error at referenceProcessor.cpp:925, pid=11636, tid=13 
Error: assert(0 <= id && id < _num_q,"Id is out-of-bounds (call Freud?)")

Do you want to debug the problem?

To debug, run 'dbx - 11636'; then switch to thread 13
Enter 'yes' to launch dbx automatically (PATH must include dbx)
Otherwise, press RETURN to abort...
==============================================================================
current thread: t@13
  [1] _waitid(0x0, 0x2d75, 0xf257f198, 0x3, 0x0, 0xfdae2800), at 0xff2c19e4 
  [2] _waitpid(0x2d75, 0xf257f2e0, 0x0, 0x2d75, 0xd, 0x0), at 0xff268404 
  [3] waitpid(0x2d75, 0xf257f2e0, 0x0, 0x0, 0x0, 0xfdae2800), at 0xff2b48dc 
=>[4] os::fork_and_exec(cmd = 0xff12ffd8 "dbx - 11636"), line 5761 in "os_solaris.cpp"
  [5] VMError::show_message_box(this = 0xf257f520, buf = 0xff12ffd8 "dbx - 11636", buflen = 2000), line 53 in "vmError_solaris.cpp"
  [6] VMError::report_and_die(this = 0xf257f520), line 661 in "vmError.cpp"
  [7] report_assertion_failure(file_name = 0xfef8262c "/net/neeraja/export/ysr/hg-gc/src/share/vm/memory/referenceProcessor.cpp", line_no = 925, message = 0xfef82675 "assert(0 <= id && id < _num_q,"Id is out-of-bounds (call Freud?)")"), line 173 in "debug.cpp"
  [8] ReferenceProcessor::get_discovered_list(this = 0x1f0e70, rt = REF_FINAL), line 925 in "referenceProcessor.cpp"
  [9] ReferenceProcessor::discover_reference(this = 0x1f0e70, obj = 0xf44182d0, rt = REF_FINAL), line 1081 in "referenceProcessor.cpp"
  [10] instanceRefKlass::oop_oop_iterate_nv(this = 0xf742ca68, obj = 0xf44182d0, closure = 0xf257f7d8), line 208 in "instanceRefKlass.cpp"
  [11] oopDesc::oop_iterate(this = 0xf44182d0, blk = 0xf257f7d8), line 470 in "oop.inline.hpp"
  [12] Par_MarkFromRootsClosure::scan_oops_in_oop(this = 0xf257f9e4, ptr = 0xf44182d0), line 7063 in "concurrentMarkSweepGeneration.cpp"
  [13] Par_MarkFromRootsClosure::do_bit(this = 0xf257f9e4, offset = 24756U), line 6980 in "concurrentMarkSweepGeneration.cpp"
  [14] BitMap::iterate(this = 0xa6448, blk = 0xf257f9e4, leftOffset = 0, rightOffset = 131072U), line 450 in "bitMap.cpp"
  [15] CMSBitMap::iterate(this = 0xa6404, cl = 0xf257f9e4, left = 0xf4400000, right = 0xf4480000), line 223 in "concurrentMarkSweepGeneration.inline.hpp"
  [16] CMSConcMarkingTask::do_scan_and_mark(this = 0xf237fb94, i = 0, sp = 0x73e80), line 3849 in "concurrentMarkSweepGeneration.cpp"
  [17] CMSConcMarkingTask::work(this = 0xf237fb94, i = 0), line 3710 in "concurrentMarkSweepGeneration.cpp"
  [18] YieldingFlexibleGangWorker::loop(this = 0xb2400), line 342 in "yieldingWorkgroup.cpp"
  [19] GangWorker::run(this = 0xb2400), line 186 in "workgroup.cpp"
  [20] java_start(thread_addr = 0xb2400), line 1010 in "os_solaris.cpp"

Current function is ReferenceProcessor::get_discovered_list
  925     assert(0 <= id && id < _num_q, "Id is out-of-bounds (call Freud?)");
(dbx) print id
id = 8
(dbx) print _num_q
_num_q = 2
(dbx) print ParallelGCThreads
ParallelGCThreads = 2U
(dbx) print ParallelCMSThreads
ParallelCMSThreads = 10U
As of JDK 7, ParallelCMSThreads is now called ConcGCThreads, so the synopsis
(in the context of JDK 7) might read:-

   CMS: reference processing crash if ConcGCThreads > ParallelGCThreads

Comments
As ConcGCThreads became an alias for ParallelCMSThreads, issue was also observed with 64-bit Java SE 6u35 Server VM and ConcGCThreads > ParallelGCThreads So, as a workaround use: ParallelCMSThreads <= ParallelGCThreads or ConcGCThreads <= ParallelGCThreads
12-10-2012

EVALUATION http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/92da084fefc9
25-03-2011

EVALUATION http://hg.openjdk.java.net/jdk7/hotspot-rt/hotspot/rev/92da084fefc9
21-03-2011

EVALUATION http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/92da084fefc9
17-03-2011

SUGGESTED FIX Webrev under development, test and review:- /net/neeraja/export/ysr/refproc/webrev Also contains the following hack to fix a CMS ref proc breakage that came in with 6984287 and might need to be independently fixed (backported to 6uXX?):- --- old/src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp Thu Mar 10 02:15:20 2011 +++ new/src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp Thu Mar 10 02:15:20 2011 @@ -5710,6 +5714,7 @@ { assert(_collector->_span.equals(_span) && !_span.is_empty(), "Inconsistency in _span"); + set_for_termination(workers->active_workers()); } OopTaskQueueSet* task_queues() { return queues(); }
10-03-2011

SUGGESTED FIX Use Andrey's suggestion: Could we use ReferenceProcessor::balance_queues to transform discovery_queues[ParallelCMSThreads] -> processing_queues[ParallelGCThreads] ?
10-03-2011

EVALUATION The relevant code in the reference processor needs to allow for the possibility that users may run with ParallelCMSThreads > ParallelGCThreads. (We might also want to add a warning that such a setting might potentially be sub-optimal wrt performance, but that is orthogonal to the correctness fix above.)
27-02-2008

WORK AROUND always use ParallelCMSThreads <= ParallelGCThreads (if modifying them via the command line).
27-02-2008