JDK-6988458 : G1: assert(mr.end() <= _cm->finger()) failed: otherwise the region shouldn't be on the stack
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 7
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2010-09-29
  • Updated: 2013-09-18
  • Resolved: 2010-11-12
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 6 JDK 7 Other
6u25Fixed 7Fixed hs20Fixed
Related Reports
Relates :  
Description
The following nightly test:

vm/gc/containers/LinkedList_Arrays

failed with the following crash:

;; Using jvm: "/export/local/common/jdk/baseline/linux-i586/jre/lib/i386/server/libjvm.so"
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/tmp/jprt/P1/B/230657.jcoomes/source/src/share/vm/gc_implementation/g1/concurrentMark.cpp:3508), pid=24653, tid=2882767760
#  assert(mr.end() <= _cm->finger()) failed: otherwise the region shouldn't be on the stack
#
# JRE version: 7.0
# Java VM: OpenJDK Server VM (20.0-b01-201009282306.jcoomes.gc-stack-fastdebug mixed mode linux-x86 )
# If you would like to submit a bug report, please visit:
#   http://java.sun.com/webapps/bugreport/crash.jsp
#

---------------  T H R E A D  ---------------

Current thread (0x09862800):  ConcurrentGCThread [stack: 0x00000000,0x00000000] [id=24676]

Stack: 
[error occurred during error reporting (printing stack bounds), id 0xe0000000]

Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0xa21687];;  _ZN7VMError6reportEP12outputStream+0x1207
V  [libjvm.so+0xa2190d];;  _ZN7VMError14report_and_dieEv+0x18d
V  [libjvm.so+0x4917d8];;  _Z15report_vm_errorPKciS0_S0_+0x68
V  [libjvm.so+0xa2173f];;  _ZN7VMError6reportEP12outputStream+0x12bf
V  [libjvm.so+0xa2190d];;  _ZN7VMError14report_and_dieEv+0x18d
V  [libjvm.so+0x4917d8];;  _Z15report_vm_errorPKciS0_S0_+0x68
V  [libjvm.so+0x447861];;  _ZN6CMTask18drain_region_stackEP13BitMapClosure+0x251
V  [libjvm.so+0x44cb61];;  _ZN6CMTask15do_marking_stepEd+0x2e1
V  [libjvm.so+0x451f55];;  _ZN23CMConcurrentMarkingTask4workEi+0x125
V  [libjvm.so+0xa4b8f0];;  _ZN10GangWorker4loopEv+0x130
V  [libjvm.so+0xa4a368];;  _ZN10GangWorker3runEv+0x18
V  [libjvm.so+0x85e839];;  _ZL10java_startP6Thread+0xf9
C  [libpthread.so.0+0x5832]


Links to failure:

http://sqeweb.sfbay.sun.com/nfs/tools/gtee/results/JDK7/NIGHTLY/VM/2010-09-28/G1_GC_Baseline/vm/linux-i586/server/mixed/linux-i586_vm_server_mixed_vm.gc.testlist/analysis.html

http://sqeweb.sfbay.sun.com/nfs/tools/gtee/results/JDK7/NIGHTLY/VM/2010-09-28/G1_GC_Baseline/vm/linux-i586/server/mixed/linux-i586_vm_server_mixed_vm.gc.testlist/ResultDir/LinkedList_Arrays/

Culprit _must_ be the changes for 6941395 which were pushed yesterday.

Comments
EVALUATION http://hg.openjdk.java.net/jdk7/build/hotspot/rev/a5c514e74487
04-12-2010

EVALUATION For some reason, jprt did not update the CR with the changeset. Here it is: Changeset: a5c514e74487 Author: johnc Date: 2010-10-18 15:01 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/a5c514e74487 6988458: G1: assert(mr.end() <= _cm->finger()) failed: otherwise the region shouldn't be on the stack Summary: The changes from 6941395 did not clear the CMTask::_aborted_region fields when concurrent marking aborted because of overflow. As a result, the next time around we could see a memory region whose start address was above the global finger and the assertion tripped. Moved the clearing of the aborted regions to ConcurrentMark::clear_marking_state, which is executed on all of the exit paths. Reviewed-by: tonyp, ysr, jmasa ! src/share/vm/gc_implementation/g1/concurrentMark.cpp
19-10-2010

SUGGESTED FIX Clear the CMTask::_aborted_region field in the routine ConcurrentMark::clear_marking_state(). This routine is called to effectively clear the marking data structures in the event of an abort and when marking completes.
01-10-2010

EVALUATION The code to clear the recorded partial region in the CMTask was not being called from one of the paths where Concurrent Marking is completely aborted (when we abort for overflow). As a result when marking was restarted the MemRegion cached in the _aborted_region field was above the value of the finger and that fires the assert. The fix is to move the clearing of the CMTask::_aborted_region field into the routine that clears the marking state in the event of an abort (for any reason).
01-10-2010