United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: JDK-6988458 G1: assert(mr.end() <= _cm->finger()) failed: otherwise the region shouldn't be on the stack
JDK-6988458 : G1: assert(mr.end() <= _cm->finger()) failed: otherwise the region shouldn't be on the stack

Details
Type:
Bug
Submit Date:
2010-09-29
Status:
Resolved
Updated Date:
2011-01-28
Project Name:
JDK
Resolved Date:
2010-11-12
Component:
hotspot
OS:
generic
Sub-Component:
gc
CPU:
generic
Priority:
P3
Resolution:
Fixed
Affected Versions:
7
Fixed Versions:
hs20 (b02)

Related Reports
Backport:
Backport:
Relates:

Sub Tasks

Description
The following nightly test:

vm/gc/containers/LinkedList_Arrays

failed with the following crash:

;; Using jvm: "/export/local/common/jdk/baseline/linux-i586/jre/lib/i386/server/libjvm.so"
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/tmp/jprt/P1/B/230657.jcoomes/source/src/share/vm/gc_implementation/g1/concurrentMark.cpp:3508), pid=24653, tid=2882767760
#  assert(mr.end() <= _cm->finger()) failed: otherwise the region shouldn't be on the stack
#
# JRE version: 7.0
# Java VM: OpenJDK Server VM (20.0-b01-201009282306.jcoomes.gc-stack-fastdebug mixed mode linux-x86 )
# If you would like to submit a bug report, please visit:
#   http://java.sun.com/webapps/bugreport/crash.jsp
#

---------------  T H R E A D  ---------------

Current thread (0x09862800):  ConcurrentGCThread [stack: 0x00000000,0x00000000] [id=24676]

Stack: 
[error occurred during error reporting (printing stack bounds), id 0xe0000000]

Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0xa21687];;  _ZN7VMError6reportEP12outputStream+0x1207
V  [libjvm.so+0xa2190d];;  _ZN7VMError14report_and_dieEv+0x18d
V  [libjvm.so+0x4917d8];;  _Z15report_vm_errorPKciS0_S0_+0x68
V  [libjvm.so+0xa2173f];;  _ZN7VMError6reportEP12outputStream+0x12bf
V  [libjvm.so+0xa2190d];;  _ZN7VMError14report_and_dieEv+0x18d
V  [libjvm.so+0x4917d8];;  _Z15report_vm_errorPKciS0_S0_+0x68
V  [libjvm.so+0x447861];;  _ZN6CMTask18drain_region_stackEP13BitMapClosure+0x251
V  [libjvm.so+0x44cb61];;  _ZN6CMTask15do_marking_stepEd+0x2e1
V  [libjvm.so+0x451f55];;  _ZN23CMConcurrentMarkingTask4workEi+0x125
V  [libjvm.so+0xa4b8f0];;  _ZN10GangWorker4loopEv+0x130
V  [libjvm.so+0xa4a368];;  _ZN10GangWorker3runEv+0x18
V  [libjvm.so+0x85e839];;  _ZL10java_startP6Thread+0xf9
C  [libpthread.so.0+0x5832]


Links to failure:

http://sqeweb.sfbay.sun.com/nfs/tools/gtee/results/JDK7/NIGHTLY/VM/2010-09-28/G1_GC_Baseline/vm/linux-i586/server/mixed/linux-i586_vm_server_mixed_vm.gc.testlist/analysis.html

http://sqeweb.sfbay.sun.com/nfs/tools/gtee/results/JDK7/NIGHTLY/VM/2010-09-28/G1_GC_Baseline/vm/linux-i586/server/mixed/linux-i586_vm_server_mixed_vm.gc.testlist/ResultDir/LinkedList_Arrays/

Culprit _must_ be the changes for 6941395 which were pushed yesterday.

                                    

Comments
SUGGESTED FIX

Clear the CMTask::_aborted_region field in the routine ConcurrentMark::clear_marking_state(). This routine is called to effectively clear the marking data structures in the event of an abort and when marking completes.
                                     
2010-10-01
EVALUATION

The code to clear the recorded partial region in the CMTask was not being called from one of the paths where Concurrent Marking is completely aborted (when we abort for overflow). As a result when marking was restarted the MemRegion cached in the _aborted_region field was above the value of the finger and that fires the assert.

The fix is to move the clearing of the CMTask::_aborted_region field into the routine that clears the marking state in the event of an abort (for any reason).
                                     
2010-10-01
EVALUATION

For some reason, jprt did not update the CR with the changeset. Here it is:

Changeset: a5c514e74487
Author:    johnc
Date:      2010-10-18 15:01 -0700
URL:       http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/a5c514e74487

6988458: G1: assert(mr.end() <= _cm->finger()) failed: otherwise the region shouldn't be on the stack
Summary: The changes from 6941395 did not clear the CMTask::_aborted_region fields when concurrent marking aborted because of overflow. As a result, the next time around we could see a memory region whose start address was above the global finger and the assertion tripped. Moved the clearing of the aborted regions to ConcurrentMark::clear_marking_state, which is executed on all of the exit paths.
Reviewed-by: tonyp, ysr, jmasa

! src/share/vm/gc_implementation/g1/concurrentMark.cpp
                                     
2010-10-19
EVALUATION

http://hg.openjdk.java.net/jdk7/build/hotspot/rev/a5c514e74487
                                     
2010-12-04



Hardware and Software, Engineered to Work Together