JDK-8054808 : Bitmap verification sometimes fails after Full GC aborts concurrent marking
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 9
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2014-08-11
  • Updated: 2018-09-18
  • Resolved: 2014-08-28
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
8u40Fixed 9 b30Fixed
Related Reports
Relates :  
Relates :  
JDK-8048085 optimized next bitmap clear during concurrent marking by if concurrent marking has been interrupted by full gc, G1 does not reset the next mark bitmap again because full gc already does.

The change added some assertion checking code that the next mark bitmap is clear if marking had been aborted, i.e.

      if (!cm()->has_aborted()) {
        SuspendibleThreadSetJoiner sts;
      } else {
        assert(!G1VerifyBitmaps || _cm->nextMarkBitmapIsClear(), "Next mark bitmap must be clear");

This code sometimes fails after Full GC:

# A fatal error has been detected by the Java Runtime Environment:
#  Internal Error (/home/tschatzl/Downloads/vmshare/cmm-decommit/src.9/src/share/vm/gc_implementation/g1/concurrentMarkThread.cpp:286), pid=39649, tid=140275789788928
#  assert(!G1VerifyBitmaps || _cm->nextMarkBitmapIsClear()) failed: Next mark bitmap must be clear
# JRE version: Java(TM) SE Runtime Environment (9.0-b13) (build 1.9.0-ea-b13)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (1.9.0-fastdebug-internal mixed mode linux-amd64 compressed oops)
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
# If you would like to submit a bug report, please visit:
#   http://bugreport.sun.com/bugreport/crash.jsp

Can be reproduced fairly well (one out of 30 times?) with code that stresses concurrent marking a lot.

These valid humongous objects were not actually marked. The problem has been the condition to check: failure = _bitmap->getNextMarkedWordAddress(r->bottom(), r->end()) != r->end(); The problem is that between the getNextMarkedWordAddress call and the re-read of r->end() for comparison humongous objects may have been allocated into the same region, causing the condition to fail.

Different reproducer yielded the same crash: bin/ute -jdk <jdk> -test "vm/gc/compact/*" -component vm -env VM_FLAVOR=server -vmoptions "-XX:+UseG1GC -Xmx1G -XX:+VerifyBeforeGC -XX:VerifyAfterGC -XX:+UnlockExperimentalVMOptions -XX:+PrintGC -XX:+PrintAdaptiveSizePolicy -XX:+G1VerifyBitmaps -XX:-ReclaimDeadHumongousObjectsAtYoungGC" -env TEST_CONCURRENCY=10 Investigation so far showed that the objects with the stray marks are (live) humongous object. The object starts are marked. Hence suggested ILW: I: medium -> marks valid humongous objects only, annoying during debugging L: low -> it takes maybe 40 test runs to show the error that do a lot of humongous object allocation W: medium -> disable g1verifybitmaps, or avoid applications with many full gcs that abort marking -> P4

Cannot reproduce on clean (promoted) builds after running the tests for a week. So possibly a problem with some local VM version. Closing the issue for now.

This needs more investigation before we can do an ILW, Thomas is currently trying to determining the cause (we have a guess but we need to confirm it)