JDK-6280558 : CMS: assert "For error-free division", concurrentMarkSweepGeneration.cpp, 1789
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 6
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: linux
  • CPU: x86
  • Submitted: 2005-06-04
  • Updated: 2012-02-01
  • Resolved: 2005-07-13
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 6
6 b43Fixed
Related Reports
Duplicate :  
Relates :  
Relates :  
Description
See:
http://vmsqe.sfbay/nightly/mantis/DTWS/results/05-28-05/ServerVM/64BITLINUX-AMD64/mixed/Gc_Baseline-Xconc/vm.gc-NIGHTLY-Gc_Baseline-Xconc-ServerVM-mixed-64BITLINUX-AMD64-2005-05-29-01-51-06/analysis.html

The following nightly tests on 05/05/28 failed the above assertion
on account of this bug:

   gc.gctests.Compact.compact020
   gc.gctests.Compact.compact009
   gc.gctests.Compact.compact006
   gc.gctests.Compact.compact005
   gc.gctests.Compact.compact004
   gc.gctests.Compact.compact003
   gc.gctests.Compact.compact002 
   gc.gctests.Compact.compact001
   gc/64bit/quicklook/largeheap/GC32BitRange
   gc.gctests.Compact.compact019
   gc.gctests.Compact.compact018
   gc.gctests.Compact.compact017
   gc.gctests.Steal.steal002
   gc.gctests.Compact.compact016 
   gc.gctests.Compact.compact013
   gc.gctests.Compact.compact011
   gc.gctests.Compact.compact010
   gc.gctests.SoftReference.soft001

Here is a typical rerun.sh file for one of the tests:
======================================================
HOME=/home/fhsu
DISPLAY=vmsqe.sfbay:0.0
CLASSPATH=.:/var/tmp/Work/exec/vm.gc-NIGHTLY-Gc_Baseline-Xconc-ServerVM-mixed-64BITLINUX-AMD64-2005-05-29-01-51-06/run2/fhsu.Linux.amd64/FillingStation:/net/vmsqe.sfbay/export/backup/UNIFIED-DTF/DTWS/suites/vm.gc/testbase/bin/classes:/var/tmp/fhsu/Work/JDK/NIGHTLY/Gc_Baseline-Xconc/linux-amd64/lib/tools.jar
LD_LIBRARY_PATH=/net/vmsqe.sfbay/export/backup/UNIFIED-DTF/DTWS/suites/vm.gc/testbase/src/nsk/share/lib/linuxamd64:/var/tmp/fhsu/Work/JDK/NIGHTLY/Gc_Baseline-Xconc/linux-amd64/jre/lib/amd64:/var/tmp/fhsu/Work/JDK/NIGHTLY/Gc_Baseline-Xconc/linux-amd64/jre/lib/amd64/server:/net/vmsqe.sfbay/export/backup/UNIFIED-DTF/DTWS/suites/vm.gc/testbase/src/misc/gc/utils/lib/linuxamd64
RAS_OPTIONS=
SHELL=/bin/sh
PATH=.:/var/tmp/fhsu/Work/JDK/NIGHTLY/Gc_Baseline-Xconc/linux-amd64/bin:/bin:/usr/bin:/var/tmp/fhsu/Work/JDK/NIGHTLY/Gc_Baseline-Xconc/linux-amd64/jre/bin

while [ $# -gt 0 ];
do
  if [ $1 = "-jdk" ]; then
        shift 1
        PATH=${1}/bin:${PATH}
        shift 1
  else
        if [ $1 = "-d" ]; then
          shift 1
          if [ $# -gt 0  ]; then
                DISPLAY=$1
                shift 1
          else
                DISPLAY=:0.0
          fi
        fi
  fi
done

export HOME
export DISPLAY
export CLASSPATH
export LD_LIBRARY_PATH
export RAS_OPTIONS
export SHELL
export PATH

#annotate TEST javaopt=-server -Xmixed -XX:-PrintVMOptions -XX:+UseConcMarkSweepGC -XX:+CMSPermGenSweepingEnabled -XX:+CMSClassUnloadingEnabled -XX:+ExplicitGCInvokesConcurrent
#/net/jano/export/disk05/hotspot/users/ysr/testing/linux-amd64/jdk1.6.0/fastdebug/bin/java -server -Xmixed -DHANGINGJAVA20624 -XX:-PrintVMOptions -XX:+UseConcMarkSweepGC -XX:+CMSPermGenSweepingEnabled -XX:+CMSClassUnloadingEnabled -XX:+ExplicitGCInvokesConcurrent -XX:+PrintGCDetails -XX:+ShowMessageBoxOnError FillingStation
/net/jano/export/disk05/hotspot/users/ysr/testing/linux-amd64/jdk1.6.0/fastdebug/bin/java -server -Xmixed -DHANGINGJAVA20624 -XX:-PrintVMOptions -XX:+UseConcMarkSweepGC -XX:+CMSPermGenSweepingEnabled -XX:+CMSClassUnloadingEnabled -XX:+ExplicitGCInvokesConcurrent -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+ShowMessageBoxOnError gc.gctests.Compact.compact020 -iterations=1 -gcTimeout=5

# /java/re/jdk/1.6.0/promoted/latest/binaries/linux-amd64/bin/java -server -Xmixed -DHANGINGJAVA20624 -XX:-PrintVMOptions -XX:+UseConcMarkSweepGC -XX:+CMSPermGenSweepingEnabled -XX:+CMSClassUnloadingEnabled -XX:+ExplicitGCInvokesConcurrent -XX:+PrintGCDetails -XX:+ShowMessageBoxOnError FillingStation
##Exit status of execution step=134
##!checkExitCode

## To suppress the following error report, specify this argument
## after -XX: or in .hotspotrc:  SuppressErrorAt=/cmsAdaptiveSizePolicy.cpp:368]
##
## An unexpected error has been detected by HotSpot Virtual Machine:
##
##  Internal Error (/PrtBuildDir/workspace/src/share/vm/gc_implementation/concurrentMarkSweep/cmsAdaptiveSizePolicy.cpp, 368), pid=25226, tid=1079327088
##
## Java VM: Java HotSpot(TM) 64-Bit Server VM (20050525150425.ysr.bugs-debug mixed mode)
##
## Error: assert(_latest_cms_sum_concurrent_phases_time_secs == 0.0,"There should not be any concurrent time")
## An error report file with more information is saved as hs_err_pid25226.log
##
## If you would like to submit a bug report, please visit:
##   http://java.sun.com/webapps/bugreport/crash.jsp
##
#Current thread is 1079327088
#Dumping core ...

###@###.### 2005-06-04 02:05:30 GMT

This bug also affects the following JTREG test: 

java/lang/management/MemoryMXBean/CollectionUsageThresholdConcMarkSweepGC.sh
###@###.### 2005-06-10 18:00:54 GMT

Comments
SUGGESTED FIX ------- allocationStats.hpp ------- 62,66c62,79 < ssize_t demand = prevSweep() - count + splitBirths() - splitDeaths(); < float rate = ((float)demand)/inter_sweep_current; < _demand_rate_estimate.sample(rate); < _desired = (ssize_t)(_demand_rate_estimate.padded_average() < *inter_sweep_estimate); --- > // If the latest inter-sweep time is below our granularity > // of measurement, we may call in here with > // inter_sweep_current == 0. However, even for suitably small > // inter-sweep durations, we may not trust the accuracy of > // accumulated data, since it has not been "integrated" > // (read "low-pass-filtered") long enough, and would be > // vulnerable to noisy glitches. In such cases, we > // ignore the current sample and use currently available > // historical estimates. > // Technically this should be derived from machine speed > float threshold = ((float)CMS_SweepTimerThresholdMillis)/1000; > if (inter_sweep_current > threshold) { > ssize_t demand = prevSweep() - count + splitBirths() - splitDeaths(); > float rate = ((float)demand)/inter_sweep_current; > _demand_rate_estimate.sample(rate); > _desired = (ssize_t)(_demand_rate_estimate.padded_average() > *inter_sweep_estimate); > } ------- concurrentMarkSweepGeneration.cpp ------- 1784d1783 < assert(_sweep_timer.seconds() > 0, "For error-free division"); 4901,4902c4900,4901 < assert(!_sweep_timer.is_active(), "Switched off in an outer context?"); < assert(_sweep_timer.seconds() > 0, "For error-free division"); --- > > assert(!_sweep_timer.is_active(), "Was switched off in an outer context"); ------- globals.hpp ------- 916a917,920 > product(uintx, CMS_SweepTimerThresholdMillis, 10, \ > "Skip block flux-rate sampling for an epoch if inter-sweep " \ > " duration exceeds this threhold in milliseconds") \ > ###@###.### 2005-06-09 20:38:00 GMT Event: putback-to Parent workspace: /net/jano.sfbay/export/disk05/hotspot/ws/main/gc_baseline (jano.sfbay:/export/disk05/hotspot/ws/main/gc_baseline) Child workspace: /net/prt-web.sfbay/prt-workspaces/20050611131241.ysr.bugs/workspace (prt-web:/net/prt-web.sfbay/prt-workspaces/20050611131241.ysr.bugs/workspace) User: ysr Comment: --------------------------------------------------------- Original workspace: sr-unwk-04:/net/spot/scratch/ysr/bugs Submitter: ysr Archived data: /net/prt-archiver.sfbay/data/archived_workspaces/main/gc_baseline/2005/20050611131241.ysr.bugs/ Webrev: http://analemma.sfbay.sun.com/net/prt-archiver.sfbay/data/archived_workspaces/main/gc_baseline/2005/20050611131241.ysr.bugs/workspace/webrevs/webrev-2005.06.11/index.html Fixed 6280558: CMS: assert "For error-free division", concurrentMarkSweepGeneration.cpp, 1789 http://analemma.sfbay/net/spot/scratch/ysr/bugs/webrev The problem exhibits when we run on certain vintages of Linux where the elapsedTimer resolution is very coarse (on the order of 10 ms). The assert that fires is one checking that the time from the end of the previous sweep to the start of the current sweep is strictly positive. (Alas, real timers have finite resolution.) When there is an explicit gc request or a concurrent mode failure in a tight memory situation, as it does in the current test case, immediately following the most recent sweep, we may sometimes find that coarse timers will not record any passage of time. The assert was guarding against the use of that time duration as the divisor in a flow rate computation. Our fix is to avoid taking the sample when the interval being timed is below a certain threshold, thus avoiding the division problem, as well as protecting against use of time durations that may not be long enough to accumulate sufficient information to be accurate. The assert can then go away. The choice of the accumulation interval is somewhat arbitrary, and can do with some (dynamic?) tuning, as well as possibly platform awareness (but that is deferred for whenever we get the time to work on that). Thanks to Hui and Dice for useful information about the resolution of gettimeofday() on Linux (where it's used for elapsedTimer). Reviewed by: John Coomes, Andrey Petrusenko, Jon Masamitsu Fix verified: yes Verification test: the failing nightly tests (some of these tests are still vulnerable to the sister bug 6280548 being investigated by Jon) Other testing: PRT, refworkload Files: update: src/share/vm/includeDB_core update: src/share/vm/memory/allocationStats.hpp update: src/share/vm/memory/concurrentMarkSweepGeneration.cpp update: src/share/vm/runtime/globals.hpp create: src/share/vm/memory/allocationStats.cpp Examined files: 3278 Contents Summary: 1 create 4 update 3273 no action (unchanged) ###@###.### 2005-06-17 20:51:39 GMT
09-06-2005

EVALUATION see comments section. ###@###.### 2005-06-08 23:50:29 GMT
08-06-2005