JDK-6902692 : G1: g1CollectorPolicy.cpp:1533, assert(false,"Debugging data for CR 6898948 has been dumped above")
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: hs17
  • Priority: P4
  • Status: Closed
  • Resolution: Other
  • OS: generic
  • CPU: generic
  • Submitted: 2009-11-18
  • Updated: 2019-11-05
  • Resolved: 2019-11-05
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbdResolved
Related Reports
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Description
http://sqeweb.sfbay.sun.com/nfs/results/vm/gtee/JDK7/NIGHTLY/VM/2009-11-16/G1_GC_Baseline/vm/linux-amd64/server/mixed/linux-amd64_server_mixed_nsk.stack.testlist/ResultDir/stack017/stack017.log

[2009-11-17T06:08:24.35] Maximal recursion depth: 10435
[2009-11-17T06:08:26.30] recent_avg_pause_time_ratio() out of bounds
[2009-11-17T06:08:26.30] -------------------------------------------
[2009-11-17T06:08:26.30] Recent GC Times (ms):
[2009-11-17T06:08:26.30] 	 _num = 3, _sum = 1954.384, _sum_of_squares = 1910809.870
[2009-11-17T06:08:26.30] 	 _davg = 776.233, _dvariance = 112378.745, _alpha =   0.700
[2009-11-17T06:08:26.30] 		 _length = 10, _next = 3
[2009-11-17T06:08:26.30] 
[2009-11-17T06:08:26.30] 		[0]=999.569	[1]=  0.000	[2]=954.815	[3]=  0.000	[4]=  0.000
[2009-11-17T06:08:26.30] 		[5]=  0.000	[6]=  0.000	[7]=  0.000	[8]=  0.000	[9]=  0.000
[2009-11-17T06:08:26.30] (End Time=1.954) Recent GC End Times (s):
[2009-11-17T06:08:26.30] 	 _num = 4, _sum =   3.954, _sum_of_squares =   5.818
[2009-11-17T06:08:26.30] 	 _davg =   0.943, _dvariance =   0.429, _alpha =   0.700
[2009-11-17T06:08:26.30] 		 _length = 10, _next = 4
[2009-11-17T06:08:26.30] 
[2009-11-17T06:08:26.30] 		[0]=  0.000	[1]=  1.000	[2]=  1.000	[3]=  1.954	[4]=  0.000
[2009-11-17T06:08:26.30] 		[5]=  0.000	[6]=  0.000	[7]=  0.000	[8]=  0.000	[9]=  0.000
[2009-11-17T06:08:26.30] GC = 1954.384, Interval = 1954.384, Ratio = 1.000
[2009-11-17T06:08:26.30] # To suppress the following error report, specify this argument
[2009-11-17T06:08:26.30] # after -XX: or in .hotspotrc:  SuppressErrorAt=/g1CollectorPolicy.cpp:1533
[2009-11-17T06:08:26.30] #
[2009-11-17T06:08:26.30] # A fatal error has been detected by the Java Runtime Environment:
[2009-11-17T06:08:26.30] #
[2009-11-17T06:08:26.30] #  Internal Error (/tmp/jprt/P1/B/195911.ysr/source/src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp:1533), pid=15217, tid=1093196096
[2009-11-17T06:08:26.35] #  Error: assert(false,"Debugging data for CR 6898948 has been dumped above")
[2009-11-17T06:08:26.35] #
[2009-11-17T06:08:26.35] # JRE version: 7.0-b76
[2009-11-17T06:08:26.35] # Java VM: OpenJDK 64-Bit Server VM (17.0-b04-2009-11-13-195911.ysr.fpdump-fastdebug mixed mode linux-amd64 compressed oops)
[2009-11-17T06:08:26.35] # An error report file with more information is saved as:
[2009-11-17T06:08:26.35] # /export/local/25677.JDK7.NIGHTLY.VM+linux-amd64_server_mixed_nsk.stack.testlist/results/ResultDir/stack017/hs_err_pid15217.log
[2009-11-17T06:08:26.35] #
[2009-11-17T06:08:26.35] # If you would like to submit a bug report, please visit:
[2009-11-17T06:08:26.35] #   http://java.sun.com/webapps/bugreport/crash.jsp
[2009-11-17T06:08:26.35] #
The computation of recent_avg_pause_time() is susceptible to unbounded error
as the JVM ages because of the way it's currently computed. (See comments
section for a description of the problem and a solution.) The susceptibility to
the error depends on the occurrence of specific patterns of very short GC's
alternating with long periods of no GC. It turns out, though, that in the
cases where this test fails in the nightlies, the timestamps generated by
the JVM (on Linux/amd64) themselves are "bursty" (time presumably does not progress
smoothly -- it jumps forward then stalls for periods of time with no
forward progress; this defect should be independently and separately
investigated to see if the JVM time-service or the undelying os/platform/hardware
might be defective here) -- this turns out to simulate precisely the condition
of a JVM that has lived for a very long time and then sees a burst of very
short GC pauses. Computing the rcent_avg_pause_time_ratio() in a slightly
different way should protect us from this increasing coarseness/error
as the JVM ages. See the comments section for more details.

http://sqeweb.sfbay.sun.com/nfs/results/vm/gtee/JDK7/NIGHTLY/VM/2009-11-17/G1_GC_Baseline/vm/linux-amd64/server/mixed/linux-amd64_server_mixed_vm.gc.testlist/ResultDir/InterruptGC//InterruptGC.log

[2009-11-18T11:16:28.40] recent_avg_pause_time_ratio() out of bounds
[2009-11-18T11:16:28.40] -------------------------------------------
[2009-11-18T11:16:28.40] Recent GC Times (ms):
[2009-11-18T11:16:28.40] 	 _num = 7, _sum = 1957.514, _sum_of_squares = 1916791.489
[2009-11-18T11:16:28.40] 	 _davg = 497.299, _dvariance = 166474.335, _alpha =   0.700
[2009-11-18T11:16:28.40] 		 _length = 10, _next = 7
[2009-11-18T11:16:28.40] 
[2009-11-18T11:16:28.40] 		[0]=  0.000	[1]=  0.000	[2]=  0.000	[3]=  0.000	[4]=  0.000
[2009-11-18T11:16:28.40] 		[5]=999.505	[6]=958.009	[7]=  0.000	[8]=  0.000	[9]=  0.000
[2009-11-18T11:16:28.40] (End Time=1.958) Recent GC End Times (s):
[2009-11-18T11:16:28.40] 	 _num = 8, _sum =   2.957, _sum_of_squares =   4.831
[2009-11-18T11:16:28.40] 	 _davg =   0.797, _dvariance =   0.507, _alpha =   0.700
[2009-11-18T11:16:28.40] 		 _length = 10, _next = 8
[2009-11-18T11:16:28.40] 
[2009-11-18T11:16:28.40] 		[0]=  0.000	[1]=  0.000	[2]=  0.000	[3]=  0.000	[4]=  0.000
[2009-11-18T11:16:28.40] 		[5]=  0.000	[6]=  1.000	[7]=  1.958	[8]=  0.000	[9]=  0.000
[2009-11-18T11:16:28.40] GC = 1957.514, Interval = 1957.514, Ratio = 1.000
[2009-11-18T11:16:28.40] # To suppress the following error report, specify this argument
[2009-11-18T11:16:28.40] # after -XX: or in .hotspotrc:  SuppressErrorAt=/g1CollectorPolicy.cpp:1533
[2009-11-18T11:16:28.40] #
[2009-11-18T11:16:28.40] # A fatal error has been detected by the Java Runtime Environment:
[2009-11-18T11:16:28.40] #
[2009-11-18T11:16:28.40] #  Internal Error (/tmp/jprt/P1/B/195911.ysr/source/src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp:1533), pid=25021, tid=1089169728
[2009-11-18T11:16:28.40] #  Error: assert(false,"Debugging data for CR 6898948 has been dumped above")
[2009-11-18T11:16:28.40] #
[2009-11-18T11:16:28.40] # JRE version: 7.0-b76
[2009-11-18T11:16:28.40] # Java VM: OpenJDK 64-Bit Server VM (17.0-b04-2009-11-13-195911.ysr.fpdump-fastdebug mixed mode linux-amd64 )
[2009-11-18T11:16:28.40] # An error report file with more information is saved as:
[2009-11-18T11:16:28.40] # /export/local/25758.JDK7.NIGHTLY.VM+linux-amd64_server_mixed_vm.gc.testlist/results/ResultDir/InterruptGC/hs_err_pid25021.log
[2009-11-18T11:16:28.40] #
[2009-11-18T11:16:28.40] # If you would like to submit a bug report, please visit:
[2009-11-18T11:16:28.44] #   http://java.sun.com/webapps/bugreport/crash.jsp
[2009-11-18T11:16:28.44] #
http://sqeweb.sfbay.sun.com/nfs/results/vm/gtee/JDK7/NIGHTLY/VM/2009-11-16/G1_GC_Baseline/vm/linux-amd64/server/mixed/linux-amd64_server_mixed_nsk.stack.testlist/ResultDir/stack018/stack018.log

[2009-11-17T06:08:28.30] Maximal recursion depth: 1800
[2009-11-17T06:11:24.27] recent_avg_pause_time_ratio() out of bounds
[2009-11-17T06:11:24.27] -------------------------------------------
[2009-11-17T06:11:24.27] Recent GC Times (ms):
[2009-11-17T06:11:24.27] 	 _num = 10, _sum = 98957.411, _sum_of_squares = 1093972265.883
[2009-11-17T06:11:24.27] 	 _davg = 7471.264, _dvariance = 17696958.204, _alpha =   0.700
[2009-11-17T06:11:24.27] 		 _length = 10, _next = 1
[2009-11-17T06:11:24.27] 
[2009-11-17T06:11:24.27] 		[0]=  0.000	[1]=10995.279	[2]=10995.260	[3]=10995.262	[4]=12994.433
[2009-11-17T06:11:24.27] 		[5]=10950.218	[6]=10040.721	[7]=10995.265	[8]=9995.682	[9]=10995.291
[2009-11-17T06:11:24.27] (End Time=175.969) Recent GC End Times (s):
[2009-11-17T06:11:24.27] 	 _num = 10, _sum = 1369.813, _sum_of_squares = 196609.242
[2009-11-17T06:11:24.27] 	 _davg = 158.450, _dvariance = 524.830, _alpha =   0.700
[2009-11-17T06:11:24.27] 		 _length = 10, _next = 2
[2009-11-17T06:11:24.27] 
[2009-11-17T06:11:24.27] 		[0]=175.969	[1]=175.969	[2]= 88.007	[3]= 99.002	[4]=109.997
[2009-11-17T06:11:24.27] 		[5]=122.992	[6]=133.942	[7]=143.983	[8]=154.978	[9]=164.974
[2009-11-17T06:11:24.27] GC = 98957.411, Interval = 98957.411, Ratio = 1.000
[2009-11-17T06:11:24.27] # To suppress the following error report, specify this argument
[2009-11-17T06:11:24.27] # after -XX: or in .hotspotrc:  SuppressErrorAt=/g1CollectorPolicy.cpp:1533
[2009-11-17T06:11:24.27] #
[2009-11-17T06:11:24.27] # A fatal error has been detected by the Java Runtime Environment:
[2009-11-17T06:11:24.27] #
[2009-11-17T06:11:24.27] #  Internal Error (/tmp/jprt/P1/B/195911.ysr/source/src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp:1533), pid=15253, tid=1082833216
[2009-11-17T06:11:25.27] #  Error: assert(false,"Debugging data for CR 6898948 has been dumped above")
[2009-11-17T06:11:25.27] #
[2009-11-17T06:11:25.27] # JRE version: 7.0-b76
[2009-11-17T06:11:25.27] # Java VM: OpenJDK 64-Bit Server VM (17.0-b04-2009-11-13-195911.ysr.fpdump-fastdebug mixed mode linux-amd64 compressed oops)
[2009-11-17T06:11:25.27] # An error report file with more information is saved as:
[2009-11-17T06:11:25.27] # /export/local/25677.JDK7.NIGHTLY.VM+linux-amd64_server_mixed_nsk.stack.testlist/results/ResultDir/stack018/hs_err_pid15253.log
[2009-11-17T06:11:25.27] #
[2009-11-17T06:11:25.27] # If you would like to submit a bug report, please visit:
[2009-11-17T06:11:25.27] #   http://java.sun.com/webapps/bugreport/crash.jsp
[2009-11-17T06:11:25.27] #

Comments
This looks like a clock skew/setup issue that can't be fixed in the VM. Closing.
05-11-2019

Here is the patch with the instrumentation that JDK-6898948 added and JDK-8141394 removed. diff --git a/src/share/vm/gc/g1/g1CollectorPolicy.cpp b/src/share/vm/gc/g1/g1CollectorPolicy.cpp --- a/src/share/vm/gc/g1/g1CollectorPolicy.cpp +++ b/src/share/vm/gc/g1/g1CollectorPolicy.cpp @@ -976,6 +976,19 @@ _recent_avg_pause_time_ratio = _recent_gc_times_ms->sum()/interval_ms; if (recent_avg_pause_time_ratio() < 0.0 || (recent_avg_pause_time_ratio() - 1.0 > 0.0)) { +#ifndef PRODUCT + // Dump info to allow post-facto debugging + gclog_or_tty->print_cr("recent_avg_pause_time_ratio() out of bounds"); + gclog_or_tty->print_cr("-------------------------------------------"); + gclog_or_tty->print_cr("Recent GC Times (ms):"); + _recent_gc_times_ms->dump(); + gclog_or_tty->print_cr("(End Time=%3.3f) Recent GC End Times (s):", end_time_sec); + _recent_prev_end_times_for_all_gcs_sec->dump(); + gclog_or_tty->print_cr("GC = %3.3f, Interval = %3.3f, Ratio = %3.3f", + _recent_gc_times_ms->sum(), interval_ms, recent_avg_pause_time_ratio()); + // In debug mode, terminate the JVM if the user wants to debug at this point. + assert(!G1FailOnFPError, "Debugging data for CR 6898948 has been dumped above"); +#endif // !PRODUCT // Clip ratio between 0.0 and 1.0, and continue. This will be fixed in // CR 6902692 by redoing the manner in which the ratio is incrementally computed. if (_recent_avg_pause_time_ratio < 0.0) { diff --git a/src/share/vm/gc/g1/g1_globals.hpp b/src/share/vm/gc/g1/g1_globals.hpp --- a/src/share/vm/gc/g1/g1_globals.hpp +++ b/src/share/vm/gc/g1/g1_globals.hpp @@ -208,6 +208,10 @@ develop(bool, G1HRRSFlushLogBuffersOnVerify, false, \ "Forces flushing of log buffers before verification.") \ \ + develop(bool, G1FailOnFPError, false, \ + "When set, G1 will fail when it encounters an FP 'error', " \ + "so as to allow debugging") \ + \ product(size_t, G1HeapRegionSize, 0, \ "Size of the G1 regions.") \ range(0, 32*M) \
04-11-2015

targeting this for hs25 for now. I think we should do some more investigation before we defer it.
03-05-2013

EVALUATION I am marking this a P4 because the effect has not been seen in practice except on Linux/x64 machines that display the "jerky" time progress behaviour described in the "Description" section. (It's a day-zero bug.)
18-11-2009