JDK-8025564 : gc/memory/UniThread/Linear1 times out during heap verification
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: hs25,8u91,9
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2013-09-27
  • Updated: 2016-03-23
  • Resolved: 2014-09-08
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 9
9 b32Fixed
Related Reports
Duplicate :  
Relates :  
Relates :  
Description
The mentioned test timed out during nightlies. pstack output indicates that the GC (G1) seems to be busy verifying the heap at VM exit.

I.e. the stack for the thread exiting:

 fffffff7503052c8 __1cCosNPlatformEventEpark6M_v_ (100114900, 179000, 100114938, 100114950, fffffff750b2faa1, fffffff750e27860) + 140
 fffffff7502731dc __1cHMonitorFIWait6MpnGThread_l_i_ (100b399b8, 103d63000, 0, b7018, 103d63000, 100114900) + d4
 fffffff750274578 __1cHMonitorEwait6Mblb_b_ (100b399b8, 103d63000, 0, 0, fffffff750aec5bb, bb348c) + 1a8
 fffffff75063b814 __1cIVMThreadXwait_for_vm_thread_exit6F_v_ (fffffff750f2d360, 156800, fffffff750f7e100, fffffff750e27860, fffffff750f7e0f9, 105b00) + 94
 fffffff750565db0 __1cHThreadsKdestroy_vm6F_b_ (fffffff750efbee4, ffffffff7dcffd75, ffffffff7dcffd38, fffffff750e27860, 103d63000, 100111ab8) + 218
 fffffff74fee5d78 jni_DestroyJavaVM (10c000, 123800, 103d63000, fffffff7509effff, fffffff750e27860, 0) + 2b0


and many other threads are busy verifying the heap, the stack trace looking something like

 fffffff74fbe52d4 __1cSG1BlockOffsetArraybMforward_to_block_containing_addr_const6kMpnIHeapWord_2pkv_2_ (0, fffffffde0ea74f8, b6e01, fffffffde0eb0a00, 28, fffffffde0ea7520) + 1f4
 fffffff74fbe3da8 __1cSG1BlockOffsetArrayRverify_for_object6kMpnIHeapWord_L_b_ (100d8e680, fffffffde0eb09e0, c, 0, 1707585, fffffff7508bfe80) + 150
 fffffff74fcca91c __1cKHeapRegionGverify6kMnMVerifyOption_pb_v_ (100d8e608, 0, ffffffff7caff9bf, ffffffff2101fd70, fffffff750931840, fffffff750edefda) + 22c
 fffffff74fc052e4 __1cTVerifyRegionClosureMdoHeapRegion6MpnKHeapRegion__b_ (ffffffff7caffb38, 100d8e608, 0, fffffff750ed2048, 0, 0) + 34

It may also be that the machine is just too loaded/slow.

ILW: H (hang?), L (only occurrence), M (workaround: do not do heap verification at end of gc)
Comments
Running the gc/vector/ObjectArrayHigh test mentioned above with a fastdebug build takes 9 minutes and 21 seconds on my workstation. A little over 1 minute is spent running the test. The rest of the time (8 minutes) is spent verifying the BOT. The reason the BOT verification takes such a long time is that for each object on the heap G1 looks up the corresponding value in the BOT and iterates forward to find the same object again. This can get very expensive since the BOT is lazily refined and may be in a state where you more or less have to iterate from the start of each heap region to find the correct object. By replacing this expensive verification with a single pass over the BOT that just verifies that the BOT has consistent values the execution time is reduced significantly. With this more explicit BOT verification the test finishes in 1 minute and 3 seconds on my workstation. Note that the requirements on the BOT are very relaxed. Basically a BOT entry can point arbitrarily far back in a HeapRegion as long as it points to an object. From that object you can then iterate forward to find the last object that you are interested in. This means that there are few values that can be considered wrong in the BOT.
02-09-2014

From description of JDK-8028425: The G1 verification can be very slow when running some tests. Compared to Parallel Scavenge, G1 runs DoubleArrayLow, FloatArrayHigh and FloatArrayLow in 21m14s compared to 1m8.7s for PS. After adding some logging, the main issue seems to be: // If it returns false, verify_for_object() will output the // appropriate messasge. if (do_bot_verify && !_offsets.verify_for_object(p, obj_size)) { *failures = true; return; } in HeapRegion::verify.
18-11-2013

Changing ILW to MLM -> P4: Impact M (test timeout), L (only occurrence), M (workaround very simple: no verification or no debug build)
17-10-2013

RULE gc/memory/UniThread/Circular1 Timeout none http://aurora.ru.oracle.com/slot-gw/300510.JAVASE.NIGHTLY.VM.GC_Baseline-G1-Stable.2013-10-02-80/results/ResultDir/Circular1/Circular1.log
04-10-2013

Bisection shows the problem started occurring with the fix for JDK-8019902, which changed the way the G1 region size is chosen, usually making it larger (on one machine where I can reproduce this, the default region size went from 1M to 4M). Verification of the larger regions takes an inordinate amount of time.
01-10-2013

Reproducible starting with jdk8-b107, but only on certain sparc machines. So far not reproducible with jdk8-b106.
28-09-2013