Bug ID: JDK-7005799 G1: nsk/regression/b6186200 fails with OOME

Type: Bug
Component: hotspot
Sub-Component: gc
Affected Version: hs20

Priority: P4
Status: Closed
Resolution: Cannot Reproduce
OS: generic
CPU: generic

Submitted: 2010-12-09
Updated: 2013-09-18
Resolved: 2011-03-08

JDK 7
7Resolved

It seems like when Java heap is exhausted, G1 does not let native code owning a critical region go before throwing OOME from Java code. This is a regression from JDK6.

Please see comments for more details.
OOM can also cause crash with following message:
#  Out of Memory Error (allocation.cpp:211), pid=17209, tid=2364230560
#
# JRE version: 7.0-b126
# Java VM: Java HotSpot(TM) Server VM (20.0-b06 compiled mode linux-x86 )

---------------  T H R E A D  ---------------

Current thread (0x08492400):  JavaThread "Thread-0" [_thread_in_vm, id=17233, stack(0x8ce64000,0x8ceb5000)]

Stack: [0x8ce64000,0x8ceb5000],  sp=0x8ceb3a20,  free space=318k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x686e1f]# Host info: Linux vmsqe-p4-14 2.6.16.60-0.42.10-smp #1 SMP Tue Apr 27 05:11:27 UTC 2010 i686 i686 i386 GNU/Linux

EVALUATION I'm going to close this as non-reproducible. The two difference OOMs that were reported do not occur any more with the latest HotSpot. But here's a bit more on why we were running out of swap. While allocating a humongous object, G1 has a pathology that causes it to constantly do unsuccessful collection attempts when the GC locker is active for a long period of time. These repeated collection attempts first try to take the pending list lock during which operation we create a Handle. Those Handles are only collected after we exit the mem_allocate() method. But, during the unsuccessful collection attempts the Handle area for that thread would grow and grow and eventually fill up the swap. Even though this is unlikely to happen in practice it will be fixed by 7018286. Also, I'm not 100% sure why I cannot reproduce this OOM with the latest workspace as it does exhibit this pathology. But 7018286 will eliminate it.
08-03-2011
EVALUATION Re Eval Comment#1: Indeed, the fix for:- 6700062 G1: Extend 6186200 to G1 was done as part of the changes in:- 6974966 G1: unnecessary direct-to-old allocations which is why the former was closed as a duplicate of the latter around 11/2010.
08-02-2011
EVALUATION I went back and tried older JDKs. I can reproduce the "OOM: Java heap space" on a linux-amd64 and a solaris-i586 machine with JDK 7 b119, b120, b121, and b122. The issue seems to disappear in b123 (and the latest HotSpot, which is one of the reasons why I couldn't reproduce it before). b122 has hs20 b03, b123 has hs20 b04. A big change in G1 between these two releases was the new slow allocation path (see 6974966: G1: unnecessary direct-to-old allocations) that went into b04. It was known that the old allocation path had some corner cases, especially involving the GC locker, and it might have returned a NULL value (causing an OOM) even though either another GC or another allocation should have been attempted. I'm pretty sure that this explains these OOMs and the issue seems to have already been resolved.
08-02-2011
EVALUATION Regarding the second type of OOM that was reported (the one that seemed to report the JVM running out of swap). I reproduced it on a solaris-i586 machine with b123. The stack trace is the following: =>[1] __read(0x0, 0xb34fc350, 0x10, 0xfee8c309), at 0xfeea1e75 [2] read(0x0, 0xb34fc350, 0x10, 0xfe9ab795), at 0xfee8c3cd [3] os::message_box(0xfec1eff5, 0xfecd4aac, 0xb34fc3b8, 0xfeb10b77), at 0xfe9aba68 [4] VMError::show_message_box(0xb34fc480, 0xfecd4aac, 0x7d0, 0xfeb106ed), at 0xfeb10ba7 [5] VMError::report_and_die(0xb34fc480, 0xfeb4c89c, 0xb34fc4c8, 0xfe5ff724), at 0xfeb1070f [6] report_vm_out_of_memory(0xfeb4c8b0, 0xd3, 0x7ff4, 0xfeb4c89c), at 0xfe5ff749 [7] Arena::grow(0x9223bb0, 0x4, 0xb34fc558, 0xfe1c7d8a), at 0xfe108cfd [8] instanceRefKlass::acquire_pending_list_lock(0xb34fc848), at 0xfe1c7db1 [9] VM_GC_Operation::doit_prologue(0xb34fc830, 0x0, 0x0, 0xfe1c7364), at 0xfeb10e09 [10] VMThread::execute(0xb34fc830, 0x40690000, 0x13, 0x1), at 0xfe1c74e9 [11] G1CollectedHeap::mem_allocate(0x806cb28, 0xf4244, 0x1, 0x0, 0xb34fc8cc, 0xf4244, 0x0, 0x2), at 0xfe66294c [12] typeArrayKlass::allocate(0xf6600ad8, 0xf4240, 0x925cc00, 0xfe0da9ad), at 0xfe0d9447 [13] InterpreterRuntime::newarray(0x925cc00, 0xa, 0xf4240), at 0xfe0da9d0 [14] 0xfaa129b3(0x0, 0xba5408b8, 0xfec74000, 0x1f80, 0xfec74000, 0x925cc00), at 0xfaa129b3 [15] 0xfaa003de(0xb34fc9e0, 0xb34fcbb4, 0xa, 0xf67d2468, 0xfaa09160, 0xb34fcad4, 0x1, 0x925cc00), at 0xfaa003de [16] JavaCalls::call_helper(0xb34fcbb0, 0xb34fcac8, 0xb34fcad0, 0x925cc00, 0xb34fcb20, 0x925ce98), at 0xfe15fae6 [17] os::os_exception_wrapper(0xfe15f5a8, 0xb34fcbb0, 0xb34fcac8, 0xb34fcad0, 0x925cc00, 0x925cc00, 0x0, 0xfe21e718), at 0xfe15fea4 [18] JavaCalls::call_virtual(0xb34fcbb0, 0x925ce98, 0x925ce9c, 0xfecd7320, 0xfecd7658), at 0xfe21e814 [19] thread_entry(0x925cc00), at 0xfe23b5d6 [20] JavaThread::run(0x925cc00, 0xfecd2c40, 0x0, 0xfe9a3fc6), at 0xfe235f8d [21] java_start(0x925cc00, 0xfef20000, 0xb34fcfec, 0xfee9cd2e), at 0xfe9a4438 [22] _thrp_setup(0xfdcfc200), at 0xfee9cd66 [23] _lwp_start(0x0, 0xb34fc350, 0x10, 0xfee8c309, 0x4e, 0xfec74000), at 0xfee9cff0 Interestingly it looks as if the JVM is trying to exit when this happens: =>[1] ___lwp_cond_wait(0x8067248, 0x8067230, 0x0, 0x0), at 0xfeea23e5 [2] _lwp_cond_wait(0x8067248, 0x8067230, 0x0, 0x1), at 0xfee77f68 [3] os::PlatformEvent::park(0x8067200, 0xfebea3b6, 0x50, 0xfe977f09), at 0xfe9ae689 [4] Monitor::IWait(0x80655f0, 0x8066400, 0x0, 0x0), at 0xfe9781ab [5] Monitor::wait(0x80655f0, 0x0, 0x0, 0x0), at 0xfe97b2af [6] VMThread::execute(0xfdd5eaf0, 0xfec74000, 0xfdd5eb28, 0xfe2cffe0), at 0xfe1c76cb [7] vm_exit(0x5f, 0x8066400, 0xf67d3fb0, 0xfe2c7781), at 0xfe2d004f [8] JVM_Halt(0x5f, 0xf67d3fb0, 0xf67d3fb0, 0xfdc837db), at 0xfe2c7848 [9] Java_java_lang_Shutdown_halt0(), at 0xfdc837ed [10] 0xfaa0a1d2(0xf67d46a8, 0xfaa08179, 0x5f, 0x5, 0xba54a658, 0xfdd5ebdc), at 0xfaa0a1d2 [11] 0xfaa0310d(0x0, 0xba54a658, 0x5f, 0x5, 0xf67d46a8, 0xfdd5ec18), at 0xfaa0310d [12] 0xfaa0310d(0x0, 0x0, 0xf67d46a8, 0x0, 0x5f, 0xfdd5ec5c), at 0xfaa0310d ... I can reproduce this with JDK 7 b123 to b127. In fact, when I look at the resident size of the JVM it's around 1400M half-way through the run and 1652M at the end of the run, but the JVM always hangs at the end, while its resident size keeps increasing up to the point where it runs out of swap. The issue disappears with JDK 7 b128. Keeping an eye on the JVM resident size it's around 1000M half-way through the run and 1008M at the end of the run but the JVM doesn't seem to hang but exits properly. b128 was the release that got the first hs20 b07 which has the zero filling thread removal (6977804: G1: remove the zero-filling thread). But I can't immediately appreciate how the zero filling thread removal would resolve this.
08-02-2011

Relates :	JDK-6977804 - G1: remove the zero-filling thread
Relates :	JDK-6186200 - RFE: Stall allocation requests while heap is full and GC locker is held
Relates :	JDK-6974966 - G1: unnecessary direct-to-old allocations
Relates :	JDK-7018286 - G1: humongous allocation attempts should take the GC locker into account