JDK-8048556 : Unnecessary GCLocker-initiated young GCs
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 7u60,8,8u20,8u40,9,11,13
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2014-06-27
  • Updated: 2022-06-27
  • Resolved: 2019-07-31
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 13 JDK 14 JDK 8 Other
11.0.6-oracleFixed 13.0.4Fixed 14 b08Fixed 8u221Fixed openjdk8u232Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Description
We are seeing very strange behavior in some GC logs (and sometimes said strange behavior happens quite frequently). The common pattern is that a GC locker-initiated young GC happens immediately after another young GC. The GC locker-initiated one seems completely unnecessary given that the eden is either empty or has very low occupancy (<10%).

Here's an example of this (with ParNew):

{Heap before GC invocations=2 (full 0):
 par new generation   total 471872K, used 433003K [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000)
  eden space 419456K, 100% used [0x00000007bae00000, 0x00000007d47a0000, 0x00000007d47a0000)
  from space 52416K,  25% used [0x00000007d47a0000, 0x00000007d54dacb0, 0x00000007d7ad0000)
  to   space 52416K,   0% used [0x00000007d7ad0000, 0x00000007d7ad0000, 0x00000007dae00000)
 tenured generation   total 524288K, used 0K [0x00000007dae00000, 0x00000007fae00000, 0x00000007fae00000)
   the space 524288K,   0% used [0x00000007dae00000, 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
 compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000)
   the space 21248K,  12% used [0x00000007fae00000, 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
No shared spaces configured.
1.119: [GC (Allocation Failure)[ParNew: 433003K->15843K(471872K), 0.0103090 secs] 433003K->15843K(996160K), 0.0103320 secs] [Times: user=0.03 sys=0.00, real=0.01 secs] 
Heap after GC invocations=3 (full 0):
 par new generation   total 471872K, used 15843K [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000)
  eden space 419456K,   0% used [0x00000007bae00000, 0x00000007bae00000, 0x00000007d47a0000)
  from space 52416K,  30% used [0x00000007d7ad0000, 0x00000007d8a48c88, 0x00000007dae00000)
  to   space 52416K,   0% used [0x00000007d47a0000, 0x00000007d47a0000, 0x00000007d7ad0000)
 tenured generation   total 524288K, used 0K [0x00000007dae00000, 0x00000007fae00000, 0x00000007fae00000)
   the space 524288K,   0% used [0x00000007dae00000, 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
 compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000)
   the space 21248K,  12% used [0x00000007fae00000, 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
No shared spaces configured.
}
{Heap before GC invocations=3 (full 0):
 par new generation   total 471872K, used 24002K [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000)
  eden space 419456K,   1% used [0x00000007bae00000, 0x00000007bb5f7c50, 0x00000007d47a0000)
  from space 52416K,  30% used [0x00000007d7ad0000, 0x00000007d8a48c88, 0x00000007dae00000)
  to   space 52416K,   0% used [0x00000007d47a0000, 0x00000007d47a0000, 0x00000007d7ad0000)
 tenured generation   total 524288K, used 0K [0x00000007dae00000, 0x00000007fae00000, 0x00000007fae00000)
   the space 524288K,   0% used [0x00000007dae00000, 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
 compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000)
   the space 21248K,  12% used [0x00000007fae00000, 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
No shared spaces configured.
1.130: [GC (GCLocker Initiated GC)[ParNew: 24002K->12748K(471872K), 0.0123930 secs] 24002K->12748K(996160K), 0.0124130 secs] [Times: user=0.04 sys=0.01, real=0.01 secs] 
Heap after GC invocations=4 (full 0):
 par new generation   total 471872K, used 12748K [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000)
  eden space 419456K,   0% used [0x00000007bae00000, 0x00000007bae00000, 0x00000007d47a0000)
  from space 52416K,  24% used [0x00000007d47a0000, 0x00000007d5413320, 0x00000007d7ad0000)
  to   space 52416K,   0% used [0x00000007d7ad0000, 0x00000007d7ad0000, 0x00000007dae00000)
 tenured generation   total 524288K, used 0K [0x00000007dae00000, 0x00000007fae00000, 0x00000007fae00000)
   the space 524288K,   0% used [0x00000007dae00000, 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
 compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000)
   the space 21248K,  12% used [0x00000007fae00000, 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
No shared spaces configured.
}

Notice that:

* The timestamp of the second GC (1.130) is almost equal to the timestamp of the first GC plus the duration of the first GC (1.119 + 0.0103320 = 1.1293). In this test young GCs normally happen at a frequency of one every 100ms-110ms or so.
* The eden at the start of the second GC is almost empty (1% occupancy). We've also seen it very often with a completely empty eden.
* (the big hint) The second GC is GClocker-initiated.

This happens most often with ParNew (in some cases, more than 30% of the GCs are those unnecessary ones) but also happens with ParallelGC too but less frequently (maybe 1%-1.5% of the GCs are those unnecessary ones). I was unable to reproduce it with G1.

I can reproduce it with with latest JDK 7, JDK 8, and also the latest hotspot-gc/hotspot workspace. I haven't tried with JDK 6.
Comments
Fix-request 13u This fixes a problem in several GCs that can lead to unnecessary young-gcs to be triggered (see discussions above). The patch applies cleanly.Test passes with the patch applied. Other testing: tier1 without regressions.
19-05-2020

Fix request 8u on behalf of John Cuthbertson (johnc@azul.com) See above 11u request for comments. 8u email thread: https://mail.openjdk.java.net/pipermail/jdk8u-dev/2019-October/010447.html
15-10-2019

Hi Christoph, ok you're right: https://mail.openjdk.java.net/pipermail/jdk-updates-dev/2019-October/001997.html
09-10-2019

Hi Roman, in that case, I think you should post a webrev on the mailing list and get a review. Thanks Christoph
09-10-2019

Fix-request This fixes a problem in several GCs that can lead to unnecessary young-gcs to be triggered (see discussions above). The patch applies mostly cleanly. Some hunks are to renamed files, which failed, and in one case I needed to make small adjustment to account for changed method signature of CH::collect(). Without the patch, the supplied testcase fails, and passes with the patch applied. Other testing: tier1 and tier2 without regressions.
08-10-2019

URL: https://hg.openjdk.java.net/jdk/jdk/rev/8d3886985964 User: kbarrett Date: 2019-07-31 18:41:57 +0000
31-07-2019

This problem does arise with G1, contrary to previous discussion, and despite JDK-7143858. Using JNICriticalStressTest2 as described (300 secs, 4 allocators, 2 critical threads; a very helpful test, by the way), applying the following regex to the log output typically finds about 20 hits on my machine: egrep "GCLocker Initiated GC\) [1-9][0-9]?M" <logfile> that look something like this: [68.978s][info ][gc ] GC(475) Pause Young (Normal) (GCLocker Initiated GC) 13M->11M(1024M) 4.102ms I think I see how this can happen. gclocker is active thread 1 enqueues normal collection 1 thread 2 is about to enqueue a normal collect it has passed the GCLocker::needs_gc check thread 1 collection tries to run blocked by check_active_before_gc sets GCLocker::_needs_gc to true thread 3 unlocks the critical section GCLocker::jni_unlock sets active => inactive (continued later) thread2 enqueues normal collection *not* blocked by check_active_before_gc, because gclocker no longer active (but _needs_gc is still true) so performs collection unlock_jni_critical (continued) enqueues gclocker collection, which runs immediately with nearly empty younggen So I think JDK-7143858 narrowed the race window, but did not close it.
28-07-2019

Same issue as (G1-specific) JDK-7143858, but in other collectors.
24-07-2019

Sorry for my earlier comment, this issue is not reproducible with G1, in my earlier comment there are "GCLocker Initiated GC" but those are not the unnecessary ones.
19-07-2019

This issue is reproducible with latest code (jdk/jdk), verified with 13 ea b24 with G1 Below is the output, complete logs attached (gc_g1.log) [0.063s][info][gc] Using G1 [0.122s][info][gc] Periodic GC disabled /scratch/fairoz/JAVA/jdk13/jdk-13-ea+24/bin/java -Xms1g -Xmx1g -Xmn256m -Xlog:gc JNICriticalStressTest2 300 4 2 Running for 300 secs Starting 4 allocating threads Starting 2 jni critical threads [0.534s][info][gc] GC(0) Pause Young (Normal) (G1 Evacuation Pause) 257M->10M(1024M) 28.404ms [0.594s][info][gc] GC(1) Pause Young (Normal) (GCLocker Initiated GC) 258M->11M(1024M) 13.452ms [0.670s][info][gc] GC(2) Pause Young (Normal) (G1 Evacuation Pause) 257M->11M(1024M) 19.507ms [0.739s][info][gc] GC(3) Pause Young (Normal) (GCLocker Initiated GC) 270M->12M(1024M) 14.990ms [0.807s][info][gc] GC(4) Pause Young (Normal) (GCLocker Initiated GC) 270M->11M(1024M) 13.842ms [0.858s][info][gc] GC(5) Pause Young (Normal) (GCLocker Initiated GC) 261M->12M(1024M) 13.124ms [0.915s][info][gc] GC(6) Pause Young (Normal) (GCLocker Initiated GC) 261M->11M(1024M) 11.016ms [0.960s][info][gc] GC(7) Pause Young (Normal) (GCLocker Initiated GC) 270M->11M(1024M) 10.185ms [1.007s][info][gc] GC(8) Pause Young (Normal) (GCLocker Initiated GC) 259M->11M(1024M) 11.054ms [1.058s][info][gc] GC(9) Pause Young (Normal) (G1 Evacuation Pause) 257M->11M(1024M) 12.424ms [1.109s][info][gc] GC(10) Pause Young (Normal) (GCLocker Initiated GC) 257M->11M(1024M) 11.027ms [1.161s][info][gc] GC(11) Pause Young (Normal) (GCLocker Initiated GC) 259M->12M(1024M) 10.671ms [1.212s][info][gc] GC(12) Pause Young (Normal) (GCLocker Initiated GC) 261M->12M(1024M) 10.815ms [1.262s][info][gc] GC(13) Pause Young (Normal) (GCLocker Initiated GC) 262M->12M(1024M) 12.258ms [1.312s][info][gc] GC(14) Pause Young (Normal) (GCLocker Initiated GC) 262M->12M(1024M) 11.176ms [1.362s][info][gc] GC(15) Pause Young (Normal) (GCLocker Initiated GC) 262M->12M(1024M) 14.546ms [1.400s][info][gc] GC(16) Pause Young (Normal) (GCLocker Initiated GC) 266M->11M(1024M) 3.544ms [1.439s][info][gc] GC(17) Pause Young (Normal) (GCLocker Initiated GC) 269M->11M(1024M) 3.647ms [1.478s][info][gc] GC(18) Pause Young (Normal) (GCLocker Initiated GC) 273M->11M(1024M) 3.401ms [1.517s][info][gc] GC(19) Pause Young (Normal) (GCLocker Initiated GC) 266M->11M(1024M) 3.405ms [1.570s][info][gc] GC(20) Pause Young (Normal) (GCLocker Initiated GC) 266M->11M(1024M) 3.597ms [1.620s][info][gc] GC(21) Pause Young (Normal) (GCLocker Initiated GC) 265M->11M(1024M) 3.368ms [1.666s][info][gc] GC(22) Pause Young (Normal) (GCLocker Initiated GC) 264M->11M(1024M) 3.296ms [1.709s][info][gc] GC(23) Pause Young (Normal) (GCLocker Initiated GC) 266M->11M(1024M) 3.519ms [1.763s][info][gc] GC(24) Pause Young (Normal) (GCLocker Initiated GC) 265M->11M(1024M) 3.416ms [1.819s][info][gc] GC(25) Pause Young (Normal) (GCLocker Initiated GC) 265M->11M(1024M) 3.524ms [1.870s][info][gc] GC(26) Pause Young (Normal) (GCLocker Initiated GC) 267M->11M(1024M) 3.331ms [1.917s][info][gc] GC(27) Pause Young (Normal) (GCLocker Initiated GC) 276M->11M(1024M) 3.280ms [1.961s][info][gc] GC(28) Pause Young (Normal) (GCLocker Initiated GC) 267M->11M(1024M) 3.311ms [2.004s][info][gc] GC(29) Pause Young (Normal) (GCLocker Initiated GC) 264M->11M(1024M) 3.321ms [2.054s][info][gc] GC(30) Pause Young (Normal) (GCLocker Initiated GC) 264M->11M(1024M) 3.361ms [2.105s][info][gc] GC(31) Pause Young (Normal) (GCLocker Initiated GC) 266M->11M(1024M) 3.160ms [2.155s][info][gc] GC(32) Pause Young (Normal) (GCLocker Initiated GC) 266M->11M(1024M) 3.167ms [2.200s][info][gc] GC(33) Pause Young (Normal) (GCLocker Initiated GC) 268M->11M(1024M) 3.159ms [2.235s][info][gc] GC(34) Pause Young (Normal) (GCLocker Initiated GC) 264M->11M(1024M) 3.359ms [2.274s][info][gc] GC(35) Pause Young (Normal) (GCLocker Initiated GC) 264M->11M(1024M) 3.347ms [2.314s][info][gc] GC(36) Pause Young (Normal) (GCLocker Initiated GC) 264M->11M(1024M) 3.397ms [2.356s][info][gc] GC(37) Pause Young (Normal) (GCLocker Initiated GC) 264M->11M(1024M) 3.221ms [2.399s][info][gc] GC(38) Pause Young (Normal) (GCLocker Initiated GC) 268M->11M(1024M) 3.214ms [2.446s][info][gc] GC(39) Pause Young (Normal) (GCLocker Initiated GC) 277M->11M(1024M) 3.395ms [2.485s][info][gc] GC(40) Pause Young (Normal) (GCLocker Initiated GC) 263M->11M(1024M) 3.379ms This issue is very much applicable to G1 also.
01-07-2019

Attached second version of the test that reproduces this: JNICriticalStressTest2. It doesn't need extra JNI code and instead uses zip.Deflater which does critical sections in the deflate() method. So it's a bit easier to use / run.
28-08-2014

It looks like Ramki's suspicions are correct. In general it seems like GCH is confused about the semantics of "full" and "complete". The boolean "full" is used in do_collection to force a collection. The boolean "complete" is used to signal that all generations will be collected. This confusion peaks at the call to increment_total_collections: // Increment total number of GC collections (started) void increment_total_collections(bool full = false); where GCH passes the boolean "complete" as the parameter. So the _total_full_collections value is only updated when we collect the old gen, but the VMOP treats the "full" flag as actual full gcs.
02-07-2014

FWIW, I agree that sending the right gc counts and fixing the vm gc op not to signal "full" would be the right fix. (Your modified fix#2 above.) I have the vague feeling that somewhere in the hoary past "full" was used somewhere in the code because the requested size was 0, but I can't recollect the details. Did you try turning off "full" for the gc locker induced gc's and see if it had any unintended repercussions? (I don't have the code in front of me and haven't looked at it in a while but I vaguely recall something fishy in the GCH policy code that did some goofy stuff with "full".) I'll try and find some time to switch in the old context to see what that history might have been, but I am sure it would be correctable and get rid of the "full" signal here.
02-07-2014

Ramki and Mikael, Thanks for the comments. I think probably the source of the confusion was that the GCLocker is initiating young GC using collect(cause) which was really intended to do Full GCs with System.gc() (at least AFAIK). Ramki, re: "Did you try turning off "full" for the gc locker induced gc's and see if it had any unintended repercussions?": the small change I mentioned above (i.e., only look at the Full GC count if _gc_cause != GCCause::_gc_locker) should have similar effect. But, yes, I'll expand the change for the GCLocker GC VM op to have full == false. I think full == true for that op is simple incorrect.
02-07-2014

On top of what I mentioned earlier, there's a second cause for this issue. A lot of the time the GC VM op for the GCLocker-initiated GC does have the correct GC count. However, at least for the GCH, that GC VM op is also tagged as "full" (why?!?!?!?!). Here's the code for skip_operation: POINT_1: bool skip = (_gc_count_before != Universe::heap()->total_collections()); if (_full && skip) { POINT_2: skip = (_full_gc_count_before != Universe::heap()->total_full_collections()); } skip is indeed set to true at POINT_1. However, because _full is true and the GC before this op was not a full GC and didn't update the totally full collection count, POINT_2 turns skip back to false. This actually explains why it happens so often in the case of ParNew (even when the GC VM op has the right GC counts, skip_operation() is doing the wrong thing). *SIGH* FWIW, out of curiosity I changed skip_operation() to: bool skip = (_gc_count_before != Universe::heap()->total_collections()); if (_full && skip && _gc_cause != GCCause::_gc_locker) { skip = (_full_gc_count_before != Universe::heap()->total_full_collections()); } and the number of occurrences of the issue dropped from over 30% to under 5%. The remaining instances are caused by the race I mentioned earlier. The point when the safepoint sneaks in when jni_unlock() is running seems to be when the collect() methods tries to take the Heap_lock in order to read the before / after GC counts. Because there's a safepoint in progress, the lock operation waits for it and by the time the lock is taken and the GC counts read, the GC has already happened and the GC counts are now up-to-date so the op is not skipped. (I hope this makes sense!!!) Note both fixes I suggested above still stand with the only difference that Fix 2 will also need to ensure that the scheduled operation should have full == false. Any thoughts on what the preferred fix would be for this? Any other ideas?
30-06-2014

Fix 2 : Attach the correct GC counts to the GC VM op for the GCLocker-initiated GC. We'll have to read the counts here, before the update that makes is_active() return false and allows another GC to happen: void GC_locker::jni_unlock(JavaThread* thread) { assert(thread->in_last_critical(), "should be exiting critical region"); MutexLocker mu(JNICritical_lock); <<=== READ GC COUNTS HERE _jni_lock_count--; ... The problem is that, currently, there's no collect() method on CollectedHeap that accepts GC counts. So, we'd have to add one, something like: collect_for_gc_locker(uint gc_count_before, uint full_gc_count_before); and provide implementations for GCH, PS, and G1. I like this approach given that, when a GC is about to happen and it's safe for it to do so, it will go ahead and it's the following GCLocker-initiated GC that will abort because a GC has already happened. I have a prototype of this fix and it's not really complicated, but it's a bit more extensive than Fix 1 for the reasons I've already covered.
30-06-2014

I have a couple of proposed fixes for this. Fix 1: check_active_during_gc() returns true when the GCLocker-initiated young GC is about to happen. - return is_active(); + return is_active() || _doing_gc; This will eliminate the possibility of other young GCs being successful while the GCLocker-initiated young GC is pending. This is the simplest / least disruptive (to the codebase) fix. However, it's not 100% satisfactory. Given that the young GC can safely go ahead, it's a shame to artificially abort it (especially, given that it's just brought all threads to a safepoint too). And who knows when the GCLocker-initiated one will actually happen; maybe that thread gets de-scheduled for a while. That could result in a lot of artificially aborted young GCs.
30-06-2014

Impact = Medium, runs an unnecessary young GC Likelihood = Medium, happens more 1/3 of the time with the attached test case Workaround = High, there is no known work-around MMH => P3
30-06-2014

I attached the test (.java and .c files) I can reproduce the problem with: java -Xms1g -Xmx1g -Xmn256m [ -XX:+UseParNewGC | -XX:+UseParallelGC ] -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+PrintHeapAtGC JNICriticalStressTest 300 4 2 The test arguments are: * 300 - 300 sec duration * 4 - 4 threads doing allocations * 2 - 2 threads doing critical sections I've been running it on my 4-way MacBook. If you have more CPUs, try maybe increase the numbers of workers.
27-06-2014

Quick description of what I think it's happening. I added a couple of extra comments with ##### to the code below: void GC_locker::jni_unlock(JavaThread* thread) { ##### ==> we assume _jni_lock_count == 0, hence is_active() returns true assert(thread->in_last_critical(), "should be exiting critical region"); MutexLocker mu(JNICritical_lock); _jni_lock_count--; ### ==> POINT-A : after this update is_active() will return false decrement_debug_jni_lock_count(); thread->exit_critical(); if (needs_gc() && !is_jni_active()) { // We're the last thread out. Cause a GC to occur. // GC will also check is_active, so this check is not // strictly needed. It's added here to make it clear that // the GC will NOT be performed if any other caller // of GC_locker::lock() still needs GC locked. if (!is_active()) { _doing_gc = true; { // Must give up the lock while at a safepoint MutexUnlocker munlock(JNICritical_lock); if (PrintJNIGCStalls && PrintGCDetails) { ResourceMark rm; // JavaThread::name() allocates to convert to UTF8 gclog_or_tty->print_cr("%.3f: Thread \"%s\" is performing GC after exiting critical section, %d locked", gclog_or_tty->time_stamp().seconds(), Thread::current()->name(), _jni_lock_count); } ### ==> POINT-B Universe::heap()->collect(GCCause::_gc_locker); } _doing_gc = false; } _needs_gc = false; JNICritical_lock->notify_all(); } } Scenario: * Thread T1 is executing jni_unlock() above and it's the last thread to exit the GC locker, so it will schedule a GC VM op (call it OP1) to do a GC locker-initiated young GC * Thread T2 is simply allocating and, after a failed allocation due to the eden being full, it will schedule a GC VM op (call it OP2) to do a young GC to satisfy the allocation request If OP2 runs before T1 executes jni_unlock(), the GC that will start will first check check_active_before_gc(), notice that GC_Locker::is_active() is true, and the GC will bail. So, we're good. If OP2 runs while T1 is between POINT-A and POINT-B and before it schedules OP1, then the GC will notice that GC_Locker::is_active() is false (see why in the code above) and carry on. This will block T1 from scheduling OP1. When it is eventually scheduled, OP1 will end up doing an additional GC even though another has just happened and the eden is almost empty. The before GC counts added to the VM ops are not going to help here. OP2 can complete before T1 takes the Heap_lock to read the GC counts (in the collect() method), so OP1 will have current GC counts when it eventually runs. Notice, however, that the an interleaving between a safepoint and jni_unlock() is not actually supposed to happen. jni_unlock() is called from unlock_critical(), which in turn is called from jni_ReleasePrimitiveArrayCritical(). The entry point for the latter does a native -> VM transition. And while a thread is in VM mode safepoints are normally excluded (right?). However, what I think happens here is that there are implicit safepoint checks on some of the mutex lock / unlock operations (of which we do a couple). My current theory is that that's what allows OP2 to happen.
27-06-2014