JDK-8205499 : C1 temporary code buffers are not removed with -XX:+UseDynamicNumberOfCompilerThreads
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 11
  • Priority: P2
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2018-06-22
  • Updated: 2018-08-20
  • Resolved: 2018-06-29
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 12
11 b20Fixed 12Fixed
Related Reports
Relates :  
Relates :  
Description
The stress test started failing with 

>Java HotSpot(TM) 64-Bit Server VM warning: CodeHeap 'non-nmethods' is full. Compiler has been disabled.

Last time Kitchensink worked fine was in JDK11 b9. Starting from JDK 11 b13 it started failing with the following exception

[glue.process.err] [stress.process.err] Java HotSpot(TM) 64-Bit Server VM warning: CodeHeap 'non-nmethods' is full. Compiler has been disabled.
[glue.process.err] [stress.process.err] Java HotSpot(TM) 64-Bit Server VM warning: Try increasing the code heap size using -XX:NonNMethodCodeHeapSize=
[glue.process.err] [stress.process.err] Java HotSpot(TM) 64-Bit Server VM warning: CodeHeap 'non-nmethods' is full. Compiler has been disabled.
[glue.process.err] [stress.process.err] Java HotSpot(TM) 64-Bit Server VM warning: Try increasing the code heap size using -XX:NonNMethodCodeHeapSize=
[glue.process.err] [stress.process.err] CodeHeap 'non-profiled nmethods': size=119172Kb used=110943Kb max_used=113455Kb free=8228Kb
[glue.process.err] [stress.process.err]  bounds [0x00007fb2787d1000, 0x00007fb27fc32000, 0x00007fb27fc32000]
[glue.process.err] [stress.process.err] CodeHeap 'profiled nmethods': size=119172Kb used=105068Kb max_used=105068Kb free=14103Kb
[glue.process.err] [stress.process.err]  bounds [0x00007fb271370000, 0x00007fb2787d1000, 0x00007fb2787d1000]
[glue.process.err] [stress.process.err] CodeHeap 'non-nmethods': size=7416Kb used=7032Kb max_used=7222Kb free=384Kb
[glue.process.err] [stress.process.err]  bounds [0x00007fb270c32000, 0x00007fb271370000, 0x00007fb271370000]
[glue.process.err] [stress.process.err]  total_blobs=690681 nmethods=7090 adapters=773
[glue.process.err] [stress.process.err]  compilation: disabled (not enough contiguous free space left)
[glue.process.err] [stress.process.err]               stopped_count=1, restarted_count=0
[glue.process.err] [stress.process.err]  full_count=0
[glue.process.err] [stress.process.err] Java HotSpot(TM) 64-Bit Server VM warning: Initialization of C1 CompilerThread2 thread failed (no space to run compilers)
[glue.process.err] [stress.process.err] CodeHeap 'non-profiled nmethods': size=119172Kb used=110944Kb max_used=113455Kb free=8227Kb
[glue.process.err] [stress.process.err]  bounds [0x00007fb2787d1000, 0x00007fb27fc32000, 0x00007fb27fc32000]
[glue.process.err] [stress.process.err] CodeHeap 'profiled nmethods': size=119172Kb used=105068Kb max_used=105068Kb free=14103Kb
[glue.process.err] [stress.process.err]  bounds [0x00007fb271370000, 0x00007fb2787d1000, 0x00007fb2787d1000]
[glue.process.err] [stress.process.err] CodeHeap 'non-nmethods': size=7416Kb used=7032Kb max_used=7222Kb free=384Kb
[glue.process.err] [stress.process.err]  bounds [0x00007fb270c32000, 0x00007fb271370000, 0x00007fb271370000]
[glue.process.err] [stress.process.err]  total_blobs=690682 nmethods=7091 adapters=773
[glue.process.err] [stress.process.err]  compilation: disabled (not enough contiguous free space left)
[glue.process.err] [stress.process.err]               stopped_count=1, restarted_count=0
[glue.process.err] [stress.process.err]  full_count=1
[glue.process.err] [stress.process.err] Java HotSpot(TM) 64-Bit Server VM warning: Initialization of C1 CompilerThread3 thread failed (no space to run compilers)
Comments
I was able to reproduce this by slightly modifying the code such that compiler threads are aggressively added and removed. The code cache quickly fills up because temporary buffers are not removed and we get a warning: Java HotSpot(TM) 64-Bit Server VM warning: CodeCache is full. Compiler has been disabled. Java HotSpot(TM) 64-Bit Server VM warning: Try increasing the code cache size using -XX:ReservedCodeCacheSize= CodeCache: size=51200Kb used=50978Kb max_used=50990Kb free=221Kb bounds [0x00007fd068e00000, 0x00007fd06c000000, 0x00007fd06c000000] total_blobs=1284 nmethods=237 adapters=787 compilation: disabled (not enough contiguous free space left) stopped_count=1, restarted_count=0 full_count=0 Java HotSpot(TM) 64-Bit Server VM warning: Initialization of C1 CompilerThread3 thread failed (no space to run compilers) Java HotSpot(TM) 64-Bit Server VM warning: Initialization of C1 CompilerThread5 thread failed (no space to run compilers) Java HotSpot(TM) 64-Bit Server VM warning: Initialization of C1 CompilerThread4 thread failed (no space to run compilers) I've verified that the following fix solves this: http://cr.openjdk.java.net/~thartmann/8205499/webrev.00/ I was not able to create a stable regression test.
28-06-2018

Yes, that could work as well. I'll have a look (assigning to me for now).
26-06-2018

Should it be in destructor ~CompilerThread()? Can we hold lock there?
26-06-2018

I think this should fix the problem: diff -r 00c4edaf2017 src/hotspot/share/compiler/compileBroker.cpp --- a/src/hotspot/share/compiler/compileBroker.cpp Mon Jun 25 10:34:46 2018 -0400 +++ b/src/hotspot/share/compiler/compileBroker.cpp Tue Jun 26 18:15:06 2018 +0200 @@ -1774,6 +1774,11 @@ tty->print_cr("Removing compiler thread %s after " JLONG_FORMAT " ms idle time", thread->name(), thread->idle_time_millis()); } + // Free buffer blob, if allocated + if (thread->get_buffer_blob() != NULL) { + MutexLockerEx mu(CodeCache_lock, Mutex::_no_safepoint_check_flag); + CodeCache::free(thread->get_buffer_blob()); + } return; // Stop this thread. } }
26-06-2018

Yes, that could very well be. At least I don't see any code that de-allocates the CodeBuffers (usually, we only do it in CompileBroker::shutdown_compiler_runtime()).
26-06-2018

JDK-8198756 should not create more threads then before. The upper limit is the same. I am wondering if we forgot to free C1 temporary CodeBuffer when unused C1 compiler threads are removed.
26-06-2018

Martin, you've implemented JDK-8198756. Could you please have a look? Thanks.
26-06-2018

I've checked to code and creation of new compiler threads after startup was introduced (and can only happen) with UseDynamicNumberOfCompilerThreads (JDK-8198756). Probably we are creating too many compiler threads and then run out of code cache space for the corresponding temporary buffers. [~lmesnik], could you please run with -XX:+TraceCompilerThreads to get more debug information? Updated ILW = Compilation is disabled due to code cache exhaustion (not enough space for compiler thread scratch buffers), with long running stress test on server with many cores, -XX:-UseDynamicNumberOfCompilerThreads = HMM = P2 We could also make this a P3, assuming that we will recover as soon as the sweeper frees up space in the code cache but let's treat it as P2 for now because the Kitchensink test is part of the Release Criteria for JDK 11.
26-06-2018

The problem seems to be that there is not enough space for a C1 temporary CodeBuffer in the code cache. This is not because there is not enough space in the NonNMethod code cache segment because the runtime would fall back to the other segments without issuing a warning (so increasing NonNMethodCodeHeapSize does not help and that's expected, you would need to increase ReservedCodeCacheSize). In the failing case there must have been not enough space in *any* segment. Looking at the log, there should have been enough space in one of the other segments (for example, 14103Kb in the profiled nmethods segment) so it's weird that we fail to allocate.
26-06-2018

Updated ILW = HMH = P1 Can this be a non-compiler, runtime issue!?
26-06-2018

The bug is consistently reproduced with JDK 11 b18. The increase of XX:NonNMethodCodeHeapSize doesn't work. The link to latest passed build is here: http://sqeweb.us.oracle.com//net/scaab055//space/aurora/sca/data/vmsqe/results/kitchensink/9/linux-x64/results/run_1/JTwork/applications/kitchensink/Kitchensink14D/ I think that it is needed to raise priority of this issue since it is a regression and impacts release criteria.
25-06-2018

I tried to increase codemethods size to 50M but it still fails. I run KS outside of Mach5. I uploaded results of running Kitchensink here: http://sqeweb.us.oracle.com//net/scaab055//space/aurora/sca/data/vmsqe/results/kitchensink/18/linux-x64/results/run_1/tier1/JTwork/applications/kitchensink/Kitchensink14D/ Passing Kitchensink 14days test is a part of Release Criteria for JDK 11. So this bug should be fixed in 11 I believe.
25-06-2018

initial ILW = kitchensink failure with CodeHeap 'non-nmethods' full; with kitchensink, rare - 'only on the 32core / 64G JavaHeap while on small typical VM it still works'; increase code heap size using -XX:NonNMethodCodeHeapSize! = HLM = P3 (For now targeting as tbd_feature. Please edit if required.)
25-06-2018