JDK-8066199 : C2 escape analysis prevents VM from exiting quickly
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 8u40,9
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2014-11-30
  • Updated: 2015-06-03
  • Resolved: 2014-12-02
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
8u40Fixed 9 b43Fixed
Related Reports
Relates :  
This issue may be observed with running a simple Maven command:

 $ mvn dependency:help

Then, you may observe the significant stall after Maven completes (prints "BUILD SUCCESS" message).
A simple profiling shows we spend most of the time after the main application thread exits in C2 compiler thread doing:

ConnectionGraph::add_fields_to_worklist(FieldNode*,PointsToNode*) + 0x000000B4
ConnectionGraph::add_field_uses_to_worklist(FieldNode*) + 0x00000192
ConnectionGraph::complete_connection_graph(GrowableArray<PointsToNode*>&,GrowableArray<JavaObjectNode*>&,GrowableArray<JavaObjectNode*>&,GrowableArray<FieldNode*>&) + 0x00000488
ConnectionGraph::compute_escape() + 0x000006EF
ConnectionGraph::do_analysis(Compile*,PhaseIterGVN*) + 0x0000010C
Compile::Optimize() + 0x000010BA
Compile::Compile(ciEnv*,C2Compiler*,ciMethod*,int,bool,bool,bool) + 0x00001088
C2Compiler::compile_method(ciEnv*,ciMethod*,int) + 0x000001A3
CompileBroker::invoke_compiler_on_method(CompileTask*) + 0x000008AF
CompileBroker::compiler_thread_loop() + 0x000004E3
JavaThread::thread_main_inner() + 0x000000E9
java_start(Thread*) + 0x000000EF
collector_root + 0x000001B6, line 1221 in "dispatcher.c"
start_thread + 0x000000BA

There are other simple commands that are affected by this, probably those long enough to kick C2 in.
I've checked this reproduces with current jdk8u and jdk9/dev.
I agree with P3 priority. The Impact should not be L, it affects a surprisingly large audience. My guess is ILW: MMM => P3

I want to rise priority to P3 to be able to backport into 8u40.

Excellent, thanks! I also wonder if we want to make CompileBroker::maybe_block "inline", given it is a simple flag check, and now is used on hot path.

Okay, I will put it there. Thank you for suggestion and testing.

Putting the blocking check within the inner loop helps to remove those remaining 0.2 second stalls: diff -r 515e86623962 src/share/vm/opto/escape.cpp --- a/src/share/vm/opto/escape.cpp Fri Nov 21 08:00:31 2014 -0800 +++ b/src/share/vm/opto/escape.cpp Tue Dec 02 12:20:09 2014 +0300 @@ -1112,6 +1112,7 @@ #define SAMPLE_SIZE 4 if ((next % SAMPLE_SIZE) == 0) { + CompileBroker::maybe_block(); // Each 4 iterations calculate how much time it will take // to complete graph construction. time.stop(); $ for I in `seq 1 10`; do MAVEN_OPTS="-XX:+DoEscapeAnalysis" /usr/bin/time -f %E mvn dependency:help > /dev/null; done 0:01.08 0:01.04 0:01.03 0:01.15 0:01.02 0:01.19 0:01.11 0:01.04 0:01.13 0:01.05

This is a worst timeline I could gather with Vladimir's fix: http://cr.openjdk.java.net/~shade/8066199/timeline-fix-1.png It reproduces in around 30% of the cases, and C2 EA now stalls VM exit for "only" ~0.2 seconds.

I use "Apache Maven 3.2.1 (ea8b2b07643dbb1b84b6d16e1f08391b666bc1e9; 2014-02-14T21:37:52+03:00)". The previous run was on some early 8u40-EA release that probably had pre- JDK-8041984 (slower) EA. On a clean jdk9/dev the slowdown is less visible: $ for I in `seq 1 10`; do MAVEN_OPTS="-XX:+DoEscapeAnalysis" /usr/bin/time -f %E mvn dependency:help > /dev/null; done 0:01.22 0:01.20 0:02.49 0:01.85 0:02.25 0:02.44 0:02.35 0:02.25 0:02.20 0:02.02 This is what I have with your patch: $ for I in `seq 1 10`; do MAVEN_OPTS="-XX:+DoEscapeAnalysis" /usr/bin/time -f %E mvn dependency:help > /dev/null; done 0:01.07 0:01.06 0:01.21 0:01.53 0:01.01 0:01.16 0:01.23 0:01.41 0:01.34 0:01.29 So, it gets better. I'll play around to see if the issue reproduces in some other configuration.

Which maven version you use? I have old version which does not show this problem: Apache Maven 3.0.3 (r1075438; 2011-02-28 09:31:09-0800)

Propose fix: src/share/vm/opto/escape.cpp @@ -1100,6 +1100,9 @@ do { while ((new_edges > 0) && (iterations++ < CG_BUILD_ITER_LIMIT)) { + // Poll for requests from shutdown mechanism to quiesce compiler + // because Connection graph construction may take long time. + CompileBroker::maybe_block(); double start_time = time.seconds(); time.start(); new_edges = 0;

I don't think we can do that since Compiler threads access VM structures.

I have a broader question: can we instead make compiler threads "daemon"-like, so that process can exit without joining them?

I will add the check to exit EA when application is exiting.

AFter 8041984 changes you can specify time limit to abort EA: -XX:EscapeAnalysisTimeout=1. Defaults: product(double, EscapeAnalysisTimeout, 20. DEBUG_ONLY(+40.), "Abort EA when it reaches time limit (in sec)")

At this point, I would argue the impact is higher than L, because quite some power users have reported the same experience on Twitter. Forcing them to use a workaround that disables EA completely will have a detrimental impact on field testing.

ILW=Slow VM Exit, seen by customers, -XX:-DoEscapeAnalysis=LMM=P5

This seems to be a quickest way to test: make a few runs, printing the elapsed wall time. $ for I in `seq 1 10`; do MAVEN_OPTS="-XX:-DoEscapeAnalysis" /usr/bin/time -f %E mvn dependency:help > /dev/null; done 0:01.10 0:01.13 0:01.13 0:01.27 0:01.16 0:01.09 0:01.09 0:01.11 0:01.27 0:01.15 $ for I in `seq 1 10`; do MAVEN_OPTS="-XX:+DoEscapeAnalysis" /usr/bin/time -f %E mvn dependency:help > /dev/null; done 0:01.11 0:11.24 0:01.08 0:11.16 0:01.08 0:11.23 0:11.17 0:01.10 0:11.31 0:11.19 In other words, we are wasting 10 (!!!) seconds half of the time with escape analysis turned on.

These guys seems to have faced the same issue: http://jira.codehaus.org/browse/MEXEC-145 (notice their workaround: -XX:CICompilerCount=1 -XX:TieredStopAtLevel=1)

A simple timeline for "mvn dependency:help". Here, we have ~2s stall between application termination and C2 termination, followed by Java process termination. http://cr.openjdk.java.net/~shade/8066199/timeline.png There are cases when we stall for >10s in the same fashion, but that requires a non-trivial reproducer.