United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: JDK-6316156 C2 method-size tuning parameters need update
JDK-6316156 : C2 method-size tuning parameters need update

Details
Type:
Enhancement
Submit Date:
2005-08-25
Status:
Open
Updated Date:
2014-02-26
Project Name:
JDK
Resolved Date:
Component:
hotspot
OS:
generic
Sub-Component:
compiler
CPU:
generic
Priority:
P3
Resolution:
Unresolved
Affected Versions:
6
Targeted Versions:
9

Related Reports
Relates:
Relates:
Relates:

Sub Tasks

Description
Certain tunable parameters are sensitive to the size of compiled methods.
They need revisiting, since machines are larger than when the parameters
were last tuned (Tiger or before).

In particular, certain newer optimizations (bimorphic inlining) create larger
methods which in turn fall foul of the restrictively turned parameters.

Parameters which may need inflation include:
InlineSmallCode
MaxInlineSize
MaxTrivialSize
NodeCountInliningCutoff
MaxNodeLimit
InlineThrowMaxSize
MaxTrivialSize

Vladimir reports that changes optimizing JVM98 run into InlineSmallCode limits below 2200.

                                    

Comments
EVALUATION

See description.
                                     
2007-06-12
MaxInlineSize (default = 35 bytecode bytes) is used to detect bytecoded methods which are small enough to to inline almost always.  It should be compared against a better metric than Method::code_size, which is the textual size of the bytecodes.

A better metric would be a weighted instruction count, with low or zero weight given to data movement instructions and heavier weight given to invocations and control transfers.  Unreached instructions (including those never reached so far according to the profile) should not contribute to the weight.  This will allow methods to contain unused slow paths (e.g., for exception throws) that do not interfere with the inlining of fast paths.  A similar metric would be a weighted IR node count (immediately after parsing), but that would more difficult to derive.

InlineSmallCode (default = a few thousand native instruction bytes) is similarly used to detect native-compiled n-methods which are too large to inline.  The problem with it is that the metric is compared against native code bytes, and many of those bytes are "cold" slow paths which are never executed (and thus never hit the instruction cache).  Slow paths (to uncommon traps) tend to be numerous and burdened with trivial data motion code.  Also, this metric is deeply machine-dependent, and so needs complete re-tuning for each CPU architecture and (even worse) for each change in the JIT back end or middle end.

A better metric would be something related to a machine-independent view of the hot path, such as the weighted bytecode instruction count (see above) or IR size.

Background:  Given a strongly interconnected control flow between inline-able methods A->B->C->???->A, the InlineSmallCode limit tends to reduce the number of native copies of A, B, etc.  The pathology it interrupts occurs if an application makes hot entries into the graph at multiple points A, B, C, ???, which (except for InlineSmallCode) would tend to create inlined n-methods containing A->B->C->???->A, B->C->???->A->B, C->???->A->B->C, etc., which triggers compilation work quadratic in the size of the graph cycle, and instruction cache traffic potentially quadratic.


                                     
2013-12-24
For better visibility into generated code decisions, see https://wiki.openjdk.java.net/display/HotSpot/PrintAssembly and search the web for PrintAssembly.  See also LogCompilation and "jitwatch".
                                     
2014-02-25
Untaken paths should not contribute to inline-limiting metrics.
                                     
2014-02-25
Surprising inline failures are a common problem.

This is a deepening issue for dynamic languages like Nashorn, since they consist of small bits of simple "plumbing" joined together.  If 5% of the joints in the plumbing fail to inline, they can dominate performance.

Such surprises may also crop up in new releases when Java programmers adopts new language features.  For example, this happened when the "assert" keyword was introduced.  Seemingly simple assertions in hot code can disturb inlining and performance, even when assertions are not enabled.  A fix to this, and other fast/slow idiom performance, should discount unreached code, in both inline heuristics and parsing (IR generation,  JDK-8030976).
                                     
2014-02-25
ILW = {Impact: Med, Likelihood: Med, Workaround: Med}
Impact: Users report sudden unacceptable performance loss.  Difficult to analyze.
Likelihood: Recurring reports.  Probably also underreported.
Workaround: When recognized, split methods into smaller methods.  Time-consuming and error-prone.
Severity = 3
                                     
2014-02-25
List of rt. jar methods that are above the byte code inlining limit: https://groups.google.com/forum/#!topic/jitwatch/KJKEgVLTGg8


                                     
2014-02-25
Also: assertions count as code weight, which is very bad.

So instead of using byte code size at all - can't HotSpot generate the inline candidate and throw it away if it's too large or something? Ideally it'd do something like this

do {
   generate ir for hot child
   if child ir too big trivial check
     break
   splice ir into graph 
   apply shrinking transforms
} while (parent ir not too big && enough time left for optimisation)

I've spent significant time tuning Nashorn hot methods to be as small as possible so that they will be considered for inlining. Sometimes, by splitting a method into two, with half the logic in each, I reach my goal - and we should try to think about how to abstract this away from the Java programmer

Regards
Marcus

                                     
2014-02-25



Hardware and Software, Engineered to Work Together