JDK-8016581 : JSR 292: invokeExact (invokehandle) operations should profile method handle behavior forms
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: hs25,8,9,10
  • Priority: P4
  • Status: Open
  • Resolution: Unresolved
  • Submitted: 2013-06-13
  • Updated: 2025-10-06
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbdUnresolved
Related Reports
Relates :  
Description
Method handles, like inner classes, are encapsulated behavior and values, and can be type-profiled for optimistic compilation.  The required profiling is different, however, since it needs to look at the MethodHandle.form.vmentry field, rather than the Object._klass field.  Since method handle invocation is done with a special hidden bytecode "invokehandle", the profiling machinery should be modified to capture that extra data when executing "invokehandle" instructions.

Background:

A method handle's meaning or behavior breaks into three parts:
1. the compiled bytecode of its LambdaForm (MethodHandle.form.vmentry)
2. the constants embedded in the LambdaForm expressions
3. bound values stored in the method handle itself (as a BoundMethodHandle subclass)

Parts 1 and 2 correspond to the (often anonymous) class of an inner class object which adapts some functional interface.  Part 3 corresponds to the instance variables of that object.  Normal class-based type profiling in the JVM allows the JIT to produce good code for inner class objects in many cases.

Currently, part 2 is merged into part 1, because bytecodes are customized to contain all LF constants.  JDK-8001106 aims share bytecodes (part 1) among many similar LFs, where the only differences are constant arguments stored in the LF expressions.

The profiling of method handles therefore has three possible levels:
A. (Coarse) Detect call sites with one or two distinct bytecode (vmentry values) and optimistically compile code that loads parts 2 and 3 as non-constants.
B. (Medium) Detect call sites with one or two distinct lambda forms (form values) and optimistically compile code that loads part 3 as non-constants.  (Most similar to inner classes.)
C. (Fine) Detect call sites with one or two distinct method handles (receiver values) and optimistically compile code that inlines the whole method handle.

Case C. would be a subsumed by instance profiling, as in JDK-8016580.


Addendum, some additional notes on a possible implementation of Case A or case B...

We could consider adding profiling to the “linker methods” (bytecode callsite adapters) the VM installs at MH invoke and/or indy bytecode sites.  (Some or all of them; we can choose.)  Those guys are responsible for  appendix argument.  The JDK code in invokeHandleForm and callSiteForm spins these linker methods.

Linker methods are defined by JDK code, but are responsible for the fine semantic details of any user-visible MH or indy call site.  For example, the logic of invokeExact that throws an exception if the MH type does not match, and the more complex logic for generic invoke that attempts to adjust argument types (and varargs) on a mismatch, is all inside the JDK-spun code of the linker method.  Thus, linker methods can be a potent tool for optimizing (or even enhancing) MH invocation.

With a little thought, it should be possible to incorporate extra profiling logic into these linker methods.  The VM does some profiling at the bytecode level (using interpreter assembly code).  By changing the JDK code that spins linker methods, we can add more profiling to complement the existing profiling.  We can add MH-specific profiling of the MH shape, such as its LF identity.  If by means of such profiling, we could capture an effectively-constant LF, we would be able to inline that LF, even for an invoke of a non-constant MH value.

The appendix argument associated with the linker method is normally a MemberName (or indy CallSite).  If we were to add profiling to the linker method logic, it would have to accumulate the data somewhere.  That’s easy; just add an indirection to the appendix argument.  Let’s call that a “profiling appendix”.  It contains a blank (at first) profile, plus a constant pointer to the MemberName (or CallSite, if we are doing this trick to indy).

The linker method logic is adjusted to reach through the extra indirection, and also update the profile (so it is no longer blank).

What’s harder, but perhaps doable without C2 changes, is to coax the C2 optimizer to take action based on the profile data accumulated in the appendix argument.  One idea for this:  Use stable fields to hold the profile bits, so C2 already knows how to constant fold them (when they are not blank).  Add logic like this to the invoker:

```
LambdaForm lfexp = appendix.lfprofile;  //stable
LambdaForm lfact = mh.form;
if (lfexp != null && lfact == lfexp)
  //do what invokeBasic would do with this LF:
  return linkToStatic(mh … , lfexp.vmentry);
else return mh.invokeBasic(…);
```

What I’m proposing here is not just an optional profiling “widget” to wrap around a MH.  Those are subject to profile pollution if the (wrapped) MH is shared, called from several places with disparate profiled activities.  Instead, adding logic to the invoker methods is a systematic play to inject profiling around MH calls (not just free-floating MHs), by hacking into the VM infrastructure that binds them to specific bytecode call sites.

Comments
Internally to method handle behaviors (MH.form.vmentry), calls to unknown (varying data dependent) methods are performed through linkToStatic, etc. The "pointer" to the target method is a MemberName reference (a "glue" object internal to method handle infrastructure). This pointer is stacked at the *end* of the argument list, *after* all normal arguments to the target method. The linkToStatic method is invoked by an "invokestatic" instruction. It may be worth while assigning just these "linker methods" their own "invokelinkermethod" internal VM instruction, whose behavior is identical to "invokestatic", but which perform type-C profiling on their *last* argument, the MemberName. That may allow inlining of more methods which are reached by these linkage methods. On the other hand, normal constant propagation will also present constant MemberName pointers to these linkage methods, and C2's optimizer (CallGenerator::for_method_handle_inline) looks for these and will inline them. Type-C profiling would provide another source of constant MemberName pointers to this optimization.
26-03-2014

The receiver class (Klass*) profiling done at invokevirtual is useless for monomorphic types; is it useless for MHs? For MHs this profile has some significance, since each bound MH species has its own data layout and (hence) its own Klass*. But, each distinct LF (lambda form) "knows" which bound MH species it applies to. (This "knowing" could be made more explicit than it currently is.) So the LF has (or ought to have) a strict superset of the information normally collected in Klass* profiling. Therefore, we could reuse the profile slots used to profile Klass* values. There's another option, though, which may be better: The internal bytecode "invokehandle" is used to invoke MHs (both invokeExact and plain invoke). We could, if we want, allocate a larger or different MDO record for invokehandle, so that both Klass* and vmentry (profile type A) could be captured. The different layout for the MDO record of an invokehandle could also accommodate constant detection (profile type C), as a later add-on. See comment on JDK-8016580.
26-03-2014

Note that in the current code, part 2 of the behavior (embedded constants) are directly "ldc"-ed from the constant pool of the bytecodes, as soon as the vmentry is replaced by the custom-compiled bytecode. This means that part 2 is mainly relevant to LF interpreter functions like LambdaForm.interpret_L. For warm MHs, the MH content splits cleanly between code (the form.vmentry, part 1) and data (the BoundMethodHandle fields, if any). Therefore, profiling options A and B are equivalent, except for cold MHs that are still being interpreted. I think A and C are the two sweet spots for profiling, with A being most broadly applicable. C is a "golden object" optimization (per call site) which detects and favors a single, unvarying receiver reference, as noted in JDK-8016580. Profiling type C is probably better applied to a wider range of receiver types, not just MHs. But A seems to be the best one to try first, because in many cases where the receiver is non-varying, just inlining its (non-varying) behavior (whether Klass* or vmentry) is enough to make the code fold up.
26-03-2014

8-pool is an invalid affects version. Set 8 and hs25.
12-11-2013