JDK-8130227 : JEP 274: Enhanced Method Handles
  • Type: JEP
  • Component: core-libs
  • Sub-Component: java.lang.invoke
  • Priority: P3
  • Status: Closed
  • Resolution: Delivered
  • Fix Versions: 9
  • Submitted: 2015-07-01
  • Updated: 2017-05-17
  • Resolved: 2016-07-13
Related Reports
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Description
Summary
-------

Enhance the `MethodHandle`, `MethodHandles`, and `MethodHandles.Lookup`
classes of the `java.lang.invoke` package to ease common use cases and
enable better compiler optimizations by means of new `MethodHandle`
combinators and lookup refinement.

Goals
-----

*   In the `MethodHandles` class in the `java.lang.invoke` package, provide new
    `MethodHandle` combinators for loops and try/finally blocks.

*   Enhance the `MethodHandle` and `MethodHandles` classes with new
    `MethodHandle` combinators for argument handling.

*   Implement new lookups for interface methods and, optionally, super
    constructors in the `MethodHandles.Lookup` class.

Non-Goals
---------

*   With the exception of possibly-required native functionality, VM-level
    extensions and enhancements, specifically compiler optimizations, are a
    non-goal.

*   Extensions at the Java language level are explicitly out of scope.

Motivation
----------

In a thread on the `mlvm-dev` mailing list
([part 1](http://mail.openjdk.java.net/pipermail/mlvm-dev/2015-February/006288.html),
[part 2](http://mail.openjdk.java.net/pipermail/mlvm-dev/2015-March/006300.html))
developers have discussed possible extensions to the `MethodHandle`,
`MethodHandles`, and `MethodHandles.Lookup` classes in the
`java.lang.invoke` package to make the realization of common use cases
easier, and also to allow for use cases that are deemed important but are
currently not supported.

The extensions proposed below not only allow for more concise usage of
the `MethodHandle` API, but they also reduce the amount of `MethodHandle`
instances created in some cases. This, in turn, will facilitate better
optimizations on behalf of the VM's compiler.

### Combinators for More Statements

**Loops.** The `MethodHandles` class provides no abstractions for loop
construction from `MethodHandle` instances. There should be a means for
constructing loops from `MethodHandle`s representing the loop's body, as well as
initialization and condition, or count.

**Try/finally blocks.** `MethodHandles` also provides no abstraction for
try/finally blocks. A method to construct such blocks from method handles
representing the `try` and `finally` parts should be provided.

### Better Argument Handling

**Argument spreading.** With `MethodHandle.asSpreader(Class<?> arrayType, int
arrayLength)`, there exists an operation to create a method handle that will
spread the contents of a *trailing* array argument to a number of arguments. An
additional `asSpreader` method should be provided, allowing to expand a number
of arguments contained in an array anywhere in a method signature to a number of
distinct arguments.
    
**Argument collection.** The method `MethodHandle.asCollector(Class<?>
arrayType, int arrayLength)` produces a handle that collects the *trailing*
`arrayLength` arguments into an array. There is no means for achieving the same
for a number of arguments elsewhere in a method signature. There should be an
additional `asCollector` method that supports this.

**Argument folding.** The folding combinator, `foldArguments(MethodHandle
target, MethodHandle combinator)`, does not allow to control the position in the
argument list at which folding should start. A position argument should be
added; the number of arguments to fold is implicitly given as the number of
arguments the `combinator` accepts.

### More Lookup Functions

**Non-abstract methods in interfaces.** Currently, a use case such as this one
will fail at run-time at the indicated position:

    interface I1 {
        default void m() { System.err.println("I1.m"); }
    }

    interface I2 {
        default void m() { System.err.println("I2.m"); }
    }

    class C implements I1, I2 {
        public void m() { I2.super.m(); System.err.println("C.m"); }
    }

    public class IfcSuper {
        public static void main(String[] args) throws Throwable {
            C c = new C();
            MethodHandles.Lookup l = MethodHandles.lookup();
            MethodType t = MethodType.methodType(void.class);
            // This lookup will fail with an IllegalAccessException.
            MethodHandle di1m = l.findSpecial(I1.class, "m", t, C.class);
            ci1m.invoke(c);
        }
    }

It should, however, be possible to construct `MethodHandle`s that bind to
non-abstract methods in interfaces.

**Class lookup.** Finally, the lookup API should allow for looking up *classes*
from different contexts, which is currently not possible. In the `MethodHandles`
area, all required access checks are done at lookup time (as opposed to
run-time, as is the case with reflection). Classes are passed in terms of their
`.class` instance. To facilitate lookups with a certain control over the
context, e.g., across module boundaries, there should be a lookup method that
delivers a `Class` instance with the right restrictions for further use in
`MethodHandle` combinators.

Description
-----------

### Combinators for Loops

**Most Generic Loop Abstraction**

The core abstractions for loops include an initialization of the loop, a
predicate to check, and a body to evaluate. The most generic `MethodHandle`
combinator for creating a loop, to be added to `MethodHandles`, is as follows:

    MethodHandle loop(MethodHandle[]... clauses)

Constructs a method handle representing a loop with several loop variables that
are updated and checked upon each iteration. Upon termination of the loop due to
one of the predicates, a corresponding finalizer is run and delivers the loop's
result, which is the return value of the resulting handle.

Intuitively, every loop is formed by one or more "clauses", each specifying a
local iteration value and/or a loop exit. Each iteration of the loop executes
each clause in order. A clause can optionally update its iteration variable; it
can also optionally perform a test and conditional loop exit. In order to
express this logic in terms of method handles, each clause will determine four
actions:

*   Before the loop executes, the initialization of an iteration variable or
    loop invariant local.

*   When a clause executes, an update step for the iteration variable.

*   When a clause executes, a predicate execution to test for loop exit.

*   If a clause causes a loop exit, a finalizer execution to compute the loop's
    return value.

Some of these clause parts may be omitted according to certain rules, and useful
default behavior is provided in this case. See below for a detailed description.

Each clause function, with the exception of clause initializers, is able to
observe the entire loop state, because it will be passed *all* current
iteration variable values, as well as all incoming loop parameters. Most clause
functions will not need all of this information, but they will be formally
connected as if by `dropArguments`.

Given a set of clauses, there is a number of checks and adjustments performed to
connect all the parts of the loop. They are spelled out in detail in the steps
below. In these steps, every occurrence of the word "must" corresponds to a
place where `IllegalArgumentException` may be thrown if the required
constraint is not met by the inputs to the loop combinator. The term
"effectively identical", applied to parameter type lists, means that they must
be identical, or else one list must be a proper prefix of the other.

*Step 0: Determine clause structure.*

*   The clause array (of type `MethodHandle[][]` must be non-`null` and contain
    at least one element.

*   The clause array may not contain `null`s or sub-arrays longer than four
    elements.

*   Clauses shorter than four elements are treated as if they were padded by
    `null` elements to length four. Padding takes place by appending elements to
    the array.

*   Clauses with all `null`s are disregarded.

*   Each clause is treated as a four-tuple of functions, called "init", "step",
    "pred", and "fini".

*Step 1A: Determine iteration variables.*

*   Examine init and step function return types, pairwise, to determine each
    clause's iteration variable type.

*   If both functions are omitted, use `void`; else if one is omitted, use the
    other's return type; else use the common return type (they must be
    identical).

*   Form the list of return types (in clause order), omitting all occurrences of
    `void`.

*   This list of types is called the "common prefix".

*Step 1B: Determine loop parameters.*

*   Examine init function parameter lists.

*   Omitted init functions are deemed to have `null` parameter lists.

*   All init function parameter lists must be effectively identical.

*   The longest parameter list (which is necessarily unique) is called the
    "common suffix".

*Step 1C: Determine loop return type.*

*   Examine fini function return types, disregarding omitted fini functions.

*   If there are no fini functions, use `void` as the loop return type.

*   Otherwise, use the common return type of the fini functions; they must all
    be identical.

*Step 1D: Check other types.*

*   There must be at least one non-omitted pred function.

*   Every non-omitted pred function must have a `boolean` return type.

(Implementation Note: Steps 1A, 1B, 1C, 1D are logically independent of each
other, and may be performed in any order.)

*Step 2: Determine parameter lists.*

*   The parameter list for the resulting loop handle will be the "common
    suffix".

*   The parameter list for init functions will be adjusted to the "common
    suffix". (Note that their parameter lists are already effectively identical
    to the common suffix.)

*   The parameter list for non-init (step, pred, and fini) functions will be
    adjusted to the common prefix followed by the common suffix, called the
    "common parameter sequence".

*   Every non-init, non-omitted function parameter list must be effectively
    identical to the common parameter sequence.

*Step 3: Fill in omitted functions.*

*   If an init function is omitted, use a constant function of the appropriate
    `null`/zero/`false`/`void` type. (For this purpose, a constant `void` is
    simply a function which does nothing and returns `void`; it can be obtained
    from another constant function by type conversion via `MethodHandle.asType
    type`.)

*   If a step function is omitted, use an identity function of the clause's
    iteration variable type; insert dropped argument parameters before the
    identity function parameter for the non-`void` iteration variables of
    preceding clauses. (This will turn the loop variable into a local loop
    invariant.)

*   If a pred function is omitted, the corresponding fini function must also be
    omitted.

*   If a pred function is omitted, use a constant `true` function. (This will
    keep the loop going, as far as this clause is concerned.)

*   If a fini function is omitted, use a constant `null`/zero/`false`/`void`
    function of the loop return type.

*Step 4: Fill in missing parameter types.*

*   At this point, every init function parameter list is effectively identical
    to the common suffix, but some lists may be shorter. For every init function
    with a short parameter list, pad out the end of the list by dropping
    arguments.

*   At this point, every non-init function parameter list is effectively
    identical to the common parameter sequence, but some lists may be shorter.
    For every non-init function with a short parameter list, pad out the end of
    the list by dropping arguments.

*Final observations.*

*   After these steps, all clauses have been adjusted by supplying omitted
    functions and arguments.

*   All init functions have a common parameter type list, which the final loop
    handle will also have.

*   All fini functions have a common return type, which the final loop handle
    will also have.

*   All non-init functions have a common parameter type list, which is the
    common parameter sequence, of (non-`void`) iteration variables followed by
    loop parameters.

*   Each pair of init and step functions agrees in their return types.

*   Each non-init function will be able to observe the current values of all
    iteration variables, by means of the common prefix.

*Loop execution.*

*   When the loop is called, the loop input values are saved in locals, to be
    passed (as the common suffix) to every clause function. These locals are
    loop invariant.

*   Each init function is executed in clause order (passing the common suffix)
    and the non-`void` values are saved (as the common prefix) into locals.
    These locals are loop varying (unless their steps are identity functions, as
    noted above).

*   All function executions (except init functions) will be passed the common
    parameter sequence, consisting of the non-`void` iteration values (in clause
    order) and then the loop inputs (in argument order).

*   The step and pred functions are then executed, in clause order (step before
    pred), until a pred function returns `false`.

*   The non-`void` result from a step function call is used to update the
    corresponding loop variable. The updated value is immediately visible to all
    subsequent function calls.

*   If a pred function returns `false`, the corresponding fini function is
    called, and the resulting value is returned from the loop as a whole.

The semantics of a `MethodHandle` `l` returned from `loop` are as follows:

    l(arg*) =>
    {
        let v* = init*(arg*);
        for (;;) {
            for ((v, s, p, f) in (v*, step*, pred*, fini*)) {
                v = s(v*, arg*);
                if (!p(v*, arg*)) {
                    return f(v*, arg*);
                }
            }
        }
    }

Based on this most generic abstraction of loops, several convenient combinators
should be added to `MethodHandles`. They are discussed in the following.

**Simple while and do-while Loops**

These combinators will be added to `MethodHandles`:

    MethodHandle whileLoop(MethodHandle init, MethodHandle pred, MethodHandle body)

    MethodHandle doWhileLoop(MethodHandle init, MethodHandle body, MethodHandle pred)

The semantics of invoking the `MethodHandle` object `wl` returned from
`whileLoop` are as follows:

    wl(arg*) =>
    {
        let r = init(arg*);
        while (pred(r, arg*)) { r = body(r, arg*); }
        return r;
    }

For a `MethodHandle` `dwl` returned from `doWhileLoop`, the semantics are as
follows:

    dwl(arg*) =>
    {
        let r = init(arg*);
        do { r = body(r, arg*); } while (pred(r, arg*));
        return r;
    }

This scheme imposes some restrictions on the signatures that the three
constituent `MethodHandle`s can have:

1.  The return type of the initializer `init`, is also the return type of the
    body `body` and of the entire loop, as well as the type of the first
    argument of the predicate `pred` and the body `body`.

2.  The return type of the predicate `pred` must be `boolean`.

**Counting Loops**

For convenience, the following loop combinators will also be provided:

*   `MethodHandle countedLoop(MethodHandle iterations, MethodHandle init, MethodHandle body)`

    A `MethodHandle` `cl` returned from `countedLoop` has the following
    semantics:

        cl(arg*) =>
        {
            let end = iterations(arg*);
            let r = init(arg*);
            for (int i = 0; i < end; i++) {
                r = body(i, r, arg*);
            }
            return r;
        }

*   `MethodHandle countedLoop(MethodHandle start, MethodHandle end, MethodHandle init, MethodHandle body)`

    A `MethodHandle` `cl` returned from this variant of `countedLoop` has the
    following semantics:

        cl(arg*) =>
        {
            let s = start(arg*);
            let e = end(arg*);
            let r = init(arg*);
            for (int i = s; i < e; i++) {
                r = body(i, r, arg*);
            }
            return r;
        }

In these two cases, the type of the first argument of `body` must be `int`, and
the return types of `init` and `body` as well as the second argument of `body`
must be the same.

**Iteration Over Data Structures**

Furthermore, a loop combinator for iteration is helpful:

*   `MethodHandle iteratedLoop(MethodHandle iterator, MethodHandle init, MethodHandle body)`

    A `MethodHandle` `it` returned from `iteratedLoop` has the following
    semantics:

        it(arg*) =>
        {
            let it = iterator(arg*);
            let v = init(arg*);
            for (T t : it) {
                v = body(t, v, a);
            }
            return v;
        }

**Remarks**

More convenience loop combinators are conceivable.

While the semantics of `continue` can easily be emulated by returning from the
body, it is an open question how the semantics of `break` can be emulated. This
could be achieved by using a dedicated exception (e.g.,
`LoopMethodHandle.BreakException`).

### Combinator for `try`/`finally` Blocks

To facilitate the construction of functionality with try/finally semantics from
`MethodHandle`s, the following new combinator will be introduced to
`MethodHandles`:

`MethodHandle tryFinally(MethodHandle target, MethodHandle cleanup)`

The semantics of invoking a `MethodHandle` `tf` returned from `tryFinally` are
as follows:

    tf(arg*) =>
    {
        Throwable t;
        Object r;
        try {
            r = target(arg*);
        } catch (Throwable x) {
            t = x;
            throw x;
        } finally {
            r = cleanup(t, r, arg*);
        }
        return r;
    }

That is, the return type of the resulting `MethodHandle` will be that of the
`target` handle. Both the `target` and the `cleanup` must have matching argument
lists, with the extension for `cleanup` that it accepts one `Throwable`
argument and the - possibly intermediate - result. In case an exception was
thrown during the execution of `target`, this argument will hold that exception.

### Combinators for Argument Handling

As additions to the existing API in `MethodHandles`, the following methods will
be introduced:

*   Addition to the class `MethodHandle` - new instance method:

        MethodHandle asSpreader(int pos, Class<?> arrayType, int arrayLength)

    In the signature of the result, at position `pos`, expect `arrayLength`
    arguments of type `arrayType`. In the result, insert an array consuming
    `arrayLength` arguments of `this` `MethodHandle`. If the signature of `this`
    does not have enough arguments at that position, or if the position does not
    exist in the signature, raise an appropriate exception.

    For example, if the signature of `this` is
    `(Ljava/lang/String;IIILjava/lang/Object;)V`, calling
    `asSpreader(int[].class, 1, 3)` will lead to the resulting signature
    `(Ljava/lang/String;[ILjava/lang/Object;)V`. 

*   Addition to the class `MethodHandle` - new instance method:

        MethodHandle asCollector(int pos, Class<?> arrayType, int arrayLength)

    In the signature of `this`, at position `pos`, expect an array argument.  In
    the signature of the result, at position `pos`, there will be `arrayLength`
    arguments of the type of that array. All arguments before `pos` are not
    affected. All arguments after `pos` are shifted to the right by
    `arrayLength`. It is expected that the arguments to be spread are available
    in the array at run-time; in case they are not, an
    `ArrayIndexOutOfBoundsException` is thrown.

    For example, if the signature of `this` is
    `(Ljava/lang/String;[ILjava/lang/Object;)V`, calling
    `asCollector(int[].class, 1, 3)` will lead to the resulting signature
    `(Ljava/lang/String;IIILjava/lang/Object;)V`.

*   Addition to the class `MethodHandles` - new static method:

        MethodHandle foldArguments(MethodHandle target, int pos, MethodHandle combiner)

    The resulting `MethodHandle` will, when invoked, act like the existing
    method `foldArguments(MethodHandle target, MethodHandle combiner)` with the
    difference that the already existing method implies a folding position of
    `0`, while the proposed new method allows for specifying a folding position
    other than `0`.

    For example, if the `target` signature is `(ZLjava/lang/String;ZI)I`, and
    the `combiner` signature is `(ZI)Ljava/lang/String;`, calling
    `foldArguments(target, 1, combiner)` will lead to the resulting signature
    `(ZZI)I`, and the second and third (`boolean` and `int`) arguments will be
    folded into a `String` upon each invocation.

These new combinators will be implemented using existing abstractions and API.
If required, non-public API will be modified.

### Lookups

The implementation of the method `MethodHandles.Lookup.findSpecial(Class<?>
refc, String name, MethodType type, Class<?> specialCaller)` will be modified to
allow for finding `super`-callable methods on interfaces. While this is not a
change of the API as such, its documented behaviour changes significantly.

Also, the `MethodHandles.Lookup` class will be extended with the following two
methods:

*   `Class<?> findClass(String targetName)`

    This retrieves an instance of `Class<?>` representing the desired target
    class identified by the `targetName`. The lookup applies the restrictions
    defined by the implicit access context. In case the access is not possible,
    the method raises an appropriate exception.

*   `Class<?> accessClass(Class<?> targetClass)`

    This attempts to access the given class, applying the restrictions defined
    by the implicit access context. In case the access is not possible, the
    method raises an appropriate exception.

Risks and Assumptions
---------------------

As this is a *purely additive API extension*, no code that existing clients of
the `MethodHandle` API use will be negatively affected. The proposed extensions
also do not rely on any other ongoing development.

Unit tests for all of the above API extensions will be provided.

Dependences
-----------

This JEP is related to
[JEP 193 (Variable Handles)](http://openjdk.java.net/jeps/193),
and a certain amount of overlap is possible since
`VarHandle`s depend on the `MethodHandle` API. This will be addressed in
collaboration with the owner of JEP 193.

The [JBS issue on JSR 292 enhancements for maintenance releases](https://bugs.openjdk.java.net/browse/JDK-8075779)
can be considered a starting point for this JEP, which distills from that
issue those points upon which agreement has been reached.

Comments
[~alanb] asked: "I assume the proposed findClass(String) will use the loader of the caller as the initiating loader, is that right? I could imagine this needing variants like findClass(ClassLoader, String) or the proposed findClass(Module, String) to be widely useful." The short answer is "no". The long answer follows... The Lookup object itself contains a securely bound class (Lookup.lookupClass), which is used not only for access checking but also for scoping of names. Therefore, Lookup.findClass(String) will derive the initiating loader from the lookup class. If the Lookup object has private access (is full-strength) then there is no stack walking and no security manager call, beyond the natural actions performed by resolution of a constant pool reference to a CONSTANT_Class. This is a corollary of the basic design principle, that Lookup operations are competent to emulate any bytecode behavior. If the Lookup object does not have private access (is a weak lookup), then there may be a security manager check, to see if the class loader for the lookup-class can (in fact) be accessed. This provides a way to get at the functionality of ternary Class.forName, if you have a Class object already in ClassLoader from which you are trying to initiate a load. From another perspective, Lookup.findClass on a full-strength lookup has exactly the same power as Class.forName (the unary version). The difference between the two calls is that Class.forName takes its caller class as a fixed parameter (via the CallerSensitive convention), whereas Lookup.findClass takes the corresponding parameter from the lookup class of the Lookup object. Both designs allow securable, authenticated lookups, but only the newer API supports delegation, and presents the scope parameter as an explicit value (instead of an indirect stack walk). The newer API is both more powerful and easier to reason about. N.B. It would be a grave error to introduce Lookup API points which are CallerSensitive. The only CallerSensitive API point related to Lookup is MethodHandles.lookup(), which is a factory that converts the current caller into a securely bound Lookup object, which can then be delegated to any trusted party (such as a bootstrap method or stack walker).
09-11-2015

Thanks for the clarification, I think it's clear now.
09-11-2015

[ I see lookupClass(Class<?>) has been renamed to findClass(String) in the latest revision so updating a comment on that ] I assume the proposed findClass(String) will use the loader of the caller as the initiating loader, is that right? I could imagine this needing variants like findClass(ClassLoader, String) or the proposed findClass(Module, String) to be widely useful. In any case, it would be good for the lookup to have at least a prototype user, ServiceLoader is one possible candidate, it is currently using Class.findClass(Module, String) in the jake forest.
17-09-2015