JDK-8265981 : Compiler implementation for Pattern Matching for switch (Preview)
  • Type: CSR
  • Component: tools
  • Sub-Component: javac
  • Priority: P4
  • Status: Closed
  • Resolution: Approved
  • Fix Versions: 17
  • Submitted: 2021-04-26
  • Updated: 2021-10-26
  • Resolved: 2021-06-05
Related Reports
CSR :  
Relates :  
Description
Summary
-------

Enhance the Java language with pattern matching for `switch` statements and `switch` expressions. 
This allows patterns to appear in `case` labels, and uses pattern matching to determine which 
`case` label matches the selector expression. A new kind of `case` label, `case null`, is introduced,
such that a `switch` with `case null` does not automatically throw `NullPointerException` when the
selector expression is `null`. The meaning of existing `switch` statements and `switch` expressions, 
without `case null`, is unchanged. Pattern matching for `switch` complements pattern matching for 
`instanceof`, as finalized in Java 16.


Problem
-------

We often want to compare an variable against multiple alternatives. Java
supports multi-way comparisons with `switch` statements and, since Java 14,
`switch` expressions, but unfortunately `switch` is very limited. You can only
switch on values of a few types — numeric types, enum types, and `String` — and
you can only test for exact equality against constants. We might like to use
patterns to test the same variable against a number of possibilities, taking a
specific action on each, but since the existing `switch` does not support that,
we end up with a chain of `if`...`else` tests such as:

```
static String formatter(Object o) {
    String formatted = "unknown";
    if (o instanceof Integer i) {
        formatted = String.format("int %d", i);
    } else if (o instanceof Long l) {
        formatted = String.format("long %d", l);
    } else if (o instanceof Double d) {
        formatted = String.format("double %f", d);
    } else if (o instanceof String s) {
        formatted = String.format("String %s", s);
    }
    return formatted;
}
```

This code benefits from pattern matching for `instanceof` expressions, but it is far from
perfect. First and foremost, this approach allows coding errors to remain hidden
because we have used an overly general control construct. The intent is to
assign something to `formatted` in each arm of the `if`...`else` chain, but there is
nothing that enables the compiler to identify and verify this invariant. If some
block — perhaps one that is executed rarely — does not assign to `formatted`, we
have a bug. In addition, the above code is not optimizable; absent
compiler heroics it will have _O(n)_ time complexity, even though the underlying
problem is often _O(1)_. 

If we extend `switch` statements and `switch` expressions to work on any type,
and allow `case` labels with patterns rather than just constants, then we could
rewrite the above code more clearly and reliably:

```
static String formatterPatternSwitch(Object o) {
    return switch (o) {
        case Integer i -> String.format("int %d", i);
        case Long l    -> String.format("long %d", l);
        case Double d  -> String.format("double %f", d);
        case String s  -> String.format("String %s", s);
        default        -> o.toString();
    };
}
```

The intent of this code is clearer because we are using the right control
construct: we are saying, "the parameter `o` matches at most one of the following
conditions, so figure it out and evaluate the corresponding arm." As a bonus, this code 
is optimizable; we are likely to be able to perform the dispatch in _O(1)_ time. 


Solution
--------

The Java language is enhanced as follows, in both `switch` statements and `switch` expressions:

-   Allow patterns as `case` labels for switch blocks.

-   Allow `null` as a `case` label for switch blocks.

-   Introduce a new semantics for `switch` that uses pattern matching to select a matching
    switch label when the switch label uses a pattern.

-   The semantics of `switch` become "`null`-sensitive": if the value of
    the selector expression is `null` and the switch block contains a `null`
    label, then this label is said to match. If the value of the selector
    expression is `null` and the switch block does not contain the `null` label,
    then `NullPointerException` is thrown as usual. This is a conservative
    extension to the semantics of `switch`: all existing `switch` statements and
    `switch` expressions will continue to execute as they do in Java 16.

-   Allow the selector expression of a `switch` statement or
    `switch` expression to be of any type, not just the traditional set of
    `char`, `byte`, `short`, `int`, `Character`, `Byte`, `Short`, `Integer`,
    `String`, or enum type. This change is conservative in that an existing
    `switch` statement or `switch` expression that uses `case` labels with
    constants will continue to require that the type of the selector
    expression be one of `char`, `byte`, `short`, `int`, `Character`, `Byte`,
    `Short`, `Integer`, `String`, or an enum type.  

-   Introduce the notion of dominance for switch labels, to prevent switch
    blocks containing switch labels that are "shadowed" by other labels
    appearing earlier in the block. For example:

    ```
    switch (x) {
        case Object o -> ...
        case String s -> ...
    }
    ```
    This switch block is erroneous because any value that matches `String s` 
    will always match `Object o` -- we say the switch label `case Object o` 
    dominates the switch label `case String s`. Dominance is determined
    with regards to the order that switch labels appear in the switch block.
    This means that the following `switch` statement is permitted:

    ```
    switch (x) {
        case String s -> ... 
        case Object o -> ...
    }
    ```
    Here, a string value for `x` will match the first switch label `case String s`, and any other non-`null`
    value will match the second switch label `case Object o`.

-   Enhance the existing notion of completeness for `switch` expressions
    to deal with pattern labels by employing a new concept of pattern totality.
    `switch` expressions continue to be restricted so it is not possible
    for no switch label to match at runtime.

-   Introduce a new kind of pattern, guarded patterns, as a means of refining a pattern
    with a boolean expression. For example, a value matches the guarded pattern
    `List l && l.length()>2` if it first matches the type pattern `List l`, and
    (only if it matches) it additionally contains at least two elements. 

-   Introduce a new kind of pattern, parenthesized patterns, to help address some parsing
    issues with guarded patterns containing nested && expressions


The standard libraries will be enhanced with `java.lang.runtime.SwitchBootstrap` (a preview class) with a bootstrap method (`typeSwitch), that will
categorize switch selector values to case numbers.

`javac` will be enhanced to parse the enhanced Java language, and its Trees API
will be enhanced to cover it.


Specification
-------------

Update on June 4: the specification for the bootstrap method has been changed as per review feedback. The updated full specdiff is attached as specdiff.01-4.zip, diff from the previously approved version is attached as specdiff.00-01-4.zip. These are also available for convenience here, respectivelly:

 * http://cr.openjdk.java.net/~jlahoda/8262891/specdiff.01/overview-summary.html
 * http://cr.openjdk.java.net/~jlahoda/8262891/specdiff.00-01/overview-summary.html

Original text:
The current draft of the specification is attached here as jep406-20210519-2.zip.


The specdiff of the proposed API changes is attached as specdiff-20210519.zip and is also available here for convenience:
http://cr.openjdk.java.net/~jlahoda/8262891/specdiff.00/overview-summary.html

Comments
Re-approving updated request.
05-06-2021

Moving to Approved. The TreeScanner and SimpleTreeVisitor should use implSpec tags to describe the operational behavior of the new methods; it doesn't look like they are.
26-05-2021

The updated specdiff should have an improved javadoc for SwitchBootstrap.typeSwitch, more precisely saying what is returned and how it works.
19-05-2021

[~darcy] Following discussions on the EG list, two new design changes have been made to this feature. Attached is the latest spec, and all a diff so you can see what has changed since the previous draft of the spec. See my email to the EG for more details: https://mail.openjdk.java.net/pipermail/amber-spec-experts/2021-May/002995.html
17-05-2021

[~gbierman], in other cases where properties that were true at compile time are not true at runtime, some kind of runtime exception occurs. It is true the same hazard exists with instanceof chains, but we often try to do stronger checking for platform features.
12-05-2021

[~darcy] I'm not sure that I follow you. Dominance is a compile-time check about the well-formedness of switch blocks with patterns appearing as switch labels. I don't see the issue regarding binary compatibility. More specifically, why is it any different from say a program that uses instanceof? Sure we can compile, then change the hierarchy, and get a confusing answer if we forget to recompile. What's new about pattern switch?
12-05-2021

Please review the javadoc of SwitchBootstraps.typeSwitch. Is the @return information describing the CallSite or what the CallSite would do? If dominance is based on the class hierarchy, then shouldn't there be a binary compatibility update if classes are moved up and down the inheritance chain? Moving to Provisional.
10-05-2021