CSR :
|
|
Relates :
|
|
Relates :
|
Summary ------- Enhance the Java language with _unnamed patterns_, which match a record component without stating the component's name or type, and with _unnamed variables_, which can be initialized but not used. Both are denoted with an underscore: `_`. Problem ------- Java developers use [record patterns](https://openjdk.org/jeps/432) to disaggregate a record instance into its components. In the following code, one part of a program creates a `ColoredPoint` instance, while another part of the program uses pattern matching with `instanceof` to test whether a variable is a `ColoredPoint`, and extract its two components if so: ``` record Point(int x, int y) {} enum Color { RED, GREEN, BLUE } record ColoredPoint(Point p, Color c) {} ... new ColoredPoint(new Point(3,4), Color.GREEN) ... if (r instanceof ColoredPoint(Point p, Color c)) { ... p.x() ... p.y() ... } ``` The code above needs only `p` in the `if` block, not `c`, however today developers have to spell out all the components of a record class every time they perform pattern matching. Furthermore, it is not visually clear that the `Color` component is irrelevant. This is especially evident when record patterns are _nested_ to extract data within components, such as: ``` if (r instanceof ColoredPoint(Point(int x, int y), Color c)) { ... x ... y ... } ``` As a result omitting unnecessary components such as `Color c` in both of the previous examples would be desirable for clearer code. In some other occasions, developers may not need to initialize any pattern variables during pattern matching but they will need to explore the shape of the structure at runtime. As a highly simplified example, consider the following `Box` and `Ball` classes, and a `switch` that explores the content of a `Box`: ``` record Box<T extends Ball>(T content) {} sealed abstract class Ball permits RedBall, BlueBall, GreenBall {} final class RedBall extends Ball {} final class BlueBall extends Ball {} final class GreenBall extends Ball {} Box<? extends Ball> b = ... switch (b) { case Box(RedBall red) -> processBox(b); case Box(BlueBall blue) -> processBox(b); case Box(GreenBall green) -> stopProcessing(); } ``` Since the variables are unused it would be ideal if the developer could elide their names, while keeping the explicit type for shape analysis reasons. Furthermore, if the `switch` was hypothetically refactored to group the first two patterns in one `case` (something that is not allowed in Pattern Matching for Switch): ``` case Box(RedBall red), Box(BlueBall blue) -> processBox(b); ``` then it would be erroneous to name the components: Neither of the names is usable on the right-hand side because either of the patterns on the left-hand side could have matched. Since the names are unusable it would be ideal to elide them. Turning to traditional imperative code, most developers will have encountered the situation of having to declare a variable that they did not intend to use. This typically occurs when the side effect of a statement is more important than its result. For example, the following code uses an enhanced-`for` statement to step through a collection, calculating `total` as a side effect, without using the loop variable `order`: ``` int total = 0; for (Order order : orders) { if (total < LIMIT) { ... total++ ... } } ``` The prominence of `order`'s declaration is unfortunate given that `order` is not used. Here is another example where the side effect of a expression is more important than its result, leading to an unused variable. The following code dequeues data but only needs two out of every three elements: ``` Queue<Integer> q = ... // x1, y1, z1, x2, y2, z2 .. while (q.size()>=3) { int x = q.remove(); int y = q.remove(); int z = q.remove(); // z is unused ... new Point(x, y) ... } ``` The third call to `remove()` has the desired side effect -- dequeuing an element -- regardless of whether its result is assigned to a variable, so the declaration of `z` could be elided--while satisfying the desire to show that `remove` indeed could returns a value. Unused variables occur frequently in two other kinds of statement that focus on side effects: - The `try`-with-resources statement is always used for its side effect: the automatic closing of resources. For example the following code acquires and (automatically) releases a context; the name `acquiredContext` is merely clutter: ``` try (var acquiredContext = ScopedContext.acquire()) { ... acquiredContext not used ... } ``` - Exceptions are the ultimate side effect, and handling one often gives rise to an unused variable. For example, most Java developers will have written `catch` blocks as shown below, where the name of the exception parameter is irrelevant: ``` String s = ...; try { int i = Integer.parseInt(s); ... i ... } catch (NumberFormatException ex) { System.out.println("Bad number: " + s); } ``` Even code without side effects is sometimes forced to declare unused variables. For example, the following code generates a map where each key mapped to the same placeholder value; since the lambda parameter `v` is not used, its name is irrelevant: ``` ...stream.collect(Collectors.toMap(String::toUpperCase, v -> "NODATA")); ``` In all these scenarios where variables are unused and their names are irrelevant, it would be ideal if developers could declare variables with no name. Solution -------- The Java language is enhanced as follows: - Allow the underscore `_` to denote an _unnamed pattern_ in place of a whole type pattern or record pattern. - Allow the underscore `_` to denote an _unnamed pattern variable_ in a type pattern. - Allow the underscore `_` to denote an _unnamed variable_ when either the local variable in a local variable declaration statement, or an exception parameter in a catch clause, or a lambda parameter in a lambda expression, are unused. The following kinds of declaration can introduce either a named variable (denoted by an identifier) or an unnamed variable (denoted by an underscore): - a local variable declaration statement in a block (JLS 14.4.2) - a resource specification of a try-with-resources statement (JLS 14.20.3) - the header of a basic for statement (JLS 14.14.1) - the header of an enhanced for loop (JLS 14.14.2) - an exception parameter of a catch block (JLS 14.20) - a formal parameter of a lambda expression (JLS 15.27.1) - Allow unnamed pattern variables in a switch that needs to execute the same action for multiple cases. The grammar of switch labels is enhanced to allow multiple patterns. Those are semantically correct only when unnamed pattern variables are used in all pattern cases and no binding variables are introduced. - Neither the unnamed pattern nor `var _` may be used at the top level of a pattern: both `... instanceof _` and `... instanceof var _` are prohibited, as are `case _` and `case var _`. - The linter for TWR + underscore needs to mute the lint warning for `_` not being referenced. This is not applicable anymore for unnamed variables. - Update the javax.lang.model for unnamed variables. Tracked in a separate CSR: [8307577: Implementation for javax.lang.model for unnamed variables (Preview)](https://bugs.openjdk.org/browse/JDK-8307577). Specification ------------- The updated JLS draft for unnamed patterns and variables is attached as jep443-20230322.zip. Also in https://cr.openjdk.org/~abimpoudis/unnamed/jep443-20230322/specs/unnamed-jls.html The proposed API enhancements are attached as specdiff.preliminary.00.zip. Those will mostly reflect the introduction of a new tree kind to support an `AnyPatternTree`. Changes in javax.lang.model are included in [8307577: Implementation for javax.lang.model for unnamed variables (Preview)](https://bugs.openjdk.org/browse/JDK-8307577). The changes to the specification and API are a subject of change until the CSR is finalized.
|