JDK-8304487 : Compiler Implementation for Primitive types in patterns, instanceof, and switch (Preview)
  • Type: CSR
  • Component: tools
  • Sub-Component: javac
  • Priority: P4
  • Status: Closed
  • Resolution: Approved
  • Fix Versions: 23
  • Submitted: 2023-03-20
  • Updated: 2024-02-09
  • Resolved: 2024-01-22
Related Reports
CSR :  
Description
Summary
-------

Enhance pattern matching by allowing primitive type patterns to be used in all
pattern contexts, align the semantics of primitive type patterns with
instanceof, and extend switch to allow primitive constants as case labels. This
is a preview language feature.

Problem
-------

### Nested primitive type patterns are limited

Disaggregation with type patterns that involve primitive types is currently
invariant. For example, assume a JSON model can be encoded with a _sealed_
hierarchy according to its specification as follows:

    sealed interface JsonValue { 
      record JsonString(String s) implements JsonValue { }
      record JsonNumber(double d) implements JsonValue { } 
      record JsonNull() implements JsonValue { }
      record JsonBoolean(boolean b) implements JsonValue { }
      record JsonArray(List<JsonValue> values) implements JsonValue { }
      record JsonObject(Map<String, JsonValue> pairs) implements JsonValue { }
    }

With respect to numbers JSON does not distinguish integers from non-integers, so
in `JsonNumber` we represent all numbers with `double` values as [recommended by the specification](https://www.rfc-editor.org/rfc/rfc8259#section-6).

Given a JSON payload of

    { "name" : "John", "age" : 30 }

we can construct a corresponding `JsonValue` via

    var json = new JsonObject(Map.of("name", new JsonString("John")
                                     "age", new JsonNumber(30)));

For each key in the map, this code instantiates an appropriate record for the
corresponding value. For the first, the value `"John"` has the same type as the
record's component, namely `String`. For the second, however, the Java compiler
applies a widening primitive conversion to convert the `int` value, 30, to a
`double`. 

What we would really like to do is use `int` directly in the `JsonNumber`
pattern such that the pattern matches only when the `double` value inside the
`JsonNumber` object can be converted to an `int` without loss of information,
and when it does match it automatically narrows the `double` value to an `int`:

    if (json instanceof JsonObject(var map)
        && map.get("name") instanceof JsonString(String name)
        && map.get("age") instanceof JsonNumber(int age))
    {
        return new Customer(name, age);
    }

### Primitive type patterns are not permitted in top-level context of `instanceof`

Primitive types patterns cannot be used at top-level contexts of `instanceof` at
all; only type patterns of reference types are allowed in top-level contexts.

Since the pattern matching `instanceof` operator safeguards cast conversions
about reference types, lifting restrictions to primitive type patterns, means
that `instanceof` can now be able to safeguard _any_ cast conversion supported
by Java ([JLS 5.5](https://docs.oracle.com/javase/specs/jls/se8/html/jls-5.html#jls-5.5)), at
top-level too. In the following example, `instanceof` with a primitive type
pattern `byte b` implies that `instanceof` safeguards whether `i` can be safely
cast to `byte` without loss of information. If `instanceof` returns true, it
means that `(byte) i` will not lead to loss of information about e.g., magnitude
and sign:

    int i = ...
    if (i instanceof byte b) {
        ... b ...
    }

In this particular case of these two types, checking whether `int` can be safely
cast to `byte` necessitates a run-time test of _exactness_. There are other
pairs that do not need a run-time check and a conversion is guaranteed only by
inspecting the two types involved. For example an `int` to `long` is always
safe. A conversion between these two is _unconditionally exact_.

### Primitive types are not permitted in type comparisons

Extending `instanceof` as the pattern matching operator means that the semantics
for the type comparison operator `instanceof`, can be enhanced, symmetrically. 

    int i = ...
    if (i instanceof byte) {
      ...
    }

### Primitive type patterns are not permitted in top-level context of `switch`

At present, primitive type patterns are not allowed at a top-level context of a
switch either. For example, with a top-level primitive type pattern Java users could
rewrite the `switch` expression ([JEP&nbsp;361](https://openjdk.org/jeps/361))

    switch (x.getStatus()) {
        case 0 -> "okay";
        case 1 -> "warning";
        case 2 -> "error";
        default -> "unknown status: " + x.getStatus();
    }

more clearly as:

    switch (x.getStatus()) {
        case 0 -> "okay";
        case 1 -> "warning";
        case 2 -> "error";
        case int i -> "unknown status: " + i;
    }

Here, the `case int i` label matches any status value not previously matched,
making the `switch` expression exhaustive so that no `default` label is
required.

### `switch` does not support all primitive types

Prior to this JEP, `switch` expressions and `switch` statements could switch on
some primitive types — but not `boolean`, `float`, `double`, or `long`. For
example a switch involving `long`s:

    long v = ...;
    switch (v) {
        case 0x01            -> ...;
        case 0x02            -> ...;
        case 10_000_000_000L -> ...;
        case 20_000_000_000L -> ...;
        case long l          -> ... l ...;
    }

With `float` values in `case` labels users will be able to switch on
floating-point match candidates:

    float f = ...;
    switch (f) {
        case Float.NaN    -> ...;
        case Float.POSITIVE_INFINITY -> ...;
        case Float.NEGATIVE_INFINITY -> ...;
        case float g -> ... g ...;
    }

With this proposal a `switch`-boolean will also be possible:

    boolean u = user.isLoggedIn();
    switch (u) {
        case true  -> user.id();
        case false -> { log("Unrecognized user"); yield -1; }
    }

Solution
--------

The Java language is enhanced as follows:

- Allow primitive type patterns to be used in nested positions, even if they do
  not spell-out the same type of the corresponding record component:
    - Derive the semantics of primitive type patterns (and reference type
      patterns on targets of primitive type) from casting conversions (A
      primitive type pattern T is applicable to a type U if there is a casting
      conversion).
- Allow `switch` to support primitive types:
    - Add support for constant expressions of `long`, `float`, `double`,
      `boolean`, and their boxes.
    - Allow boolean switches to be exhaustive when both true and false cases are
      listed.
    - Incorporate primitive patterns in the exhaustiveness check: a) when a
      boolean type pattern is used and b) when a primitive type pattern is
      unconditional to the underlying type of a match candidate which has a
      boxed type.
- Allow `instanceof` to support primitive types:
    - Update the grammar for `instanceof` as a type comparison operator to
      accept any type as its RHS.
    - Add support for `instanceof` according to all supported pairs of
      conversions under a casting context. `instanceof` is the precondition test
      for safe casting in general (of an _exact conversion_). 
 - Define _exactness_ of conversions:
      - Define _unconditionally exact_ conversions for some pair of types (no
        run-time action needed).
      - Define the run-time actions to decide exactness for the rest.
- Translate `instanceof` in `TransPatterns` to invoke the exactness tests when
   needed.
- Allow the `switch` translation in `SwitchBootstraps` to call the exactness
   tests when needed.

Specification
-------------

The updated JLS draft for primitive types in patterns, instanceof, and switch is linked as 
https://cr.openjdk.org/~abimpoudis/instanceof/latest/ and attached as JLS-instanceof.pdf.

The proposed API enhancements are attached as 8304487.specdiff.<n>.zip

The changes to the specification and API are a subject of change until the CSR is finalized.

Comments
Moving to Approved.
22-01-2024

CSR updated! Comments addressed.
22-01-2024

Moving back to Provisional until a few remaining small improvements are made. Minor feedback, for the JLS changes: "P is a floating-point type and x and y are equal according to the result obtained as if by execution of Double.compare(x, y). P is java.math.BigDecimal and x and y are equal according to the result obtained as if by execution of x.compareTo(y)." Pedantically, the compare/compareTo method calls return an *int* value rather than a boolean; a more precise expression would be along the lines of Double.compare(x, y) **== 0** etc. For the API changes, it would be a kindness to readers to explicitly include some notable information directly in the javadoc and not delegate to the JLS as completely. In particular, I think the following information should be present in the API javadoc in some form: * Converting a floating-point negative zero to an integer type is considered inexact. * Converting a floating-point NaN or infinity to an integer type is considered inexact. * Converting a floating-point NaN or infinity or signed zero to another floating-point type is considered exact.
18-01-2024

JLS updated after several rounds of improvements - The spec draft is rebased on top of JLS 22 (inheriting an editorial change that introduces testing contexts (5.7), and incorporating all the changes for unnamed patterns and variables from JEP 456). - It streamlines the definitions of exact conversions and unconditionally exact conversions. Finalizing
04-01-2024

Moving back to Provisional.
17-11-2023

[~darcy] I adopted Raffaello's proposal about the is* convention, e.g., isByteToCharExact. Indeed it looks good! https://github.com/openjdk/jdk/pull/15638/files#diff-99dbd969283945c2cb7e4a6028476e9dba020b4cacf31af0ae6dbea6de303eb6 (also new specdiff attached). I have included a new name for the class ExactConversionsSupport. I am planning to finalize if there are no further comments. WDYT?
16-10-2023

[~rgiulietti] +1. I agree. > For the class "ExactnessMethods", I recommend considering a class > name. [~darcy] can you clarify what you mean by class name? e.g., a name that is not descriptive about the contents like "ExactnessMethods"? > Are any updates needed to the javac tree API? no changes to the tree API
04-10-2023

Work on JDK-8286139 has been suspended in favor of this one. However, I think that having library methods, in addition to this language feature, would be beneficial regardless. Finding the right place for the `ExactnessMethods` class and adding the conversions to complement the test methods there could be opened for discussion. A more conventional way to name the test methods would be something like `isByte()` + reliance on method overloading, or something more explicit like `isByteFromInt()`. A library is more flexible in that it could allow converting from `-0.0` to `0`, for example, without changing the JLS.
04-10-2023

PS What is the current relation between this work and JDK-8286139? cc [~rgiulietti]
03-10-2023

Moving to Provisional, not Approved. "Another example is the case of int to float conversion. Since float is specified to have 23 fractional bits and 1 for sign, we can represent all successive ints with 24 bits without the use of exponent. For integers requiring more bits, we can represent only those with sufficiently low number of leading and trailing zeros. A potential runtime exactness check could be the following:" From a certain point of view this statement could be construed as correct, but it is misleading as the int value converted to float will have nonzero exponents. For example, the integer value 8 would be stored as 0x1.0p3 In other words, 1 times 2^3. From a presentation point of view, I strongly recommend an explicit note stating "this is how signed zero and NaN are handled" rather than relying implicitly on representation equivalence as defined in the core libraries. For the class "ExactnessMethods", I recommend considering a class name. The method "${type1}_${type2}" do not follow typical JDK naming conventions. Something "byteToCharExact" would be more in keeping with naming conventions. I recommend declaring the host class as explicitly final, which is not strictly required as there are no public constructors, but makes the intentions of the class usage clearer. If someone happens to come across this class, it would be a kindness for the floating-point related methods to have a link into the relevant portions of the Double documentation. Are any updates needed to the javac tree API?
02-10-2023