Bug ID: JDK-8170351 JEP 301: Enhanced Enums

JDK-8170351 : JEP 301: Enhanced Enums

Type: JEP
Component: tools
Sub-Component: javac

Priority: P3
Status: Closed
Resolution: Withdrawn

Submitted: 2016-11-25
Updated: 2020-09-29
Resolved: 2020-09-29

Related Reports

Relates :	JDK-8049342 - 13.1: improve naming of anon classes for enum members
Relates :	JDK-8151454 - JEP 286: Local-Variable Type Inference
Relates :	JDK-8073381 - need API to get enum's values without creating a new array
Relates :	JDK-6594685 - Parameterized enums
Relates :	JDK-6408723 - Parameterized enum type

Sub Tasks

JDK-8173063 :	Enhanced enums (spec) - Closed
JDK-8177469 :	Compiler implementation for Enhanced Enums - Closed

Description

Summary
-------

Enhance the expressiveness of the `enum` construct in the Java Language by allowing type-variables in enums (_generic enums_), and performing sharper type-checking for enum constants.

Goals
-----

These two enhancements work together to enable enum constants to carry constant-specific _type information_ as well as constant-specific state and behavior.  There are many situations where developers have to refactor enums into classes in order to achieve the desired result; these enhancements should reduce this need.

The following example shows how the two enhancements work together:

```
enum Argument<X> { // declares generic enum
   STRING<String>(String.class), 
   INTEGER<Integer>(Integer.class), ... ;

   Class<X> clazz;

   Argument(Class<X> clazz) { this.clazz = clazz; }

   Class<X> getClazz() { return clazz; }
}

Class<String> cs = Argument.STRING.getClazz(); //uses sharper typing of enum constant
```
Non-Goals
---------

This JEP targets specific enhancements to how enum constants are type-checked. As such, other enum-related features such as:

* allow enum subclassing
* allow enum in non-static contexts

are *outside* the scope of this JEP.

Motivation
----------

Java enums are a powerful construct. They allow grouping of constants - where each constant is a singleton object. Each constant can optionally declare a body, which can be used to override the behavior of the base enum declaration. In the following we will try to model the set of Java primitive types using an enum. Here's a start:

```
enum Primitive {
    BYTE,
    SHORT,
    INT,
    FLOAT,
    LONG,
    DOUBLE,
    CHAR,
    BOOLEAN;
}
```

As stated above, an enum declaration is like a class, and can have constructors; we can use this feature to keep track of the boxed class and the default value of each primitive:

```
enum Primitive {
    BYTE(Byte.class, 0),
    SHORT(Short.class, 0),
    INT(Integer.class, 0),
    FLOAT(Float.class, 0f),
    LONG(Long.class, 0L),
    DOUBLE(Double.class, 0d),
    CHAR(Character.class, 0),
    BOOLEAN(Boolean.class, false);

    final Class<?> boxClass;
    final Object defaultValue;

    Primitive(Class<?> boxClass, Object defaultValue) {
       this.boxClass = boxClass;
       this.defaultValue = defaultValue;
    }

}
```

While this is rather nice, there are some limitations: that the field `boxClass` is loosely typed as `Class<?>`, as the field type needs to be compatible with all the sharper types used by the enum constants. As a result, any attempt to do something like this:

```
Class<Short> cs = SHORT.boxedClass(); //error
```

Will fail with a compile-time error. Even worse, the field `defaultValue` has a type of `Object`. This is unavoidable since the field needs to be shared across multiple constants modelling different primitive types. Hence, static safety is lost, as the compiler allows code like the following:

```
String s = (String)INT.defaultValue(); //ok
```

Let's now try to extend the enum and add some operations to the constants modelling primitive types (for the sake of brevity, in the remainder we will only show a subset of the constants):

```
enum Primitive {
    INT(Integer.class, 0) {
       int mod(int x, int y) { return x % y; }
       int add(int x, int y) { return x + y; }
    },
    FLOAT(Float.class, 0f)  {
       long add(long x, long y) { return x + y; }
    }, ... ;

    final Class<?> boxClass;
    final Object defaultValue;

    Primitive(Class<?> boxClass, Object defaultValue) {
       this.boxClass = boxClass;
       this.defaultValue = defaultValue;
    }

}
```

Again, this results in problems, as there's no way to do something like this:

```
int seven = INT.add(3, 4); //error
```

That's because the static type of `INT` is simply `Primitive` and `Primitive` has no member named `add`. So, in order to add operations to our enum, we need to add the members to the enum declaration itself, as follows:

```
enum Primitive {
    INT(Integer.class, 0),
    FLOAT(Float.class, 0f), ... ;

    final Class<?> boxClass;
    final Object defaultValue;

    Primitive(Class<?> boxClass, Object defaultValue) {
       this.boxClass = boxClass;
       this.defaultValue = defaultValue;
    }

    int mod(int x, int y) {
       if (this == INT) {
          return x % y;
       } else {
          throw new IllegalStateException();
       }
    }

    int add(int x, int y) {
        if (this == INT) {
          return x + y;
       } else {
          throw new IllegalStateException();
       }
    }

    long add(float x, float y) {
        if (this == FLOAT) {
          return x + y;
       } else {
          throw new IllegalStateException();
       }
    }
    ...

}
```

But the code above has, again, several problems. First, this breaks encapsulation: suddenly, `Primitive` acquires a bunch of members, none of which make sense for all the constants. As a result, the implementation of each method becomes more convoluted, as the methods must check whether they have been called on the *right* enum constant. Type-safety is also lost, as the compiler will not detect bad usages such as:

```
int zero = FLOAT.mod(50, 2); //ok
```

All the problems described above can be addressed by removing specific asymmetries between enums and classes, and by refining the way in which enum constants are type-checked. More precisely:

* allow type-parameter in enum declarations
* do not prematurely erase sharp type-information associated with enum constants

With these enhancements, the `Primitive` enum can be rewritten as follows:

```
enum Primitive<X> {
    INT<Integer>(Integer.class, 0) {
       int mod(int x, int y) { return x % y; }
       int add(int x, int y) { return x + y; }
    },
    FLOAT<Float>(Float.class, 0f)  {
       long add(long x, long y) { return x + y; }
    }, ... ;

    final Class<X> boxClass;
    final X defaultValue;

    Primitive(Class<X> boxClass, X defaultValue) {
       this.boxClass = boxClass;
       this.defaultValue = defaultValue;
    }
}
```

This generic declaration is clearly more expressive than the previous one - now the enum constant `Primitive.INT` has a sharper parameterized type `Primitive<Integer>` which means that its members are also sharply typed:

```
Class<Short> cs = SHORT.boxedClass(); //ok!
```

Also, since type information on enum constants is not prematurely erased, the compiler can reason about membership of constants - as demonstrated below:

```
int zero_int = INT.mod(50, 2); //ok
int zero_float = FLOAT.mod(50, 2); //error
```

The compiler is now able to reject the second statement as there's no member `mod` in the enum constant `FLOAT` - which guarantees extra type-safety. 

Description
-----------

### Generic enums

As discussed in JDK-6408723, an important requirement for allowing generics in enums is that type-parameters are fully bound in the enum constant declaration. This allows for a straightforward translation scheme which can augment the one we have today - for instance, given an enum declaration like the following:

```
enum Foo<X> {
   ONE<String>,
   TWO<Integer>;
}
```

The corresponding desugared code will look as follows:
```
/* enum */ class Foo<X> {
   static Foo<String> ONE = ...
   static Foo<Integer> TWO = ...

   ...
}
```

That is, it is still possible to map each constant to a static field declaration, as type bindings are all statically known.

It might be desirable to allow diamond on enum constant initialization - for instance:

```
enum Bar<X> {
   ONE<>(Integer.class),
   TWO<>(String.class);

   Bar(X x) { ... }
}
```

If the diamond syntax is used, special care is required if the enum constant has a body (i.e. it is translated into an anonymous class) and the inferred type is non-denotable. As in the case for diamond with anonymous inner classes, the compiler will have to reject that case.

### Sharper typing of enum constants

Under current rules, the static type of an enum constant is the enum type itself. Under such rules, the constants `Foo.ONE` and `Foo.TWO` above will both have the same type, namely `Foo`. This is undesirable for at least two reasons:

* in case of a generic enum (as `Foo`), the static type of a constant is not sharp enough to capture the full type info carried by that constant
* even in the absence of generic enum, the constant type is not sharp enough to let a client access a member that is only defined on that enum constant (see the example at the beginning of this page)

To overcome this limitation, typing of enum constants should be redefined so that a given enum constant gets its own type. Let E be an enum declaration, and C be a (possibly generic) enum constant declaration in E. The constant C is associated with a sharper type if either of the following conditions are satisfied:

* `C` is of the kind `C<T1, T2 ... Tn>`  but declares no body; the constant sharper type is `E<T1, T2 ... Tn>`
* C has a body; the constant sharper type is an anonymous type (written `E.C`) whose supertype is either
  * `E<T1, T2, ... Tn>` if `C` is of the kind `C<T1, T2, ... Tn>` and `E` is a generic enum
  * `E`, if E is non-generic

These enhanced typing rule allow the static types for `Foo.ONE` and the one for `Foo.TWO` to be different.

Additional Considerations
-----------

**Binary compatibility**

Let's assume we have the following enum:

```
enum Test {
   A { void a() { } }
   B { void b() { } }
}
```

As we have seen, this would be translated as follows:

```
/* enum */ class Test {
   static Test A = new Test() { void a() { } }
   static Test B = new Test() { void b() { } }
}
```

If we allow sharper type for enum constants, a naive approach would translate the code as follows:

```
/* enum */ class Test {
   static Test$1 A = new Test() { void a() { } }
   static Test$2 B = new Test() { void b() { } }
}
```

Here, the binary incompatibility is manifest: the type of the enum constant `A` just changed from `Test` to `Test$1` upon recompilation. This change is going to break non-recompiled clients using `Test`.

To overcome this problem, it is better to take an erasure-based approach: while the static type of `A` might be the sharper type `Test.A` - any reference to the type of the constant gets erased to the base enum type `Test`. This leads to code that is binary compatible with respect to what we had before. However, if everything gets erased to `Test`, how is access to members of a specific enum constants implemented?

```
Foo.A.a();
```

It is easy to see that, if in the code above, symbolic references to `A` are erased to `Test`, the method call will not be well-typed (as `Test` does not have a member named `a`). To overcome this problem, the compiler has to insert a synthetic cast:

```
checkcast Test$1
invokevirtual Test$1::a
```

This is not dissimilar with what happens when accessing members of an intersection type through erasure.

Another orthogonal observation is that the current naming scheme for enum constants classes is too fragile - the names `Test$1` and `Test$2` shown above are essentially order-dependent - this means that changing the order in which enum constants are declared could lead to binary compatibility issues. More specifically, if in the code above `A` is swapped with `B` and the enum is recompiled, the client bytecode above would fail to link, as `Test$1` would no longer have a member method named `a`. This is in stark contrast with the respect to what the JLS has to say about binary compatible evolution of enums:


> Adding or reordering constants in an enum will not break compatibility with pre-existing binaries.


One way to preserve binary compatible evolution would be to emit order insensitive class names, such as `Test$A` and `Test$B` instead of `Test$1` and `Test$2`. The impact of such a change with respect to reflection and serialization is discussed below.

**Serialization**

In Java, all enums are implicitly serializable, as Enum implements Serializable. We would like that the changes provide here be serialization-compatible; they should not change the serialized form.  The serialization specification:

http://docs.oracle.com/javase/6/docs/platform/serialization/spec/serial-arch.html#6469

provides special treatment for enums; the serialized form of an enum constant is its name only, and it is not possible to customize serialization/deserialization of an enum constant. (Note that all enum constants are initialized during the `<clinit>`, and the `Enum.valueOf` method that is used by deserialization calls the enum's static `values()` method, which implicitly forces initialization of the base enum class (and of all the constants)).

In other words, no compatibility problem with respect to the serialized form exists, as the serialized form already does not depend on the class name generated by the compiler.

**Reflection**

Another place where binary names come up is reflection.  The following is perfectly legal reflective code:

```
Class<?> c = Class.forName("Test$1");
System.err.println(c.getName()); //prints Test$1
```

While reflection has restrictions in order to prevent an enum constant to be instantiated reflectively, there's no restriction for inspecting the members of an enum constant class. Therefore, existing code using the idiom above would cease to work should we change the binary form of enum constants.

**Denotability**

Currently, an enum constant is a value, not a type. So, a legitimate question is as to whether enum constants should also be denotable types.

The usual arguments apply here - on the one hand, having a denotable type for an enum constant makes it less magic, and allow programmer to declare variable with that type. But there are also disadvantages:

* could make the code less readable (e.g. A a = A) - as the same ident could mean both value and type
* not clear as to whether all enum constants get their own type; what about an enum constant that does not declare any additional member? Is its type just an alias for the base enum type?

On the other hand, if the enum constant type is a non-denotable type, it becomes an opaque thing that programmers can only interact with indirectly (e.g. through type inference). To mitigate some of the drawbacks of a non-denotable type, it is important to note that the proposal to add local variable type inference could technically allow programmers to declare variables with the sharper enum type, even though it is non-denotable (e.g. `var a = A`).

**Accessibility**

There is one corner case with respect to accessibility of members through the enum sharper type. Consider the following case:

```
package a;

public enum Foo {
  A() { 
    public String s = "Hello!";
  };
}

package b;

class Client {
   public static void main(String[] args) {
      String s = Foo.A.s; //IllegalAccessError
   }
}
```

When executing this code, the VM will issue an `IllegalAccessError`; the problem is that the anonymous class for the enum constant `Foo$A` is package-private; as a result, an attempt to access a public field in a package-private class from another package will result in an access error. To overcome this problem, the enum constant class should have same modifier as the enum class in which it is defined.

**Source compatibility**

From a source compatibility perspective, there are cases in which sharper typing could leak out as a result of an interaction between this feature and type inference - consider the following code:

```
EnumSet<Test> e = EnumSet.of(Test.A);
```

The code above used to behave in a relatively straightforward fashion: the static type of `Test.A` is simply `Test`, meaning that inferring the type-variable of `EnumSet.of` was simple, as both constraints named the type `Test`. But if we change the way in which `Test.A` is type-checked, the behavior gets more interesting: the type-variable of `EnumSet.of` will get two competing constraints: it must be equal to `Test` (form the target-type) and it must be a supertype of `Test.A`. Luckily, in such a scenario, type inference is smart enough to prefer the stricter equality constraint, and ends up inferring `Test`.  All things considered, the source compatibility impact of this change is not too different from the one in JDK-8075793, where the change caused capture variables to appear in more places instead of their upper bounds.

Risks and Assumptions
---------------------

This proposal has two main risks outlined in the sections above:

* change in binary names of enum constants could lead to issues with core reflection
* change in typing of enum constants could result in subtle changes in method type inference, especially in the absence of a target-type

The first problem is probably nothing to be concerned about; as it has been shown, binary names of enum constants is currently very fragile and prone to re-ordering issues. As a result, any code that is relying on the binary name of an enum constant is inherently fragile, as it is essentially relying on the output of a specific compiler.

The second problem is more worrisome, as it could cause potential source compatibilities. In order to detect how frequent the source incompatibility scenario described above could be, we have measured how many times the `EnumSet.of` method was called with various arities; for each call we kept track of whether the call occurred in a context where a target type was available. Below are the results (the measurements have been taken against the full open JDK forest).

* Total calls to EnumSet.of: 150
  * calls with arity = 1 : 69
    * of which, without target-type: 0

In other words, the source compatibility scenario described above does not seem to pose any serious threat.

Dependencies
------------

The sharper type used for an enum constant are not necessarily denotable; these would constitute another category of non-denotable types.  This may interact with the treatment of non-denotable types in JEP-286 (Local Variable Type Inference).  Depending on decisions made in JEP-286 regarding non-denotable types, one might be able to say:

    var a = Argument.String;

and have the type of `a` be the sharper type `Argument.String` rather than the coarser type `Argument`.

Comments

After conducting some real world experiments using the feature described in this JEP it became apparent [1] that generic enums don't play well with generic methods. The issues are especially evident when considering static generic methods accepting a Class<X> parameter, where X models an enum type, many of which are defined in the Java SE API itself, like EnumSet::allOf, EnumSet::noneOf. In such cases, passing a class literal corresponding to a generic enum as a paramater would result in a compile-time error --- because of a failure in generic type well-formedness. A proposal attempting to rectify these issues was later formulated and discussed [2], but was also found lacking, as it essentially amounted at promoting the use of more raw types, and, more broadly, raised concerns regarding the return on complexity associated with the enhanced-enums feature. For these reasons, we are now withdrawing this JEP. [1] - http://mail.openjdk.java.net/pipermail/amber-spec-experts/2017-May/000041.html [2] - http://mail.openjdk.java.net/pipermail/amber-spec-experts/2018-December/000876.html
29-09-2020
the current implementation does update javadoc code. Of course this is an area for which more tests are needed and could change in the future but it has been covered as part of the first iteration of the implementation
28-03-2017
Can I suggest a SubTask for "javadoc updates for Enhanced Enums" ? It's not obvious to me that the existing javadoc code will support enhanced enums without any change to the standard doclet.
28-03-2017
@John - I think extending custom interfaces, is totally doable from a language/translation strategy perspective (an enum with a body is just a class), but I fear it would be messy from a programming model perspective. First, enums can already implement interfaces - so you can't really opt out from the supertypes you get from the enum declaration itself: enum Foo implements A { CONST() extends B { } } Realistically, this can only mean that CONST implements _both_ A and B, which can be a tad confusing when looking at this from a syntactic perspective. But maybe that's just matter of finding a better syntax. Another - deeper - consequence, is that, by allowing custom _additional_ supertypes would result in a proliferation of intersection types when doing inference - for instance, cases like: EnumSet.of(BYTE, INT) would return not Primitive, but Primitive & Bitwise<Primitive>. Speaking more generally, I think that while enum constants (with body) are modelled as classes from a classfile perspective, their 'classness' was never meant to be too exposed in the programming model. Giving enum constants sharper types is mostly about not throwing away the static information the compiler knows about a given constant. Adding generics enum is really adding generics _to the enum declaration_ not to the constant part (that is, a constant can only instantiate the type-variables declared in the enum). So, if this proposal was about adding type-var declarations to the enum constant themselves, I'd agree with you that from there to add custom supertypes would be a short hop. But that's not what this proposal is about.
27-03-2017
Since the enum subtypes are separately defined classes, is there any reason not to allow them to implement different interfaces, as well as have different type parameters? Both are refinements of the common enum type. Allowing interfaces would enable the subtypes to share code between themselves (via default methods). Example: enum Primitive { BYTE(Byte.class) extends Bitwise<BYTE> { }, INT(Integer.class) extends Bitwise<INT> { } , FLOAT(Float.class) extends Floating<FLOAT> { }, BOOLEAN(Boolean.class); // unique kind of type, no special supers interface Bitwise<T extends Primitive> { T bitwiseAnd(T x, T y); T bitwiseOr(T x, T y); ... } ... } Code sharing like this also pushes on the question of denoting the enum subtype inside the body of the subtype and elsewhere. This example assumes that the enum name is in scope as a type name in the body itself.
27-03-2017
[~jrose] I have compiled the test case you provided, I will added as a regression test. Using the current implementation of enhanced enums, it's compiling and printing the expected output.
27-03-2017
If enum nested classes are going to be distinct static subtypes, they should also be able to carry resolvable static members. For example, an enumeration of primitive types should allow independent "public static final" constants on each enum. Such constants should be useable, e.g., as switch labels. This is a natural consequence of distinguishing enum subtypes. It is also a useful way to get more static information from enums. enum EnumCon { BYTE { @Override Class<?> type() { return byte.class; } static final int BITSIZE = 8; }, INT { @Override Class<?> type() { return int.class; } static final int BITSIZE = 32; }; abstract Class<?> type(); public static void main(String... av) { // these next two lines fail to compile under current rules, // because typeof(BYTE) = EnumCon, not EnumCon$$BYTE. System.out.println("BYTE.BITSIZE = "+BYTE.BITSIZE); // should print 8 System.out.println("INT.BITSIZE = "+INT.BITSIZE); // should print 32 } }
27-03-2017