JDK-8329079 : Compiler Implementation for Flexible Constructor Bodies (Second Preview)
  • Type: CSR
  • Component: tools
  • Sub-Component: javac
  • Priority: P4
  • Status: Provisional
  • Resolution: Unresolved
  • Fix Versions: 23
  • Submitted: 2024-03-26
  • Updated: 2024-05-10
Related Reports
CSR :  
Description
Summary
-------

In constructors in the Java programming language, allow statements that do not
read the fields of the instance being created to appear before an explicit
constructor invocation.

Problem
-------

The Java language disallows any statements in constructors prior to an explicit
constructor invocation as a way of preventing access to the current instance
prior to superclass construction. This ensures that object construction proceeds
in an orderly fashion "from the top down".

However, this rule prevents a variety of common patterns that are available to
regular methods. For example:

*   "Fail fast" validation of constructor parameters as the first order of
    business
*   Creating an object to pass to the superclass constructor in two different
    parameter positions
*   Complex preparation and/or selection of superclass constructor parameters
*   Initialization of superclass fields by a subclass

This rule is more restrictive than it needs to be. All of the existing semantic
and safety guarantees relating to constructors are still preserved if code can
appear prior to an explicit constructor invocation as long as two criteria are
met:

*   The code does not refer to the current `this` instance (either explicitly or
    implicitly), except in a field access that appears on the left-hand side of
    a simple assignment.
  
*   There are no `return` statements

Note that the first item is not new; except for the carve-out for simple
assignments, it is exactly the same criterion that applies today to
subexpressions of an explicit constructor invocation.

Solution
--------

The grammar of a constructor body is changed from:

```
ConstructorBody:
    { [ExplicitConstructorInvocation] [BlockStatements] } ;
```
to:
```
ConstructorBody:
    { [BlockStatements] } ;
    { [BlockStatements] ExplicitConstructorInvocation [BlockStatements] } ;
```


The Java Language Specification classifies code that appears in the argument
list of an explicit constructor invocation as being in a _static context_. This
means that the arguments to the explicit constructor invocation are treated as
if they were in a static method; in other words, as if no instance is available.
The technical restrictions of a static context are stronger than necessary,
however, and they prevent code that is useful and safe from appearing as
constructor arguments. Moreover, the `javac` compiler has enforced more relaxed
conditions, which have been exploited by many developers.


For example, subexpressions of an explicit constructor invocation in a
constructor body of an inner class commonly refer to the outer instance; for
example:

```
public class MyClass {
    public void doSomething() {
        // ...
    }
    public class MyThread extends Thread {
        public MyThread() {
            super(MyClass.this::doSomething);
        }
    }
}
```

To fix the JLS, rather than revise the concept of a static context, we define a
new, strictly weaker concept of an _early construction context_ to cover both
the arguments to an explicit constructor invocation and any statements that
appear before it. The rules for code in an early construction context are
similar to the rules for code in an instance method, except for one restriction:
in an early construction context, code must not read the fields of the instance
under construction.

On `amber-dev` there was a good deal of discussion regarding the how far to expand
the rules. The eventual consensus was to take a conservative approach, so the
above set of changes represents a minimal choice. Two other options that were
considered were:

1.  Allow multiple explicit constructor invocations, and use DA/DU analysis to
    ensure exactly one call is ever executed
2.  Allow explicit constructor invocations within `try` blocks, with the
    requirement that if any exceptions are caught the constructor must complete
    abruptly.

Aside: The first condition would not require any change to the JVMS, whereas the
second would require a (straightforward) change.

It should also be noted that this feature does not change the semantics of any
existing code.

## History

This second preview contains one new feature enhancement when compared to the
first preview: 

> Allow a constructor body to assign to fields in the same class before making
> an explicit constructor invocation. This enables a constructor in a subclass
> to ensure that a constructor in a superclass never executes code which sees
> the _default_ value of a field in the subclass (`0`, `false`, `null`). This
> can occur when, due to overriding, the superclass constructor invokes a method
> in the subclass that uses the field. 

This extension provides for an elegant solution for a long-standing issue: Java
allows constructors to invoke overridable methods. This is widely considered bad
practice, but is unfortunately legal. Consider the following example:

```
class Super {
    Super() { overriddenMethod(); }

    void overriddenMethod() { System.out.println("hello"); }
}

class Sub extends Super {
    final int x;
    Sub(int x) { this.x = x; }

    @Override
    void overriddenMethod() { System.out.println(x); }
}
```

What does `new Sub(42);` print? It might be expected to print `42`, but it
actually prints `0`. This because the `Super` constructor is implicitly invoked
_before_ the field assignment in the `Sub` constructor body. The `Super`
constructor then invokes `overriddenMethod`, causing that method in `Sub` to run
before the `Sub` constructor body has had a chance to assign `42` to the field.
As a result, the method in `Sub` sees the default value of the field, which is
`0`. This is the source of many bugs and errors.

Whilst this is considered bad programming practice, it is not uncommon, and it
presents a conundrum for subclasses, especially when modifying the superclass
is not an option.

This JEP can provide a solution for the conundrum by allowing the `Sub` constructor
to _initialize_ the field in `Sub` before the `Super` constructor is invoked.
The example can be rewritten as follows, where only the `Sub` class is changed:

```
class Super {
    Super() { overriddenMethod(); }

    void overriddenMethod() { System.out.println("hello"); }
}

class Sub extends Super {
    final int x;
    Sub(int x) {
        this.x = x;  // Initialize the field before the super call
        super();
    }

    @Override
    void overriddenMethod() { System.out.println(x); }
}
```

Now, `new Sub(42);` will print `42`, because the field in `Sub` is assigned to `42` before `overriddenMethod` is invoked.

In a constructor body, a simple assignment to a field declared in the same class
is _allowed_ in an early construction context, provided the field declaration
lacks an initializer. This means that a constructor body can initialize the
class's own fields in an early construction context, but not the fields of a
superclass.

A constructor body cannot, of course, _access_ any of the fields of the current
instance -- whether declared in the same class as the constructor, or in a
superclass -- until after the explicit constructor invocation.



Specification
-------------

Summary of JLS modifications:

*   Update the grammar to allow statements (other than `return`) to appear prior
    to any explicit constructor invocation.
*   Define the statements up to and including an explicit construction invocation as a
    "early construction context".
*   Narrow the definition of "static context" to exclude  early construction
    contexts.
*   Update restrictions on static contexts to also restrict early construction
    contexts where appropriate.

There are no explicit specification changes for record and enum classes;
constructors in these classes inherit the new rules in the natural way:

*   Enum constructors and non-canonical record constructors may invoke `this()`
    but not `super()`; as a result, these constructors may now permit statements
    before any `this()` invocation.
*   Canonical record constructors are not allowed to explicitly invoke a
    constructor, so there is no effect on them.

Attached is a copy of the JLS changes.

Notes
-----------------

Discussion on amber-dev (these are all one thread):

* https://mail.openjdk.org/pipermail/amber-dev/2023-January/007680.html
* https://mail.openjdk.org/pipermail/amber-dev/2022-October/007537.html
* https://mail.openjdk.org/pipermail/amber-dev/2022-November/007540.html
* https://mail.openjdk.org/pipermail/amber-dev/2022-December/007627.html

Discussion on compiler-dev:

* https://mail.openjdk.org/pipermail/compiler-dev/2023-February/021960.html

Comments
[~darcy]: I added an example into the "History" section. Hope that helps! Thanks.
10-05-2024

Thanks [~gbierman]; yes, an example would be helpful.
09-05-2024

[~darcy] Apologies Joe; now added. Please let me know if you need more details - I could include an example, for example.
07-05-2024

Moving to Provisional. [~gbierman], please explicitly note the changes from the analagous feature in the previous release before the CSR is Finalized.
07-05-2024