JDK-8306915 : Implementation of JEP Launch Multi-File Source-Code Programs
  • Type: CSR
  • Component: tools
  • Sub-Component: javac
  • Priority: P4
  • Status: Closed
  • Resolution: Approved
  • Fix Versions: 22
  • Submitted: 2023-04-26
  • Updated: 2024-10-02
  • Resolved: 2023-11-28
Related Reports
CSR :  
Relates :  
Relates :  
Description
Summary
-------

Enhance the `java` launcher to run a program supplied as one or more files of Java source code.

Problem
-------

The single file limitation of source code programs makes it hard to gradually grow a project and create a build configuration when appropriate rather than when necessitated by the JDK tools.

Solution
--------

We enhance the source-code launching capabilities of `java` so that it compiles source files in memory. However, unlike JEP 330, the compilation has a computed source-path rather than an empty one.

The means of launching a source code program is unchanged. It is still `java ... Prog.java`. But instead of only compiling the file `Prog.java` in memory, other source files required by the program will also be compiled in memory.

For "shebang" files, we only compile the single file (the source-path is empty).

Specification
-------------

In keeping with JEP 330, we do not require that the launched file have the same name as its public class.

(The sections "How the launcher finds source files" and "Launch-time semantics and operation" in the JEP are reproduced here)

### How the launcher finds source files

The `java` launcher expects that the source files of a multi-file program are located in a [standard directory hierarchy](https://docs.oracle.com/en/java/javase/20/docs/specs/man/javac.html#directory-hierarchies), where the directory structure follows the package structure. This means that (1) source files in the same directory are expected to declare classes in the same package, and (2) a source file in directory `foo/bar` declares a class in package `foo.bar`.    

For example, suppose a directory contains `Prog.java`, which declares classes in the unnamed package, and a subdirectory `pkg`, where `Helper.java` declares the class `Helper` in the package `pkg`:

```
// Prog.java
class Prog {
    public static void main(String[] args) { pkg.Helper.run(); }
}

// pkg/Helper.java
package pkg;
class Helper {
    static void run() { System.out.println("Hello!"); }
}
```

Running `java Prog.java` will cause `Helper.java` to be found in the `pkg` subdirectory and compiled in memory, resulting in the class `pkg.Helper` needed by code in class `Prog`.

If `Prog.java` declared classes in a named package, or `Helper.java` declared classes in a package other than `pkg`, then `java Prog.java` would fail.

The `java` launcher computes the _root_ of the source tree from the package and the filesystem location of the initial `.java` file. For `java Prog.java`, the initial file is `Prog.java` and it declares a class in the unnamed package, so the root of the source tree is the directory containing `Prog.java`. On the other hand, if `Prog.java` declared a class in a named package `a.b.c`, then `Prog.java` must be placed in the corresponding directory hierarchy:

```
a/
  b/
    c/
      Prog.java
```

and must be launched by running `java a/b/c/Prog.java`. The root of the source tree is the directory containing the subdirectory `a`.

If `Prog.java` declared classes in a different named package, then `java a/b/c/Prog.java` would fail. *This is a change in behavior of the `java` launcher's source-file mode.* Prior to JDK NN, source-file mode was permissive about which package, if any, was declared in a `.java` file at a given location; `java a/b/c/Prog.java` would succeed as long as `Prog.java` was found in `a/b/c/`, regardless of its `package` declaration. Since it is unusual for a `.java` file to declare classes in a named package without residing in the corresponding directory hierarchy, it is unlikely that the package is important; the simple fix is to remove the `package` declaration from the file.

### Launch-time semantics and operation

Since JDK 11, the launcher's source-file mode has worked as if

```
java <other options> --class-path <path> <.java file>
```

is informally equivalent to

```
javac <other options> -d <memory> --class-path <path> <.java file>
java  <other options> --class-path <memory>:<path> <first class in .java file>
```

With the ability to launch multi-file source-code programs, source-file mode now works as if

```
java <other options> --class-path <path> <.java file>
```

is informally equivalent to

```
javac <other options> -d <memory> --class-path <path> --source-path <root> <.java file>
java <other options> --class-path <memory>:<path> <launch class in .java file>
```

where `<root>` is the computed root of the source tree [as explained earlier](#How-the-launcher-finds-source-files), and `<launch class in .java file>` is chosen as follows:

* If the first top level class in the `.java` file declares a standard `main` method (`public static void main(String[])`), that class is chosen. This preserves compatibility with JEP 330, so the same `main` method is used when a source program grows from single-file to multi-file. It is also important for launching ["shebang" files](#"Shebang"-files) whose name may not match that of any class in the file.

* If the first top level class in the `.java` file does not declare a standard `main` method, then if another top level class in the file declares a standard `main` method and has a name that matches the file, that class is chosen. This maintains an experience as close as possible to that of launching a program compiled with `javac`. That is, when a source program grows to the point that it is desirable to run `javac` explicitly and execute the `class` files, the same launch class can be used.

(The use of `--source-path` indicates to `javac` that classes co-located in a `.java` file are preferred to classes located in other `.java` files. For example, invoking `javac --source-path dir dir/Prog.java` will not compile `Helper.java` if `Prog.java` declares the class `Helper`.)

When the `java` launcher runs in source-file mode (e.g., `java Prog.java`) it takes the following steps:

1. Compute the directory which is the root of the source tree.

2. Determine the module of the source-code program. If a `module-info.java` file exists in the root then its module declaration is used to define a named module that will contain all the classes compiled from `.java` files in the source tree. If `module-info.java` does not exist then all the classes compiled from `.java` files will reside in the unnamed module.

3. Compile all the classes in the initial `.java` file, and possibly other `.java` files which declare classes referenced by code in the initial file, and store the resulting `class` files in an in-memory cache.

4. Determine the launch class in the initial file. If the first top level class in the initial file declares a standard `main` method, that class is the launch class; otherwise, if another top level class in the initial file declares a standard `main` method and has same name as the file, that class is the launch class; otherwise, there is no launch class, and the launcher reports an error and stops.

5. Use a custom class loader to load the launch class from the in-memory cache, then invoke the standard `main` method of that class.

When the custom class loader is invoked to load a class — either the launch class or any other class that needs to be loaded while running the program — the loader performs a search that mimics the order of `javac`'s [`-Xprefer:source`](https://docs.oracle.com/en/java/javase/21/docs/specs/man/javac.html#searching-for-module-package-and-type-declarations) option at compile time. In particular, if a class exists both in the source tree (declared in a `.java` file) and on the class path (in a `.class` file) then the class in the source tree is preferred. The loader's search algorithm for a class named `C` is:

1. If a class file for `C` is found in the in-memory cache then the loader defines the cached class file to the JVM, and loading of `C` is complete.

2. Otherwise, the loader delegates to the application class loader to search for a class file for `C` that is exported by a named module which is read by the module of the source-code program and, also, is present on the module path or in the Java run-time image. (The unnamed module, in which the source-code program may reside, reads a [default set of modules](https://openjdk.org/jeps/261#Root-modules) in the Java run-time image.) If found, loading of `C` is completed by the application class loader.

3. Otherwise, the loader searches for a `.java` file whose name matches the name of the class (or the enclosing class if the requested class is a member class), i.e. `C.java`, located in the directory corresponding to the package of the class. If found, all the classes declared in the `.java` file are compiled. If compilation succeeds then the resulting class files are stored in the in-memory cache, the loader defines the class `C` to the JVM using the cached class file, and loading of `C` is complete. If compilation fails then the launcher reports the error and terminates with a non-zero exit status.

    When compiling `C.java`, the launcher may choose to eagerly to compile other `.java` files that declare classes referenced by `C.java`, and store the resulting class files in the in-memory cache. This choice is based on heuristics that may change between JDK releases.

4. Otherwise, if the source-code program resides in the unnamed module, the loader delegates to the application class loader to search for a class file for `C` on the class path. If found then loading of `C` is completed by the application class loader.

5. Otherwise, a class named `C` cannot be found, and the loader throws a `ClassNotFoundException`.

Classes loaded from the class path or the module path cannot reference classes that are compiled in memory from `.java` files. That is, when class references in pre-compiled classes are encountered, the source tree is never consulted.

### Differences between compilation at compile-time and launch-time

There are some major differences between how the Java compiler compiles code on the source path when using `javac` and how it compiles code when using the `java` launcher in source-file mode:

1. In source-file mode, the classes that are referenced and found in `.java` files may be compiled *during program execution*, rather than all being compiled before execution starts. This means that a compilation error may occur, causing the launcher to terminate, after the program has already started executing. This developer experience is very different than prototyping with explicit compilation via `javac`, but it works effectively in the fast-moving "edit-run" cycle enabled by source-file mode.

2. In source-file mode, classes that are accessed via reflection are loaded in the same manner as classes that are accessed directly. For example, if the program calls `Class.forName("pkg.Helper")`, then the launcher's custom class loader will attempt to load the class `Helper` in the package `pkg`, potentially causing compilation of `pkg/Helper.java`. Similarly, if a package's annotations are queried via `Package.getAnnotations`, then an appropriately-placed `package-info.java` file in the source tree will be compiled in memory and loaded.

3. In source-file mode, annotation processing is disabled, similar to when `--proc:none` is passed to `javac`.

4. In source-file mode, it is not possible to run a source code program whose `.java` files span multiple modules.

Comments
Moving to Approved contingent on a release note describing the behavioral incompatibility being written.
28-11-2023

[~darcy] > What experiments, if any, have been done to see if the behavioral incompatibilities have much impact in practice? It's impossible to test the impact, because it depends on usage in the field, i.e. on the question of, how often do people write single-file source programs that declare a package but aren't placed in an appropriate directory? However, the remediation is simple: either remove the package name from the single-source-file program (which must be self-contained) or place it in an appropriate directory. The latter is likely to be the case, anyway, if the program is part of the project; the former should be easy if it's not. > Should the top-level documentation of jdk.compiler be updated to discuss the differences in the implicit compilation done here? I don't think so. The differences are in the launcher, not in the behaviour of the compiler. > Besides the java man page, is there any other specification or documentation that should be updated to describe what is being done? There are changes to the online `java` help, too, but when it comes to external documentation, I believe source-code launching has always only been documented in the `java` man page.
03-11-2023

Moving to Provisional, not Approved. What experiments, if any, have been done to see if the behavioral incompatibilities have much impact in practice? The attached man page updates reference JEP 330 rather than JEP 458, which I assume is an oversight. Should the top-level documentation of jdk.compiler be updated to discuss the differences in the implicit compilation done here? Besides the java man page, is there any other specification or documentation that should be updated to describe what is being done? Before the CSR is re-Finaized, please also attach some rendered version of the full updated man page.
31-10-2023

A Finalized CSR should have a specific release set; i assume this is intended for JDK 22.
04-10-2023

Moving back to Provisional. [~rpressler], please have one or more engineers review this CSR before re-Finalizing it.
06-05-2023

Added draft diff to the `java` man page.
02-05-2023

Before this CSR is Finalized, the the diffs to the java command "man page" (i.e. https://download.java.net/java/early_access/jdk21/docs/specs/man/java.html#using-source-file-mode-to-launch-single-file-source-code-programs) should be included as well. Moving to Provisional, not Approved.
02-05-2023