JDK-8051619 : JEP 201: Modular Source Code
  • Type: JEP
  • Component: not defined
  • Priority: P1
  • Status: Closed
  • Resolution: Delivered
  • Fix Versions: 9
  • Submitted: 2014-07-22
  • Updated: 2020-12-07
  • Resolved: 2014-09-24
Related Reports
Blocks :  
Relates :  
Relates :  
Relates :  
Relates :  
Sub Tasks
JDK-8054834 :  
Description
Summary
-------

Reorganize the JDK source code into modules, enhance the build system to
compile modules, and enforce module boundaries at build time.


Non-Goals
---------

This JEP does not change the structure of the JRE and JDK binary images,
nor does it introduce a module system.  That work is covered by the
related JEPs [220](http://openjdk.java.net/jeps/220) and
[261](http://openjdk.java.net/jeps/261).

This JEP defines a new source-code layout for the JDK.  This layout may
be used outside of the JDK, but it is not a goal of this JEP to design a
broadly-accepted universal modular source-code layout.



Motivation
----------

[Project Jigsaw][jig] aims to design and implement a standard module
system for the Java SE Platform and to apply that system to the
Platform itself, and to the JDK.  Its primary goals are to make
implementations of the Platform more easily scalable down to small
devices, improve security and maintainability, enable improved
application performance, and provide developers with better tools for
programming in the large.

The motivations to reorganize the source code include:

  1. Give JDK developers the opportunity to become familiar with the
     modular structure of the system;

  2. Preserve that structure going forward by enforcing module boundaries
     in the build, even prior to the introduction of a module system; and

  3. Enable development of Project Jigsaw to proceed without always
     having to "shuffle" the present non-modular source code into modular
     form.


Description
-----------


### Current scheme

Most of the JDK source code is today organized, roughly, in a scheme that
dates back to 1997.  In abbreviated form:

    src/{share,$OS}/{classes,native}/$PACKAGE/*.{java,c,h,cpp,hpp}

where:

  - The `share` directory contains shared, cross-platform code;

  - The `$OS` directory contains operating-system-specific code, where
    `$OS` is one of `solaris`, `windows`, _etc._;

  - The `classes` directory contains Java source files, and possibly
    resource files;

  - The `native` directory contains C or C++ source files; and

  - `$PACKAGE` is the relevant Java API package name, with periods
    replaced by slashes.

To take a simple example, the source code for the `java.lang.Object`
class in the `jdk` repository resides in two files, one in Java and the
other in C:

    src/share/classes/java/lang/Object.java
              native/java/lang/Object.c

For a less trivial example, the source code for the package-private
`java.lang.ProcessImpl` and `ProcessEnvironment` classes is
operating-system-specific; for Unix-like systems it resides in three
files:

    src/solaris/classes/java/lang/ProcessImpl.java
                                  ProcessEnvironment.java
                native/java/lang/ProcessEnvironment_md.c

(Yes, the second-level directory is named `solaris` even though this code
is relevant to all Unix derivatives; more on this below.)

There are a handful of directories under `src/{share,$OS}` that don't
match the current structure, including:

    Directory                     Content
    --------------------------    --------------------------
    src/{share,$OS}/back          JDWP back end
                    bin           Java launcher
                    instrument    Instrumentation support
                    javavm        Exported JVM include files
                    lib           Files for $JAVA_HOME/lib
                    transport     JDWP transports


### New scheme

The modularization of the JDK presents a rare opportunity to completely
restructure the source code in order to make it easier to maintain.  We
implement the following scheme in every repository in the JDK forest
except for `hotspot`.  In abbreviated form:

    src/$MODULE/{share,$OS}/classes/$PACKAGE/*.java
                            native/include/*.{h,hpp}
                                   $LIBRARY/*.{c,cpp}
                            conf/*
                            legal/*

where:

  - $MODULE is a module name (_e.g._, `java.base`);

  - The `share` directory contains shared, cross-platform code, as
    before;

  - The `$OS` directory contains operating-system-specific code, as
    before, where `$OS` is one of `unix`, `windows`, _etc._;

  - The `classes` directory contains Java source files and resource files
    organized into a directory tree reflecting their API `$PACKAGE`
    hierarchy, as before;

  - The `native` directory contains C or C++ source files, as before but
    organized differently:

    - The `include` directory contains C or C++ header files intended to
      be exported for external use (_e.g._, `jni.h`);

    - C or C++ source files are placed in a `$LIBRARY` directory, whose
      name is that of the shared library or DLL into which the compiled
      code will be linked (_e.g._, `libjava` or `libawt`); and, finally,

  - The `conf` directory contains configuration files meant to be edited
    by end users (_e.g._, `net.properties`).

  - The `legal` directory contains legal notices.

To recast the previous examples, the source code for the
`java.lang.Object` class is laid out as follows:

    src/java.base/share/classes/java/lang/Object.java
                        native/libjava/Object.c

The source code for the package-private `java.lang.ProcessImpl` and
`ProcessEnvironment` classes is laid out this way:

    src/java.base/unix/classes/java/lang/ProcessImpl.java
                                         ProcessEnvironment.java
                       native/libjava/ProcessEnvironment_md.c

(We took the opportunity here, finally, to rename the `solaris` directory
to `unix`.)

The content of the directories currently under `src/{share,$OS}` that
don't match the current structure is now in appropriate modules:

    Directory                     Module
    --------------------------    --------------------------
    src/{share,$OS}/back          jdk.jdwp.agent
                    bin           java.base
                    instrument    java.instrument
                    javavm        java.base
                    lib           $MODULE/{share,$OS}/conf
                    transport     jdk.jdwp.agent

Files in the current `lib` directory that are not intended to be edited
by end users are now resource files.


### Build-system changes

The build system now compiles one module at a time rather than one
repository at a time, and it compiles modules according to a reverse
topological sort of the module graph.  Modules that do not depend on each
other, directly or indirectly, are compiled concurrently when possible.

A side benefit of compiling modules rather than repositories is that code
in the `corba`, `jaxp`, and `jaxws` repositories can make use of new Java
language features and APIs.  This was previously forbidden, since those
repositories were compiled before the `jdk` repository.

The compiled classes in an intermediate (_i.e._, non-image) build are
divided into modules.  Where today we have:

     jdk/classes/*.class

the revised build system produces:

     jdk/modules/$MODULE/*.class

The structure of image builds, as noted, does not change; there are very
minor differences in their content.

Module boundaries are enforced at build time, insofar as possible, by the
build system.  If a module boundary is violated then the build will fail.


Alternatives
------------

There are numerous other possible source-layout schemes, including:

  1. Keep `{share,$OS}` at the top, with a `modules` directory to contain
     module class files:

         src/{share,$OS}/modules/$MODULE/$PACKAGE/*.java
                         native/include/*.{h,hpp}
                                $LIBRARY/*.{c,cpp}
                         conf/*

  2. Put everything under the appropriate `$MODULE` directory, but keep
     `{share,$OS}` at the top:

         src/{share,$OS}/$MODULE/classes/$PACKAGE/*.java
                                 native/include/*.{h,hpp}
                                        $LIBRARY/*.{c,cpp}
                                 conf/*

  3. Push `{share,$OS}` down into the `$MODULE` directories, as in the
     present proposal, but remove the intermediate `classes` directory
     and prefix the names of the `native` and `conf` directories with an
     underscore, all so as to simplify the common case of pure Java
     modules:

         src/$MODULE/{share,$OS}/$PACKAGE/*.java
                                 _native/include/*.{h,hpp}
                                         $LIBRARY/*.{c,cpp}
                                 _conf/*

  4. A variant of scheme 3, but with `{share,$OS}` at the top:

         src/{share,$OS}/$MODULE/$PACKAGE/*.java
                                 _native/include/*.{h,hpp}
                                         $LIBRARY/*.{c,cpp}
                                 _conf/*

  5. Another variant of scheme 3, pushing `{share,$OS}` deeper down so as
     to further simplify the case of pure Java modules with no
     `$OS`-specific code:

         src/$MODULE/$PACKAGE/*.java
                     _native/include/*.{h,hpp}
                             $LIBRARY/*.{c,cpp}
                     _conf/*
                     _$OS/$PACKAGE/*.java
                         _native/include/*.{h,hpp}
                                 $LIBRARY/*.{c,cpp}
                         _conf/*

We rejected the schemes involving underscores (3–5) as too
unfamiliar and difficult to navigate.  We prefer the present proposal
over schemes 1 and 2 because it entails the least change from
the current scheme while placing all of the source code for a module
under a single directory.  Tools and scripts that depend upon the current
scheme must be revised, but at least for Java source code the structure
underneath each `$MODULE` directory is the same as before.

Additional issues which we considered:

  - _Should we define distinct directories for resource files, so that
    they would be separate from Java source files?_ — No; this
    does not seem worth the trouble.

  - _Some modules have content that spans repositories; is this a
    problem?_ — It's an annoyance, but the build system can cope
    with it via the magic of the `VPATH` mechanism.  Over time we might
    restructure the repositories to reduce or even eliminate cross-repo
    modules, but that's beyond the scope of this JEP.

  - _Some modules have multiple native libraries; should we merge them so
    that each module has at most one native library?_ — No; in
    some cases we need the flexibility of multiple native libraries per
    module, _e.g._, for "headless" _vs._ "headful" AWT.


Testing
-------

As stated, this JEP does not change the structure of the JRE and JDK
binary images, and makes only minor changes to the content.  We therefore
validated this change by comparing images built with it against images
built without it, and running tests to validate the actual minor changes.


Risks and Assumptions
---------------------

We assumed that Mercurial would be able to handle the massive number of
file-rename operations that would be necessary to implement this change,
and to preserve all historical information in the process.  Early testing
showed Mercurial to be capable of this, but there is still a minor risk
that the relationships between the new and old locations of some files
were not properly recorded.  In that case the history of the file in its
old location will still be in the repository; it will just be more
difficult to find.

It is impossible to apply a patch created against a repository using the
old scheme directly to a repository using the new scheme, and vice versa.
To mitigate this we developed a script to translate the file names in a
patch from their old locations to their new locations.


Dependences
-----------

This JEP is the second of several JEPs for [Project Jigsaw][jig].  It
incorporates the definition of the modular structure of the JDK from
[JEP 200][jmj], but it does not explicitly depend upon that JEP.


[jig]: http://openjdk.java.net/projects/jigsaw/
[jmj]: https://bugs.openjdk.java.net/browse/JDK-8051618
[mxml]: https://bugs.openjdk.java.net/secure/attachment/21575/modules.xml

Comments
Under non-goals it should be mentioned that the jtreg test source code will not be moving/reorganized/modularized as part of this JEP.
15-08-2014