Summary
-------
Introduce autoconf (`./configure`-style) build setup, refactor the Makefiles
to remove recursion, and leverage
[JEP 139: Enhance javac to Improve Build Speed](JDK-8046129).
Goals
-----
The top level goals that we are trying to achieve are:
1. Increase build speed radically
2. Simplify build-system source code (Makefiles, _etc._)
3. Simplify work for developers
4. Get exact and reproducible build output
5. Simplify build-machine configurations (JPRT, _etc._)
We will address these goals by four sub-projects, which are more or less
tightly intertwined.
1. Update the Makefile structure
2. Use autoconf (configure script)
3. Add parallel Java compilation support
4. Make Java builds incremental
We need to understand properly existing developer workflows so that we can
minimize the impact of this change for everyone.
This project is part of a larger effort to improve the build infrastructure of
the JDK. We expect this project to be closely followed by the future steps. The
distinction between these steps is somewhat arbitrary, and is made only to
quickly benefit from the first work done on improving the JDK build
infrastructure.
Non-Goals
---------
Since we will update the Makefiles with a new structure, several issues that we
want to address in the future might turn out to work just by themselves as an
effect of the update. However, we are not specifically addressing these issues
during this project, and we will not test them nor make any guarantee that they
will work properly. (We will, however, try to make sure that we don't break
anything that works.) These issues include:
- Make it easy to port to new platforms
- Make it possible to do JDK development without a network connection
- Provide proper support for cross-compilation, including compilation of
32-bit binaries on 64-bit hosts
- Improve the handling of warnings
We will also not address issues that are scheduled for future steps. (However,
some of this work will lay the ground for these future improvements.) These
issues include:
- Speed up Hotspot compilation
- Upgrade compilers
- Support IDE projects
- Reconsider the source-drop mechanism
Success Metrics
---------------
### Build simplicity
Given that all prerequisites are available, building should be accomplished by:
1. Getting the source code from the Mercurial repositories
2. `./configure`
3. `make`
### Build speed
Build speed depends on hardware factors, and improvements will vary. Our target
is compiling on Linux on an 8-way machine. In this case, the time spent
building the JDK after our improvements should be at most 33% of the current
time. (Typically this means going from ~15 minutes to ~5 minutes, or less). A
stretch goal is that build time should be at most 20% (~3 minutes) for the JDK.
Note that this is just for the JDK. It does not include building Hotspot, nor
creating Javadoc.
### Makefile cleanup
All small (<3 kB) recursive Makefiles in the JDK (not including Hotspot) should
be removed, and the functionality collected into central Makefiles. (A small
number of Makefiles is not in itself a goal, however, having the code in one
(or a few) places helps with overview and understanding.)
Motivation
----------
Building the complete JDK is unnecessarily slow. This puts an extra burden on
developers and build systems. As a result, developers check out and build just
a part of the source code, since the product as a whole takes too long to
build.
The current implementation of the build system, with more than 350 minimal,
recursive Makefiles scattered all around the product, makes it hard to make
changes to the build system. The current solution also sometimes requires
updating Makefiles just to add new source files or directories; this should not
be needed.
Today the build system is configured by using several environment
variables. This is in contrast to the popular method of using `./configure` to
set up the build system. Apart from familiarity, this has several benefits over
environment variables. Arguments to configure are checked -- a misspelled
argument results in an error, whereas a misspelled environment variable is just
ignored. `./configure --help` shows a list of available arguments, whereas it
is almost impossible to get a comprehensive list of all environment variables
that affect the current build system.
Description
-----------
These changes will not result in any changes in the built product; they only
affect the internal development process.
### Update the Makefile structure
#### Background
Updating the old Makefiles to a new, simplified architecture will be
fundamental for all other work described here.
#### Implementation
The current style of recursive makefiles with one file per directory will be
removed. Instead, the makefiles will discover files to be compiled by looking
recursively into source-code directories. Files that should not be compiled
will instead be listed as explicit exclusions. This will be needed to be able
to use the new parallel javac compiler.
Code common to several subsystems will be stored in a new, top-level directory
"common/make". The design idea is that these common files will provide a
library with helper functions, so that the per-subsystem Makefiles can be
written as simply and cleanly as possible. We will accept a greater code
complexity in these libraries if it allows for increased simplicity of the
per-subsystem Makefiles.
Since good coding practice is not automatically enforced by the Makefile
syntax, we will take extra care to make sure we write proper and readable code.
As part of the update, we will produce a document describing the coding
guidelines we have found useful and have followed during the rewrite process,
so as to guide future changes in the Makefiles. We will also produce a document
describing the overall architecture of the Makefiles.
The Makefiles do other things apart from building the resulting binary, or
build unusual variants of the binaries. Some of these targets appear arcane and
not used anymore. If all stakeholders agree then we will not port such targets
to the new system. This is a list of features we're so far considering to
remove:
- _(currently empty)_
#### Mixing old and new
It is probably possible to keep the old Makefile system around, in parallel
with the new rewritten Makefile system, so we have two ways of building the
product (new and old) for some time. This is not really desirable, since it
risks leading to code duplication and general confusion, and will make us miss
out the benefits of removing the old stuff. However, keeping the old system, or
having an easy way of restoring the old system, would help us manage the risk
involved.
#### Transition
Most developers will not have much interaction with the actual makefiles, so
there will not be any large changes in workflow.
Previously, sometimes the Makefiles needed to be updated whenever source files
or directories were added or deleted. This will not be needed anymore, and this
needs to be communicated to all developers.
Developers who want to change the actual Makefiles need to understand the
overall design and coding principles used. This will be documented, but the
existence of these documents needs to be communicated.
### Use autoconf (configure script)
#### Background
The basic idea behind autoconf is that a single, simple interface will handle
the "glue" issues between a user's system's configuration and the requirements
of the Makefiles. This interface is the `./configure` shell script.
Using autoconf has thus two facets -- creating and using the `./configure`
script. The configure script is generated by the autoconf tools, from the
source code in `configure.ac` (and accompanying helper files), which is written
using M4 macros. From this source code, a `configure` shell script is
generated. This script (even though it is generated) is checked in into the
repository. Whenever the `configure.ac` source code is changed, the `configure`
script needs to be regenerated and updated in the repository. To regenerate
`configure`, the autoconf tools need to be installed on the system.
The typical user, however, will not need to do this. Since `configure` is
checked in, he/she only needs to run `./configure`. To do this, the autoconf
tools are not needed. This results in a `config.spec` file in Makefile syntax,
which determines the build details, and which is included by the Makefile.
#### Autoconf implementation
The configure script has three major tasks:
1. Determine that all build dependencies are present.
2. Analyze known differences between platforms and determine which applies in
the current situation.
3. Apply the arguments given by the user to specialize the build.
Even though the autoconf framework helps with all of these tasks, they must all
be explicitly coded with knowledge about the specifics of OpenJDK. This means
that we need to be clear about what build dependencies we actually have, what
differences needs to be determined, and in what ways the user can influence the
build result.
The build dependencies have previously been described in the README file.
The known differences has previously been encoded in the Makefiles, or been in
the "common knowledge".
The user influences have historically been by using environment variables, and
the check for these have been in the Makefiles.
The configure script can work like a "wrapper" for the old Makefiles, and set
up the same variables in config.spec as the Makefile have been using. In this
case, it will be almost transparent for the Makefile that the variables came
from the configure script instead of the user. However, in many cases a better
solution is probably to output a more "clean" variable, and rewrite the
corresponding parts of the Makefiles.
#### Legal status
As part of using autoconf, we need to include three files from autoconf in the
JDK 8 source repository. The three files are `pkg.m4`, `config.guess` and
`config.sub`. Legal clearance for inclusion of these files in the OpenJDK has
been requested. We believe this should not be a problem, since the autoconf
license is explicitly written to support this use case (basically allowing us
to distribute them any way we like, as long as they are used as part of a
configure script).
#### Transition
The current workflow when building OpenJDK is basically:
1. Retrieve source code from repository
2. Setup a slew of environment variables
3. Run make
4. Repeat 2 and 3 each time a rebuild is needed
Many team members have created personal shell scripts and similar solutions to
help with this.
The new workflow using configure scripts will instead lead to:
1. Retrieve source code from repository
2. Run `./configure`, possibly with specializing arguments
3. Run make
4. Repeat 3 each time a rebuild is needed
Since step 3 is so easy, no shell scripts will be needed to rebuild. However,
if the user had heavily specialized their setup, they might want to create
scripts to help them run configure with the correct arguments.
We should provide a translation table from old environment variables to new
`configure` arguments.
**Discussion:** Maybe we should check for some commonly used old-style
environment variables when running configure/make, and alert the user?
### Speed up javac using server mode supporting parallel compilation
For [JEP 139: Enhance javac to Improve Build Speed](JDK-8046129) we will write an
extension to javac which will support parallel compilation. To use this, we
must add support for it in the makefiles.
#### Transition
Switching the Java compilation to using the javac server will result in no
noticeable impact for the developer (apart from the major speedup, of
course). No transition plan is needed.
### Make Java builds incremental by enhancing javac with dependency output
Make has the ability to make incremental builds, that is, just recompile a
subset of all files when a change have been made. Ideally, this subset should
be the minimal subset needed. For this to work make needs to have dependency
information available, in a format that it can use.
For [JEP 139: Enhance javac to Improve Build Speed](JDK-8046129) we will write an
extension to javac which will allow for incremental builds of Java code. To use
this, we must add support for it in the makefiles.
#### Transition
The incremental build will be available for developers without any specific
action. In theory, the only noticeable difference for the developer should be
the increase in speed when doing recompilations. However, if the dependency
generation fails or gets confused, the build might be incorrect and a full
rebuild will be needed. This is very unlikely to happen, however it will be
useful to inform developers of this potential problem and inform them how to do
a full rebuild.
Also, compilation speed will now be correlated with the complexity of the
source code dependencies. Informing the developers about this might add an
incentive to write good code with less far-fetched dependencies.
Alternatives
------------
Instead of making javac properly parallel, we could start several
single-threaded compilations of different and independent java packages in
parallel. This would not require any changes to javac, but it would be much
harder to get the Makefiles correct, and it would not give as much speed
improvement.
We could have skipped rewriting the Makefiles, but to introduce these kinds of
changes without properly cleaning up the Makefiles first would have been a
daunting and time-consuming task.
Testing
-------
Since we will not change the resulting binary, we don't need to add or change
any tests of the product itself.
However, we should make sure we deliver on the promise of not changing the
resulting binary. As part of this project, we should create a build comparison
tool, which can compare the build result from the old system with the build
result from the new system, on all relevant aspects. This is a harder problem
than it sounds, since two subsequent builds, even with the same build system,
will not be bitwise identical, due to transient and irrelevant factors. To be
useful, such a tool needs to ignore such irrelevant aspects, and focus on what
should not change.
This tool should be run for a variety of platforms and build types, comparing
the old and the new system.
This tool can also be used to test that incremental builds are identical to
full rebuilds.
**Discussion:** Ideally, the build system should be tested, just as properly as
the resulting product. Unfortunately, no such framework for testing the build
system exists, and creating a proper testing framework is most likely outside
the scope of this effort.
**Discussion:** We should examine the possibilities of adding at least some
kind of basic testing of the Makefiles. Testing incremental builds by
specially crafted and "evil" dependencies could be one kind of tests to add. Is
there an existing javac test suite to add such tests to?
Risks and Assumptions
---------------------
Removing non-build items
- Risk: By mistake remove support for workflow or process needed by some
groups
- Mitigation plan: Communicate with all groups, gather requirements
- Contingency plan: Immediately re-implement support for workflow
Problems on rare platforms
- Risk: In some rare circumstances, the new build system will not work
- Mitigation plan: Test many scenarios (different hardware and software, for
different groups) before deploying; make sure we can use both new and old
system in parallel if needed
- Contingency plan: Keep old system so both systems can be used in parallel
Resulting product is incorrect
- Risk: Build changes causes incorrect bits to be build
- Mitigation plan: Test resulting build properly
- Contingency plan: Keep old system and use it instead until problem is solved
Dependences
-----------
As noted, this JEP depends upon
[JEP 139: Enhance javac to Improve Build Speed](JDK-8046129).
This JEP will make heavy changes to code which is also modified by the
BSD/MacOS X port. The build changes are likely arrive in JDK 8 before that
project, so we will have to take care of the changes they introduced. However,
most changes will be related to Hotspot, which we are not considering in this
project.
Future JEPs will build upon this JEP to improve the HotSpot and Javadoc build
processes.
Impact
------
The impact of this change on the actual resulting product is minimal.
- Compatibility: The way the product is built will be different. Existing
personal or group build scripts will not work without modification.
- Portability: We must make sure that the new build system works properly on
all supported platforms. If possible, it should be written so as to
minimize porting efforts when porting to new systems.
- Documentation: Existing documentation (like the build README) needs to be
updated.