JDK-8253495 : CDS generates non-deterministic output
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 16
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2020-09-22
  • Updated: 2023-10-23
  • Resolved: 2022-03-16
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 19
19 b14Fixed
Related Reports
Blocks :  
Duplicate :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Sub Tasks
JDK-8253499 :  
Description
The following test is failing in the JDK16 CI:

runtime/cds/DeterministicDump.java

Here's a snippet from the log file:

----------System.err:(14/863)----------
java.lang.RuntimeException: File content different at byte #4, b0 = -62, b1 = -41
	at DeterministicDump.compare(DeterministicDump.java:117)
	at DeterministicDump.doTest(DeterministicDump.java:68)
	at DeterministicDump.main(DeterministicDump.java:43)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:64)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
	at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298)
	at java.base/java.lang.Thread.run(Thread.java:832)

JavaTest Message: Test threw exception: java.lang.RuntimeException
JavaTest Message: shutting down test

result: Failed. Execution failed: `main' threw exception: java.lang.RuntimeException: File content different at byte #4, b0 = -62, b1 = -41

====
The failure seems to have happened after JDK-8253208
Comments
This is not a trivial fix. I don't think it should be backported to 17.
23-10-2023

Changeset: de4f04cb Author: Ioi Lam <iklam@openjdk.org> Date: 2022-03-16 03:12:48 +0000 URL: https://git.openjdk.java.net/jdk/commit/de4f04cb71a26ce03b96460cb8d1c1e28cd1ed38
16-03-2022

[~dholmes] David, what you are missing here is the fact that the output of CDS is in fact deeply tied to the binaries of the built JDK! We include classes.jsa in java.base.jmod, so if the content of this file changes, so does the core java.base.jmod file, which in turn changes the jimage. Also, there is no random timestamps in documentation, or anywhere else, for that matter (with the possible exception for a few trivial bugs remaining). If timestamps are included, they are determined by SOURCE_DATE_EPOCH. As for class dumps done by end users, I agree that determinism might not seem as important (but it might be to the end users!), but the core problem here is making the entire JDK build deterministic.
10-03-2022

Hypothetically though, what if the fix for this is just a couple of lines of code? I'm not sure much investigation has been done on this yet..? I've done a binary comparison and the actual amount of difference is quite small. I'm happy to spend some time to see if that is the case...
10-03-2022

Thanks for those examples [~aleonard] but I don't quite see how the generated jsa dump file fits in with those scenarios. It is sufficient that the binaries of the built JDK are equivalent to establish those properties. I don't see that that needs to extend to an output (the dump file) produced by running that JDK. It is our choice to supply a pre-generated jsa file as part of the JDK to save the user from generating it themselves at install time. I don't see that it has to itself be deterministically generated. We can choose to supply other artifacts as part of the JDK and which themselves need not be deterministically produced - the obvious example is generated documentation that includes a timestamp. If there is concern about such generated artifacts then the end user can generate them themselves. I can see that ideally things would be easier if everything generated from the build process was deterministic in this sense, but I also want to be sure the costs of ensuring that are reasonable.
10-03-2022

[~dholmes] It's part of how security is "suddenly" became such a concern. :-) While I'm not part of the hotspot team, I'd consider this "workaround" a happy solution. We're not talking about something affecting normal Hotspot usage. It's just the special CDS dump generation, where you run the JVM once on your specific platform, to generate the classes.jsa dump. I'm pretty sure most users *never* do this, and probably not even those who perhaps should (performance-critical installation with a constant application on well-specified hardware) will do it. So, in the end, this is practically just a special mode for building the JDK.
10-03-2022

[~ihse] Magnus, thanks for clarifying that. I hadn't realized the flow on effects from this in that regard. I must confess though that I'm still somewhat mystified as to why this kind of reproducibility is "suddenly" such a concern. [~aleonard] Ioi has already done a lot of investigation on this. The PR is more a workaround than a real fix as the real fix is very complex.
10-03-2022

I'm not at all sure why "reproducability of the entire JDK project" is a goal? Who does this benefit? We seem to be jumping through hoops to try and achieve this goal and it is clear we bare the cost but not at all clear who is reaping any benefit?
09-03-2022

Project quality and secure supply chain are key reasons. For example: 1) You're setting up a new CI infrastructure to build openjdk and you want to make sure it is correct and exactly replicating your existing CI. So you re-build a build on both your existing and new CI, then do a binary compare of the JDK. 2) Open project built JDKs can verify their supply chain security, by doing a "parallel build" within a secure locked down environment, if they are identical then you know the public open project JDK is secure. 3) Consumer confidence, if you can state to your consumers that your provided JDKs are fully reproducible in a secure environment, they will be more satisfied to consume them.
09-03-2022

A pull request was submitted for review. URL: https://git.openjdk.java.net/jdk/pull/7748 Date: 2022-03-08 19:11:02 +0000
08-03-2022

To be clear: this issue is also affecting the reproducibility of the entire JDK project. As it stands right now, classes_nocoops.jsa and classes.jsa are two of the few remaining files that cannot be reproducibly built. :-( So it's just not about a test failing. A better bug title might be something like "CDS generates non-deterministic output".
23-02-2022

On my Linux box with JDK 17, the non-determinism is much worse: $ /jdk/official/jdk17/bin/java -version java version "17" 2021-09-14 LTS Java(TM) SE Runtime Environment (build 17+35-LTS-2724) Java HotSpot(TM) 64-Bit Server VM (build 17+35-LTS-2724, mixed mode, sharing) $ while true; do /jdk/official/jdk17/bin/java -Xlog:cds=debug -Xshare:dump -XX:SharedArchiveFile=d1.jsa | grep rw.*crc; done [0.856s][info ][cds] Shared file region (rw ) 0: 4452392 bytes, addr 0x0000000800000000 file offset 0x00001000 crc 0x2849f60b [0.848s][info ][cds] Shared file region (rw ) 0: 4452392 bytes, addr 0x0000000800000000 file offset 0x00001000 crc 0xf08314eb [0.847s][info ][cds] Shared file region (rw ) 0: 4452392 bytes, addr 0x0000000800000000 file offset 0x00001000 crc 0x157da02f [0.846s][info ][cds] Shared file region (rw ) 0: 4452392 bytes, addr 0x0000000800000000 file offset 0x00001000 crc 0x954ca7fa [0.852s][info ][cds] Shared file region (rw ) 0: 4452392 bytes, addr 0x0000000800000000 file offset 0x00001000 crc 0xa1b0dab2 [0.851s][info ][cds] Shared file region (rw ) 0: 4452392 bytes, addr 0x0000000800000000 file offset 0x00001000 crc 0x799a010a [0.859s][info ][cds] Shared file region (rw ) 0: 4452392 bytes, addr 0x0000000800000000 file offset 0x00001000 crc 0x945a307f [0.863s][info ][cds] Shared file region (rw ) 0: 4452392 bytes, addr 0x0000000800000000 file offset 0x00001000 crc 0x4621051e [0.847s][info ][cds] Shared file region (rw ) 0: 4452392 bytes, addr 0x0000000800000000 file offset 0x00001000 crc 0x7c2e1ef7
04-10-2021

I have been doing some reproducible build testing and came across this deterministic CDS dump issue as well. I'd like to just add some detail that maybe helpful from what I have seen. I am testing with latest jdk-18 head. 1) I first re-boot my x64Linux ubuntu VM to clear any shared memory. 2) Run java -Xlog:cds=debug -Xshare:dump -XX:SharedArchiveFile=d1.jsa -Xmx128M -Xms128M 3) Run java -Xlog:cds=debug -Xshare:dump -XX:SharedArchiveFile=d2.jsa -Xmx128M -Xms128M 4) Run java -Xlog:cds=debug -Xshare:dump -XX:SharedArchiveFile=d3.jsa -Xmx128M -Xms128M I discover that d2.jsa and d3.jsa (and in fact any subsequent ones) are all identical. BUT the very first dump d1.jsa differs from those. It's as if running the first one changes the shared memory regions initially, but it is then stable for any subsequent dumps. I also find this is applicable to both with and without -XX:-UseCompressedOops.
04-10-2021

ILW = HLM = P3
29-09-2020

The current design of CDS deterministic dump assumes that when -Xshare:dump is executed, Symbols are always created in a predictable order. This turns out to be not the cases. It appears that some Java threads are executed in parallel while the VM starts up, causing Symbols to be created in nondeterministic order. A proper fix will be to sort all the Symbols alphabetically. This fix is quite involved, as it requires InstanceKlass::methods() to be re-sorted, which means the itables/vtables also need to be laid out again. For the time being, we should problem list DeterministicDump.java to avoid noise in testing.
22-09-2020

The problem seems to be with -XX:-UseCompressedOops -XX:-UseCompressedClassPointers. Without these flags, the crcs are deterministic. $ java -Xshare:dump -Xlog:cds=debug -XX:-UseCompressedOops -XX:-UseCompressedClassPointers | grep crc [0.753s][debug][cds] Shared file region (mc ) 0: 25240 bytes, addr 0x0000000800000000 file offset 0x00001000 crc 0x462afe3a [0.758s][debug][cds] Shared file region (rw ) 1: 4438144 bytes, addr 0x0000000800007000 file offset 0x00008000 crc 0x1811b024 [0.768s][debug][cds] Shared file region (ro ) 2: 7555736 bytes, addr 0x0000000800443000 file offset 0x00444000 crc 0x42471e98 [0.776s][debug][cds] Shared file region (bm ) 3: 187888 bytes, addr 0x0000000000000000 file offset 0x00b79000 crc 0xa9455052 $ java -Xshare:dump -Xlog:cds=debug -XX:-UseCompressedOops -XX:-UseCompressedClassPointers | grep crc [0.754s][debug][cds] Shared file region (mc ) 0: 25240 bytes, addr 0x0000000800000000 file offset 0x00001000 crc 0x462afe3a [0.758s][debug][cds] Shared file region (rw ) 1: 4438144 bytes, addr 0x0000000800007000 file offset 0x00008000 crc 0x81823633 [0.769s][debug][cds] Shared file region (ro ) 2: 7555736 bytes, addr 0x0000000800443000 file offset 0x00444000 crc 0xa5453035 [0.773s][debug][cds] Shared file region (bm ) 3: 187888 bytes, addr 0x0000000000000000 file offset 0x00b79000 crc 0x01e2caed
22-09-2020