JDK-8275867 : Make ImmutableCollections deterministic if running with TraceBytecodes/StopInterpreterAt
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: runtime
  • Priority: P4
  • Status: Resolved
  • Resolution: Won't Fix
  • Submitted: 2021-10-25
  • Updated: 2023-01-09
  • Resolved: 2023-01-09
Related Reports
Relates :  
Relates :  
Description
HotSpot provides the command line options `TraceBytecodes` and `StopInterpreterAt` which can be quite handy when debugging some issues which occur very early during VM strartup.

Unfortunately, JDK-8139233 randomized the iteration order of immutable collections which are used quite early (starting around bytecode  ~600). This can change the absolute execution index of subsequent bytecodes significantly (e.g. the index of the applications main method which is executed around  bytecode 130_000 can vary by up to 100 bytecodes). Obviously, this behavior also makes the `StopInterpreterAt`option almost useless.

I therefor propose to introduce a new diagnost JVM flag (e.g. `-XX:+DeterministicVMStartup`) which suppresses the iteration randomization of `ImmutableCollections`. The option will be off by default to not change the current behavior and only be activated if the user sets `TraceBytecodes` or `StopInterpreterAt`.

I can imagine that apart from HotSpot debugging, this option can also be useful for debugging Java programs which use immutable collections.

The reason why I propose to fix this issue in the VM instead of fixing it in the class library (e.g. with the help of a property) is that the static initializer of `ImmutableCollections` already calls into the VM anyway in order to initialize its salt value. This is required because CDS/`-Xshare:dump` also requires a deterministic iteration order to guarantee reproducible builds (JDK-8241071). Fixing this in the VM also makes it easier to fix `TraceBytecodes`/`StopInterpreterAt` in a simple way.

Comments
Runtime Triage: This is not on our current list of priorities. We will consider this feature if we receive additional customer requirements.
09-01-2023

Based on previous comments this doesn't appear to be a change to collections itself. Moving to hotspot/runtime -- I hope that's the right category. Also moving Fix-version to "tbd" since there's no PR for this yet.
01-12-2021

[~simonis] There is no code currently in openjdk for hash table iteration order randomization. Google had a local patch for that, but unsuitable for upstreaming. An implementation difficulty is that the natural idea of using a system property runs into a circular dependency since system properties are implemented using hash tables. I would also like to see zero overhead when these core classes are used in production. ImmutableCollectons comes pretty close; not sure if/how we can do better.
27-10-2021

I think giving the user some control over randomization of iteration order, for both immutable collections and hash tables, would be good. A reasonable strategy for enterprise users is to run with randomized order while running tests, and deterministic when running in production.
27-10-2021

So what about using the simple solution I've proposed before for TraceByteCodes/StopInterpreterAt and introduce a new property to control the seed value? It would be probably wiser to open a new issue for the latter because that will require a CSR. [~martin] can you please point me to the code in "hash tables" which does randomized iteration and the place where the random seed is initialized?
27-10-2021

I've changed the summary to be more specific because "deterministic startup" in general is a much more complicated topic :)
26-10-2021

> Is there value in disabling this behaviour when using the TraceByteCodes or StopInterpreterAt flags? I think that's a good idea [~dholmes]. We can treat `TraceByteCodes` and `StopInterpreterAt` like `DumpSharedSpaces`: JVM_ENTRY_NO_ENV(jlong, JVM_GetRandomSeedForCDSDump()) JVMWrapper("JVM_GetRandomSeedForCDSDump"); if (DumpSharedSpaces || TraceByteCodes || StopInterpreterAt) { // Derive seed from JVM/build version (i.e. make it constant per build) return seed; } else { // '0' means that ImmutableCollection will create a "real" random seed. return 0; } JVM_END That's pretty simple. It doesn't solve the issue for Java applications which might behave differently based on the iteration order, but that's a different problem. I'll send out a PR soon. [~iklam] you're right that the early threads introduce some noise. But from my experience the code they execute seems to be small and constant. I.e. my application main method consistently started at a specific byte code index after I've fixed the ImmutableCollections issue.
26-10-2021

[~simonis] Have you prototyped this? The VM starts several Java threads before executing any application code, so TraceBytecodes will start diverging pretty early due to thread context switching. For CDS, even with JDK-8241071, we still cannot reproduce deterministic CDS archive, due to thread context switching. See JDK-8253495.
26-10-2021

See this bug for related discussion. JDK-8241071 - Generation of classes.jsa with -Xshare:dump is not deterministic
26-10-2021

Is there value in disabling this behaviour when using the TraceByteCodes or StopInterpreterAt flags? If not then you don't actually need to introduce yet-another-flag.
25-10-2021