JDK-8232207 : Linux os::available_memory re-reads cgroup configuration on every invocation
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 11,12,13,14
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • OS: linux
  • CPU: x86
  • Submitted: 2019-10-14
  • Updated: 2022-11-30
  • Resolved: 2019-10-16
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 13 JDK 14 Other
11.0.7-oracleFixed 13.0.4Fixed 14 b20Fixed openjdk8u372Fixed
Related Reports
Relates :  
Relates :  
Description
On Linux systems considered containerized then os::available_memory will call into OSContainer::memory_limit_in_bytes and read the memory limit information from the /proc/self/mountinfo or similar.

This can take significant background resources, mostly since os::available_memory is called every iteration of each compiler thread's main loop to help determine if the system should add more compiler threads.

There are a couple of possible solutions:

- add a grace time to how often OSContainer::memory_limit_in_bytes actually reads the /proc configuration
- refactor compiler thread loop to make that possibly_add_compiler_threads a periodic task rather than something that each and every compiler thread does between every compilation
Comments
A pull request was submitted for review. URL: https://git.openjdk.org/jdk8u-dev/pull/127 Date: 2022-10-03 09:57:22 +0000
31-10-2022

This doesn't seem to affect OpenJDK 8u as JDK-8198756 which adds function possibly_add_compiler_threads is JDK 11 onwards.
16-11-2020

Fix request (13u): The change applies cleanly to 13u.
28-05-2020

Fix request (11u) I would like to downport this for parity with 11.0.7-oracle. Applies clean.
10-01-2020

URL: https://hg.openjdk.java.net/jdk/jdk/rev/21a92562f0c2 User: redestad Date: 2019-10-16 22:10:30 +0000
16-10-2019

Even at just a 20ms grace time I get similar improvements. I also had a look at good old Hello World, which on my system sees marked improvements: baseline: 126,240,357 instructions # 0.83 insns per cycle ( +- 0.09% ) 25,173,436 branches # 437.673 M/sec ( +- 0.09% ) 874,657 branch-misses # 3.47% of all branches ( +- 0.24% ) 0.038821460 seconds time elapsed ( +- 0.44% ) 20ms grace: 120,630,166 instructions # 0.83 insns per cycle ( +- 0.07% ) 23,968,581 branches # 434.086 M/sec ( +- 0.07% ) 826,396 branch-misses # 3.45% of all branches ( +- 0.20% ) 0.038206737 seconds time elapsed ( +- 0.46% ) Since 20ms is basically instant for the purpose of adapting to a change in max memory limit, I think we can move ahead with this.
15-10-2019

When it comes to the parts of the JVM that can adapt downwards, resizing decisions happen with some latency anyhow: e.g., C1 / C2 threads will only be shut down once they are done with their current compilation, at the earliest (only the last compiler thread of each type is up for elimination), and C2 compilations can take several seconds. Heap and GC thread decisions may take longer (I'm actually not sure there's any code in effect there to reduce resource use when the container limits change...) Thus adding a 1s grace time on re-reads here is unlikely to affect the policy decisions on whether to start shutting down threads at all. With that in mind I find it *very* reasonable to add some latency like this if it measurably reduce overhead for the normal case (which it does). The experimental 1s value is not set in stone, of course: Any value above say 100ms might suffice to reduce the observed overheads into noise levels.
15-10-2019

I wish the container folk would provide better APIs that aren't so expensive and for which you know what can and can't change over time based on the container configuration. :( A mechanism to get notified of changes would also be good.) I have no way to evaluate whether 1 second can be considered "reasonably responsive". I would not be surprised if we have to add a flag to control this once people start to encounter it - which could still be years out from now.
15-10-2019

Proof-of-concept: https://cr.openjdk.java.net/~redestad/8232207/open.00/
14-10-2019

I've prototyped a patch that adds a short (1s) grace time on re-reading the config inside of os::available_memory, which avoid the startup issue (reduces CPU cycles spent of medium sized startup tests by ~5%), but keeps the system reasonably responsive w.r.t. reacting to memory configuration changes. Perhaps there needs to be a flag here for environments that need to immediately respond to changes, but I think that's out of scope since there'll always be (potentially long) latencies before the JVM can react to a new configuration and reduce size of heap, number of compiler threads etc..
14-10-2019

We have a general problem that reading information from a container environment is expensive, but the fact it is dynamic information and can change at any time means we have to keep doing. Something like available_memory should be cheap so that we don't need to contort our code rather than doing the simple and obvious algorithms.
14-10-2019