JDK-8230305 : Cgroups v2: Container awareness
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 8u221,11,13,14
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • OS: linux
  • CPU: generic
  • Submitted: 2019-08-28
  • Updated: 2022-09-14
  • Resolved: 2020-01-17
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 15
11.0.16-oracleFixed 15 b07Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Sub Tasks
JDK-8230848 :  
JDK-8231111 :  
Description
JDK-8146115 added Hotspot runtime support for JVMs running in Docker containers. At the time Docker used cgroups v1 and, hence, runtime support only includes cgroup v1 controllers.

This enhancement will extend functionality of JDK-8146115 to also detect cgroups v2. That is iff cgroups v2 unified hierarchy is available only, use the cgroups v2 backend. Otherwise fall back to existing cgroups v1 container support.

Fedora 31 plans to switch to the unified hierarchy by default[1] and will be used as a test system for the cgroups v2 backend. At this point, podman and crun have sufficient support for a cgroups v2 only OS for this work to be possible to start at the JVM level.

Controllers supported for cgroups v2 are: memory, cpu, cpuset

memory controller:
------------------------------------------------------------------------
https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html#memory

$ podman --runtime=/usr/bin/crun run --memory=300M -ti fedora:30 /bin/bash
[root@18bac3504b35 /]# cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' -f3)/memory.max
314572800

cpu controller:
------------------------------------------------------------------------
https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html#cpu

$ podman --runtime=/usr/bin/crun run --memory=300M --cpu-quota=10000 --cpu-period=10000 -ti fedora:30 /bin/bash
[root@c6561b50c547 /]# cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' -f3)/cpu.max
10000 10000

OR using --cpu-shares:

$ podman --runtime=/usr/bin/crun run --memory=300M --cpu-shares=1024 -ti fedora:30 /bin/bash
[root@e18f01fca65a /]# cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' -f3)/cpu.weight
39

Note that with Cgroups V2, there are certain transformations being done as
the OCI spec has been done with Cgroups V1 in mind. See:
https://github.com/containers/crun/blob/master/crun.1.md#cpu-controller

cpuset controller:
------------------------------------------------------------------------
https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html#cpuset

$ podman --runtime=/usr/bin/crun run --memory=300M --cpuset-cpus=0,1 -ti fedora:30 /bin/bash
[root@c6e8caa8f25b /]# cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' -f3)/cpuset.cpus
0-1
[root@c6e8caa8f25b /]# cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' -f3)/cpuset.cpus.effective
0-1

On a cgroup v2 only system the current container detection bails with "Required cgroup memory subsystem not found". The below example is for JDK 11, but it's equally valid for JDK 14:

$ podman --runtime=/usr/bin/crun run --memory=300M -ti fedora:30 /bin/bash
[root@a81e632b9aa4 /]# dnf install java-11-openjdk
[...]
[root@a81e632b9aa4 /]# java -Xlog:os+container=trace -version
[0.002s][trace][os,container] OSContainer::init: Initializing Container Support
[0.003s][debug][os,container] Required cgroup memory subsystem not found
openjdk version "11.0.4" 2019-07-16
OpenJDK Runtime Environment 18.9 (build 11.0.4+11)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.4+11, mixed mode, sharing)


[1] https://fedoraproject.org/wiki/Changes/CGroupsV2
Comments
Thanks for handling this complex set of backports!
23-03-2022

OpenJDK 11u backports: My testing indicates these bugs should get pushed/integrated in short succession for 11u: JDK-8230305 (this one), JDK-8231111, JDK-8237479, JDK-8253714, JDK-8253727 This gives a clean test state on cgroups v2 and v1 systems. All of them have PRs for jdk11u-dev out.
17-03-2022

[~goetz] They're not yet ready for integration (JDK-8231111 in particular is on review still). No point in approving at this point. I'll remove the request label for the time being until they're ready. Sorry for the noise.
11-03-2022

[~zgu] Please propose above three changes for backport, then I will admit all of them.
11-03-2022

Relevant 11u backports: JDK-8231111: https://github.com/openjdk/jdk11u-dev/pull/863 JDK-8237479: https://github.com/openjdk/jdk11u-dev/pull/864 JDK-8253714: https://github.com/openjdk/jdk11u-dev/pull/865
08-03-2022

[~zgu] We shouldn't squash them into one push. Separate backports, and reviews if need be, are fine. The maintainer approval should happen together for those, though, as they are largely inter-related. I suggest to use dependent pull request SKARA feature. See: https://mail.openjdk.java.net/pipermail/jdk-dev/2021-March/005232.html It basically means, for example, JDK-8237479 backport should target `pr/840` branch of jdk11u-dev, JDK-8231111 also `pr/840` and backport of JDK-8253714 target whatever pr number JDK-8231111 will get. Once the 4 bugs are ready for integration, approval process can start for all 4. They get approved in bulk and integrated as a quadruplet. Does that make sense?
08-03-2022

(11u) - JDK-8237479 is a clean backport on top of this backport. - JDK-8231111 is a biggie: http://cr.openjdk.java.net/~zgu/JDK-8231111-11u/webrev.00/ - JDK-8253714 is a clean backport on top of JDK-8231111 Not sure how to squash them into one push.
08-03-2022

As far as approving this for 11u is concerned we should do it together with JDK-8253714 (as it's causing a test fail), JDK-8237479 (as it might break slowdebug builds), and JDK-8231111 (which is a dependency of JDK-8253714). This should reduce testing noise.
07-03-2022

Fix Request (11u) Given increasing customer demands and the LTS status of JDK11u, it makes a lot of sense to support Cgroup v2 in this release. I purpose to backport this patch to jdk11u as starting point, and relevant backports in [1] will follow shortly. [1] https://mail.openjdk.java.net/pipermail/jdk-updates-dev/2022-February/012398.html
04-03-2022

A pull request was submitted for review. URL: https://git.openjdk.java.net/jdk11u-dev/pull/840 Date: 2022-02-24 21:18:12 +0000
24-02-2022

URL: https://hg.openjdk.java.net/jdk/jdk/rev/931354c6323d User: sgehwolf Date: 2020-01-17 13:15:47 +0000
17-01-2020

RFR for hotspot: http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-October/039776.html http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-November/039888.html
08-11-2019

Prototype imlementation depending on JDK-8230848. It doesn't include Metrics bits so far, but passes hotspot container tests on cgroups v1 and cgroups v2 (on F31 beta with podman): http://cr.openjdk.java.net/~sgehwolf/webrevs/cgroupsv2-hotspot/05/webrev/
08-11-2019

Docker uses runc as container runtime and that isn't fully ported to cgroups v2 unified hierarchy: https://github.com/opencontainers/runc/issues/654 As podman allows one to use a different container runtime engine (--runtime option) it's currently the only option on cgroups v2 unified hierarchy. That is, podman + crun as container runtime.
02-09-2019

Does docker run on Fedora 31 or is podman the only choice?
28-08-2019