JDK-8230305 : Cgroups v2: Container awareness
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 8u221,11,13,14
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • OS: linux
  • CPU: generic
  • Submitted: 2019-08-28
  • Updated: 2020-02-21
  • Resolved: 2020-01-17
JDK 15
15 b07Fixed
JDK-8146115 added Hotspot runtime support for JVMs running in Docker containers. At the time Docker used cgroups v1 and, hence, runtime support only includes cgroup v1 controllers.

This enhancement will extend functionality of JDK-8146115 to also detect cgroups v2. That is iff cgroups v2 unified hierarchy is available only, use the cgroups v2 backend. Otherwise fall back to existing cgroups v1 container support.

Fedora 31 plans to switch to the unified hierarchy by default[1] and will be used as a test system for the cgroups v2 backend. At this point, podman and crun have sufficient support for a cgroups v2 only OS for this work to be possible to start at the JVM level.

Controllers supported for cgroups v2 are: memory, cpu, cpuset

memory controller:

$ podman --runtime=/usr/bin/crun run --memory=300M -ti fedora:30 /bin/bash
[root@18bac3504b35 /]# cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' -f3)/memory.max

cpu controller:

$ podman --runtime=/usr/bin/crun run --memory=300M --cpu-quota=10000 --cpu-period=10000 -ti fedora:30 /bin/bash
[root@c6561b50c547 /]# cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' -f3)/cpu.max
10000 10000

OR using --cpu-shares:

$ podman --runtime=/usr/bin/crun run --memory=300M --cpu-shares=1024 -ti fedora:30 /bin/bash
[root@e18f01fca65a /]# cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' -f3)/cpu.weight

Note that with Cgroups V2, there are certain transformations being done as
the OCI spec has been done with Cgroups V1 in mind. See:

cpuset controller:

$ podman --runtime=/usr/bin/crun run --memory=300M --cpuset-cpus=0,1 -ti fedora:30 /bin/bash
[root@c6e8caa8f25b /]# cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' -f3)/cpuset.cpus
[root@c6e8caa8f25b /]# cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' -f3)/cpuset.cpus.effective

On a cgroup v2 only system the current container detection bails with "Required cgroup memory subsystem not found". The below example is for JDK 11, but it's equally valid for JDK 14:

$ podman --runtime=/usr/bin/crun run --memory=300M -ti fedora:30 /bin/bash
[root@a81e632b9aa4 /]# dnf install java-11-openjdk
[root@a81e632b9aa4 /]# java -Xlog:os+container=trace -version
[0.002s][trace][os,container] OSContainer::init: Initializing Container Support
[0.003s][debug][os,container] Required cgroup memory subsystem not found
openjdk version "11.0.4" 2019-07-16
OpenJDK Runtime Environment 18.9 (build 11.0.4+11)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.4+11, mixed mode, sharing)

[1] https://fedoraproject.org/wiki/Changes/CGroupsV2
URL: https://hg.openjdk.java.net/jdk/jdk/rev/931354c6323d User: sgehwolf Date: 2020-01-17 13:15:47 +0000

RFR for hotspot: http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-October/039776.html http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-November/039888.html

Prototype imlementation depending on JDK-8230848. It doesn't include Metrics bits so far, but passes hotspot container tests on cgroups v1 and cgroups v2 (on F31 beta with podman): http://cr.openjdk.java.net/~sgehwolf/webrevs/cgroupsv2-hotspot/05/webrev/

Docker uses runc as container runtime and that isn't fully ported to cgroups v2 unified hierarchy: https://github.com/opencontainers/runc/issues/654 As podman allows one to use a different container runtime engine (--runtime option) it's currently the only option on cgroups v2 unified hierarchy. That is, podman + crun as container runtime.

Does docker run on Fedora 31 or is podman the only choice?