JDK-8322420 : [Linux] cgroup v2: Limits in parent nested control groups are not detected
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 21.0.3
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • OS: linux
  • CPU: generic
  • Submitted: 2023-12-19
  • Updated: 2024-09-16
  • Resolved: 2024-09-11
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 24
24 b15Fixed
Related Reports
Blocks :  
Duplicate :  
Relates :  
Relates :  
Relates :  
Relates :  
Description
cgroups v2 support hierarchical groups. Limits from outer groups apply also for the inner groups. To calculate an effective limit for the current nested group one needs to calculate a minimum of all the limits in the current group and its parent groups up to the root one.

Detecting resource limits with cgroups v2
https://mail.openjdk.org/pipermail/container-discuss/2023-November/000001.html

Current Hotspot does not seem to support it:
$ cgcreate -g memory:foo/bar
$ echo $[100*1024*1024] >/sys/fs/cgroup/foo/memory.max
grep "" /sys/fs/cgroup{,/foo{,/bar}}/memory.max
grep: /sys/fs/cgroup/memory.max: No such file or directory
/sys/fs/cgroup/foo/memory.max:104857600
/sys/fs/cgroup/foo/bar/memory.max:max
$ cgexec -g memory:foo/bar java -Xlog:os+container=trace -version|&grep 'Memory Limit'
[0.001s][trace][os,container] Memory Limit is: -1
[0.001s][trace][os,container] Memory Limit is: Unlimited
$ cgdelete -r -g memory:foo

$ cgcreate -g memory:foo/bar
$ echo $[100*1024*1024] >/sys/fs/cgroup/foo/bar/memory.max
$ grep "" /sys/fs/cgroup{,/foo{,/bar}}/memory.max
grep: /sys/fs/cgroup/memory.max: No such file or directory
/sys/fs/cgroup/foo/memory.max:max
/sys/fs/cgroup/foo/bar/memory.max:104857600
$ cgexec -g memory:foo/bar java -Xlog:os+container=trace -version|&grep 'Memory Limit'
[0.001s][trace][os,container] Memory Limit is: 104857600
[0.001s][trace][os,container] Memory Limit is: 104857600
[0.023s][trace][os,container] Memory Limit is: 104857600
[0.023s][trace][os,container] Memory Limit is: 104857600
$ cgdelete -r -g memory:foo

Comments
Changeset: 55a7cf14 Branch: master Author: Severin Gehwolf <sgehwolf@openjdk.org> Date: 2024-09-11 13:51:31 +0000 URL: https://git.openjdk.org/jdk/commit/55a7cf14453b6cd1de91362927b2fa63cba400a1
11-09-2024

A pull request was submitted for review. Branch: master URL: https://git.openjdk.org/jdk/pull/20646 Date: 2024-08-20 14:40:30 +0000
20-08-2024

I've created JDK-8336881 to track the Metrics classes implementation for this (core libs).
22-07-2024

A pull request was submitted for review. Branch: master URL: https://git.openjdk.org/jdk/pull/17198 Date: 2023-12-28 12:55:22 +0000
09-07-2024

JDK-8333446 is the enhancement adding systemd tests and fails on cgroup v2 due to this bug (passes on cgroup v1).
03-06-2024

The test case will be tagged with @requires systemd.support (or similar) and would be skipped on those systems. I'm working on a patch to add that to the test libraries. We should have done that with JDK-8217338 so as to avoid needing to manually test it. Is there a reason to believe that if the test passes on systemd Linux systems the product code will be broken on Alpine?
03-06-2024

Then the testcase will not work on non-systemd Linuxes. Such as OpenJDK binary distribution for popular Alpine Linux with musl libc.
03-06-2024

Sure. CentOS-7 behaves very differently from Fedora 40 (although that is cgroupv1 vs. cgroupv2).
03-06-2024

Testing notes: I think regression testing of this should be best handled by using systemd slices. The chances that systemd being available on a test system is more likely than cgexec and friends provided by libcg. $ cat user-cg.slice [Unit] Description=Demo cpu/memory cgroup Before=slices.target [Slice] MemoryAccounting=true MemoryLimit=2000M $ cat user-cg-cpu.slice [Unit] Description=Demo demo Before=slices.target [Slice] CPUAccounting=true # 2 CPU cores CPUQuota=200% $ sudo cp user-cg.slice user-cg-cpu.slice /etc/systemd/system/ $ sudo systemctl daemon-reload && sudo systemctl restart user-cg-cpu.slice && sudo systemd-run --slice user-cg-cpu.slice --scope ./jdk-23+14/bin/java -Xlog:os+container=trace --version Running scope as unit: run-r970ecde6b9be4fb6b3cbc007b6a20011.scope [0.001s][trace][os,container] OSContainer::init: Initializing Container Support [0.001s][debug][os,container] Detected optional pids controller entry in /proc/cgroups [0.001s][debug][os,container] Detected cgroups v2 unified hierarchy [0.001s][trace][os,container] Path to /cpu.max is /sys/fs/cgroup/user.slice/user-cg.slice/user-cg-cpu.slice/run-r970ecde6b9be4fb6b3cbc007b6a20011.scope/cpu.max [0.001s][debug][os,container] Open of file /sys/fs/cgroup/user.slice/user-cg.slice/user-cg-cpu.slice/run-r970ecde6b9be4fb6b3cbc007b6a20011.scope/cpu.max failed, No such file or directory [0.001s][trace][os,container] CPU Quota is: -2 [0.001s][trace][os,container] Path to /cpu.max is /sys/fs/cgroup/user.slice/user-cg.slice/user-cg-cpu.slice/run-r970ecde6b9be4fb6b3cbc007b6a20011.scope/cpu.max [0.001s][debug][os,container] Open of file /sys/fs/cgroup/user.slice/user-cg.slice/user-cg-cpu.slice/run-r970ecde6b9be4fb6b3cbc007b6a20011.scope/cpu.max failed, No such file or directory [0.001s][trace][os,container] CPU Period is: -2 [0.001s][trace][os,container] OSContainer::active_processor_count: 4 [0.002s][trace][os,container] CgroupSubsystem::active_processor_count (cached): 4 [0.002s][trace][os,container] total physical memory: 4096413696 [0.002s][trace][os,container] Path to /memory.max is /sys/fs/cgroup/user.slice/user-cg.slice/user-cg-cpu.slice/run-r970ecde6b9be4fb6b3cbc007b6a20011.scope/memory.max [0.002s][trace][os,container] Raw value for memory limit is: max [0.002s][trace][os,container] Memory Limit is: Unlimited [0.002s][debug][os,container] container memory limit unlimited: -1, using host value 4096413696 [0.003s][trace][os,container] CgroupSubsystem::active_processor_count (cached): 4 [0,023s][trace][os,container] Path to /cpu.max is /sys/fs/cgroup/user.slice/user-cg.slice/user-cg-cpu.slice/run-r970ecde6b9be4fb6b3cbc007b6a20011.scope/cpu.max [0,023s][debug][os,container] Open of file /sys/fs/cgroup/user.slice/user-cg.slice/user-cg-cpu.slice/run-r970ecde6b9be4fb6b3cbc007b6a20011.scope/cpu.max failed, No such file or directory [0,023s][trace][os,container] CPU Quota is: -2 [0,023s][trace][os,container] Path to /cpu.max is /sys/fs/cgroup/user.slice/user-cg.slice/user-cg-cpu.slice/run-r970ecde6b9be4fb6b3cbc007b6a20011.scope/cpu.max [0,023s][debug][os,container] Open of file /sys/fs/cgroup/user.slice/user-cg.slice/user-cg-cpu.slice/run-r970ecde6b9be4fb6b3cbc007b6a20011.scope/cpu.max failed, No such file or directory [0,023s][trace][os,container] CPU Period is: -2 [0,023s][trace][os,container] OSContainer::active_processor_count: 4 [0,028s][trace][os,container] total physical memory: 4096413696 [0,028s][trace][os,container] Path to /memory.max is /sys/fs/cgroup/user.slice/user-cg.slice/user-cg-cpu.slice/run-r970ecde6b9be4fb6b3cbc007b6a20011.scope/memory.max [0,028s][trace][os,container] Raw value for memory limit is: max [0,028s][trace][os,container] Memory Limit is: Unlimited [0,028s][debug][os,container] container memory limit unlimited: -1, using host value 4096413696 openjdk 23-beta 2024-09-17 OpenJDK Runtime Environment Temurin-23+14-202403142003 (build 23-beta+14-ea) OpenJDK 64-Bit Server VM Temurin-23+14-202403142003 (build 23-beta+14-ea, mixed mode, sharing) $ nproc 4 $ free --mega total used free shared buff/cache available Mem: 4096 483 2440 14 1171 3356 Swap: 2810 0 2810 In the above case the expected detected memory limit should be 2GB, but actually is detected as unlimited (4GB host mem). Similarly the expected CPU count should be 2, but actually is 4 (the host system value).
31-05-2024

Since this bug is filed for hotspot, we should limit the scope of this fix to the hotspot implementation. The JDK implementation of the cgroups code in jdk.internal.platform needs fixing too, but in a separate bug. The Java implementation is mostly used by serviceability tools anyway and should be less critical.
31-05-2024

The two preparatory bugs for this would be JDK-8331560 and JDK-8302744.
03-05-2024

Linking a related issue.
21-02-2024

Note that -XshowSettings:system is a launcher feature which uses the Java Metrics classes. For a hotspot bug, such a limit up the hierarchy needs to show with -Xlog:os+container=trace
19-12-2023