JDK-8293540 : [Metrics] Incorrectly detected resource limits with additional cgroup fs mounts
  • Type: Bug
  • Component: core-svc
  • Affected Version: 11.0.16,17.0.4,20
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: linux
  • CPU: generic
  • Submitted: 2022-09-08
  • Updated: 2023-01-03
  • Resolved: 2022-09-30
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 17 JDK 20
11.0.18Fixed 17.0.6Fixed 20 b18Fixed
Related Reports
Relates :  
Relates :  
Description
Similar to JDK-8293472, but for the Java serviceability code. 

The Java code triggers an assertion if turned on:
$ sudo podman run --rm -ti --memory=300M --memory-swap=300M -v /sys/fs/cgroup:/cgroup-in:ro -v $(pwd)/build/linux-x86_64-server-fastdebug/images/jdk/:/opt/jdk:z fedora:36 /opt/jdk/bin/java -esa -ea -XshowSettings:system -version

Exception in thread "main" java.lang.AssertionError
	at java.base/jdk.internal.platform.CgroupSubsystemFactory.amendCgroupInfos(CgroupSubsystemFactory.java:324)
	at java.base/jdk.internal.platform.CgroupSubsystemFactory.determineType(CgroupSubsystemFactory.java:186)
	at java.base/jdk.internal.platform.CgroupSubsystemFactory.create(CgroupSubsystemFactory.java:85)
	at java.base/jdk.internal.platform.CgroupMetrics.getInstance(CgroupMetrics.java:183)
	at java.base/jdk.internal.platform.SystemMetrics.instance(SystemMetrics.java:29)
	at java.base/jdk.internal.platform.Metrics.systemMetrics(Metrics.java:58)
	at java.base/jdk.internal.platform.Container.metrics(Container.java:43)
	at java.base/sun.launcher.LauncherHelper.printSystemMetrics(LauncherHelper.java:317)
	at java.base/sun.launcher.LauncherHelper.showSettings(LauncherHelper.java:172)


Comments
A pull request was submitted for review. URL: https://git.openjdk.org/jdk8u-dev/pull/221 Date: 2023-01-03 12:32:05 +0000
03-01-2023

Fix Request (OpenJDK 11u): Please approve getting this backported to 11u. Similar fix to JDK-8293472 but for the Metrics code. It depends on that bug getting integrated as well. On some systems this leads to incorrectly detected container limits (e.g. via OperatingSystemMXBean on cg2). Patch applies cleanly. Container tests pass for me on cg1 and cg2. Risk should be low because it ignores cgroup mounts not in hierarchy /sys/fs/cgroup. I'll follow-up with a test-fix backport of JDK-8294740, once this is in (assuming it gets approved).
17-11-2022

A pull request was submitted for review. URL: https://git.openjdk.org/jdk11u-dev/pull/1526 Date: 2022-11-17 14:19:13 +0000
17-11-2022

[~goetz] Did you mean jdk17u-fix-yes (over jdk11u-fix-yes)?
14-10-2022

Fix Request (OpenJDK 17u): Please approve getting this backported to 17u. Similar fix to JDK-8293472 but for the Metrics code. On some systems this leads to incorrectly detected container limits (e.g. via OperatingSystemMXBean on cg2). Patch applies cleanly. Container tests pass for me on cg1 and cg2. Risk should be low because it ignores cgroup mounts not in hierarchy /sys/fs/cgroup. I'll follow-up with a test-fix backport of JDK-8294740, once this is in (assuming it gets approved).
13-10-2022

A pull request was submitted for review. URL: https://git.openjdk.org/jdk17u-dev/pull/785 Date: 2022-10-12 16:36:29 +0000
12-10-2022

A pull request was submitted for review. URL: https://git.openjdk.org/jdk17u-dev/pull/784 Date: 2022-10-12 12:35:28 +0000
12-10-2022

Changeset: 6d83482a Author: Severin Gehwolf <sgehwolf@openjdk.org> Date: 2022-09-30 08:44:10 +0000 URL: https://git.openjdk.org/jdk/commit/6d83482a6b5f1898514fd450d8143dbfef57e362
30-09-2022

A pull request was submitted for review. URL: https://git.openjdk.org/jdk/pull/10248 Date: 2022-09-13 13:06:10 +0000
14-09-2022

Review: https://github.com/openjdk/jdk/pull/10248
14-09-2022

I can reproduce wrong metrics values on cgroups v2 and additonal cgroup fs mounts with the following tests: test/hotspot/jtreg/containers/docker/TestMemoryAwareness.java test/jdk/jdk/internal/platform/docker/TestDockerCpuMetrics.java test/jdk/jdk/internal/platform/docker/TestDockerMemoryMetrics.java
13-09-2022

The only reason why this happens to not run afoul of detecting the memory limit (with system assertions turned off) is the order of the mountinfo entries. In my case the manual mount comes first so the last entry wins: [root@2073b7e789ac /]# cat /proc/self/mountinfo | grep cgroup2 1200 1198 0:26 /../../.. /cgroup-in ro,nosuid,nodev,noexec,relatime - cgroup2 cgroup2 rw,seclabel,nsdelegate,memory_recursiveprot 1212 1199 0:26 / /sys/fs/cgroup ro,nosuid,nodev,noexec,relatime - cgroup2 cgroup2 rw,seclabel,nsdelegate,memory_recursiveprot
09-09-2022