JDK-8292206 : TestCgroupMetrics.java fails as getMemoryUsage() is lower than expected
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 17.0.7-oracle,19,20
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: linux
  • CPU: x86_64
  • Submitted: 2022-08-10
  • Updated: 2023-01-05
  • Resolved: 2022-12-22
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 21
21 b04Fixed
Related Reports
Duplicate :  
Duplicate :  
Relates :  
Description
The following test failed in the JDK19 CI:

jdk/internal/platform/cgroup/TestCgroupMetrics.java

Here's a snippet from the log file:

#section:main
----------messages:(6/232)----------
command: main TestCgroupMetrics
reason: User specified action: run main TestCgroupMetrics 
Mode: agentvm
Agent id: 4
Additional exports to unnamed modules from @modules: java.base/jdk.internal.platform
elapsed time (seconds): 1.709
----------configuration:(17/2159)----------

<snip>

----------System.err:(14/960)----------
java.lang.RuntimeException: Test failed for - memory:getMemoryUsage, expected [23031517184], got [22767202304]
	at jdk.test.lib.containers.cgroup.CgroupMetricsTester.fail(CgroupMetricsTester.java:71)
	at jdk.test.lib.containers.cgroup.MetricsTesterCgroupV1.fail(MetricsTesterCgroupV1.java:188)
	at jdk.test.lib.containers.cgroup.MetricsTesterCgroupV1.testMemoryUsage(MetricsTesterCgroupV1.java:555)
	at jdk.test.lib.containers.cgroup.MetricsTester.testAll(MetricsTester.java:62)
	at TestCgroupMetrics.main(TestCgroupMetrics.java:47)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
	at java.base/java.lang.reflect.Method.invoke(Method.java:578)
	at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:312)
	at java.base/java.lang.Thread.run(Thread.java:1589)

JavaTest Message: Test threw exception: java.lang.RuntimeException
JavaTest Message: shutting down test

result: Failed. Execution failed: `main' threw exception: java.lang.RuntimeException: Test failed for - memory:getMemoryUsage, expected [23031517184], got [22767202304]

A similar failure in this test was previously resolved:

JDK-8229284 [TESTBUG] jdk/internal/platform/cgroup/TestCgroupMetrics.java fails for - memory:getMemoryUsage
Comments
Changeset: 6ccee839 Author: Ioi Lam <iklam@openjdk.org> Date: 2022-12-22 20:50:20 +0000 URL: https://git.openjdk.org/jdk/commit/6ccee839580fd9dc4cd4941b44dbbe3105202561
22-12-2022

jdk.test.lib.containers.cgroup.MetricsTester is being used in two different scenarios: 1) TestCgroupMetrics (outside a container) 2) TestSystemMetrics (from within a container - with a 256mb mem limit). So the analysis isn't entirely correct.
20-12-2022

A pull request was submitted for review. URL: https://git.openjdk.org/jdk/pull/11734 Date: 2022-12-20 00:03:25 +0000
20-12-2022

This particular scenario is fundamentally wrong. https://github.com/openjdk/jdk/blob/ba942c24e8894f4422870fb53253f5946dc4f0d1/test/lib/jdk/test/lib/containers/cgroup/MetricsTesterCgroupV1.java#L530-L557 The jdk.internal.platform.Metrics.getMemoryUsage() API returns the the "memory.current" metric https://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git/tree/Documentation/admin-guide/cgroup-v2.rst memory.current: The total amount of memory currently being used by the cgroup and its descendants. This test scenario is executed in a regular Java program, outside of any containers (in fact it's executed inside a jtreg agent thread) java.lang.RuntimeException: Test failed for - memory:getMemoryUsage, expected [23031517184], got [22767202304] at jdk.test.lib.containers.cgroup.CgroupMetricsTester.fail(CgroupMetricsTester.java:71) at jdk.test.lib.containers.cgroup.MetricsTesterCgroupV1.fail(MetricsTesterCgroupV1.java:188) at jdk.test.lib.containers.cgroup.MetricsTesterCgroupV1.testMemoryUsage(MetricsTesterCgroupV1.java:555) at jdk.test.lib.containers.cgroup.MetricsTester.testAll(MetricsTester.java:62) at TestCgroupMetrics.main(TestCgroupMetrics.java:47) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) at java.base/java.lang.reflect.Method.invoke(Method.java:578) at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:312) at java.base/java.lang.Thread.run(Thread.java:1589) So "memory.current" returns the total usage of all programs executed by the current Linux user, including all other jtreg test cases that may be currently allocating and freeing memory. That's why we see such a high number: 22767202304 is over 22GB! Under this circumstance, it's simple impossible to make Metrics.getMemoryUsage() to return any predictable value.
19-12-2022

We already have sanity test for the getMemoryUsage() API: https://github.com/openjdk/jdk/blob/ba942c24e8894f4422870fb53253f5946dc4f0d1/test/lib/jdk/test/lib/containers/cgroup/MetricsTesterCgroupV1.java#L237-L241 oldVal = metrics.getMemoryUsage(); newVal = getLongValueFromFile(Controller.MEMORY, "memory.usage_in_bytes"); if (!CgroupMetricsTester.compareWithErrorMargin(oldVal, newVal)) { fail(Controller.MEMORY, "memory.usage_in_bytes", oldVal, newVal); } I don't think it's possible to write a test to cause metrics.getMemoryUsage() to reliably return a value that's larger (or smaller) than one previously observed. Java heap allocations may cause collections which might release memory back to the OS (the exact behavior depends on the collector and the OS). Even [~ibereziuk] the suggestion above may not work, as the container may overcommit and use more RAM that on the system. So I think the MetricsTesterCgroupV1::testMemoryUsage() and MetricsTesterCgroupV2::testMemoryUsage() should be removed.
19-12-2022

Indeed, part of TestCgroupMetrics.java test which fails intermittently. The test needs to be improved. There is a ticket https://bugs.openjdk.org/browse/JDK-8298030 . Better to disable the problematic part of the test until the test is improved.
02-12-2022

Here's log file snippet from the jdk-20+22-1572-tier5 sighting: jdk/internal/platform/cgroup/TestCgroupMetrics.java #section:main ----------messages:(8/309)---------- command: main TestCgroupMetrics reason: User specified action: run main TestCgroupMetrics started: Sat Oct 29 18:51:23 UTC 2022 Mode: agentvm Agent id: 6 Additional exports to unnamed modules from @modules: java.base/jdk.internal.platform finished: Sat Oct 29 18:51:25 UTC 2022 elapsed time (seconds): 1.683 ----------configuration:(15/1955)---------- <snip> ----------System.err:(14/960)---------- java.lang.RuntimeException: Test failed for - memory:getMemoryUsage, expected [23282388992], got [22777573376] at jdk.test.lib.containers.cgroup.CgroupMetricsTester.fail(CgroupMetricsTester.java:71) at jdk.test.lib.containers.cgroup.MetricsTesterCgroupV1.fail(MetricsTesterCgroupV1.java:188) at jdk.test.lib.containers.cgroup.MetricsTesterCgroupV1.testMemoryUsage(MetricsTesterCgroupV1.java:555) at jdk.test.lib.containers.cgroup.MetricsTester.testAll(MetricsTester.java:62) at TestCgroupMetrics.main(TestCgroupMetrics.java:47) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) at java.base/java.lang.reflect.Method.invoke(Method.java:578) at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:312) at java.base/java.lang.Thread.run(Thread.java:1591) JavaTest Message: Test threw exception: java.lang.RuntimeException JavaTest Message: shutting down test result: Failed. Execution failed: `main' threw exception: java.lang.RuntimeException: Test failed for - memory:getMemoryUsage, expected [23282388992], got [22777573376]
30-10-2022