JDK-8339593 : [Linux] Not all cgroup related metrics get looked up correctly
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 24
  • Priority: P4
  • Status: New
  • Resolution: Unresolved
  • OS: linux
  • CPU: generic
  • Submitted: 2024-09-05
  • Updated: 2024-09-05
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbdUnresolved
Related Reports
Relates :  
Relates :  
Description
Container detection code got originally introduced with JDK-8146115 primarily focusing on Docker. This got enhanced to support cgroups v2 in JDK-8230305. Some more fixes got added to support other cgroups-utilizing runtimes to limit resource control. For example JDK-8322420 and JDK-8217338. The primary goal of this was to prevent OOM kills and improve the JVM running in a limited environment in containers and kubernetes.

However, not all possible combinations are currently supported. Especially, when deviating from common container runtimes such as docker/podman it's possible the JVM doesn't recognize certain metrics. Those metrics are mostly used for diagnostics currently:

See os::Linux::print_container_info()

For example these metrics might return wrong values in certain configurations. systemd slice configs where memory limit and other memory metrics aren't restricted at the same level of a hierarchy would be such a case.

The values in question might, but are not limited to:

OSContainer::cpu_shares() => uses cpu.weight on cg v2
OSContainer::memory_and_swap_limit_in_bytes() => uses memory.swap.max on cg v2
OSContainer::memory_soft_limit_in_bytes() => uses memory.high on cg v2
Comments
Filing this issue based on the discussion in https://github.com/openjdk/jdk/pull/20646 The test framework in JDK-8333446 could be leveraged to cover such cases if we ever decided to fix this situation. Right, now it seems uncommon enough for it to be worth fixing it.
05-09-2024