JDK-8370572 : Cgroups hierarchical memory limit is not honored after JDK-8322420
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 21.0.9,25,26
  • Priority: P4
  • Status: New
  • Resolution: Unresolved
  • Submitted: 2025-10-24
  • Updated: 2025-10-27
Related Reports
Causes :  
Description
JDK-8322420 removed the handling of `hierarchical_memory_limit`:
https://github.com/openjdk/jdk/commit/55a7cf14453b6cd1de91362927b2fa63cba400a1#diff-8910f554ed4a7bc465e01679328b3e9bd64ceaa6c85f00f0c575670e748ebba9L118-L131

It seems to be the reason for reports that ECS tasks are no longer honoring the memory limit:
 https://github.com/adoptium/adoptium-support/issues/1293
 https://github.com/corretto/corretto-21/issues/135

We initially found this in Corretto 21.0.9. I have deployed `public.ecr.aws/amazoncorretto/amazoncorretto:21` image to ECS, Fargate, 2GB task, and overridden Docker command to `java,-XX:InitialRAMPercentage=25,-XX:MaxRAMPercentage=25,-Xlog:os+container=trace,-Xlog:gc+init,-version`, and it showed me this:

```
[0.002s][trace][os,container] total physical memory: 4037939200
...
[0.002s][trace][os,container] Memory Limit is: 9223372036854771712
[0.002s][debug][os,container] container memory limit ignored: 9223372036854771712, using host value 4037939200
...
[0.013s][info ][gc,init ] Heap Initial Capacity: 964M
[0.013s][info ][gc,init ] Heap Max Capacity: 964M
...
openjdk version "21.0.9" 2025-10-21 LTS
OpenJDK Runtime Environment Corretto-21.0.9.10.1 (build 21.0.9+10-LTS)
OpenJDK 64-Bit Server VM Corretto-21.0.9.10.1 (build 21.0.9+10-LTS, mixed mode, sharing)
```

So regardless this is 2G task, we detect 4G as the memory limit, which gives 25% heap of 1G, which leads to OOM later.

In fact, this also reproduces with JDK mainline nightly, with a similar result:

```
| [0.001s][trace][os,container] OSContainer::init: Initializing Container Support
| [0.001s][debug][os,container] Detected optional pids controller entry in /proc/cgroups
| [0.001s][debug][os,container] Detected cgroups hybrid or legacy hierarchy, using cgroups v1 controllers
...
| [0.002s][trace][os,container] Path to /memory.limit_in_bytes is /sys/fs/cgroup/memory/memory.limit_in_bytes
| [0.002s][trace][os,container] Memory Limit is: 9223372036854771712
| [0.002s][debug][os,container] container memory limit ignored: 9223372036854771712, upper bound is 4037939200
...
| [0.019s][info ][gc,init     ] Heap Initial Capacity: 964M
| [0.019s][info ][gc,init     ] Heap Max Capacity: 964M
...
| openjdk version "26-testing" 2026-03-17
| OpenJDK Runtime Environment (build 26-testing-builds.shipilev.net-openjdk-jdk-b5938-20251024-1037)
| OpenJDK 64-Bit Server VM (build 26-testing-builds.shipilev.net-openjdk-jdk-b5938-20251024-1037, mixed mode, sharing)
```

If you ask how ECS configures cgroups, it would say this:

```
/sys/fs/cgroup/memory/memory.use_hierarchy:1
/sys/fs/cgroup/memory/memory.stat:hierarchical_memory_limit 2147483648
...
/sys/fs/cgroup/memory/memory.limit_in_bytes:9223372036854771712
```

So there is _no_ `memory.limit_in_bytes` set, but `hierarchical_memory_limit` is still there. New JDK code leans heavily on `memory.limit_in_bytes` and therefore sees no limit. Older JDK code used to look at `hierarchical_memory_limit` and worked fine. 

My brief reading suggests that walking the hierarchy and looking for `memory.limit_in_bytes` is fine for cgroups V2, but for cgroups V1 we should still rely on `hierarchical_memory_limit`. I have not been able to see clear docs on this.
Comments
A pull request was submitted for review. Branch: master URL: https://git.openjdk.org/jdk/pull/28006 Date: 2025-10-27 17:40:45 +0000
27-10-2025

I attached JDK-8370572-poc-1.patch that restores hunks of JDK-8322420 related to `hierarchical_memory_limit`. It seems to solve the problem with local Docker reproducer and on ECS as well.
24-10-2025

That's what I am saying: it looks like container got itself a child of the outside parent, and that child is now _root_ from the perspective of container. There is no "parent" from within the container anymore, only a single "child" one? So the full hierarchy is not visible to the container, but kernel still computes limits for us, because it _can_ still see it. % find /sys/fs/cgroup 2>&1 | grep memory.stat ... /sys/fs/cgroup/memory/memory/parent/memory.stat /sys/fs/cgroup/memory/memory/memory.stat % docker run --cgroup-parent=/parent --rm -it shipilev/openjdk:latest find /sys/fs/cgroup | grep memory.stat /sys/fs/cgroup/memory/memory.stat
24-10-2025

Yeah. This is interesting: [0.001s][trace][os,container] Path to /memory.limit_in_bytes is /sys/fs/cgroup/memory/memory.limit_in_bytes it should be: /sys/fs/cgroup/memory/parent/memory.limit_in_bytes
24-10-2025

Yes. So I am thinking we should trust `hierarchical_memory_limit`, if it is available; because that *is* what kernel computes for us, regardless of the actually visible hierarchy from inside of container? But maybe there is another solution I am not seeing.
24-10-2025

The container detection is a heuristic and it's not guaranteed to be always right. Question is how to best deal with this.
24-10-2025

Already did, same thing. My shipilev/openjdk:21 (jdk21u-dev nightly) fails on both ECS and locally in the reproducer above. Note that my reproducer above is failing even with *current mainline*, so it does not look like a missing backport. And the cgroup config that container sees seems to be similar to what container sees when running on ECS. That I think is the culprit...
24-10-2025

Please test 21.0.10 (21u-dev) as it also includes JDK-8343191.
24-10-2025

> Is the a reason why ECS sets up the cgroup subsystem the way it does - using the hierarchical limit rather than the real one? We are still following up on that. But I see the problem even with plain Docker, see above. Seems to be a generic V1 / Docker compatibility problem to me? E.g. it looks like if part of cgroup hierarchy is not visible (that's what cgroup-parent does, AFAIU), then our hierarchy walk does not see whatever limit we set "outside" of container. But the kernel reports it right via hierarchical_memory_limit...
24-10-2025

Found a local reproducer without ECS, just with plain Docker: % sudo mkdir /sys/fs/cgroup/memory/parent % echo 1073741824 | sudo tee /sys/fs/cgroup/memory/parent/memory.limit_in_bytes % docker run --cgroup-parent=/parent --rm -it shipilev/openjdk:latest grep -rH . /sys/fs/cgroup/ | grep -E \(memory.limit_in_bytes\|hierarch\) /sys/fs/cgroup/memory/memory.use_hierarchy:1 /sys/fs/cgroup/memory/memory.stat:hierarchical_memory_limit 1073741824 /sys/fs/cgroup/memory/memory.limit_in_bytes:9223372036854771712 % docker run --cgroup-parent=/parent shipilev/openjdk:latest java -Xlog:os+container=trace -XX:InitialRAMPercentage=25 -XX:MaxRAMPercentage=25 -Xlog:gc+init -version [0.001s][trace][os,container] OSContainer::init: Initializing Container Support [0.001s][debug][os,container] Detected optional pids controller entry in /proc/cgroups [0.001s][debug][os,container] Detected cgroups hybrid or legacy hierarchy, using cgroups v1 controllers [0.001s][debug][os,container] OSContainer::init: is_containerized() = true because all controllers are mounted read-only (container case) ... [0.001s][trace][os,container] Path to /memory.limit_in_bytes is /sys/fs/cgroup/memory/memory.limit_in_bytes [0.001s][trace][os,container] Memory Limit is: 9223372036854771712 [0.001s][debug][os,container] container memory limit ignored: 9223372036854771712, upper bound is 264567476224 ... [0.166s][trace][os,container] Path to /memory.limit_in_bytes is /sys/fs/cgroup/memory/memory.limit_in_bytes [0.166s][trace][os,container] Memory Limit is: 9223372036854771712 [0.166s][debug][os,container] container memory limit ignored: 9223372036854771712, upper bound is 264567476224 [0.166s][info ][gc,init ] Memory: 246G ... [0.166s][info ][gc,init ] Heap Initial Capacity: 63104M [0.166s][info ][gc,init ] Heap Max Capacity: 63104M ... openjdk version "26-testing" 2026-03-17 OpenJDK Runtime Environment (build 26-testing-builds.shipilev.net-openjdk-jdk-b5938-20251024-1037) OpenJDK 64-Bit Server VM (build 26-testing-builds.shipilev.net-openjdk-jdk-b5938-20251024-1037, mixed mode, sharing)
24-10-2025

> I am thinking we should reinstate `hierarchical_memory_limit` check? It would make the container subsystem code even more complicated than it already is. So ECS with cgroup v1 relies on a systemd-related fix which was ineffective on cgroup v2: JDK-8217338. Interesting. Is the a reason why ECS sets up the cgroup subsystem the way it does - using the hierarchical limit rather than the real one? Are there ECS systems with cgroup v2? If so how does the setup look like there for the same system? We know setting the hard memory limit fixes the issue.
24-10-2025

[~sgehwolf], any thoughts? I am thinking we should reinstate `hierarchical_memory_limit` check?
24-10-2025