This bug is similar to JDK-8217766. In fact, it had an incomplete fix for the Metrics (Java) side.
On a system with join controllers, /proc/self/cgroup, might look like this:
9:pids:/user.slice/user-1000.slice/session-2.scope
8:perf_event:/
7:blkio:/user.slice
6:rdma:/
5:cpuset:/
4:devices:/user.slice
3:cpu,cpuacct,memory,net_cls,net_prio,hugetlb:/user.slice/user-1000.slice/session-2.scope
2:freezer:/
1:name=systemd:/user.slice/user-1000.slice/session-2.scope
0::/user.slice/user-1000.slice/session-2.scope
Then, the Java code to set the path to for the controller reads:
try (Stream<String> lines =
CgroupUtil.readFilePrivileged(Paths.get("/proc/self/cgroup"))) {
lines.map(line -> line.split(":"))
.filter(line -> (line.length >= 3))
.forEach(line -> setSubSystemControllerPath(subsystem, line));
} catch (IOException e) {
return null;
}
where setSubSystemController() reads:
/**
* setSubSystemPath based on the contents of /proc/self/cgroup
*/
private static void setSubSystemControllerPath(CgroupV1Subsystem subsystem, String[] entry) {
String controllerName;
String base;
CgroupV1SubsystemController controller = null;
CgroupV1SubsystemController controller2 = null;
controllerName = entry[1];
base = entry[2];
if (controllerName != null && base != null) {
switch (controllerName) {
case "memory":
controller = subsystem.memoryController();
break;
case "cpuset":
controller = subsystem.cpuSetController();
break;
case "cpu,cpuacct":
case "cpuacct,cpu":
controller = subsystem.cpuController();
controller2 = subsystem.cpuAcctController();
break;
case "cpuacct":
controller = subsystem.cpuAcctController();
break;
case "cpu":
controller = subsystem.cpuController();
break;
case "blkio":
controller = subsystem.blkIOController();
break;
// Ignore subsystems that we don't support
default:
break;
}
}
if (controller != null) {
controller.setPath(base);
if (controller instanceof CgroupV1MemorySubSystemController) {
CgroupV1MemorySubSystemController memorySubSystem = (CgroupV1MemorySubSystemController)controller;
boolean isHierarchial = getHierarchical(memorySubSystem);
memorySubSystem.setHierarchical(isHierarchial);
boolean isSwapEnabled = getSwapEnabled(memorySubSystem);
memorySubSystem.setSwapEnabled(isSwapEnabled);
}
if (controller != null) {
controller.setPath(base);
if (controller instanceof CgroupV1MemorySubSystemController) {
CgroupV1MemorySubSystemController memorySubSystem = (CgroupV1MemorySubSystemController)controller;
boolean isHierarchial = getHierarchical(memorySubSystem);
memorySubSystem.setHierarchical(isHierarchial);
boolean isSwapEnabled = getSwapEnabled(memorySubSystem);
memorySubSystem.setSwapEnabled(isSwapEnabled);
}
subsystem.setActiveSubSystems();
}
if (controller2 != null) {
controller2.setPath(base);
}
So for the example /proc/self/cgroup file it only sets the path for "blkio" and "cpuset". Others are not correctly set because they are on a joined path.
The net effect of this is that Metrics doesn't report the container limits correctly on such systems.