Bug ID: JDK-8270087 Manually mounted cgroups controllers introduce warning logs on stdout

JDK-8270087 : Manually mounted cgroups controllers introduce warning logs on stdout

Type: Bug
Component: hotspot
Sub-Component: runtime
Affected Version: 11.0.16,16.0.1,17,20

Priority: P3
Status: Resolved
Resolution: Duplicate
OS: linux
CPU: generic

Submitted: 2021-05-17
Updated: 2024-06-04
Resolved: 2022-09-07

Related Reports

Duplicate :	JDK-8293472 - Incorrect container resource limit detection if manual cgroup fs mounts present
Relates :	JDK-8272121 - Unified Logging always writes warnings to stdout
Relates :	JDK-8253435 - Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist

Description

ADDITIONAL SYSTEM INFORMATION :
Debian10 (running inside docker)
openjdk version "16.0.1" 2021-04-20
OpenJDK Runtime Environment AdoptOpenJDK-16.0.1+9 (build 16.0.1+9)
OpenJDK 64-Bit Server VM AdoptOpenJDK-16.0.1+9 (build 16.0.1+9, mixed mode, sharing)


A DESCRIPTION OF THE PROBLEM :
We start a regular subprocess using the java binary via a script and rely on being able to consume the java processes' stdout and parse it as JSON. (in the particular case to start up a small java server that communicates its random, OS-assigned TCP-port via stdout so we can connect to it later on, fully automated).
In our particular environment the assumption that stdout is pure JSON broke because of the following log message on stdout, which we feel belongs to stderr:

[0.000s][warning][os,container] Duplicate cpuset controllers detected. Picking /sys/fs/cgroup/cpuset, skipping /cgroup-in/cpuset.

We have found this while googling for solutions: https://openjdk.java.net/jeps/158 containing most importantly this statement:
"The default configuration is that all messages using warning and error level are output to stderr." which does not seem to be true in this case. This issues is being addressed with JDK-8272121.

The second issue, why this warning is being produced in the first place is being addressed with JDK-8293472.

REGRESSION : Last worked in version 15

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
- Create a new empty folder and cd into it
- Download and extract jdk16 or 16.01 into it
- Create a file called "Dockerfile" with the following contents
FROM debian:10
ADD jdk-16 /jdk-16
- Build a docker container using the command
   docker build -t jdk16-logging-repro -f Dockerfile .
- run it using the following command
  docker run -it --rm -v /sys/fs/cgroup:/cgroup-in:ro jdk16-logging-repro
- inside the container do a "jdk16/bin/java -version 2>/dev/null"

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
No output or only the java version information
ACTUAL -
[0.000s][warning][os,container] Duplicate cpuset controllers detected. Picking /sys/fs/cgroup/cpuset, skipping /cgroup-in/cpuset.

CUSTOMER SUBMITTED WORKAROUND :
Can be worked around by not specifying "-v /sys/fs/cgroup:/cgroup-in:ro" on the docker invocation or by using -XX:-UseContainerSupport being passed to java -version.

FREQUENCY : always

Comments

Closing as duplicate of JDK-8293472. The issue of any warning logs produced via UL is being addressed with JDK-8272121.
07-09-2022
As it turns out Tencent folks have created JDK-8293472 and a PR associated with it. I'll close it as a duplicate of JDK-8293472.
07-09-2022
[~sgehwolf] sound good. Please change the synopsis. Also, in the description where it mentions -Xlog, please add a pointer to JDK-8272121. Thanks
06-09-2022
ILW = HLM = P3
06-09-2022
[~iklam] The warning reported by this user is a symptom of a bug similar to JDK-8253435 (the warning showing up). In a way it uncovered two bugs: JDK-8272121 and that the runtime processing isn't ready to handle those duplicates. So in a way it's just as good to keep this one open and track that change here. I don't think it makes sense to close as a duplicate of JDK-8272121 just so that we create another. I intend to change the synopsis of this bug once I start work on a fix if that helps. Does that sound OK?
06-09-2022
[~mbaesken] [~sgehwolf] The bug reported by the user is that log_warning messages always goes to stdout, and cannot be disabled even if -Xlog:all=warning:stderr is specific. See also JDK-8272121. I would like to close this issue as a duplicate of JDK-8272121. Do you agree? For the container-specific part of this issue (container handling of /sys/fs/cgroups), should we track that in a separate issue?
06-09-2022
This basically is another incarnation of JDK-8253435. The core assumption is that only one mountinfo entry is present per cg v1 controller. This breaks with bind-mounting the /sys/fs/cgroups tree into the container. With a fastdebug build of JDK 20 this crashes: $ sudo podman run --rm -ti -v /sys/fs/cgroup:/cgroup-in:ro -v $(pwd)/build/linux-x86_64-server-fastdebug/images/jdk/:/opt/jdk:z fedora:36 [root@3517b9eafc25 /]# /opt/jdk/bin/java -version # To suppress the following error report, specify this argument # after -XX: or in .hotspotrc: SuppressErrorAt=/cgroupSubsystem_linux.cpp:373 # # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (/disk/openjdk/upstream-sources/git/jdk-jdk/src/hotspot/os/linux/cgroupSubsystem_linux.cpp:373), pid=12, tid=13 # assert(cg_infos[4]._mount_path == __null) failed: stomping of _mount_path # # JRE version: (20.0) (fastdebug build ) # Java VM: OpenJDK 64-Bit Server VM (fastdebug 20-internal-adhoc.sgehwolf.jdk-jdk, mixed mode, sharing, tiered, unknown gc, linux-amd64) # Core dump will be written. Default location: Core dumps may be processed with "/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h" (or dumping to //core.12) # # An error report file with more information is saved as: # //hs_err_pid12.log # # Aborted (core dumped) [root@3517b9eafc25 /]# cat /proc/self/mountinfo \| grep cgroup 2466 2464 0:27 / /cgroup-in ro,nosuid,nodev,noexec - tmpfs tmpfs ro,seclabel,size=4096k,nr_inodes=1024,mode=755,inode64 2467 2466 0:28 / /cgroup-in/unified rw,nosuid,nodev,noexec,relatime - cgroup2 cgroup2 rw,seclabel,nsdelegate 2468 2466 0:29 / /cgroup-in/systemd rw,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,xattr,name=systemd 2469 2466 0:32 / /cgroup-in/net_cls,net_prio rw,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,net_cls,net_prio 2470 2466 0:33 / /cgroup-in/freezer rw,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,freezer 2471 2466 0:34 / /cgroup-in/cpu,cpuacct rw,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,cpu,cpuacct 2472 2466 0:35 / /cgroup-in/perf_event rw,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,perf_event 2473 2466 0:36 / /cgroup-in/memory rw,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,memory 2474 2466 0:37 / /cgroup-in/blkio rw,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,blkio 2475 2466 0:38 / /cgroup-in/devices rw,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,devices 2476 2466 0:39 / /cgroup-in/cpuset rw,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,cpuset 2477 2466 0:40 / /cgroup-in/hugetlb rw,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,hugetlb 2478 2466 0:41 / /cgroup-in/pids rw,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,pids 2479 2466 0:42 / /cgroup-in/misc rw,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,misc 2491 2465 0:101 / /sys/fs/cgroup rw,nosuid,nodev,noexec,relatime - tmpfs cgroup rw,context="system_u:object_r:container_file_t:s0:c783,c861",size=1024k,inode64 2492 2491 0:42 /machine.slice/libpod-3517b9eafc258a609b90362766d67aee1490fb98c1c6cb48ff84d26039305c56.scope /sys/fs/cgroup/misc ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,misc 2493 2491 0:41 /machine.slice/libpod-3517b9eafc258a609b90362766d67aee1490fb98c1c6cb48ff84d26039305c56.scope /sys/fs/cgroup/pids ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,pids 2494 2491 0:40 /machine.slice/libpod-3517b9eafc258a609b90362766d67aee1490fb98c1c6cb48ff84d26039305c56.scope /sys/fs/cgroup/hugetlb ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,hugetlb 2495 2491 0:39 /machine.slice/libpod-3517b9eafc258a609b90362766d67aee1490fb98c1c6cb48ff84d26039305c56.scope /sys/fs/cgroup/cpuset ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,cpuset 2496 2491 0:38 /machine.slice/libpod-3517b9eafc258a609b90362766d67aee1490fb98c1c6cb48ff84d26039305c56.scope /sys/fs/cgroup/devices ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,devices 2497 2491 0:37 /machine.slice/libpod-3517b9eafc258a609b90362766d67aee1490fb98c1c6cb48ff84d26039305c56.scope /sys/fs/cgroup/blkio ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,blkio 2498 2491 0:36 /machine.slice/libpod-3517b9eafc258a609b90362766d67aee1490fb98c1c6cb48ff84d26039305c56.scope /sys/fs/cgroup/memory ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,memory 2499 2491 0:35 /machine.slice/libpod-3517b9eafc258a609b90362766d67aee1490fb98c1c6cb48ff84d26039305c56.scope /sys/fs/cgroup/perf_event ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,perf_event 2500 2491 0:34 /machine.slice/libpod-3517b9eafc258a609b90362766d67aee1490fb98c1c6cb48ff84d26039305c56.scope /sys/fs/cgroup/cpu,cpuacct ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,cpu,cpuacct 2501 2491 0:33 /machine.slice/libpod-3517b9eafc258a609b90362766d67aee1490fb98c1c6cb48ff84d26039305c56.scope /sys/fs/cgroup/freezer ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,freezer 2502 2491 0:32 /machine.slice/libpod-3517b9eafc258a609b90362766d67aee1490fb98c1c6cb48ff84d26039305c56.scope /sys/fs/cgroup/net_cls,net_prio ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,net_cls,net_prio 2503 2491 0:29 /machine.slice/libpod-3517b9eafc258a609b90362766d67aee1490fb98c1c6cb48ff84d26039305c56.scope /sys/fs/cgroup/systemd ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,xattr,name=systemd
06-09-2022
This is an issue seen sometimes on Linux in cgroups / docker environments. Since 11.0.16 after backporting https://bugs.openjdk.org/browse/JDK-8253435 which added the warning to cgroupSubsystem_linux.cpp it can be observed as well in JDK11 . It seems you need to have some special cpusets setup to see this . Because it is an UL warning (not an info) it is printed without setting specific additional UL flags. And unfortunately it is shown on stdout not stderr which caused trouble with some tools not expecting it. So questions are : - Would it be better to have an info instead of a warning? - having it on stderr instead of stdout might be more reasonable Btw. it does not have anything to do with AdoptOpenJDK. Any recent OpenJDK 11 , 17 will have the issue. Most likely also Oracle JDK.
06-09-2022
This incident is reported on AdoptOpenJDK-16.0.1+9, was not able to reproduce this issue with Oracle JDK 16.0.1 and JDK 17. OS: Windows 10 JDK 16.0.1 : Pass JDK 17ea29: Pass No output is generated as mentioned in the expected behavior EXPECTED - No output or only the java version information
08-07-2021