JDK-8257746 : Regression introduced with JDK-8250984 - memory might be null in some machines
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 8,11
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2020-12-04
  • Updated: 2025-01-16
  • Resolved: 2021-01-28
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 13 JDK 15 JDK 16 JDK 17 JDK 8 Other
11.0.11Fixed 13.0.7Fixed 15.0.4Fixed 16.0.1Fixed 17 b08Fixed 8u301Fixed openjdk8u292Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Comments
Fix request (15u) Requesting backport to 15u as follow-up fix for JDK-8250984 that is already included to 15u. The patch applies cleanly. Tested with tier1 and container tests.
13-05-2021

Fix request (13u) Requesting backport to 13u for parity with 11u. The patch doesn't apply cleanly since 13u doesn't have cgroups v2 support (JDK-8231111), so it reapplied manually to similar places in cgroupv1/Metrics.java. Tested with tier1 and container tests. RFR: http://mail.openjdk.java.net/pipermail/jdk-updates-dev/2021-March/005150.html
01-03-2021

Fix Request (16u) Backporting this small low-risk fix prevents this bug from occurring in JDK-16u. The original bug fix patch applied cleanly. After applying the patch to a JDK-16u repo, the fix was regression tested by running Mach5 tiers 1 and 2 on Linux, Windows, and Mac OS, and running tiers 3-5 on Linux x64.
01-03-2021

Fix Request (OpenJDK 11u): Please approve backporting this to OpenJDK 11u. Patch doesn't apply cleanly due to the cgroups v2 patch in JDK 17. Rewritten for JDK 11u and reviewed by Matthias Baesken. Risk should be low as it's only adding null checks before actually using the controller. Matthias tested the patch and confirmed it fixes the regression on an affected system. I've also tested using the container tests (cgroups v1) which pass too. webrev: https://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8257746/jdk11/01/webrev/ RFR: https://mail.openjdk.java.net/pipermail/jdk-updates-dev/2021-February/005110.html
25-02-2021

Thanks Matthias! I'll propose for review then. RFR: https://mail.openjdk.java.net/pipermail/jdk-updates-dev/2021-February/005110.html
24-02-2021

hi Severin, I did a quick check and the proposed change solves the issue we noticed with openjdk11 on the SLES11 linux x86_64 machine.
24-02-2021

Candidate webrev for 11u: https://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8257746/jdk11/01/webrev/
24-02-2021

Ah, OK. Thanks! It's jtreg itself which fails not the test. It also confirms that it's OperatingSystemMXBean which triggers that code path. Work-around is to use -XX:-UseContainerSupport on these systems.
24-02-2021

Interesting. Would you have the full stack trace of the jdk/jdk/jfr/event/sampling/TestNative.java failure? Yes, the 11u backport would need some rewrite as there is no cgroups v2 support there. I can do it, but since I've no real way of reproducing/testing it I'd rely on somebody else for confirming the fix.
24-02-2021

My colleague noticed the error on a SLES11 linux x86_64 box, when running the test jdk/jdk/jfr/event/sampling/TestNative.java . SLES11 is rather old, for some reason /proc/self/cgroup misses the memory entry; that entry is present on higher SLES Linux versions I checked (e.g. SLES12). I would like to have the fix in jdk11 as well, because the issue is present there. The jdk17 change does not apply directly to jdk11 (because the change is in separate files when comparing 11 and 17) . A separate backport request tbd.
24-02-2021

Here are the contents of the requested files: /proc/self/mountinfo ------------------------ 16 20 0:15 / /dev rw,relatime - tmpfs udev rw,nr_inodes=0,mode=755 17 16 0:16 / /dev/shm rw,relatime - tmpfs tmpfs rw,size=74448896k 20 1 253:3 / / rw,noatime - ext3 /dev/mapper/vg_sys_r1-root rw,errors=continue,user_xattr,acl,barrier=1,data=ordered 18 20 0:3 / /proc rw,relatime - proc proc rw 14 20 0:14 / /sys rw,relatime - sysfs sysfs rw 15 16 0:11 / /dev/pts rw,relatime - devpts devpts rw,gid=5,mode=620,ptmxmode=000 21 14 0:6 / /sys/kernel/debug rw,relatime - debugfs debugfs rw 22 20 8:1 / /boot rw,noatime - ext3 /dev/sda1 rw,errors=continue,user_xattr,acl,barrier=1,data=ordered 23 20 253:1 / /home rw,nosuid,nodev,noatime - ext3 /dev/mapper/vg_sys_r1-home rw,errors=continue,user_xattr,acl,barrier=1,data=ordered 24 20 253:2 / /opt rw,noatime - ext3 /dev/mapper/vg_sys_r1-opt rw,errors=continue,user_xattr,acl,barrier=1,data=ordered 25 24 253:0 / /opt/bmc rw,noatime - ext3 /dev/mapper/vg_sys_r1-bmc rw,errors=continue,user_xattr,acl,barrier=1,data=ordered 26 20 253:5 / /tmp rw,noatime - ext3 /dev/mapper/vg_sys_r1-tmp rw,errors=continue,user_xattr,acl,barrier=1,data=ordered 27 20 253:6 / /var rw,nodev,noexec,noatime - ext3 /dev/mapper/vg_sys_r1-var rw,errors=continue,user_xattr,acl,barrier=1,data=ordered 28 20 253:8 / /priv rw,relatime - ext3 /dev/mapper/vg00r1-priv rw,errors=continue,barrier=1,data=ordered 29 20 253:7 / /priv2 rw,relatime - ext3 /dev/mapper/vg_sys_r1-priv2 rw,errors=continue,user_xattr,acl,barrier=1,data=ordered 30 14 0:18 / /sys/fs/fuse/connections rw,relatime - fusectl fusectl rw 31 27 0:3 / /var/lib/ntp/proc ro,nosuid,nodev,relatime - proc none rw 32 20 0:19 / /cgroup/cpu rw,relatime - cgroup cgroup rw,cpu 33 20 0:20 / /net rw,relatime - autofs /etc/mount.map rw,fd=6,pgrp=6987,timeout=30,minproto=5,maxproto=5,indirect 34 18 0:21 / /proc/fs/nfsd rw,relatime - nfsd nfsd rw 35 27 0:22 / /var/lib/nfs/rpc_pipefs rw,relatime - rpc_pipefs rpc_pipefs rw /proc/cgroups ------------------------ #subsys_name hierarchy num_cgroups enabled cpuset 0 1 1 cpu 1 1 1 cpuacct 0 1 1 memory 0 1 1 devices 0 1 1 freezer 0 1 1 net_cls 0 1 1 blkio 0 1 1 perf_event 0 1 1 /proc/self/cgroup ------------------------ 1:cpu:/
24-02-2021

FWIW, I have JDK-8254001 in the pipeline and after that is in add a regression test for this issue. See my wip patch here: https://bugs.openjdk.java.net/browse/JDK-8254001?focusedCommentId=14399868&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14399868
24-02-2021

[~mbaesken] Do you know how to reproduce? Could you provide us with the following info: a) What code is triggering the issue? I'm guessing OperatingSystemMXBean is involved, but would like to confirm. b) How do relevant cgroups file look like on the affected system: /proc/cgroups, /proc/self/cgroup and /proc/self/mountinfo. Thanks!
24-02-2021

Looks like we are facing the issue too in OpenJDK11 on an older SLES11 based machine (where memory is NULL). > Error: Unexpected exception occurred! java.lang.NullPointerException > java.lang.NullPointerException > at > java.base/jdk.internal.platform.cgroupv1.Metrics.getMemoryAndSwapLimit(Metrics.java:484)
24-02-2021

full stack reported when running the jtreg test with OpenJDK11 by my colleague was : Error: Unexpected exception occurred! java.lang.NullPointerException java.lang.NullPointerException at java.base/jdk.internal.platform.cgroupv1.Metrics.getMemoryAndSwapLimit(Metrics.java:484) at jdk.management/com.sun.management.internal.OperatingSystemImpl.getTotalSwapSpaceSize(OperatingSystemImpl.java:57) at com.sun.javatest.regtest.config.OS.<init>(OS.java:160) at com.sun.javatest.regtest.config.OS.current(OS.java:59) at com.sun.javatest.regtest.config.RegressionContext.<init>(RegressionContext.java:77) at com.sun.javatest.regtest.config.RegressionContext.getDefault(RegressionContext.java:52) at com.sun.javatest.regtest.config.RegressionTestFinder.<init>(RegressionTestFinder.java:93) at com.sun.javatest.regtest.config.RegressionTestSuite.createTestFinder(RegressionTestSuite.java:100) at com.sun.javatest.regtest.config.RegressionTestSuite.<init>(RegressionTestSuite.java:82) at com.sun.javatest.regtest.config.RegressionTestSuite.open(RegressionTestSuite.java:65) at com.sun.javatest.regtest.config.TestManager.getTestSuites(TestManager.java:165) at com.sun.javatest.regtest.tool.Tool.run(Tool.java:1127) at com.sun.javatest.regtest.tool.Tool.run(Tool.java:1078) at com.sun.javatest.regtest.tool.Tool.main(Tool.java:147) at com.sun.javatest.regtest.Main.main(Main.java:58)
24-02-2021

JDK-8253797 happened since and has JDK-8250984 equivalent code for cgroup v2 branch. Has it been investigated cgroups v2 won't face the same issue?
28-01-2021

Changeset: abc4300d Author: Poonam Bajaj <poonam@openjdk.org> Date: 2021-01-28 15:07:03 +0000 URL: https://git.openjdk.java.net/jdk/commit/abc4300d
28-01-2021

After the backport of JDK-8250984, there are places where memory.isSwapEnabled() is called. For example: public long getMemoryAndSwapFailCount() { if (!memory.isSwapEnabled()) { return getMemoryFailCount(); } return SubSystem.getLongValue(memory, "memory.memsw.failcnt"); } But memory could be Null on some machines that have cgroup entries for CPU but not for memory. This would cause a NullPointerException when memory is accessed.
27-01-2021