JDK-8314550 : [macosx-aarch64] serviceability/sa/TestJmapCore.java fails with "sun.jvm.hotspot.debugger.UnmappedAddressException: 801000800"
  • Type: Bug
  • Component: hotspot
  • Sub-Component: svc-agent
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • OS: os_x
  • CPU: aarch64
  • Submitted: 2023-08-17
  • Updated: 2023-10-12
  • Resolved: 2023-08-25
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 22
22 b13Fixed
Related Reports
Relates :  
Relates :  
Description
On OSX (and I believe only aarch64), the following failure has been entered multiple times into JDK-8270202, which was intended to just cover ZGC failures, so I'm creating this CR to cover this issue:

 sun.jvm.hotspot.debugger.UnmappedAddressException: 7001000800
at jdk.hotspot.agent/sun.jvm.hotspot.debugger.PageCache.checkPage(PageCache.java:208)
at jdk.hotspot.agent/sun.jvm.hotspot.debugger.PageCache.getLong(PageCache.java:100)
at jdk.hotspot.agent/sun.jvm.hotspot.debugger.DebuggerBase.readCInteger(DebuggerBase.java:356)
at jdk.hotspot.agent/sun.jvm.hotspot.debugger.DebuggerBase.readAddressValue(DebuggerBase.java:382)
at jdk.hotspot.agent/sun.jvm.hotspot.debugger.bsd.BsdDebuggerLocal.readAddress(BsdDebuggerLocal.java:421)
at jdk.hotspot.agent/sun.jvm.hotspot.debugger.bsd.BsdAddress.getAddressAt(BsdAddress.java:73)
at jdk.hotspot.agent/sun.jvm.hotspot.types.basic.BasicTypeDataBase.findDynamicTypeForAddress(BasicTypeDataBase.java:238)
at jdk.hotspot.agent/sun.jvm.hotspot.runtime.VirtualBaseConstructor.instantiateWrapperFor(VirtualBaseConstructor.java:104)
at jdk.hotspot.agent/sun.jvm.hotspot.oops.Metadata.instantiateWrapperFor(Metadata.java:78)
at jdk.hotspot.agent/sun.jvm.hotspot.oops.MetadataField.getValue(MetadataField.java:43)
at jdk.hotspot.agent/sun.jvm.hotspot.oops.MetadataField.getValue(MetadataField.java:40)
at jdk.hotspot.agent/sun.jvm.hotspot.classfile.ClassLoaderData.getKlasses(ClassLoaderData.java:82)
at jdk.hotspot.agent/sun.jvm.hotspot.classfile.ClassLoaderData.classesDo(ClassLoaderData.java:101)
at jdk.hotspot.agent/sun.jvm.hotspot.classfile.ClassLoaderDataGraph.classesDo(ClassLoaderDataGraph.java:84)
at jdk.hotspot.agent/sun.jvm.hotspot.utilities.HeapHprofBinWriter.writeSymbols(HeapHprofBinWriter.java:1206)
at jdk.hotspot.agent/sun.jvm.hotspot.utilities.HeapHprofBinWriter.write(HeapHprofBinWriter.java:454)
at jdk.hotspot.agent/sun.jvm.hotspot.tools.JMap.writeHeapHprofBin(JMap.java:216)
at jdk.hotspot.agent/sun.jvm.hotspot.tools.JMap.run(JMap.java:103)
at jdk.hotspot.agent/sun.jvm.hotspot.tools.Tool.startInternal(Tool.java:278)
at jdk.hotspot.agent/sun.jvm.hotspot.tools.Tool.start(Tool.java:241)
at jdk.hotspot.agent/sun.jvm.hotspot.tools.Tool.execute(Tool.java:134)
at jdk.hotspot.agent/sun.jvm.hotspot.tools.JMap.main(JMap.java:202)
at jdk.hotspot.agent/sun.jvm.hotspot.SALauncher.runJMAP(SALauncher.java:340)
at jdk.hotspot.agent/sun.jvm.hotspot.SALauncher.main(SALauncher.java:500) 
Comments
Changeset: d0cc0439 Author: Chris Plummer <cjplummer@openjdk.org> Date: 2023-08-25 21:14:33 +0000 URL: https://git.openjdk.org/jdk/commit/d0cc0439c07ad0cca611e1999eda37f20c5a99d0
25-08-2023

A pull request was submitted for review. URL: https://git.openjdk.org/jdk/pull/15423 Date: 2023-08-24 23:31:56 +0000
24-08-2023

Hmm, I think a better fix in CDS would be: if -XX:+AlwaysPreTouch is specify, we should pre-touch every page in the mmap'ed regions to make sure they are committed/ In filemap.cpp, this can be done by changing the read_only parameter to os::map_memory to false when AlwaysPreTouch is true. After os::map_memory returns, we call os::pretouch_memory() on the returned memory.
21-08-2023

[~iklam] I've tried out your fix. Although I can't prove it is actually fixing anything (since I can't reproduce the issue without the fix), it doesn't seem to be causing any issues with our testing. However I was wondering if the fix should actually go where you've indicated, or if it should have more fine grained control. The fix is in use_windows_memory_mapping(), yet the reason for this change has nothing to do with Windows (it's actually fixing a macosx-aarch64 issue), so this doesn't seem like the right place to put it. Probably the callers of use_windows_memory_mapping() should be the ones checking the AlwaysPreTouch flag. There are 3 calls to use_windows_memory_mapping(). Do they all want this fix, or maybe it can be isolated to just one or two of them.
21-08-2023

I never implemented Ioi's suggestion because I could never reproduce the issue, and I wanted to be sure that the fix was actually working. However, the problem has reproduced in CI a few times since then, so I think it's worth putting in place and see if it ever turns up again. One other thing to note is that it only seems to happen on 11.* hosts, but we still have some of those. And lastly, this really only fixes the problem for our testing, which does the following in CoreUtils.java: public static String getAlwaysPretouchArg(boolean withCore) { // macosx-aarch64 has an issue where sometimes the java heap will not be dumped to the // core file. Using -XX:+AlwaysPreTouch fixes the problem. if (withCore && Platform.isOSX() && Platform.isAArch64()) { return "-XX:+AlwaysPreTouch"; } else { return "-XX:-AlwaysPreTouch"; } } Users will still have this issue unless they run with AlwaysPreTouch, which normally they won't.
17-08-2023

Ioi's response: If specifying -XX:+AlwaysPreTouch is a viable work-around, this can be changed for CDS (metaspaceShared.cpp): static bool use_windows_memory_mapping() { const bool is_windows = (NOT_WINDOWS(false) WINDOWS_ONLY(true)); //const bool is_windows = true; // enable this to allow testing the windows mmap semantics on Linux, etc. - return is_windows; + return is_windows || AlwaysPreTouch } This will basically avoid using mmap on the CDS regions. Instead, we use read() to copy the contents into memory. Maybe this way, macosx-aarch64 will make these regions available in the core file.
17-08-2023

My comment on from JDK-8270202: Since this test purposely generates a core file, I looked at the hs_err file which was also produced and saw this: garbage-first heap total 524288K, used 1070K [0x00000007e0000000, 0x0000000800000000) So 801000800 is not in the java heap. However, I also saw: CDS archive(s) mapped at: [0x0000000800000000-0x0000000800cf0000-0x0000000800cf0000), size 13565952, SharedBaseAddress: 0x0000000800000000, ArchiveRelocationMode: 0. Compressed class space mapped at: 0x0000000801000000-0x0000000804400000, reserved size: 54525952 Narrow klass base: 0x0000000800000000, Narrow klass shift: 0, Narrow klass range: 0x100000000 So 801000800 is in the CDS archive's "Compressed class space". When the SA page cache tries to read it in, it throws: sun.jvm.hotspot.debugger.UnmappedAddressException: 801000800 Our test tools also ran jstack on the core files and got SA exceptions like the following: sun.jvm.hotspot.debugger.UnmappedAddressException: 80001a3d8 sun.jvm.hotspot.debugger.UnmappedAddressException: 800048590 These appear to be in the CDS archive. My guess is that these addresses are valid but are not in the core file. See JDK-8293563, which documents issues with macosx-aarch64 not including all of (most of) the java heap in the core dump. Even lldb can't read these pages. This was fixed by using -XX:+AlwaysPreTouch. However, I doubt that helps with the mapped in CDS archive. We may need for CDS to access all pages in the archive when -XX:+AlwaysPreTouch is used,
17-08-2023