JDK-8249779 : SA fails to properly discover system libraries for OSX core files
  • Type: Bug
  • Component: hotspot
  • Sub-Component: svc-agent
  • Affected Version: 16
  • Priority: P4
  • Status: Open
  • Resolution: Unresolved
  • OS: os_x
  • CPU: x86_64
  • Submitted: 2020-07-21
  • Updated: 2020-07-24
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbdUnresolved
Related Reports
Relates :  
Relates :  
Description
While working on JDK-8247515, which involved fixing issues that SA has with setting up symbol tables for OSX core files (and then doing lookups), I realized that the SA code that tries to discover all dylibs in the core file is broken.

The algorithm it uses is to look in each LC_SEGMENT_64 in the core file, checking the start of each to see if represents a dylib, which is indicated by MH_MAGIC_64 at the start. Once a mach-o file is found this way, it is searched for an LC_ID_DYLIB entry which provides the path to the dylib. Once one mach-0 file has been found in a segment, SA will then search the rest of the segment page by page looking for more mach-o files in the same segment.

This approach sort of works, and at least finds the JVM libs reliably. However, it has a few problem with system libs. The first is that some are in segments that don't start with a mach-o file, and therefore end up getting skipped. Changing the algorithm to search all segments fixed this problem, but is very slow as every page of the core file ends up getting paged in. It also seemed to find a few false positives (things that were not mach-o files) and also found dylibs that don't seem to be relevant. This was determined by looking at the output of lldb "image list", which showed about 239 libraries, yet the brute force SA approach found about 3x this number.

The other issue is that when there are multiple system libraries in the same segment, there appears to be no way to determine where each starts and ends within the segment. So SA just registers them all as having the same start address and same size (the address and size of the segment). This means when an address -> symbol lookup is done, SA just ends up using the first dylib it found in the segment, which chances are is the wrong one. The result of the lookup ends up being the very last symbol in the dylib, plus a very large offset.

Since JVM libs always seem to each be in there own segment and therefore are always found, I think we can continue with the approach for finding them. However, we need a better way for the system libs. For now as part of JDK-8247515 I am disabling the looking up of system libs. This is done by ignoring any LC_ID_DYLIB that is found that has a relative path. If you look for references to this CR in the macosx ps_core.c, you can see where the relevant changes were made.

As for how to properly fix this issue, I'm unsure. I did a lot of investigating into what is in the core file (just LC_THREAD and LC_SEGMENT_64 load commands) and what is in each LC_SEGMENT_64 (no sections, which was surprising). Basically I couldn't find anything that looked like it could be used to assemble a link map of sorts. But certainly it must be possible, as I was able to get lldb to provide a link map with the "image list" command. I therefore suggest looking at the lldb source for the answer.
Comments
The first LC_SEGMENT_64 of the core file is mapped to the mach-o file for the executable. For the "java" executable, it had the following Load Commands: LC_SEGMENT_64: LC_SEGMENT_64: LC_SEGMENT_64: LC_SEGMENT_64: LC_DYLD_INFO: LC_SYMTAB: LC_DYSYMTAB: LC_LOAD_DYLINKER: LC_UUID: LC_VERSION_MIN_MACOSX: LC_SOURCE_VERSION: LC_MAIN: LC_LOAD_DYLIB: /usr/lib/libz.1.dylib LC_LOAD_DYLIB: @rpath/libjli.dylib LC_LOAD_DYLIB: /System/Library/Frameworks/Cocoa.framework/Versions/A/Cocoa LC_LOAD_DYLIB: /System/Library/Frameworks/Security.framework/Versions/A/Security LC_LOAD_DYLIB: /System/Library/Frameworks/ApplicationServices.framework/Versions/A/ApplicationServices LC_LOAD_DYLIB: /usr/lib/libSystem.B.dylib LC_RPATH: @loader_path/. LC_RPATH: @loader_path/../lib LC_FUNCTION_STARTS: LC_DATA_IN_CODE: So this answers a couple of questions and asks a couple. What it answers is how to deal with @rpath libraries. They are found relative to the paths given by LC_RPATH. Note to find other user JNI libraries, I suspect you need to look in libjvm.dylib for the need LC_RPATH. It also answers how to find the list of dylibs that are referenced. Note these are just those referenced by "java". To get the full list of 230+ libraries that lldb is seeing, you probably need to follow library dependencies. For example, libjli.dylib most likely depends on libjvm.dylib and libjava.dylib, and these to a bunch more (both JDK and system libraries). What it asks is how to determine @loader_path. I'm not sure where that value comes from. Possibly it is embedded somewhere in the LC_DYLD_INFO, which is a complex stream of data to parse, and I'm not even certain what is in it. Also, it doesn't seem to give any indication of where the libraries are loaded into memory. I think the presence of the LC_LOAD_DYLIBs is mostly for the loader to load the dylibs, but I'm not so sure there is any indication of where they actually end up in memory. Certainly the LC_LOAD_DYLIB doesn't specify the load address. It just gives the dylib name.
24-07-2020

As mentioned above, looking at lldb will probably explain how to properly parse the core file for the needed library info. A few things to note: (1) lldb is able to find the executable path and mapping without being told the path to the executable (SA needs the path). (2) lldb deals with relative paths better. When SA finds an LC_ID_DYLIB with a relative path, such as @rpath/libjvm.dylib, it searches in various places for it so it can create a full path and locate the actual dll, such as in JAVA_HOME and DYLD_LIBRARY_PATH. It also looks in the lib directory relative to the executable. This is clumsy and makes it difficult, for example, to locate user JNI libraries. lldb seems to find the full paths to all these libs in some other way. (3) lldb finds all the system libraries. SA hasn't figured out how to do this yet.
24-07-2020