JDK-8275703 : System.loadLibrary fails on Big Sur for libraries hidden from filesystem
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.lang
  • Affected Version: 17
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • OS: os_x
  • CPU: x86_64
  • Submitted: 2021-10-19
  • Updated: 2022-09-12
  • Resolved: 2021-10-28
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 13 JDK 15 JDK 17 JDK 18 JDK 7 JDK 8
11.0.15-oracleFixed 13.0.12Fixed 15.0.8Fixed 17.0.2Fixed 18 b22Fixed 7u351Fixed 8u331Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Description
ADDITIONAL SYSTEM INFORMATION :
MacBook Pro (15-inch, 2018)
MacOS Big Sur 11.6
OpenJDK 17, OpenJDK 11, Oracle JDK 17, Oracle JDK 18-ea

A DESCRIPTION OF THE PROBLEM :
OSX Big Sur no longer ships with copies of the libraries on the filesystem [1] and therefore attempts to load a native library via System.loadLibrary no longer works.

Details:
- The new dynamic linker cache introduced by OSX Big Sur creates problems with previous code which checks for the existence of a library file in the filesystem before attempting to load it with dlopen(...). According to the Big Sur release notes in [1], 
   "Code that attempts to check for dynamic library presence by looking for a file at a path or enumerating a directory will fail. Instead, check for library presence by attempting to dlopen() the path, which will correctly check for the library in the cache."

Tracing the execution of System.loadLibrary shows that the offending check happens at [2] (tested on OpenJDK 11 and 17, Oracle JDK 17 and 18-ea) where File.exists(...) is used to check whether the library file is present on the filesystem before attempting to load it.  As a result, valid dynamic libraries that otherwise open fine via dlopen(...) fail with UnsatisfiedLinkError in current versions of the JDK.

Background:
- the way I discovered this issue was while attempting to get the OSX Accelerate LibVec BLAS implementation to load via Netlib in Spark;  Netlib kept complaining that it can't find the native BLAS libraries. Inspecting the Netlib code [3] showed that it used System.loadLibrary to attempt to load the appropriately-configured BLAS library. From there, further tests on using System.loadLibrary via "jshell" showed the problem to be in its implementation.  I have confirmed that the dlopen(...) call works with "hidden" libraries by creating a small C application that invoked dlopen and reported success or failure.  Inspection of the JDK code for System.loadLibrary eventually led to the offending file existence check in [2]

[1] https://developer.apple.com/documentation/macos-release-notes/macos-big-sur-11_0_1-release-notes/#Kernel
[2] https://github.com/openjdk/jdk/blob/895e2bd7c0bded5283eca8792fbfb287bb75016b/src/java.base/share/classes/jdk/internal/loader/NativeLibraries.java#L163
[3] https://github.com/luhenry/netlib/blob/20ecbd98425ea7baae5ebd70d392c9eb206dfb26/blas/src/main/jdk17/dev/ludovic/netlib/blas/ForeignLinkerBLAS.java#L55

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
In a OSX Big Sur installation with XCode and developer tools installed install any Java version from the list in "Runtime information", then

capitanu@leo:~/Downloads/java/jdk-18.jdk/Contents/Home$ export JAVA_HOME=$(pwd)
capitanu@leo:~/Downloads/java/jdk-18.jdk/Contents/Home$ export LD_LIBRARY_PATH='/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A'
capitanu@leo:~/Downloads/java/jdk-18.jdk/Contents/Home$ export JAVA_LIBRARY_PATH='/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A'
capitanu@leo:~/Downloads/java/jdk-18.jdk/Contents/Home$ jshell
|  Welcome to JShell -- Version 18-ea
|  For an introduction type: /help intro

jshell> System.loadLibrary("blas")
|  Exception java.lang.UnsatisfiedLinkError: no blas in java.library.path: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A:/Users/capitanu/Library/Java/Extensions:/Library/Java/Extensions:/Network/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java:.
|        at ClassLoader.loadLibrary (ClassLoader.java:2429)
|        at Runtime.loadLibrary0 (Runtime.java:818)
|        at System.loadLibrary (System.java:1998)
|        at (#1:1)

jshell> System.load("/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib")
|  Exception java.lang.UnsatisfiedLinkError: Can't load library: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
|        at ClassLoader.loadLibrary (ClassLoader.java:2393)
|        at Runtime.load0 (Runtime.java:755)
|        at System.load (System.java:1962)
|        at (#2:1)

To prove that dlopen works with "hidden" libraries, create a file dlopen.c with contents:

#include <stdio.h>
#include <stdlib.h>
#include <dlfcn.h>

int main(int argc, char** argv)
{
    void *handle;

    if (argc != 2) {
        fprintf(stderr, "Usage: %s <lib_filename_or_full_path>\n", argv[0]);
        return EXIT_FAILURE;
    }

    printf("Attempting to load library '%s'...\n", argv[1]);

    handle = dlopen(argv[1], RTLD_LAZY);

    if (handle == NULL) {
	fprintf(stderr, "Unable to load library!\n");
	return EXIT_FAILURE;
    }

    printf("Library successfully loaded!\n");

    return dlclose(handle);
}

Then:
gcc -o dlopen dlopen.c
./dlopen libBLAS.dylib
and
./dlopen /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib



EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Expected for dynamic library to load correctly via System.loadLibrary and System.load even for "hidden" libraries that are not visible on the filesystem but can be successfully loaded via dlopen.

ACTUAL -
|  Exception java.lang.UnsatisfiedLinkError: no blas in java.library.path: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A:/Users/capitanu/Library/Java/Extensions:/Library/Java/Extensions:/Network/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java:.
|        at ClassLoader.loadLibrary (ClassLoader.java:2429)
|        at Runtime.loadLibrary0 (Runtime.java:818)
|        at System.loadLibrary (System.java:1998)
|        at (#1:1)

---------- BEGIN SOURCE ----------
System.load("/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib")
---------- END SOURCE ----------

FREQUENCY : always



Comments
Looks like the change has the undesired side effect on macOS that in case of some failures (e.g. trying to load a macOS x86_64 shared lib on macOS aarch64) the exception message got worse. old exception message with details about the error from Hotspot os::dll_load : Caused by: java.lang.ExceptionInInitializerError: Exception java.lang.ExceptionInInitializerError: JCo initialization failed with java.lang.UnsatisfiedLinkError: /testing/jco3/macOsx64/libsapjco3.dylib: dlopen(/testing/jco3/macOsx64/libsapjco3.dylib, 1): no suitable image found. Did find: /testing/jco3/macOsx64/libsapjco3.dylib: mach-o, but wrong architecture /testing/jco3/macOsx64/libsapjco3.dylib: mach-o, but wrong architecture [in thread "main"] new exception message , details "thrown away" : Caused by: java.lang.ExceptionInInitializerError: Exception java.lang.ExceptionInInitializerError: JCo initialization failed with java.lang.UnsatisfiedLinkError: Can't load library: /testing/jco3/macOsx64/libsapjco3.dylib [in thread "main"]
08-09-2022

A pull request was submitted for review. URL: https://git.openjdk.org/jdk13u-dev/pull/358 Date: 2022-06-13 14:52:22 +0000
16-06-2022

A pull request was submitted for review. URL: https://git.openjdk.org/jdk15u-dev/pull/222 Date: 2022-06-13 14:38:51 +0000
16-06-2022

Fix request (13u, 15u) - will label after testing completed This issue was reproduced on macOS Monterey 12.0.1. After applying the patch it is eliminated. The original patch applied not cleanly (for jdk15u difference is in 1 file; for jdk13u the patch is identical to jdk11u).
14-06-2022

verified
13-04-2022

Critical fix request [11u] I backport this for parity with 11.0.15-oracle. I had to do some rework because the code touched has been reengineered. This is P3 and in 11.0.15-oracle, so I would like to get it to 11.0.15. Oracle even backported it to 7! SAP nighlty testing passed.
08-03-2022

A pull request was submitted for review. URL: https://git.openjdk.java.net/jdk11u/pull/30 Date: 2022-03-08 07:59:10 +0000
08-03-2022

Fix request to backport to jdk17u: On macOS 11.x, system libraries are loaded from dynamic linker cache and the libraries are no longer present on the filesystem. This does not impact JNI libraries which are not in the dynamic linker cache. However, existing code using `System::loadLibrary` to load a system library no longer works since the implementation checks for file existence before doing dlopen. The alternative is for the existing code to change and load a system library in native code using dlopen instead. For compatibility of this long-standing behavior, request to fix this in jdk17u. The backport is exact same change as JDK-8275703.
17-11-2021

Changeset: 309acbf0 Author: Mandy Chung <mchung@openjdk.org> Date: 2021-10-28 15:27:26 +0000 URL: https://git.openjdk.java.net/jdk/commit/309acbf0e86a0d248294503fccc7a936fa0a846e
28-10-2021

The JDK side determines the path of the given library name and checks if it exists before calling JVM_LoadLibrary. A potential fix for MacOS might be to skip the file presence check and pass it to JVM. However, we should look into the performance implication.
21-10-2021

Moving to core-libs as this is handled on the JDK side, the VM just does a dlopen for the path passed from Java. There was a related discussion of this behaviour recently in regard to the native library loading API used by Panama.
21-10-2021

Issue is reproduced. Exception is thrown when trying to load dynamic library via System.loadLibrary and System.load but can be successfully loaded via dlopen. MacOS BigSur 11.6 JDK 17.0.1:Fail jshell> System.loadLibrary("blas") Output : | Exception java.lang.UnsatisfiedLinkError: no blas in java.library.path: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A:/Users/sswsharm/Library/Java/Extensions:/Library/Java/Extensions:/Network/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java:. jshell> System.load("/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib") Output: | Exception java.lang.UnsatisfiedLinkError: Can't load library: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib The filesystem are successfully loaded via dlopen ./dlopen libBLAS.dylib Output: Attempting to load library 'libBLAS.dylib'... Library successfully loaded! ./dlopen /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib Output: Attempting to load library '/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib'... Library successfully loaded!
21-10-2021