JDK-8287982 : Concurrent implicit attach from native threads crashes VM
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.lang
  • Affected Version: 19
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: linux
  • CPU: x86_64
  • Submitted: 2022-05-25
  • Updated: 2022-07-27
  • Resolved: 2022-06-22
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 19 JDK 20
19 b28Fixed 20Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Description
ADDITIONAL SYSTEM INFORMATION :
Linux, JDK 19-ea+23-1706

A DESCRIPTION OF THE PROBLEM :
Using the Foreign Function & Memory API to call a native C++ function which calls back into the JVM concurrently (at least two C++ threads) causes the JVM to crash. The same program (modulo some changes to make it compilable with JDK 19) worked on JVM 18.

# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007fec8d25e842, pid=7636, tid=7655
#
# JRE version: OpenJDK Runtime Environment (19.0+23) (build 19-ea+23-1706)
# Java VM: OpenJDK 64-Bit Server VM (19-ea+23-1706, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0x85e842]  java_lang_Thread::set_thread_status(oopDesc*, JavaThreadStatus)+0x22

---------------  S U M M A R Y ------------

Command Line: --enable-preview --enable-native-access=ALL-UNNAMED JavaClass

Host: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz, 8 cores, 7G, Manjaro Linux
Time: Wed May 25 13:58:14 2022 CEST elapsed time: 0.110217 seconds (0d 0h 0m 0s)

---------------  T H R E A D  ---------------

Current thread (0x00007fec18000bf0):  JavaThread "<no-name - thread is attaching>"
[error occurred during error reporting (printing current thread), id 0xb, SIGSEGV (0xb) at pc=0x00007fec8d25e78c]

Stack: [0x00007fec23801000,0x00007fec24000000],  sp=0x00007fec23ffe160,  free space=8180k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x85e842]  java_lang_Thread::set_thread_status(oopDesc*, JavaThreadStatus)+0x22
V  [libjvm.so+0xc4132b]  ObjectMonitor::INotify(JavaThread*)+0x13b
V  [libjvm.so+0xc4275f]  ObjectMonitor::notifyAll(JavaThread*)+0x8f
V  [libjvm.so+0x82dbee]  InstanceKlass::set_initialization_state_and_notify(InstanceKlass::ClassState, JavaThread*)+0xae
V  [libjvm.so+0x836ea8]  InstanceKlass::initialize_impl(JavaThread*)+0x738
V  [libjvm.so+0xad5a9a]  LinkResolver::resolve_static_call(CallInfo&, LinkInfo const&, bool, JavaThread*)+0x16a
V  [libjvm.so+0xad64cb]  LinkResolver::resolve_invoke(CallInfo&, Handle, constantPoolHandle const&, int, Bytecodes::Code, JavaThread*)+0x2bb
V  [libjvm.so+0x8538b7]  InterpreterRuntime::resolve_invoke(JavaThread*, Bytecodes::Code)+0x177
V  [libjvm.so+0x853e37]  InterpreterRuntime::resolve_from_cache(JavaThread*, Bytecodes::Code)+0x37
j  java.lang.Thread.genThreadName()Ljava/lang/String;+13 java.base@19-ea
j  java.lang.Thread.<init>(Ljava/lang/ThreadGroup;Ljava/lang/Runnable;)V+2 java.base@19-ea
v  ~StubRoutines::call_stub 0x00007fec78537cc6
V  [libjvm.so+0x8589f5]  JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, JavaThread*)+0x315
V  [libjvm.so+0x85a01a]  JavaCalls::call_special(JavaValue*, Handle, Klass*, Symbol*, Symbol*, Handle, Handle, JavaThread*)+0x1aa
V  [libjvm.so+0xe11612]  JavaThread::allocate_threadObj(Handle, char const*, bool, JavaThread*)+0xc2
V  [libjvm.so+0x9042aa]  attach_current_thread.part.0+0x19a
V  [libjvm.so+0xe3e02b]  ProgrammableUpcallHandler::on_entry(OptimizedEntryBlob::FrameData*)+0x15b
v  blob 0x00007fec78661b91
C  [libffm.so+0x3514]  f(char const* (*)())+0x1b
C  [libffm.so+0x3eb8]  void std::__invoke_impl<void, void (*)(char const* (*)()), char const* (*)()>(std::__invoke_other, void (*&&)(char const* (*)()), char const* (*&&)())+0x34
C  [libffm.so+0x3e2d]  std::__invoke_result<void (*)(char const* (*)()), char const* (*)()>::type std::__invoke<void (*)(char const* (*)()), char const* (*)()>(void (*&&)(char const* (*)()), char const* (*&&)())+0x37
C  [libffm.so+0x3d9d]  void std::thread::_Invoker<std::tuple<void (*)(char const* (*)()), char const* (*)()> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>)+0x43
C  [libffm.so+0x3d56]  std::thread::_Invoker<std::tuple<void (*)(char const* (*)()), char const* (*)()> >::operator()()+0x18
C  [libffm.so+0x3d3a]  std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)(char const* (*)()), char const* (*)()> > >::_M_run()+0x1c

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j  java.lang.Thread.genThreadName()Ljava/lang/String;+13 java.base@19-ea
j  java.lang.Thread.<init>(Ljava/lang/ThreadGroup;Ljava/lang/Runnable;)V+2 java.base@19-ea
v  ~StubRoutines::call_stub 0x00007fec78537cc6

siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x0000000000000018

[abridged -- see hs_err.txt in attachment]

REGRESSION : Last worked in version 18.0.1
FREQUENCY : often
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
$ g++ -fPIC -shared -o libffm.so  nativemethod.cpp
$ javac --enable-preview -source 19 JavaClass.java
$ LD_LIBRARY_PATH=. java --enable-preview --enable-native-access=ALL-UNNAMED JavaClass

---------- BEGIN SOURCE ----------
nativemethod.cpp:

#include <iostream>
#include <thread>

using namespace std;

extern "C" {
void nativeMethod(const char* (*) ());
}

void f(const char* getMessage()) {
  for (auto i = 0; i < 100; ++i)
    cout << getMessage() << endl;
};

void nativeMethod(const char* getMessage()) {
  auto t1 = thread{f, getMessage};
  auto t2 = thread{f, getMessage};
  t1.join();
  t2.join();
}

JavaClass.java:

import java.lang.foreign.Addressable;
import java.lang.foreign.FunctionDescriptor;
import java.lang.foreign.Linker;
import java.lang.foreign.MemorySession;
import java.lang.foreign.SymbolLookup;
import java.lang.foreign.ValueLayout;
import java.lang.invoke.MethodHandles;
import java.lang.invoke.MethodType;

import static java.lang.foreign.SegmentAllocator.implicitAllocator;

public class JavaClass {

  static {
    System.loadLibrary("ffm");
  }

  public static Addressable getMessage() {
    return implicitAllocator().allocateUtf8String("Hello World!").address();
  }

  public static void main(String... args) {
    try {
      var linker = Linker.nativeLinker();
      var symbolLookup = SymbolLookup.loaderLookup();
      var symbol = symbolLookup.lookup("nativeMethod").orElseThrow();
      var functionDesc = FunctionDescriptor.ofVoid(ValueLayout.ADDRESS);
      var methodHandle = linker.downcallHandle(symbol, functionDesc);
      var upcallMethodType = MethodType.methodType(Addressable.class);
      var upcallFunctionDesc = FunctionDescriptor.of(ValueLayout.ADDRESS);
      var upcallMethodHandle = MethodHandles.lookup().findStatic(JavaClass.class, "getMessage", upcallMethodType);
      var upcallSymbol = linker.upcallStub(upcallMethodHandle, upcallFunctionDesc, MemorySession.openImplicit());
      methodHandle.invoke(upcallSymbol);
    } catch (Throwable e) {
      System.out.println(e);
    }
  }
}
Comments
Changeset: 7cf71bc2 Author: Alan Bateman <alanb@openjdk.org> Date: 2022-06-22 07:48:14 +0000 URL: https://git.openjdk.org/jdk19/commit/7cf71bc2d3ae3d84552f06358e70204dc65552fc
22-06-2022

A pull request was submitted for review. URL: https://git.openjdk.org/jdk19/pull/28 Date: 2022-06-16 13:34:18 +0000
21-06-2022

I don't see how the linking/initialization locking has any relevance here? The async exception could happen anywhere in the Java code. The locking code is native. Or did virtual threads also introduce changes here? Update: the locking changes mean there is no longer a Java thread state change as we don't expose VM locking/blocking states at the Java level.
17-06-2022

Or we re-open JDK-6412693 and use the ServiceThread to do the construction on the attaching threads behalf. This partial construction problem has bitten us a few ways in the past.
13-06-2022

[~coleenp] Yes, that would fix this too but I assume too late for JDK 19. The Thread constructor for platform threads need to handle the primordial thread case, the JNI attach case, plus the normal case where the parent creates a child. The former cases have always been somewhat fragile in that SM permission checks, instrumentation, and more can lead to executing code in the context of a partly initialized thread (JDK-8274668 is a recent one, David has linked to others). The re-implementation of Thread tries to minimise the code executed when a Thread initialises itself but I missed the case where several threads may JNI attach at the same time and naming them requires that ThreadNumbering be loaded/initialized. There are several ways to fix that and I will have a PR soon. There is still the issue of static or dynamic instrumentation where the Thread initialization is modified to call into arbitrary code but I don't think we can solve that completely.
09-06-2022

Taking out the Java lock to synchronize during linking and initialization helps this dependency.
09-06-2022

> I moved the generating of the thread name to the end of Thread constructor so if there is any class loading/init generating the name is done when the fields are set. We should check that any code in the VM that accesses the thread name also has a null check as logically we always expect a thread name to be set. This could include JVMTI code if the thread still blocks due to the ThreadNumbering class initialization - though in the JNI attach case it may be the thread will be ignored due to the fact it is still marked as attaching. The ability to have a partially constructed thread block due to synchronization like this makes things more fragile than I would like. It may be better to preload/init those classes so that this cannot happen. ??
08-06-2022

I moved the generating of the thread name to the end of Thread constructor so if there is any class loading/init generating the name is done when the fields are set. This avoids the crash with the test case. Note that the crash that I observed initially isn't the same as reported, instead, it's the monitor enter where JavaThreadBlockedOnMonitorEnterState changes the thread state. It might be that it happens on t1 sometimes, t2 other times. Stack: [0x00007fead451e000,0x00007fead4d1d000], sp=0x00007fead4d1b120, free space=8180k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x8886f2] java_lang_Thread::set_thread_status(oopDesc*, JavaThreadStatus)+0x22 V [libjvm.so+0xc6ef03] ObjectMonitor::enter(JavaThread*)+0x5a3 V [libjvm.so+0xdee4ac] ObjectSynchronizer::enter(Handle, BasicLock*, JavaThread*)+0xdc V [libjvm.so+0x85bd4c] InstanceKlass::link_class_impl(JavaThread*)+0x26c V [libjvm.so+0x85ec2b] InstanceKlass::initialize_impl(JavaThread*)+0x16b V [libjvm.so+0xb0046a] LinkResolver::resolve_static_call(CallInfo&, LinkInfo const&, bool, JavaThread*)+0x16a V [libjvm.so+0xb00e9b] LinkResolver::resolve_invoke(CallInfo&, Handle, constantPoolHandle const&, int, Bytecodes::Code, JavaThread*)+0x2bb V [libjvm.so+0x87bc27] InterpreterRuntime::resolve_invoke(JavaThread*, Bytecodes::Code)+0x177 V [libjvm.so+0x87c1a7] InterpreterRuntime::resolve_from_cache(JavaThread*, Bytecodes::Code)+0x37 j java.lang.Thread.genThreadName()Ljava/lang/String;+13 java.base@19-internal j java.lang.Thread.<init>(Ljava/lang/ThreadGroup;Ljava/lang/Runnable;)V+2 java.base@19-internal v ~StubRoutines::call_stub 0x00007feb4c40acc6 V [libjvm.so+0x8828c5] JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, JavaThread*)+0x315 V [libjvm.so+0x883eea] JavaCalls::call_special(JavaValue*, Handle, Klass*, Symbol*, Symbol*, Handle, Handle, JavaThread*)+0x1aa V [libjvm.so+0xe3c422] JavaThread::allocate_threadObj(Handle, char const*, bool, JavaThread*)+0xc2 V [libjvm.so+0x92e18a] attach_current_thread.part.0+0x19a V [libjvm.so+0xe7552b] UpcallLinker::on_entry(UpcallStub::FrameData*)+0x15b v blob 0x00007feb4c533511 C [libffm.so+0x8b52] f(char const* (*)())+0x1b C [libffm.so+0xa1b2] void std::_Bind_simple<void (*(char const* (*)()))(char const* (*)())>::_M_invoke<0ul>(std::_Index_tuple<0ul>)+0x40 C [libffm.so+0xa0bd] std::_Bind_simple<void (*(char const* (*)()))(char const* (*)())>::operator()()+0x1b C [libffm.so+0xa056] std::thread::_Impl<std::_Bind_simple<void (*(char const* (*)()))(char const* (*)())> >::_M_run()+0x1c
08-06-2022

However it is not clear to me why we are changing the thread status on a notifyAll call: V [libjvm.so+0x85e842] java_lang_Thread::set_thread_status(oopDesc*, JavaThreadStatus)+0x22 V [libjvm.so+0xc4132b] ObjectMonitor::INotify(JavaThread*)+0x13b Which thread is it being invoked upon? The current thread shouldn't be changing status. The notified thread will change status but it must have already set a "blocking" status in the first place - so why didn't we crash then? Update: I think we have something subtle going on in relation to thread state changes. Most of the changes are predicated on the target being "alive" and at this point for the attaching thread it is not clear this is seen to be the case. Then we have one state change: static void wait_reenter_end(JavaThread *java_thread, bool active) { if (active) { java_thread->get_thread_stat()->contended_enter_end(); } set_thread_status(java_thread, JavaThreadStatus::RUNNABLE); } which is unconditional. So that could explain the actual crash. It doesn't change the fact we shouldn't be doing this kind of synchronization whilst the thread is still in the process of attaching.
08-06-2022

Here is the problem: public Thread() { this(null, genThreadName(), 0, null, 0, null); } During construction we call genThreadName which can trigger the classloading and initialization of the ThreadNumbering helper class. If this happens concurrently from two threads one will have to block, but blocking causes the VM to change the thread status, but the thread status is stored in the FieldHolder instance and that has not yet been initialized. The classloading and initialization will happen when the first auto-named thread is created - which in the test case happens to be by JNI attaching threads from the "foreign" API. Moving this to core-libs.
08-06-2022

I suspect Loom has caused this. The thread status was moved to a FieldHolder class but we seem to be trying to update that status in the context of the Thread constructor, before FieldHolder has been initialized.
08-06-2022

I am able to reproduce the crash on latest jdk19 build. [~sswsharm] you need to set "LD_LIBRARY_PATH=." in the environment. In fastdebug build, the hs_err is similar to the one in the bug report. The cause is this: # Internal Error (/jdk2/zoo/open/src/hotspot/share/classfile/javaClasses.cpp:1949), pid=3881588, tid=3881622 # assert(holder != __null) failed: Java Thread not initialized
08-06-2022

I moved the hs_err file to bug attachment to make it easier to navigate this bug report.
08-06-2022