JDK-8249004 : Reduce ThreadsListHandle overhead in relation to direct handshakes
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 16
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2020-07-08
  • Updated: 2021-11-15
  • Resolved: 2021-11-08
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 18
18 b23Fixed
Related Reports
Relates :  
Relates :  
Description
The implementation of execute_direct is as follows:

bool Handshake::execute_direct(HandshakeClosure* thread_cl, JavaThread* target) {
  JavaThread* self = JavaThread::current();
  HandshakeOperation op(thread_cl, /*is_direct*/ true);

  jlong start_time_ns = os::javaTimeNanos();

  ThreadsListHandle tlh;
  if (tlh.includes(target)) {
    target->set_handshake_operation(&op);

It needs the ThreadsListHandle to ensure the target thread is alive, and will remain alive, for the duration of the operation.

However, in some cases when dealing with direct handshakes, the higher-level operation already has to have a ThreadsListHandle active to guard the target thread for the same reasons. In such cases the overhead of the TLH in execute_direct is unnecessary.

We should provide an alternative version of execute_direct that allows us to rely on the outer TLH.
Comments
Changeset: ea23e733 Author: Daniel D. Daugherty <dcubed@openjdk.org> Date: 2021-11-08 14:45:04 +0000 URL: https://git.openjdk.java.net/jdk/commit/ea23e7333e03abb4aca3e9f3854bab418a4b70e2
08-11-2021

In the normal case of nested TLH: To promote the current list to ref-counted we execute: atomic inc, store, release store fence To grab new list: load acquire, release store fence, load acquire, CAS CAS can only fail if you are very unlucky with an exiting thread, thus need to reexecute this part. And it uses a few bytes on stack. Considering that this thread have to add the operation (CAS), wait until the target is safe, and grab mutex before executing the above should not really matter. But there is another good reason for not creating ThreadsListHandle, the caller must always have it on a ThreadsList. So I think we should remove the ThreadsListHandle and instead add an guarantee that: Thread::current()->get_threads_hazard_ptr() != NULL Thread::current()->get_threads_hazard_ptr()->find_index_of_JavaThread(target) > 0 And then plainly call: target->handshake_state()->add_operation(&op); So we do not need to pass any additional thing in.
07-07-2021

If you tell me that nested TLH are cheap then I won't have an issue. :) I am/was assuming all TLH creation is somewhat heavyweight. As for relying on a TLH in a callers context, I don't see that as any different to expecting a lock to already be held by some caller. We trust our own code, and have an assertion to catch mistakes.
07-07-2021

Thanks for the clarification. I should have read this part more closely: > the higher-level operation already has to have a ThreadsListHandle > active to guard the target thread for the same reasons. and then asked you to clarify what you meant by higher-level operation if I couldn't figure it out. Sorry about that. Hmmm... I'm not sure that I like relying on a ThreadsListHandle that's in a caller's context. That feels a bit brittle. I could see passing a ThreadsListHandle down into callers, but... I need to mull on this more... The situation that we're in here is exactly why we created nested ThreadsListHandles. Is there some reason to think that nested ThreadsListHandles are expensive?
06-07-2021

Many (most?) of the JVMTI functions require that a TLH be held and then call a handshake operation. The TLH is not in the immediate caller but exists higher up the call chain. For example: // java_thread - protected by ThreadsListHandle and pre-checked // owned_monitor_count_ptr - pre-checked for NULL // owned_monitors_ptr - pre-checked for NULL jvmtiError JvmtiEnv::GetOwnedMonitorInfo(JavaThread* java_thread, jint* owned_monitor_count_ptr, jobject** owned_monitors_ptr) { we know there is an active TLH protecting the thread, and we then have: // get owned monitors info with handshake GetOwnedMonitorInfoClosure op(calling_thread, this, owned_monitors_list); Handshake::execute(&op, java_thread); (and based on the comments this was the site of a previous "direct handshake"). These are the situations for which this RFE was create. I was thinking we utilise the knowledge that we have an existing TLH and then call a special version of execute which doesn't create its own TLH. (A debug build could create a TLH and confirm the target is contained therein.)
04-07-2021

JvmtiEnv::GetThreadListStackTraces() has a ThreadsListHandle and calls: Handshake::execute(&op, java_thread); and I can't find any other callers of Handshake::execute() that have an existing ThreadsListHandle. Perhaps there were more locations when biased locking was still in the system.
02-07-2021

The Handshake code has evolved since this RFE was file. Handshake::execute_direct() no longer exists. src/hotspot/share/runtime/handshake.hpp: class Handshake : public AllStatic { public: // Execution of handshake operation static void execute(HandshakeClosure* hs_cl); static void execute(HandshakeClosure* hs_cl, JavaThread* target); static void execute(AsyncHandshakeClosure* hs_cl, JavaThread* target); };
02-07-2021