JDK-8255384 : Remove special_runtime_exit_condition() check from SS::block()
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 16
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2020-10-26
  • Updated: 2020-11-23
  • Resolved: 2020-11-16
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 16
16 b25Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Description
Method SafepointSynchronize::block() contains a call to handle_special_runtime_exit_condition() that prevents JTs transitioning back to Java from escaping after being externally suspended. This can happen when calling SafepointMechanism::process_if_requested() from handle_polling_page_exception(), java_suspend_self_with_safepoint_check() and check_safepoint_and_suspend_for_native_trans(). The check can be removed if we just make small fixes to these methods. Removing the call to handle_special_runtime_exit_condition() in SS::block() avoids having to think about recursive cases when transitioning and stopping for safepoints. The call also introduces noise while reading the code since it is not clear what exactly we are trying to prevent with it. 
Comments
Changeset: 3675653c Author: Patricio Chilano Mateo <pchilanomate@openjdk.org> Date: 2020-11-16 17:21:13 +0000 URL: https://github.com/openjdk/jdk/commit/3675653c
16-11-2020

Sure send the PR as soon as you are ready. Aside: hard to know what to discuss in the JBS issue versus the PR these days :)
28-10-2020

Right, that's why we don't need the check for the _thread_new_trans case, since we know first time the thread calls java it will have to go through the JavaCallWrapper. I have been testing the patch since last week in mach5 and saw no issues. Maybe I can send the PR to discuss it there?
27-10-2020

There are multiple transitions involved: native -> VM (via JNI_ENTRY) -> Java (via JavaCallWrapper) So yes the async exception will get installed as we call up to Java and so would be thrown as soon as we hit code that checks for a pending exception. That might not be until after the first bytecode of the method executes but that isn't a problem.
27-10-2020

Sorry, I lost track of the calls on "If the handler then invokes Java code, via JNI or JVM TI, we will do a transition_from_native...". How exactly is the transition from native to Java in that case? Because I see that JNI methods that allow to call Java (Call*Method*) use the JavaCalls class which will check for exceptions in the JavaCallWrapper.
27-10-2020

For the _thread_new case also note that if there is a JVMTI event handler installed for JvmtiExport::post_thread_start, then we will perform a ThreadToNativeFromVM transition which also calls has_special_runtime_exit_condition() but with a false argument (no async exception check). If the handler then invokes Java code, via JNI or JVM TI, we will do a transition_from_native which will only call check_safepoint_and_suspend_for_native_trans if suspended or there is an active poll. If an async exception was set via the initial safepoint (when we transitioned from _thread_new to _thread_in_vm) then this async exception will still be pending when we start to execute the Java code of the handler. But that in itself is not a problem as the delivery of Thread.stop is a race and there are no guarantees as to when the exception will actually get thrown. Actually this is even less of an issue as AFAICS no exception from the event handler can ever make it back to the thread's execution of its run() method.
27-10-2020

Getting rid of the recursive path is good. But there is still a lot of analysis to be applied to validate the proposed changes.
26-10-2020

Hi David, The change itself is small despite the analysis giving the wrong impression. This check in SS::block() adds a recursive path when stopping for safepoints that we can avoid. Also I think it is simpler to have SS::block() just block for safepoints as one would expect, rather than also checking for all the things we check for in handle_special_runtime_exit_condition, which makes reasoning about transitions and what exactly we can execute on them harder. I think these conditions should be checked in the exact places where we need them which is before going to java or after coming out of native.
26-10-2020

I'm having trouble seeing how we end up with something simpler by doing all of the above. It also requires some detailed analysis to confirm the conclusions that have been expressed above.
26-10-2020

Looking at the handle_special_runtime_exit_condition() call in SS::block() shows that the check is needed when coming out of the safepoint while in the following states: _thread_new_trans, _thread_in_native_trans and _thread_in_Java. Method handle_special_runtime_exit_condition() checks for external suspends, object deoptimization, async exceptions and JFR suspends. All these need to be checked when transitioning to java and when transitioning from native (except for async exceptions). If we analyze the _thread_new_trans, _thread_in_native_trans and _thread_in_Java cases separately we can see that the check is only useful for the latter. _thread_new_trans case: We are transitioning from _thread_new state, so we were not transitioning from native. Also, all new JTs will call JavaCallWrapper() before going to Java for the first time, where we already have a has_special_runtime_exit_condition() check. Therefore we don't need the extra check in SS::block(). _thread_in_native_trans case: When transitioning from native we already check for external suspends, object deoptimization and JFR suspends in check_safepoint_and_suspend_for_native_trans(). The latter is called when transitioning to the vm (ThreadStateTransition::transition_from_native()) or to java(check_special_condition_for_native_trans()). _thread_in_Java case: We do need the check in SS::block() to avoid JTs transitioning back to Java from escaping after being externally suspended. This can happen when calling SafepointMechanism::process_if_requested() in the following cases: - Method handle_polling_page_exception() doesn't use transition wrappers, so if JT calls SafepointMechanism::process_if_requested() and stops for a safepoint due to a VM_ThreadSuspend request, unless we have that check in SS::block() the JT will transition back to java without actually suspending. We can fix that by using the transition wrappers instead of calling SafepointMechanism::process_if_requested() directly. With this we will also avoid the need to add explicit checks for _obj_deopt and async exceptions as we did after removal of the same handle_special_runtime_exit_condition() check in ~TIVFH. - When transitiong to java (ThreadInVMfromJava, ThreadInVMfromJavaNoAsyncException, JavaCallWrapper::JavaCallWrapper(), native wrappers) a JT always calls has_special_runtime_exit_condition() to check if it needs to suspend itself. If true it calls java_suspend_self_with_safepoint_check(). After the do-while() loop the JT makes a last call to SafepointMechanism::process_if_requested() to check if it needs to process a safepoint/handshake. If in that call the JT goes to a safepoint due to a new VM_ThreadSuspend request then unless we have that check in SS::block() the JT will transition back to java without actually suspending. We can fix this by moving the SafepointMechanism::process_if_requested() call inside the do-while loop. (Notice that wait_for_object_deoptimization() is already doing this). - Method check_safepoint_and_suspend_for_native_trans() calls SafepointMechanism::process_if_requested() but it doesn't check for external suspend after that. Rather those two calls are in a conditional and we do either one or the other. Therefore we need that check in SS::block() same as with the other two cases. We can fix this by calling SafepointMechanism::process_if_requested() first and then checking for external suspends unconditionally.
26-10-2020