JDK-8296440 : Remove Method* handling from cleanup_inline_caches_impl
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 20
  • Priority: P4
  • Status: Open
  • Resolution: Unresolved
  • Submitted: 2022-11-05
  • Updated: 2022-11-17
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbdUnresolved
Related Reports
Relates :  
Relates :  
Relates :  
Description
JDK-8222841 has added code to reset the Method* for CompiledStaticCalls when the call target class is no longer alive.
However, this should not happen in a healthy VM. If a static call (or optimized virtual call) is reachable then the callee class can’t be dead.
Removing this code piece of JDK-8222841 and running lots of tests did not reveal any problem.

The code was implemented for x86 on which the CompiledStaticCalls embed the Method* into the code stream and don't use the oop recorder for it.
Other platforms (e.g. PPC64, AArch64) allocate a metadata slot via oop recorder. The code in question clears the slot in the metadata section, not the replicated value which is actually used by the code. Note that fix_metadata_relocation() does nothing on most platforms! This is a further indication that the code is not needed.

In addition, the Method* null checks in the c2i entry barriers should not be needed.

We should consider reverting the unneeded parts of JDK-8222841 and run all related tests.

Comments
Interesting. I did some digging too. When I introduced https://bugs.openjdk.org/browse/JDK-8222841 it was only days after https://bugs.openjdk.org/browse/JDK-8223171 which introduced ciMethod::can_be_statically_bound, which checks if the target is in a different class loader to the static receiver. It looks like that changed at least some C1 code paths to start embedding dependencies for CHA methods, where it previously wouldn't (it used to just check if the method was final or private), if class loader boundaries are crossed. Back then it was however easy to trip up on the ShouldNotReachHere change you made, but mainly because the sweeper also did inline cache cleaning, even on is_unloading() nmethods. However, in https://bugs.openjdk.org/browse/JDK-8240693 it was found out that this specific clearing was not safe to run from the sweeper anyway, so a guard was inserted checking if the caller nmethod is_unloading before clearing the metadata. And then of course I deleted the sweeper altogether (https://bugs.openjdk.org/browse/JDK-8290025), and now there is only the GC left doing inline cache cleaning. So yeah it very well might be that after the code has shuffled around quite a bit since this code was introduced, it might not be needed any longer. At least I can't trigger a case where it is needed any longer. Note that if this is true, it also means we can delete the c2i entry barriers. That would be very nice.
17-11-2022

I've added a ShouldNotReachHere() before resetting the Method* to NULL (https://github.com/openjdk/jdk/blob/4527dc67be6d238dcecd98c2aa05cbf0b8077e59/src/hotspot/share/code/compiledMethod.cpp#L661). So far it was not reach in our nightly testing, that is most JCK and JTREG tests, also in Xcomp mode, SPECjvm2008, SPECjbb2015, Renaissance Suite and SAP specific tests with fastdebug and release builds on the standard platforms plus PPC64.
17-11-2022

I did add the code because we had a crash back in JDK12, where callsites were calling from !is_unloading() nmethods into dead methods through c2i adapters and stale inline caches. Having said that, that was a long time ago, and perhaps things have changed. I would like it if things worked the way you are saying. So if you would like to try that out, I do support it. If the intended callee is dead, we really want the caller to be dead too.
11-11-2022

Thanks for doing the experiments. What about the following rationale? For an optimized virtual call the JIT has proven that every time the call is executed the dynamic dispatch for the current receiver (!= null) would be to the very same target method M. This also means that M cannot be unloaded. And what if it was unloaded? We cannot just re-resolve the call and load the class again. Classloading has side effects. If it was legally unloaded then there cannot be receiver for it. If there is one after unloading this means that class loading happened and the nmethod with the call was deoptimized because the dependencies were triggered. Is the code trying to make the VM more robust against JIT bugs in virtual call optimizations? In that case I think it would be better to crash. Maybe the Method* should be set to 0xdeadbeef? While I don't think that there can be a race executing an optimized virtual call and unloading the target I do think it is necessary to clean the Method* in the static stub if the Method* is unloading. The call cannot be executed anymore then but, as the comment says, code that reads Metadata in nmethods (class redefinition) should not see and dereference dead pointers.
11-11-2022

So the way I see it, the dependencies are there to guard against class loading. While I can see that they also do impact the class unloading conditions as well, I'm not sure it is safe to assume that it always will. We can get to an optimized virtual call in a large array of different ways depending on a lot of conditions. The test you wrote is very nice. Thanks for building it. I have played around with your test today to see if I can break it. I can get to an optimized virtual call through various combinations of exact types, assumed single concrete subclass, unique concrete method. Sometimes I can get rid of the unique concrete type dependency when the CHA monomorphic method is final, and C1 and C2 give me completely different dependencies in different situations. Using an interface instead and C2, I can get to a situation where an optimized virtual call is emitted, and there are zero dependencies. But it emits a dynamic type check to check if the receiver type is the CHA monomorphic method holder type. That comparison embeds metadata which means we get the holder oop in the oop section. By using a static final receiver and setting it with unsafe instead of a normal field, I can get rid of the comparison, and have the code generated with an opt_virtual_call, zero dependencies, and no dynamic type check. However, then the receiver gets embedded from the static final as we need to set rsi to the receiver, which ends up saving the day. While it is encouraging that I have been unable to figure out how to get rid of dependencies *and* metadata *and* oops that make the opt_virtual_call race with class unloading, I don't yet feel like the lack of finding such a path means it doesn't exist, as there are so many different ways in which we can embed one of these opt_virtual_calls.
10-11-2022

I tried to trigger Erik's scenario with compiler/classUnloading/methodUnloading/TestMethodUnloading.java, but I couldn't get it to work. It generated a virtual call instead of opt_virtual. I even tried having WorkerClass derive from a base class instead of Object, so there would be only one unique subclass, but I couldn't get that to work either.
10-11-2022

The dependencies do affect unloading. I've attached CallAbstractVirtualSingleConcreteReceiverInOtherLoader.java to show what I mean. Looking at the (also attached) CallAbstractVirtualSingleConcreteReceiverInOtherLoader_output.log we see L92: optimized virtual call with target ConcreteReceiverR1::testMethod_callee L239: Reference to loader of ConcreteReceiverR1 with single concrete method `testMethod_callee`. This reference comes from the dependency of the opt. virtual call. L380: Receiver of call becomes unreachable. Loader from L239 is kept alive because the Reference is treated as a strong root while the nmethod has an activation on stack. L388: Loader is unreachable when the nmethod with the opt. virt. call is not on stack because the reference is treated as weak ref. nmethod is made not entrant because of this. Opt virtual call is not reachable anymore. L390: PhantomReference of the loader gets notified > However, the oop of that class loader is never embedded into the caller oop section, as no bytecodes are inlined from it. The test shows it indeed is embedded into the caller's oop section. The test matches the situation you described: abstract static receiver, single concrete receiver method loaded by other classloader, doesn't it? > Then you can end up in a situation where you can indeed racingly call through an optimized virtual call inline cache, while it is being cleared by the concurrent unloading code So I cannot see how this would be possible.
09-11-2022

Yes that is the line. It can be reached with CHA. The dependencies trigger on class loading, not unloading. And I don’t think AFAICT that the callee method holder has to have been loader for an opt virtual call inline cache to be generated.
09-11-2022

Hi Erik, > Then you can end up in a situation where you can indeed racingly call through an optimized virtual call inline cache, while it is being cleared by the concurrent unloading code I think you are referring to this line, correct? https://github.com/openjdk/jdk/blob/dd5d4df5b68a40923987841a206fac5032d72f71/src/hotspot/share/code/compiledMethod.cpp#L661 But how can the loader be dead if the call is reached with a receiver != null? Also I think there are dependencies added to the nmethod N with the optimized virtual call that include the loader L of the class with the single concrete method. If N was on stack the reference from the dependencies would keep L alive. Otherwise (not on stack) N would hit the entry barrier and get unloaded without a race if L was dead.
08-11-2022

If you have an abstract method, then it can be that the abstract callee is indeed reachable, but it can be CHA be proven that there is a single method that can be called, warranting the use of optimized virtual call relocatinos from the compiler. But the single callee class can be in a different class loader. However, the oop of that class loader is never embedded into the caller oop section, as no bytecodes are inlined from it. Then you can end up in a situation where you can indeed racingly call through an optimized virtual call inline cache, while it is being cleared by the concurrent unloading code. When I built that code for ZGC in JDK 12, we only supported ZGC on x86_64, so that was the only platform that was enforced to deal with that race. Having said that, I think the preferred treatment would be to have the caller generate an oop map slot that is initially empty, and populate it with an oop dynamically when the callsite is resolved, so that the caller nmethod becomes is_unloading() if the speculative target method gets unloaded. The GCs can already deal with register_nmethod being called multiple times in similar situation when C1 patches in oops. That would make for a much cleaner model, where we could indeed remove this code.
08-11-2022

The crashes with JDK-8296450 and JDK-8295097 on sparc and aarch64 might indicate that the replicated metadata needs to be kept consistant, rather than removing the clearing of just one copy.
07-11-2022