Relates :
|
|
Relates :
|
|
Relates :
|
|
Relates :
|
|
Relates :
|
|
Relates :
|
|
Relates :
|
|
Relates :
|
|
Relates :
|
|
Relates :
|
|
Relates :
|
|
Relates :
|
|
Relates :
|
|
Relates :
|
|
Relates :
|
Escape analysis (EA) should be enabled for better performance, when the vm is running with JVMTI agents loaded. Main intent is to be able to start a production system in a mode that allows to initiate a debugging session anytime later if necessary or desired without the need to disable escape analysis at start-up. In most cases debugging will never be activated and the production systems should run at the best possible performance while still being ready for debugging. The enhancement will improve performance also when a debugger has attached to the vm. Another important scenario for the enhancement is heap diagnostics. Agents with that purpose need not be loaded at start-up. They can be loadded into a running system whenever necessary or desired. Unfortunately the current JVMTI implementation does not and cannot give access to scalar replaced objects which can hinder diagnostics. JDK-8233915 is an example for this issue that will be fixed by this enhancement also. Currently EA is disabled if a JVMTI agent added the capability can_access_local_variables, because an access to a local reference variable potentially changes the escape state of the referenced object and thereby invalidates optimizations based on it. There are more JVMTI capabilities that allow agents to acquire object references from stack frames: 1. can_access_local_variables 2. can_get_owned_monitor_info 3. can_get_owned_monitor_stack_depth_info 4. can_tag_objects This allows for example to walk the object graph beginning at its roots, which include local variables. JDK-8230677 switches EA off if capabilities 2. or 3. are taken. This workaround is not possible for 4. as can_tag_objects is an always capability. JDK-8233915 tracks this issue. In addition EA is disabled if 5. can_pop_frame is added. Not because it gives access to local variables, but because the implementation of PopFrame interferes with object reallocation during deoptimization of compiled frames. It is likely a bug that EA is not disabled if 6. can_force_early_return is added as ForceEarlyReturn has the same issues with deoptimization. This enhancement shall allow the JVM to run with escape analysis enabled even if any of the capabilities 1. to 6. is requested by a JVMTI agent. Summary of Proposed Implementation ---------------------------------- The JVMTI implementation is changed to revert EA based optimizations just before objects escape through JVMTI. At runtime there is no escape information for each object in scope. Instead each scope is annotated, if non-escaping objects exist and if some are passed as parameters. If a JVMTI agent accesses a reference on stack, then the owning compiled frame C is deoptimized, if any non-escaping object is in scope. Scalar replaced objects are reallocated on the heap and objects with eliminated locking are relocked. This is called "deoptimizing objects" for short. If the agent accesses a reference in a callee frame of C and C is passing any non-escaping object as argument then C and its objects are deoptimized as well. Deoptimizing Objects --------------------- Early reallocation of scalar replaced (aka virtual) objects, where reallocation is done independently of and potentially long before replacing the owning compiled frame with equivalent interpreter frames, is a preexisting functionality that is leveraged by the enhancement (see materializeVirtualObjects). Reallocating and relocking objects is called "deoptimizing objects". Deoptimized objects are kept as deferred updates (preexisting JavaThread::_deferred_locals_updates). Either all objects of a compiled frame are deoptimized or none. It is annotated at the corresponding deferred updates if it happened already in order to avoid doing it twice. EscapeBarrier ------------------ The class EscapeBarrier is the interface to synchronize and trigger deoptimization before objects escape. C2 Changes ---------- During EA C2 annotates each safepoint if it has non-escaping objects in scope and each java call if it has non-escaping objects in its parameter list. This information is persisted in the CompiledMethod's debug information. Escape Information at Runtime ----------------------------- There is preexisting information about scalar replaced objects and eliminated locking (note that locks are not only eliminated based on EA, but also nested locks are omitted). The implementation adds information about non-escping objects in scope and in argument lists at call sites: compiledVFrame::not_global_escape_in_scope() compiledVFrame::arg_escape() ScopeDesc::not_global_escape_in_scope() ScopeDesc::arg_escape() Synchronization --------------- Competing agents use the new flag '_obj_deopt' in Thread::_suspend_flags and the new Monitor EscapeBarrier_lock to synchronize and to suspend their target thread. Deoptimization can be concurrent for different target threads. A self deoptimization cannot be concurrent with other deoptimizations. Deoptimizing everything (e.g. before heap walks) cannot be concurrent with other deoptimizations. See EscapeBarrier::sync_and_suspend_one() and EscapeBarrier::sync_and_suspend_all() PopFrame and ForceEarlyReturn ----------------------------- Objects are deoptimized before the PopFrame/ForceEarlyReturn operation and JVMTI_ERROR_OUT_OF_MEMORY is returned if reallocations fail. This avoids reallocation failures during the operation. Performance ----------- Performance should not be affected if no JVMTI agent is loaded. If a JVMTI agent is loaded that adds any of the capabilities listed above, but remains inactive, then there should be a performance gain as high as the gain of EA. The performance impact is expected to be still positive if debugging interactively. jvm2008 results are attated to the RFE. Microbenchmark results: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.microbenchmark/ Testing ------- The proposed implementation comes with a significant abount of dedicated test code. The new develop flag DeoptimizeObjectsALot allows for stress testing, where internal threads are started that deoptimize frames and objects in millisecond intervals given with DeoptimizeObjectsALotInterval. The number of threads started are given with DeoptimizeObjectsALotThreadCountAll and DeoptimizeObjectsALotThreadCountSingle. The former targets all existing threads whereas the latter operates on a single thread selected round robin.
|