The Xcheck:jni version of EnsureLocalCapacity tries to validate use of local refs by tracking an artificial limit (the "planned capacity") which is checked on all JNI method returns, to ensure that the limit has not been exceeded (with some slack built in- CHECK_JNI_LOCAL_REF_CAP_WARN_THRESHOLD):
Checked JNI function exit:
size_t planned_capacity = handles->get_planned_capacity();
size_t live_handles = handles->get_number_of_live_handles();
if (live_handles > planned_capacity) {
IN_VM(
tty->print_cr("WARNING: JNI local refs: " SIZE_FORMAT ", exceeds capacity: " SIZE_FORMAT,
live_handles, planned_capacity);
thr->print_stack();
)
A well-written piece of native code that uses JNI is supposed to track the expected number of local refs it will use and call EnsureLocalCapacity beforehand to ensure it won't exceed the planned capacity.
Prior to JDK-8193222 EnsureLocalCapacity(capacity) simply did:
add_planned_handle_capacity(thr->active_handles(), capacity);
where:
add_planned_handle_capacity(JNIHandleBlock* handles, size_t capacity) {
handles->set_planned_capacity(capacity +
handles->get_number_of_live_handles() +
CHECK_JNI_LOCAL_REF_CAP_WARN_THRESHOLD);
}
so we simply add the required amount of capacity to the current in-use amount, plus the slack.
That approach was wrong because it did not handle nested native method calls correctly. Suppose funcA will create 20 local refs, and funcB will create 5, then they both call EnsureLocalCapacity with 20 and 5 respectively. But if funcA calls funcB then we have a problem. Lets say there are initially 0 active handles and the slack is 2. Then:
EnsureLocalCapacity(20) will do:
planned_capacity = 20 + 0 + 2 = 22
Lets say funcA creates no local refs before calling funcB, then in funcB EnsureLocalCapacity(5) will do:
planned_capacity = 5 + 0 + 2 = 7
we then return to funcA and proceed to loop 20 times to call a JNI function that returns a local ref. Once we get to the 8th iteration the function exit check will detect 8 active handles is greater than the planned capacity of 7 and produce the warning! The code lost the fact that funcA had requested a capacity of 20.
So the fix in JDK-8193222 was to have EnsureLocalCapacity only call add_planned_handle_capacity if the requested capacity was greater than the current planned_capacity. So in funcB the call to EnsureLocalCapacity becomes a no-op and when we return to funcA we still have a planned_capacity of 22 and now our loop completes fine. So bug fixed in a simple way - all good!
Except it isn't all good. What we haven't considered is that funcB created 5 local refs. Now if funcB calls DeleteLocalRef for each local ref it created all is well and good - we still have capacity for 22 local refs. However, funcB is not required to call DeleteLocalRef, and if it doesn't then our available capacity has been reduced to 22 - 5 = 17. So on the 18th iteration of the loop the function exit check will again fail and we get the warning!
So how to address this? The problem is that at the time EnsureLocalCapacity is called in funcB there is no knowledge of how many active handles will be created and remain when funcB returns. So there is no way to address this problem in EnsureLocalCapacity. Also funcB is just application native code, not a JNI function itself, so we can't put an adjustment in the JNI function exit hook that Xcheck:jni installs. So in short there seems no way to address this!
Now you may suggest that funcA has to call EnsureLocalCapacity with a value that accounts for all the local-refs created transitively by the code in funcA, including funcB - that would certainly fix the problem. But there is no way in general to know what this value would be. We would just encourage programmers to call EnsureLocalCapacity(BigNumber) to "ensure" there is plenty of allowance made. That makes the checked EnsureLocalCapacity a useless tool.
In practice, in Hotspot, the actual functionality of EnsureLocalcapacity is a no-op: there is no inherent local ref capacity limit: we create them till we run out of memory. So the Xcheck:jni version of EnsureLocalCapacity is just a way to encourage people to write portable JNI code, in case it runs on a VM that does have a limit. So the checked version is there to help developers write their native code in a way that will work on other VMs. This raises the question why we actually do this? If another VM has an inherent local ref limit (per the JNI specification) then why doesn't that other VM provide the tools to help developers write correct JNI code?
I think the checked version of EnsureLocalCapacity, whilst well-intentioned, cannot be implemented in a correct and useful way, and so has no real value and should be removed.