Bug ID: JDK-8140587 Atomic*FieldUpdaters should use Class.isInstance instead of direct class check

Type: Enhancement
Component: core-libs
Sub-Component: java.util.concurrent
Affected Version: 9

Priority: P4
Status: Resolved
Resolution: Fixed

Submitted: 2015-10-27
Updated: 2017-02-22
Resolved: 2015-11-26

JDK 8	JDK 9	Other
8u101Fixed	9 b96Fixed	openjdk7uFixed

There is an anecdotal evidence that replacing direct class checks with Class.isInstance helps when "obj" has subclasses, and "obj.getClass()  != tclass" could not be optimized. fullCheck() does Class.isInstance anyway, so it seems functionally correct to use it in the "quick" checks.

        public final int get(T obj) {
            if (obj == null || obj.getClass() != tclass || cclass != null) fullCheck(obj);
            return U.getIntVolatile(obj, offset);
        }

        private void fullCheck(T obj) {
            if (!tclass.isInstance(obj))
                throw new ClassCastException();
            if (cclass != null)
                ensureProtectedAccess(obj);
        }

This should be checked after JDK-8140483 lands. C1 and C2 may disagree which pattern is faster.

I do plan to split the integration up into many pieces. This change should be fairly self-contained. We'll see.
10-11-2015
Thanks Martin! I wonder if we should commit this patch separately, though, to ease the (possible) backport to 8u.
09-11-2015
I'll make this part of jsr166 jdk9 integration Wave 2.
09-11-2015
Based on the research above, Doug had reworked the checks in A*FU (thanks!), and we hit our performance targets. We need to pick up the changes from jsr166 repo to OpenJDK, and we are done. See the performance data here: http://cr.openjdk.java.net/~shade/8140587/perf-jsr166.txt
08-11-2015
With that, I updated the poc.patch with more aggressive peeling, and also added a sample peeling for ARFU: http://cr.openjdk.java.net/~shade/8140587/poc.patch See the sample inline trees for this patch here: http://cr.openjdk.java.net/~shade/8140587/perf.txt
04-11-2015
Okay, the performance really depends on whether we have succeeded to inline AFU.get and all subsequent callees: because then we could trust the final instance fields that have the classes we type/access-check against. C2 uses profile hotness to figure the inlinees under the FreqInlineSize (325) size threshold. C1 does not have that luxury, and it has to depend on static MaxInlineSize (35) threshold. The "natural" way to cater for this is to peel methods until they meet the MaxInlineSize limit. But there is a C1-specific trouble: C1 inline policy also* scales the MaxInlineSize threshold with NestedInliningSizeRatio (0.9) for each subsequent (n+1)-depth callee. Which means we cannot even peel the methods under the static threshold, because it will eventually fail, once the method appears deeply enough in the call tree. Once we inline everything down the A*FU methods, C1 and C2 work nicely with Class.isInstance change.
04-11-2015
Hey, look, we can make C2 to perform uniformly well, and make C1 much faster at least on ::get(). http://cr.openjdk.java.net/~shade/8140587/perf.txt http://cr.openjdk.java.net/~shade/8140587/poc.patch I think the same pattern may be replicated to other A*FU classes and access methods.
02-11-2015
I agree with the goal of making the field updaters so high-performance that there is no reason to use Unsafe. It seems like a hard goal to me, given that Unsafe has a C-style pointer interface. j.u.c. once used field updaters more than it does today. We could perhaps go back some day. I leave achieving the necessary performance to you (and Paul). I agree if you can make C1 happy as well, that's good enough.
30-10-2015
[~martin], so what you are saying is that we should cater for lesser VMs here? I generally agree, but we have to draw the line somewhere. I draw the line at C1: if we run well enough with C1 (plus some C1 tuneups), then an improvement is sound. I think the largest trouble for C1 right now is fast-pathing isInstance for a direct class hit. Current code would inevitably do isInstance in fullCheck() if we miss the direct class check, so I don't think it matters if we tune up C1 to soak that fast-path into isInstance itself. If this is fixed, then there would be no excuse to substitute A*FU with Unsafe "for performance reasons" whatsoever :)
30-10-2015
The current code should work well with stupid VMs when it is uncommon to have subclasses of classes with atomic field updaters. Optimizing isInstance seems like a standard job for a VM, but it has to handle different kinds of class objects that are not possible here (interfaces), so it's not surprising if only C2 gets it right.
30-10-2015
The preliminary data show C2 welcomes Class.isInstance very much, while C1 frowns badly. Investigating...
30-10-2015