JDK-8295066 : Folding of loads is broken in C2 after JDK-8242115
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 11,17,20
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2022-10-10
  • Updated: 2024-01-24
  • Resolved: 2022-10-26
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 17 JDK 20
11.0.19-oracleFixed 17.0.7-oracleFixed 20 b22Fixed
Related Reports
Duplicate :  
Relates :  
Description
When merging JDK-8242115 into Valhalla, we noticed massive failures with our IR verification tests. It turned out that (at least) folding of loads is broken. For example, with the following test (attached):

    static class MyClass {
        int x = 42;
    }

    public static int test() {
        MyClass[] array = new MyClass[1];
        array[0] = new MyClass();
        return array[0].x;
    }

Before JDK-8242115, the load from the array is constant folded to 42 and the MyClass array and object allocations are removed. This does not work anymore after the fix, leading to a significant performance regression in microbenchmarks.

To reproduce, simply run:
java -XX:+AlwaysIncrementalInline -XX:CompileCommand=quiet -XX:CompileCommand=compileonly,Test::test -Xbatch -XX:CompileCommand=print,Test::test -XX:-TieredCompilation Test.java

and check _new_array_Java and _new_instance_Java calls in compiled code.

There might be more issues but we can check that by applying a prototype fix and re-running Valhalla tests. For now, we omitted JDK-8242115 when merging master into Valhalla.
Comments
Fix request [17u] I backport this for parity with 17.0.7-oracle. Required follow-up to 8242115. Clean backport. Test passes. SAP nightly testing passed.
31-12-2022

A pull request was submitted for review. URL: https://git.openjdk.org/jdk17u-dev/pull/1001 Date: 2022-12-29 13:41:19 +0000
29-12-2022

Changeset: 58a7141a Author: Igor Veresov <iveresov@openjdk.org> Date: 2022-10-26 20:45:26 +0000 URL: https://git.openjdk.org/jdk/commit/58a7141a0dea5d1b4bfe6d56a95d860c854b3461
26-10-2022

A pull request was submitted for review. URL: https://git.openjdk.org/jdk/pull/10861 Date: 2022-10-25 19:50:10 +0000
25-10-2022

Okay, my bad, I screwed up the merge. Your latest fix passes all Valhalla tests.
25-10-2022

I'm now seeing new IR verification failures with the patch. You can reproduce them by applying both JDK-8242115 and your patch to the 'lworld' branch of https://github.com/openjdk/valhalla and then run test/hotspot/jtreg/compiler/valhalla/inlinetypes/TestArrays.java.
24-10-2022

So, I tweaked the scalarization logic. Now both of you tests pass. Please see if it solves all the problems. The second patch (8295066-2.patch) includes the changes from the first one.
22-10-2022

Right, it's probably not worth the trouble. But if I understand correctly, either: - JDK-8242115 is not complete yet, because LCM/GCM *could* still move the pre-val load above a safepoint, in which case we would need something similar to very late barrier expansion in ZGC, or - LCM/GCM are not an issue and we could introduce a "simple" macro node for oop loads/stores that expand the barrier code only after optimizations.
21-10-2022

Ugh, this would be very hard to do. Even if it's doable in this case, in general with loops it would mean moving the load across safepoints, which is exactly what we don't want to do. I guess the only way to approach this is to tweak the scalarization logic to ignore the barriers.
17-10-2022

I verified that this fixes most of the issues we obverse in Valhalla but there's at least one issue left. I attached a corresponding test (Test2.java): java -XX:CompileCommand=quiet -XX:CompileCommand=compileonly,Test2::test -XX:-TieredCompilation -XX:CompileCommand=print,Test2::test Test2.java | grep new_instance The problem is that your fix bails out from LoadNode::Ideal and therefore the "// Split instance field load through Phi" code is not executed. I'm worried that your fix disables more optimizations that are currently not caught by any tests. For example, all the other code that's part of LoadNode::Ideal and LoadNode::Identity.
17-10-2022

The solution is actually to allow it to constant fold in LoadNode::Identity(). I attached the suggested fix.
15-10-2022

Thanks for the update, Igor. I did not look into it in detail, so it might well be that other things are broken as well. Let me know if you have a prototype fix, so I can run it through Valhalla testing.
11-10-2022

I have a hack to make it work again, but I need to tidy it up and hide a bunch stuff behind the GC API.
11-10-2022

So, actually it broke EA. Because for folding to happen (at least in this case EA needs to happen).
10-10-2022

Igor, could you please have a look?
10-10-2022

ILW = Performance regression due to broken folding of loads, loads from arrays, no workaround = MMH = P3
10-10-2022