JDK-8349364 : C1 sometimes fails to remove useless modulo operations
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 11,17,21,25
  • Priority: P4
  • Status: Open
  • Resolution: Unresolved
  • OS: generic
  • CPU: generic
  • Submitted: 2025-01-31
  • Updated: 2025-09-12
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbdUnresolved
Related Reports
Relates :  
Relates :  
Description
ADDITIONAL SYSTEM INFORMATION :
# Java version
java 23.0.1 2024-10-15
java 21.0.5 2024-10-15 LTS
java 17.0.12 2024-07-16 LTS

# Operating system details
$ cat /etc/*release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu 20.04.6 LTS"
NAME="Ubuntu"
VERSION="20.04.6 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.6 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal

A DESCRIPTION OF THE PROBLEM :
Missed dead code elimination in C1/C2 compiler. The issue disappeared
when some irrelevant code is added, but '-XX:TieredStopAtLevel=3'
remains slow. This could also be potentially related to JIT profiling.

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
The following steps shows how to reproduce the bug on a Ubuntu Linux
environment with on Java 21.0.5 and the corresponding results. (It can also
be reproduce on macOS-ARM64)

```
Compile the source code:
javac Test1.java Test2.java

Running with if statement:
java -XX:TieredStopAtLevel=1 Test1  0.04s user 0.03s system 100% cpu 0.075 total
java -XX:TieredStopAtLevel=2 Test1  0.06s user 0.02s system 101% cpu 0.078 total
java -XX:TieredStopAtLevel=3 Test1  4.78s user 0.03s system 100% cpu 4.805 total
java -XX:TieredStopAtLevel=4 Test1  4.53s user 0.05s system 100% cpu 4.569 total
Running without if statement:
java -XX:TieredStopAtLevel=1 Test2  4.71s user 0.04s system 100% cpu 4.748 total
java -XX:TieredStopAtLevel=2 Test2  4.79s user 0.03s system 100% cpu 4.820 total
java -XX:TieredStopAtLevel=3 Test2  4.80s user 0.05s system 100% cpu 4.848 total
java -XX:TieredStopAtLevel=4 Test2  4.69s user 0.04s system 100% cpu 4.723 total
```

ACTUAL -
The `test` method always return `a2`, which is never modified inside
the method body. The method can be optimized to directly return `a2`.
C1 with an extra irrelevant if statement seems to be able do this
optimization while all the other configurations fail. It is
interesting to see how that if statement affects the compiler
behavior. Besides, `-XX:TieredStopAtLevel=3` is slow regardless of the
presence of the if statement. To the best of my knowledge, this option
is just C1 with full profiling, so this issue could also be related to
JIT profiling.

---------- BEGIN SOURCE ----------
# Test1.java

```java
public class Test1 {
    public static double test(final double a1, final double b1, final double a2, final double b2) {
        final double x = Double.longBitsToDouble(
                Double.doubleToRawLongBits((double) a1) & (long) (8411500938693120034L << -173714142));
        final double y = (double) ((x + b1) / (((x * (b1 % x)) % (x - a1)) - (b1 - x)));
        // Adding this if statement makes C1 fast
        if (Double.isNaN(a1)) {
            System.out.println("a1 is NaN");
        }
        return (double) a2;
    }
    public static void main(String[] args) {
        int N = 10000000;
        double[] res = new double[N];
        for (int i = 0; i < N; ++i) {
            res[i] = test((double) 'a', 100, 10L, 100.0d);
        }
    }
}
```

# Test2.java

```java
public class Test2 {
    public static double test(final double a1, final double b1, final double a2, final double b2) {
        final double x = Double.longBitsToDouble(
                Double.doubleToRawLongBits((double) a1) & (long) (8411500938693120034L << -173714142));
        final double y = (double) ((x + b1) / (((x * (b1 % x)) % (x - a1)) - (b1 - x)));
        return (double) a2;
    }
    public static void main(String[] args) {
        int N = 10000000;
        double[] res = new double[N];
        for (int i = 0; i < N; ++i) {
            res[i] = test((double) 'a', 100, 10L, 100.0d);
        }
    }
}

```
---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
No workaround for HotSpot. The issue is not reproduced in GraalVM.

FREQUENCY : always


Comments
I verified that performance with C2 improves dramatically with JDK-8345766. Before: real 0m13.932s user 0m14.238s sys 0m0.084s After: real 0m0.491s user 0m0.790s sys 0m0.071s
06-02-2025

For C2, isn't this JDK-8345766 again? And for C1, I think level 3 is expected to be slow because of full profiling.
06-02-2025

This never worked, so I'm converting it to an enhancement.
06-02-2025

While looking at this, I noticed that we can still do better in C2. I filed JDK-8349523. For C1, I attached a simplified test (Test2.java) that demonstrates that C1 would usually remove useless double modulo operations but in this case it only works if there is a subsequent call to Double.isNaN(d1). This is a benign issue as it only affects C1 code that will be replaced by C2 but we should investigate. java -XX:TieredStopAtLevel=1 -XX:CompileCommand=compileonly,Test2::test -Xbatch Test2.java With "isNaN": real 0m1.352s user 0m1.305s sys 0m0.051s Without: real 0m2.652s user 0m2.601s sys 0m0.058s
06-02-2025