JDK-8307683 : Loop Predication should not hoist range checks with trap on success projection by negating their condition
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 9,11,17,19,20,21
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: generic
  • CPU: x86_64
  • Submitted: 2023-05-05
  • Updated: 2024-01-08
  • Resolved: 2023-06-01
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 17 JDK 21
11.0.23-oracleFixed 17.0.10-oracleFixed 21 b26Fixed
Related Reports
Duplicate :  
Duplicate :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Description
ADDITIONAL SYSTEM INFORMATION :
Ubuntu 22.04 / JDK 11.0.19 (but also 17.0.7)

A DESCRIPTION OF THE PROBLEM :
Since the docker image we use to build our application (maven:3.9.1-eclipse-temurin-11-focal) is based on the JDK version 11.0.19, one of our Unit Test makes the JVM crash systematically.
With the image based on jdk 11.0.18 we do not encounter the problem.

I crashes with the following message:
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (macroAssembler_x86.cpp:864), pid=51663, tid=51664
#  fatal error: DEBUG MESSAGE: duplicated predicate failed which is impossible
#
# JRE version: OpenJDK Runtime Environment Temurin-17.0.7+7 (17.0.7+7) (build 17.0.7+7)
# Java VM: OpenJDK 64-Bit Server VM Temurin-17.0.7+7 (17.0.7+7, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0xafec21]  MacroAssembler::debug64(char*, long, long*)+0x41
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to /home/xxxxxxxx/tmp/core.51663)
#
# An error report file with more information is saved as:
#  /tmp/hs_err_pid51663.log
Compiled method (c2)     139  288       4       java.util.GregorianCalendar::computeTime (976 bytes)
 total in heap  [0x00007f6100ed7f90,0x00007f6100ed8680] = 1776
 relocation     [0x00007f6100ed80f0,0x00007f6100ed8120] = 48
 main code      [0x00007f6100ed8120,0x00007f6100ed8460] = 832
 stub code      [0x00007f6100ed8460,0x00007f6100ed8478] = 24
 metadata       [0x00007f6100ed8478,0x00007f6100ed84e0] = 104
 scopes data    [0x00007f6100ed84e0,0x00007f6100ed85a0] = 192
 scopes pcs     [0x00007f6100ed85a0,0x00007f6100ed8620] = 128
 dependencies   [0x00007f6100ed8620,0x00007f6100ed8648] = 40
 handler table  [0x00007f6100ed8648,0x00007f6100ed8660] = 24
 nul chk table  [0x00007f6100ed8660,0x00007f6100ed8680] = 32
Compiled method (c2)     141  288       4       java.util.GregorianCalendar::computeTime (976 bytes)
 total in heap  [0x00007f6100ed7f90,0x00007f6100ed8680] = 1776
 relocation     [0x00007f6100ed80f0,0x00007f6100ed8120] = 48
 main code      [0x00007f6100ed8120,0x00007f6100ed8460] = 832
 stub code      [0x00007f6100ed8460,0x00007f6100ed8478] = 24
 metadata       [0x00007f6100ed8478,0x00007f6100ed84e0] = 104
 scopes data    [0x00007f6100ed84e0,0x00007f6100ed85a0] = 192
 scopes pcs     [0x00007f6100ed85a0,0x00007f6100ed8620] = 128
 dependencies   [0x00007f6100ed8620,0x00007f6100ed8648] = 40
 handler table  [0x00007f6100ed8648,0x00007f6100ed8660] = 24
 nul chk table  [0x00007f6100ed8660,0x00007f6100ed8680] = 32
#
# If you would like to submit a bug report, please visit:
#   https://github.com/adoptium/adoptium-support/issues
#
fish: Job 1, '/home/xxxxxxxx/.sdkman/candidat…' terminated by signal SIGABRT (Abort)


We managed to reproduce it with a simple use case: it happens when we create a Calendar, set lenient to false and force the HOUR_OF_DAY and the MINUTE fields to 0 then call getTime() that triggers computeTime().

The failure occurs only after calling the method several thousands of time, this is why we think the JIT compiler has something to do with it.

Also the problem does not happen when the  -Xcomp flag is set.

We tested it on JDK17 and the same problem occurs with the 17.0.7 version (it works fine with 17.0.6)

REGRESSION : Last worked in version 11.0.18

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
use the test case source code below: 

javac TestCalendarJit.java
java TestCalendarJit


EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
no crash  after the 100000 calls loop
ACTUAL -
2544
2545
2546
2547
2548
2549
2550
2551
2552
2553
2554
2555
2556
=============== DEBUG MESSAGE: duplicated predicate failed which is impossible ================

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007fd3b80d97a9, pid=51869, tid=51870
#
# JRE version: OpenJDK Runtime Environment Temurin-11.0.19+7 (11.0.19+7) (build 11.0.19+7)
# Java VM: OpenJDK 64-Bit Server VM Temurin-11.0.19+7 (11.0.19+7, mixed mode, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# J 306 c2 java.util.GregorianCalendar.computeTime()V java.base@11.0.19 (970 bytes) @ 0x00007fd3b80d97a9 [0x00007fd3b80d95c0+0x00000000000001e9]
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to /home/xxxxxxxx/tmp/core.51869)
#
# An error report file with more information is saved as:
# /home/xxxxxxxx/tmp/hs_err_pid51869.log
Compiled method (c2)     193  306       4       java.util.GregorianCalendar::computeTime (970 bytes)
 total in heap  [0x00007fd3b80d9410,0x00007fd3b80d9a90] = 1664
 relocation     [0x00007fd3b80d9588,0x00007fd3b80d95b8] = 48
 main code      [0x00007fd3b80d95c0,0x00007fd3b80d9880] = 704
 stub code      [0x00007fd3b80d9880,0x00007fd3b80d9898] = 24
 metadata       [0x00007fd3b80d9898,0x00007fd3b80d9900] = 104
 scopes data    [0x00007fd3b80d9900,0x00007fd3b80d99c0] = 192
 scopes pcs     [0x00007fd3b80d99c0,0x00007fd3b80d9a40] = 128
 dependencies   [0x00007fd3b80d9a40,0x00007fd3b80d9a58] = 24
 handler table  [0x00007fd3b80d9a58,0x00007fd3b80d9a70] = 24
 nul chk table  [0x00007fd3b80d9a70,0x00007fd3b80d9a90] = 32
Compiled method (c1)     194  304       3       TestCalendarJit::getDate (29 bytes)
 total in heap  [0x00007fd3b0c21a10,0x00007fd3b0c22680] = 3184
 relocation     [0x00007fd3b0c21b88,0x00007fd3b0c21c40] = 184
 main code      [0x00007fd3b0c21c40,0x00007fd3b0c223c0] = 1920
 stub code      [0x00007fd3b0c223c0,0x00007fd3b0c22450] = 144
 oops           [0x00007fd3b0c22450,0x00007fd3b0c22458] = 8
 metadata       [0x00007fd3b0c22458,0x00007fd3b0c224a0] = 72
 scopes data    [0x00007fd3b0c224a0,0x00007fd3b0c22558] = 184
 scopes pcs     [0x00007fd3b0c22558,0x00007fd3b0c22668] = 272
 dependencies   [0x00007fd3b0c22668,0x00007fd3b0c22670] = 8
 nul chk table  [0x00007fd3b0c22670,0x00007fd3b0c22680] = 16
Could not load hsdis-amd64.so; library not loadable; PrintAssembly is disabled
#
# If you would like to submit a bug report, please visit:
#   https://github.com/adoptium/adoptium-support/issues
#
fish: Job 1, '/home/xxxxxxxx/.sdkman/candidat…' terminated by signal SIGABRT (Abort)


---------- BEGIN SOURCE ----------

import java.util.Calendar;

public class TestCalendarJit {

    public static void main(String[] args) {
        for (int i = 0; i < 100000; i++) {
            System.err.println(i);
            getDate();
        }
    }

    private static void getDate() {
        Calendar c = Calendar.getInstance();
        c.setLenient(false);
        c.set(Calendar.HOUR_OF_DAY, 0);
        c.set(Calendar.MINUTE, 0);
        c.getTime();
    }
}

---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
the problem does not occur with the -Xcomp flag set

FREQUENCY : occasionally



Comments
Fix request [11u] I backport this for parity with 11.0.23-oracle. Follow the jdk17 backport to solve the error. Test passes. SAP nightly testing passed.
13-12-2023

A pull request was submitted for review. URL: https://git.openjdk.org/jdk11u-dev/pull/2340 Date: 2023-12-06 05:57:03 +0000
06-12-2023

A pull request was submitted for review. URL: https://git.openjdk.org/jdk17u-dev/pull/1553 Date: 2023-07-06 07:11:22 +0000
06-07-2023

Fix request [17u] This fixes a regression in 17.0.7. To work around the regression, JDK-8297951 was backed out in 17.0.8. Risk: we should fix this, the backout of 8297951 only reduces the likelyhood of the bug. So with the backout we have two issues open. I will redo the backpout, too. I had to do larger adaptions to the change, but the core of the fix is obvious to map to 17. Test #id0 crashes without the fix, #id1 passes. Both pass with the fix. SAP nightly testing passed.
06-07-2023

Changeset: dfd3da3f Author: Christian Hagedorn <chagedorn@openjdk.org> Date: 2023-06-01 08:04:45 +0000 URL: https://git.openjdk.org/jdk/commit/dfd3da3f52480f68f653beb1e720691f8232ace7
01-06-2023

The negation of the range check conditions introduced with JDK-7173584 turned out to be wrong and is now suggested as final and complete fix.
31-05-2023

Thanks for the explanation, Christian, and also for the quick turnaround on this!
25-05-2023

A pull request was submitted for review. URL: https://git.openjdk.org/jdk/pull/14156 Date: 2023-05-25 16:48:35 +0000
25-05-2023

This is great news; thanks Christian!
25-05-2023

Hi Mario, hi Frederic It's true that JDK-8297951 inserted the Halt Nodes which we see failing at runtime. However, before JDK-8297951 we could just have a silent wrong execution (I have a test case for that as well). So, it looks like JDK-8297951 revealed this issue and made it much more common. The fix is straight forward. I will create a PR and target it to 21.
25-05-2023

Hi Christian, Thanks for looking into this. Regarding the pattern, I'm not sure if this pattern is very rare though, we had a lot of reports and it seems there are also multiple reports over a Adoptium, but indeed they all look very seem similar.
25-05-2023

While investigating this issue we noticed that a revert of JDK-8297951 prevents the crash.
25-05-2023

This is unrelated to JDK-8305428. I've had a closer look and it seems that the code added by JDK-4809552 to allow Loop Predication if we have positive values that aren't LoadRanges is only correct if we have an actual RangeCheckNode. But we also allow normal IfNodes which is wrong. This leads to this crash and wrong executions with a simple reproducer. But this pattern is very rare in practice.
09-05-2023

ILW = Crash with with debug build and possible wrong execution with release build, rare, disable compilation of affected method or use -XX:-UseLoopPredicate = HLM = P3
09-05-2023

Issue is reproduced. Crash is observed on JDK 17.0.7, 11.0.19 but not present in JDK 17.0.6, 11.0.18 and below versions. However, crash can be avoided by using -Xcomp OS: Windows 10 JDK 11.0.18:Pass JDK 11.0.19: Fail JDK 17.0.6: Pass JDK 17.0.7 : Fail JDK 20.0.1: Fail JDK 21ea: Fail ILW = Regression, reproducible on GA build, use -Xcomp= HLM = P3 Moving it to dev team for further analysis.
09-05-2023