JDK-8211698 : Crash in C2 compiled code during execution of double array heavy processing code
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 11,12
  • Priority: P2
  • Status: Closed
  • Resolution: Fixed
  • OS: generic
  • CPU: x86_64
  • Submitted: 2018-10-02
  • Updated: 2022-09-21
  • Resolved: 2018-12-18
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 12 JDK 13
11.0.2Fixed 12 b25Fixed 13Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Description
ADDITIONAL SYSTEM INFORMATION :
Windows 10
OpenJDK 11

A DESCRIPTION OF THE PROBLEM :
We have a method that iterates many times over a double array of 101 elements in size and it does this many times over with similar inputs. Since upgrading to Java11 we are seeing it fail once it is C2 compiled

I have tried running with -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly enabled but the output was mangled together

I do have a mini dump file if that would help

ERROR MESSAGES/STACK TRACES THAT OCCUR :
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  EXCEPTION_ILLEGAL_INSTRUCTION (0xc000001d) at pc=0x000002223d06a0ba, pid=868, tid=11716
#
# JRE version: OpenJDK Runtime Environment (11.0+28) (build 11+28)
# Java VM: OpenJDK 64-Bit Server VM (11+28, mixed mode, tiered, compressed oops, g1 gc, windows-amd64)
# Problematic frame:
# J 14134 c2 au.com.tradefloor.option.valuation.CoxRossRubinstein.calculate(Lau/com/tradefloor/options/OptionType;Lau/com/tradefloor/options/ExerciseStyle;IDDDDD[DD)D (946 bytes) @ 0x000002223d06a0ba [0x000002223d068140+0x0000000000001f7a]
#
# Core dump will be written. Default location: C:\dev\apache-tomcat-8.5.31\bin\hs_err_pid868.mdmp
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#

REGRESSION : Last worked in version 10.0.2


---------- BEGIN SOURCE ----------
The code can be shared privately if needed
---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
There are no none workarounds yet

FREQUENCY : always



Comments
Fix Request update: The fix needs to be promoted to jdk11.0.3 because Oracle took it into 11.0.3 as well. The patch from jdk11u-dev applies cleanly and was already tested there. I removed jdk11u-fix-yes label for reconsideration. We should agree on some openjdk11u-critical-request label as Andrew Hughes suggested...
13-03-2019

Fix Request Backporting this fix improves C2 reliability. Patch applies cleanly to 11u, passes tier1 tests. New regression test passes. Alas, it also seems to pass without the product patch.
28-02-2019

URL: http://hg.openjdk.java.net/jdk/jdk12/rev/b04860fd2e2c User: rraghavan Date: 2018-12-18 13:49:15 +0000
18-12-2018

Following test case: import java.util.Arrays; public class RangeCheckEliminationScaleNotOne { public static void main(String[] args) { { int[] array = new int[199]; boolean[] flags = new boolean[100]; Arrays.fill(flags, true); flags[0] = false; flags[1] = false; for (int i = 0; i < 20_000; i++) { test1(100, array, 0, flags); } boolean ex = false; try { test1(100, array, -5, flags); } catch (ArrayIndexOutOfBoundsException aie) { ex = true; } if (!ex) { throw new RuntimeException("no AIOOB exception"); } } } private static int test1(int stop, int[] array, int offset, boolean[] flags) { if (array == null) {} int res = 0; for (int i = 0; i < stop; i++) { if (flags[i]) { res += array[2 * i + offset]; } } return res; } } ran with: java -XX:-BackgroundCompilation -XX:-TieredCompilation -XX:-UseOnStackReplacement -XX:CompileOnly=RangeCheckEliminationScaleNotOne::test1 -XX:LoopMaxUnroll=1 -XX:-UseLoopPredicate RangeCheckEliminationScaleNotOne segfaults with the change suggested by Tobias because of a oob array access. So I think the root cause is a bug in range check elimination. I will file a bug for that and work on a fix. In the meantime, I think this bug should be fixed so we go back to the behaviour pre-8193130. Assigning it back to Rahul for that.
12-12-2018

[~thartmann], [~roland] Thank you Tobias for the analysis details. Request to review some minor notes on possible required changes (not directly related to this 8211698 error, but may be overlooked during 8193130 fix) If changes required and correct may be included as part of 8211698 fix itself else please ignore. 1. [src/hotspot/share/opto/node.cpp] is the following change required? if (dead->Opcode() == Op_Opaque4) { - igvn->C->remove_range_check_cast(dead); + igvn->C->remove_opaque4_node(dead); } 2. [src/hotspot/share/opto/loopnode.hpp] Similar to change done - - virtual LoopNode* skip_strip_mined(int expect_opaq = 1) { return this; } + virtual LoopNode* skip_strip_mined(int expect_skeleton = 1) { return this; } Should we change for class CountedLoopNode also? - virtual LoopNode* skip_strip_mined(int expect_opaq = 1); + virtual LoopNode* skip_strip_mined(int expect_skeleton = 1); 3. [src/hotspot/share/opto/compile.cpp] For the code refactoring done, the comment with below change seems wrong (no local 'max_idx_expr' ..) + case Op_CmpUL: { + if (!Matcher::has_match_rule(Op_CmpUL)) { + // We don't support unsigned long comparisons. Set 'max_idx_expr' + // to max_julong if < 0 to make the signed comparison fail. + ConINode* sign_pos = new ConINode(TypeInt::make(BitsPerLong - 1)); ..........
10-12-2018

Hi [~roland], Please check this task.
10-12-2018

## -XX:+TraceLoopPredicate output (with also -XX:-UseLoopPredicate !) - rc_predicate init * 2 + offset<u range rc_predicate init * -2 + offset<u range
10-12-2018

Very nice analysis, Rahul! To summarize your findings, the problem was introduced by the fix for JDK-8193130 in JDK 11 b07. The test can be slightly simplified (see attached Test8211698_simple.java). Looking at the output of -XX:+TraceLoopOpts -XX:+TraceLoopPredicate -XX:+TraceRangeLimitCheck shows that we are unrolling the inner loop and add range check predicates for the iarr1[-istep] and iarr1[istep] accesses: Loop: N0/N0 has_sfpt Loop: N235/N233 limit_check counted [5,0),-1 (-1 iters) has_sfpt Loop: N249/N201 limit_check counted [0,int),+1 (-1 iters) has_sfpt PreMainPost Loop: N249/N201 limit_check counted [0,int),+1 (4 iters) has_sfpt rce RangeCheck Loop: N249/N201 counted [int,int),+1 (4 iters) main has_sfpt rce RC bool node: 21 ConI === 0 [[ 170 148 40 253 255 294 296 ]] #int:888 138 SubI === _ 137 93 [[ 325 170 139 ]] !jvms: Test8211698::test @ bci:25 170 CmpU === _ 138 21 [[ 171 ]] !jvms: Test8211698::test @ bci:49 171 Bool === _ 170 [[ 172 ]] [lt] !jvms: Test8211698::test @ bci:49 rc_predicate init * 2 + offset<u range RC bool node: 21 ConI === 0 [[ 170 148 40 253 255 294 296 341 352 ]] #int:888 147 SubI === _ 93 137 [[ 326 148 ]] !jvms: Test8211698::test @ bci:39 148 CmpU === _ 147 21 [[ 149 ]] !jvms: Test8211698::test @ bci:40 149 Bool === _ 148 [[ 150 ]] [lt] !jvms: Test8211698::test @ bci:40 rc_predicate init * -2 + offset<u range Loop: N0/N0 has_sfpt Loop: N235/N233 limit_check counted [5,0),-1 (-1 iters) has_sfpt Loop: N305/N313 limit_check counted [0,int),+1 (4 iters) pre has_sfpt Loop: N249/N201 counted [int,int),+1 (4 iters) main rc has_sfpt Loop: N264/N272 counted [int,int),+1 (4 iters) post has_sfpt Both range checks are conditional because they are not always executed but are dependent on the 'istep < 0' check. However, both predicates are executed at runtime and we hit halt because "init" is negative. Looking at PhaseIdealLoop::add_range_check_predicate(), I've noticed that the Opaque4Node is not correctly wired. It should be: diff -r 0c637249d934 src/hotspot/share/opto/loopTransform.cpp --- a/src/hotspot/share/opto/loopTransform.cpp Mon Dec 10 09:37:18 2018 +0100 +++ b/src/hotspot/share/opto/loopTransform.cpp Mon Dec 10 11:40:16 2018 +0100 @@ -2289,9 +2289,9 @@ register_new_node(opaque_bol, predicate_proj); IfNode* new_iff = NULL; if (overflow) { - new_iff = new IfNode(predicate_proj, bol, PROB_MAX, COUNT_UNKNOWN); + new_iff = new IfNode(predicate_proj, opaque_bol, PROB_MAX, COUNT_UNKNOWN); } else { - new_iff = new RangeCheckNode(predicate_proj, bol, PROB_MAX, COUNT_UNKNOWN); + new_iff = new RangeCheckNode(predicate_proj, opaque_bol, PROB_MAX, COUNT_UNKNOWN); } register_control(new_iff, loop->_parent, predicate_proj); Node* iffalse = new IfFalseNode(new_iff); With this change, I cannot reproduce the problem. It also fixes JDK-8215044 which seems to be a duplicate of this issue. However, I'm not sure if above fix is sufficient/correct. [~roland], what do you think?
10-12-2018

Issue with PhaseIdealLoop::add_range_check_predicate() or in remove range check casts !?
10-12-2018

1. Found following attached failing Test8211698.java test. -------------- public class Test8211698 { public static void main(String[] args) { Test8211698 issue = new Test8211698(); for (int i = 0; i < 10000; i++) { issue.test(new int[999]); } } public void test(int[] iaarg) { int[] iarr1 = new int[888]; for (int i = 5; i > 0; i--) { for (int j = 0; j <= i - 1; j++) { int istep = 2 * j - i; int iadj = 0; if (istep < 0) { iadj = iarr1[-istep] + iaarg[i]; } else { iadj = iarr1[istep] + iaarg[i]; } if (iarr1[j] < iadj) { iarr1[j] = iadj; } } } } } -------------- 2. Tried old versions --JDK 11 b06 - Could not find any failure. --JDK 11 b07 - SIGILL crash with explicit -XX:-UseSubwordForMaxVector. Found JDK-8193130 (Bad graph when unrolled loop bounds conflicts with range checks) fix changeset initiated the failure. --JDK 11 b22 onwards - SIGILL crash irrespective of UseSubwordForMaxVector setting --With Latest JDK 12 sources - same SIGILL crash --Latest JDK 12 sources without JDK-8193130 fix - NO failure 3. Extracts from attached hs_err_pid24383.log with with latest build. $java -XX:CompileOnly=Test8211698.test Test8211698 -------------------- # SIGILL (0x4) at pc=0x00007f6554ad14ea, pid=24383, tid=24387 .... # Problematic frame: # J 30 c2 Test8211698.test([I)V (91 bytes) @ 0x00007f6554ad14ea [0x00007f6554ad1160+0x000000000000038a] .... Registers: ..RSI=0x0000000000000005, RDI=0xffffffffffffffff.., R10=0x0000000000000004,.. ..... ;; B20: # B54 B21 <- B19 Freq: 4.5027 0x00007f6554ad130f: movslq %r8d,%r10 0x00007f6554ad1312: shl %r10 0x00007f6554ad1315: mov %r10,%rdi 0x00007f6554ad1318: sub %rsi,%rdi 0x00007f6554ad131b: cmp $0x378,%rdi 0x00007f6554ad1322: jae 0x00007f6554ad14ea /** seems wrong unsigned cmp/jmp caused the issue **/ ...... ;; B54: # N615 <- B20 Freq: 4.56249e-06 0x00007f6554ad14ea: ud2 -------------------- 4. Test worked okay when tried following temporary source change. (this is not correct fix) [src/hotspot/share/opto/loopPredicate.cpp] ----------- BoolNode* PhaseIdealLoop::rc_predicate(IdealLoopTree *loop, Node* ctrl, ..... if (overflow) { ... // Integer expressions may overflow, do long comparison range = new ConvI2LNode(range); register_new_node(range, ctrl); - cmp = new CmpULNode(max_idx_expr, range); + cmp = new CmpLNode(max_idx_expr, range); } else { cmp = new CmpUNode(max_idx_expr, range); } [8193130-changeset: http://hg.openjdk.java.net/jdk/jdk/rev/bde392011cd8] Understood the above source code location was touched for JDK-8193130 fix. Also related additions in Compile::final_graph_reshaping_impl() [compile.cpp] (But just reverting only this selected changes in PhaseIdealLoop::rc_predicate and Compile::final_graph_reshaping_impl will not fix the issue) - Work in progress for correct fix -
08-12-2018

This 8211698 reported failure in JDK11b22 is triggered by JDK-8194740 fix changeset. (JDK-8194740 - UseSubwordForMaxVector causes performance regression) But found the crash started with JDK11b07 with explicit -XX:-UseSubwordForMaxVector option. Need to check if related to JDK-8210389, JDK-8211759.
08-10-2018

initial ILW = HLH = P2
05-10-2018

I can't see any obvious candidates in the hotspot fixes for b22.
05-10-2018

This is a regression, issue started appearing from 11 ea b22 onwards 10.0.2 GA - Pass 11 ea b21 - Pass 11 ea b22 - Fail //regression introduced here 11 GA - Fail 12 ea b14 - Fail Output from 12 ea b14 fastdebug build -sh-4.2$ /scratch/fairoz/JAVA/jdk12/jdk-12-ea+14_fastdebug/fastdebug/bin/java Issue9057482 # # A fatal error has been detected by the Java Runtime Environment: # # SIGILL (0x4) at pc=0x00007f6d2c4f8bb6, pid=89377, tid=89378 # # JRE version: Java(TM) SE Runtime Environment (12.0+14) (fastdebug build 12-ea+14) # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 12-ea+14, mixed mode, tiered, compressed oops, g1 gc, linux-amd64) # Problematic frame: # J 127 c2 Issue9057482.calculate(LOptionType;LExerciseStyle;IDDDDD[DD)D (946 bytes) @ 0x00007f6d2c4f8bb6 [0x00007f6d2c4f7800+0x00000000000013b6] # # Core dump will be written. Default location: Core dumps may be processed with "/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e" (or dumping to /scratch/fairoz/JI/8211698/core.89377) # # An error report file with more information is saved as: # /scratch/fairoz/JI/8211698/hs_err_pid89377.log Stacktrace == Stack: [0x00007f6d4485c000,0x00007f6d4495d000], sp=0x00007f6d4495b5d0, free space=1021k Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) J 127 c2 Issue9057482.calculate(LOptionType;LExerciseStyle;IDDDDD[DD)D (946 bytes) @ 0x00007f6d2c4f8bb6 [0x00007f6d2c4f7800+0x00000000000013b6] J 100 c1 Issue9057482.calculate(LOptionType;LExerciseStyle;ILjava/math/BigDecimal;Ljava/math/BigDecimal;Ljava/math/BigDecimal;Ljava/math/BigDecimal;Ljava/math/BigDecimal;)Ljava/math/BigDecimal; (107 bytes) @ 0x00007f6d250052dc [0x00007f6d25004fe0+0x00000000000002fc] J 99 c1 Issue9057482.calculate(LOptionType;LExerciseStyle;ILjava/math/BigDecimal;Ljava/math/BigDecimal;Ljava/math/BigDecimal;Ljava/math/BigDecimal;Ljava/math/BigDecimal;Ljava/math/BigDecimal;)Ljava/math/BigDecimal; (18 bytes) @ 0x00007f6d250048fc [0x00007f6d25004860+0x000000000000009c] j Issue9057482.main([Ljava/lang/String;)V+74 v ~StubRoutines::call_stub V [libjvm.so+0xe9e4aa] JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x85a V [libjvm.so+0xf7e7c6] jni_invoke_static(JNIEnv_*, JavaValue*, _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) [clone .isra.95] [clone .constprop.241]+0x236 V [libjvm.so+0xfa383e] jni_CallStaticVoidMethod+0x1fe C [libjli.so+0x4afa] JavaMain+0xb9a
05-10-2018

Crash occured in c2 compiled code, requested submitter for reproducible test case From the description of the issue ---------- BEGIN SOURCE ---------- The code can be shared privately if needed
04-10-2018