JDK-8229495 : SIGILL in C2 generated OSR compilation
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 11,13,14,15
  • Priority: P2
  • Status: Closed
  • Resolution: Fixed
  • Submitted: 2019-08-13
  • Updated: 2024-10-17
  • Resolved: 2020-07-15
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 15 JDK 16
11.0.10Fixed 15 b32Fixed 16Fixed
Related Reports
Duplicate :  
Duplicate :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Description
(provisional synopsis, please change as you see fit)

Found with fuzzing. The testcase is attached. It fails in the first second every 5-th run or so. There are plenty of hs_errs in the attached bundle. 

$ ~/trunks/jdk-jdk/build/linux-x86_64-server-fastdebug/images/jdk/bin/java Test
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGILL (0x4) at pc=0x00007f7693cde65e, pid=12339, tid=12340
#
# JRE version: OpenJDK Runtime Environment (14.0) (fastdebug build 14-internal+0-adhoc.shade.jdk-jdk)
# Java VM: OpenJDK 64-Bit Server VM (fastdebug 14-internal+0-adhoc.shade.jdk-jdk, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# J 63% c2 Test.vMeth(IF)V (252 bytes) @ 0x00007f7693cde65e [0x00007f7693cde020+0x000000000000063e]
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport %p %s %c %d %P" (or dumping to /home/shade/trunks/JavaFuzzer/tests/03934/core.12339)
#
# An error report file with more information is saved as:
# /home/shade/trunks/JavaFuzzer/tests/03934/hs_err_pid12339.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#
Current thread is 12340
Dumping core ...

The disassembly shows it is ud2 following the call:

 4c 8b 54 24 30                   mov    r10,QWORD PTR [rsp+0x30]
 4c 89 54 24 20                   mov    QWORD PTR [rsp+0x20],r10
 89 5c 24 14                      mov    DWORD PTR [rsp+0x14],ebx
 89 5c 24 28                      mov    DWORD PTR [rsp+0x28],ebx
 e8 c4 27 46 f8                   call   0xfffffffff84627e2
 0f 0b                            ud2     ; <---- SIGILL here
 0f 0b                            ud2    
 be 8d ff ff ff                   mov    esi,0xffffff8d
 44 89 6c 24 08                   mov    DWORD PTR [rsp+0x8],r13d
 89 5c 24 0c                      mov    DWORD PTR [rsp+0xc],ebx
 44 89 74 24 14                   mov    DWORD PTR [rsp+0x14],r14d
 c5 fa 10 4c 24 20                vmovss xmm1,DWORD PTR [rsp+0x20]

...so it must be returning incorrectly on some path.
Comments
11u Fix Request Backporting this patch fixes a bug that causes incorrect code that can crash to be generated. Patch does not apply cleanly to 11u and requires adjustments. 11u RFR: https://mail.openjdk.java.net/pipermail/jdk-updates-dev/2020-November/004205.html
23-11-2020

URL: https://hg.openjdk.java.net/jdk/jdk15/rev/c973b5ec934d User: roland Date: 2020-07-15 11:09:51 +0000
15-07-2020

<RFR> - http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-July/038884.html
07-07-2020

Very little progress since the office lock-down, except for testing (11 & 14 seemingly alright, 15 still in progress). To describe the immediate error/issue, let us use the following example. We essentially have the following source loop. for (int i = 10; 0 < i; i--) { v[i] = n; n += b; } To simplify, let's assume 'b' is zero. (Also assume "scale" is one.) Main-loop progression (with parts left out): for (int i = In; 0 < i; i += -2) { v[i] = n; v[i-1] = n; } The unrolled main-loop with range-checks: for (int i = In; 0 < i; i += -4) { rc(i + 0, v.len); rc(i + -3, v.len); // This check is related to the "level" of unroll. v[i] = n; v[i-1] = n; v[i-2] = n; v[i-3] = n; } When RCE runs on main-loop, forming templates and hoisted RCs, we get: @T(In + 0, v.len) rc(In + 0, v.len); rc(In + 0 + -3, v.len); @T(In + -3, v.len) rc(In + -3, v.len); rc(In + -3 + -3, v.len); // Wrong, no error (since In[-1..9] => LH[-7..3]) for (int i = In; 0 < i; i += -4) { v[i] = n; v[i-1] = n; v[i-2] = n; v[i-3] = n; } Unroll will then remove old RCs and insert new RCs derived from the templates, computing a new "max reach" (-7) based on the current stride. @T(In + 0, v.len) rc(In + 0, v.len); rc(In + 0 + -7, v.len); @T(In + -3, v.len) rc(In + -3, v.len); rc(In + -3 + -7, v.len); // Wrong, error since now In[-1..9] => LH[-11..-1] for (int i = In; 0 < i; i += -8) { v[i] = n; v[i-1] = n; v[i-2] = n; ... v[i-7] = n; } [1] The immediate error is caused by the template/skeleton range-checks introduced during RCE and the way these are transformed during unroll. [2] The presence of range-checks in the (unrolled) main-loop is due to the inability to establish the dominator relation to the original/pre-loop (and the associated range-checks). Addressing [2] with a deeper dominator search hides the problem in the skeleton & RCE code [1] for the original reproducer but does not provide for a general solution (and of course, the erroneous range-check code is still there). Addressing [1] with a new/different tautology seems promising. The solution to JDK-8240335 also alters behaviour in pre/main/post loop scenarios. The effect is not fully investigated/understood with respect to the current solution to the (original) issue.
29-04-2020

Deferral Request Solution not ready, effect on problem addressed in original change-set not verified.
06-02-2020

Reported issue here seems started or triggered from JDK 13 b20 build with the fix changeset of JDK-8216137.
27-01-2020

Verified problem in OpenJDK 11.0.6. -----8<----- # SIGILL (0x4) at pc=0x00007ff4d086b28e, pid=9811, tid=9812 # # JRE version: OpenJDK Runtime Environment (11.0.6) (fastdebug build 11.0.6-internal+0-2019-11-13-1123352.phedlin...) # Java VM: OpenJDK 64-Bit Server VM (fastdebug 11.0.6-internal+0-2019-11-13-1123352.phedlin..., mixed mode, tiered, compressed oops, g1 gc, linux-amd64) # Problematic frame: # J 101% c2 Test.vMeth(IF)V (252 bytes) @ 0x00007ff4d086b28e [0x00007ff4d086ac40+0x000000000000064e] #
13-11-2019

So JDK 11 is affected as well (JDK-8216135 has been backported)? Please link the issue and update the affects version of this bug accordingly.
08-11-2019

The problem was introduced by the solution in/to JDK-8216135.
08-11-2019

(sighs) Yes, I cannot reproduce this failure on most recent jdk-updates/jdk11u-dev either. So it must be relatively new.
19-08-2019

[~shade], [~phedlin], Please note found this SIGILL crash issue for the reported testcase is reproducible from jdk-13+20 build version onward. (could not get the failure with repeated test runs with previous jdk-13+19 and some selected earlier build versions) The issue is Not-reproducible with the testcase, if only 8216137 fix changeset is reverted from jdk-13+20 (also again no crash with latest sources if only 8216137 related changes are reverted) So maybe this JDK-8229495 is related to JDK-8229499. For now assigning this one also to [~phedlin]. Please unassign if unrelated or missed something. Thanks.
17-08-2019

Rahul, are you sure about that? There are plenty of fuzzed tests failing with node budget asserts, they are tracked in JDK-8229499.
16-08-2019

JDK-8225653 will help debugging such issues in the future.
14-08-2019