JDK-8275427 : RunThese24H.java failed with EXCEPTION_ACCESS_VIOLATION in java.util.ArrayList.add
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 18
  • Priority: P2
  • Status: Closed
  • Resolution: Duplicate
  • OS: windows
  • CPU: x86_64
  • Submitted: 2021-10-18
  • Updated: 2021-12-10
  • Resolved: 2021-12-10
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 18
18Resolved
Related Reports
Duplicate :  
Description
The following test failed in the JDK18 CI:

applications/runthese/RunThese24H.java

Here's a snippet from the log file:

[74693.990s][info ][class,unload] unloading class javasoft.sqe.jck.lib.javax.jce.CryptoUtil 0x0000000805775000
[74693.991s][info ][class,unload] unloading class javasoft.sqe.javatest.lib.MultiTest 0x0000000802b83000
[74693.991s][info ][class,unload] unloading class javasoft.sqe.javatest.Test 0x000000080322bc00
[74694.853s][info ][gc          ] GC(2683) Pause Remark 722M->722M(2192M) 954.812ms
[74696.006s][info ][gc          ] GC(2683) Pause Cleanup 866M->866M(2192M) 0.372ms
[74696.185s][info ][gc          ] GC(2683) Concurrent Mark Cycle 4356.182ms
[74715.271s][info ][gc          ] GC(2684) Pause Young (Prepare Mixed) (G1 Evacuation Pause) 1554M->593M(2192M) 95.247ms
[74717.433s][info ][gc          ] GC(2685) Pause Young (Mixed) (G1 Evacuation Pause) 641M->557M(2192M) 118.170ms
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x000001fd8001c81c, pid=35076, tid=40276
#
# JRE version: Java(TM) SE Runtime Environment (18.0+20) (fastdebug build 18-ea+20-1194)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 18-ea+20-1194, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, windows-amd64)
# Problematic frame:
# j  java.util.ArrayList.add(Ljava/lang/Object;[Ljava/lang/Object;I)V+0 java.base@18-ea
#
# Core dump will be written. Default location: T:\\testoutput\\test-support\\jtreg_closed_test_hotspot_jtreg_applications_runthese_RunThese24H_java\\scratch\\0\\hs_err_pid35076.mdmp
#
# JFR recording file will be written. Location: T:\\testoutput\\test-support\\jtreg_closed_test_hotspot_jtreg_applications_runthese_RunThese24H_java\\scratch\\0\\hs_err_pid35076.jfr
#
Unsupported internal testing APIs have been used.

# An error report file with more information is saved as:
# T:\\testoutput\\test-support\\jtreg_closed_test_hotspot_jtreg_applications_runthese_RunThese24H_java\\scratch\\0\\hs_err_pid35076.log
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
#
Checking for JDK Version : 1.1
Creating StreamObject from spec
Creating StreamObject from Object
----------System.err:(1016/79602)*----------


Here's the crashing thread's stack:

---------------  T H R E A D  ---------------

Current thread (0x000001fe1dfa8cd0):  JavaThread "javasoft.sqe.tests.api.java.util.Base64.EncodingDecodingMimeCustom_3 " daemon [_thread_in_Java, id=40276, stack(0x00000089f6c00000,0x00000089f6d00000)]

Stack: [0x00000089f6c00000,0x00000089f6d00000],  sp=0x00000089f6cfd970,  free space=1014k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  0x000001fd8001c81c


siginfo: EXCEPTION_ACCESS_VIOLATION (0xc0000005), reading address 0x000000085656572e


Here's another snippet from the hs_err_pid:

RCX=0x00000000000000b1 is an unknown value
RDX=
[error occurred during error reporting (printing register info), id 0xe0000000, Internal Error (t:\workspace\open\src\hotspot\share\gc/g1/g1BlockOffsetTable.inline.hpp:113)]

Registers:
RAX=0x0000000856565656, RBX=0x00000008000345e0, RCX=0x00000000000000b1, RDX=0x00000002545b12d0
RSP=0x00000089f6cfd970, RBP=0x00000089f6cfd9d0, RSI=0x0000000000000001, RDI=0x0000000000000028
R8 =0x0000000000000000, R9 =0x00000002545a7088, R10=0x0000000800000000, R11=0x00000002545a7070
R12=0x0000000000000000, R13=0x0000000800724eae, R14=0x00000089f6cfd9f8, R15=0x000001fe1dfa8cd0
RIP=0x000001fd8001c81c, EFLAGS=0x0000000000010206


The info on the actual crash itself is very limited. However, the
test's log file shows a GC event just before the crash:

[74717.433s][info ][gc          ] GC(2685) Pause Young (Mixed) (G1 Evacuation Pause) 641M->557M(2192M) 118.170ms
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x000001fd8001c81c, pid=35076, tid=40276

and the hs_err_pid shows a secondary error in G1 code:

[error occurred during error reporting (printing register info), id 0xe0000000, Internal Error (t:\workspace\open\src\hotspot\share\gc/g1/g1BlockOffsetTable.inline.hpp:113)]

So I'm starting this failure in hotspot/gc for initial triage.
Comments
I've attached the hs_err_pid from the jdk-18+20-1194-tier8 sighting: test-support_jtreg_closed_test_hotspot_jtreg_applications_runthese_RunThese24H_java_applications_runthese_RunThese24H_hs_err_pid35076.log
09-12-2021

Could you please attach the hs_err file to this?
09-12-2021

ILW = heap corruption with base64 intrinsic, AVX-512 only?, disable intrinsic? = HML = P2
09-12-2021

This crash is just a different manifestation of JDK-8273108. The header of the object is being overwritten by the previous byte[]. In this case both the markword and the _metadata field are corrupted and we crash while trying to dereference the decoded class pointer: 0:705> dp 0x00000002545b12a0 00000002`545b12a0 72313056`4f56555a 5470464d`5a4e6a56 00000002`545b12b0 474e3256`61466a56 45705652`53466d54 00000002`545b12c0 565a3151`34646c56 39414452`4e56554d 00000002`545b12d0 56565656`56565656 56565656`56565656 <== markword / klass overwritten 00000002`545b12e0 4002a024`4002a024 4002a024`4002a024 00000002`545b12f0 4002a024`4002a024 4002a024`4002a024 If we look at the heap we can see that the previous contiguous object is a byte array that spans up to 0x2545b12d0: 0:705> dp 0x2545b1170 00000002`545b1170 00000000`00000001 00000150`00002070 00000002`545b1180 48563264`77306d56 70684757`54685855 00000002`545b1190 77594656`6f316d55 7a706c52`564e465a 00000002`545b11a0 745a3156`4f5a6b57 50703161`586c4865 0x2545b1180 + 0x150 = 0x2545b12d0 So seems the base64 decoder stub generated extra bytes that corrupted the next object's header. The JavaThread that crashes is javasoft.sqe.tests.api.java.util.Base64.EncodingDecodingMimeCustom_3 so we are indeed at the point in RunThese where we execute these base64 encoding/decoding tests.
08-12-2021

The linked-to bug has a pattern 0x78787878 and this one has a pattern 0x56565656. Interesting?
07-12-2021

Since that broken object lies in eden, it is very unlikely that GC has anything to do with the corruption of the Klass - GC does not touch objects in Eden for any reason I can think of. We actually actively try to exclude doing anything with object in eden.
02-11-2021

Crash happens 5 seconds after the mentioned GC. There are three intervening safepoint cleanup tasks between that GC and the crash: Event: 74717.310 Executing VM operation: G1CollectForAllocation Event: 74717.715 Executing VM operation: G1CollectForAllocation done Event: 74719.715 Executing VM operation: Cleanup Event: 74719.876 Executing VM operation: Cleanup done Event: 74720.881 Executing VM operation: Cleanup Event: 74720.996 Executing VM operation: Cleanup done Event: 74721.997 Executing VM operation: Cleanup Event: 74722.192 Executing VM operation: Cleanup done The secondary crash seems to happen when optimistically trying to decode the register/and stack slot value containing 00000002545b12d0. This address lies within the Eden: | 84|0x0000000254000000, 0x0000000254fffff8, 0x0000000255000000| 99%| E| |TAMS 0x0000000254000000, 0x0000000254000000| Complete The code crashes in this ArrayList Java method: private void add(E e, Object[] elementData, int s) { if (s == elementData.length) elementData = grow(); elementData[s] = e; size = s + 1; } This is the disassembly of the code in the hs_err file: 0x000001fd8001c7d4: mov (%rsp),%rax ; Reading e (also an object array) 0x000001fd8001c7d8: mov 0x8(%rsp),%ecx ; Reading s 0x000001fd8001c7dc: mov 0x10(%rsp),%rdx ; Reading elementData (broken array) 0x000001fd8001c7e1: movslq %ecx,%rcx ; 0x000001fd8001c7e4: cmp 0xc(%rdx),%ecx ; comparing array length 0x000001fd8001c7e7: mov %ecx,%ebx 0x000001fd8001c7e9: jb 0x1fd8001c7f3 ; jump forward a few instructions 0x000001fd8001c7eb: mov %rdx,%rdx ; nop 0x000001fd8001c7ee: jmp 0x1fd800113c0 ; jump to grow() ? 0x000001fd8001c7f3: test %rax,%rax ; Null check 0x000001fd8001c7f6: je 0x1fd8001cb32 0x000001fd8001c7fc: mov 0x8(%rax),%ebx ; Read klass of e 0x000001fd8001c7ff: movabs $0x800000000,%r10 ; Read Compressed Class Space base 0x000001fd8001c809: add %r10,%rbx ; Decode class of e 0x000001fd8001c80c: mov 0x8(%rdx),%eax ; Read klass of elementData 0x000001fd8001c80f: movabs $0x800000000,%r10 ; Read Compressed Class Space base 0x000001fd8001c819: add %r10,%rax ; Decode klass of elementData 0x000001fd8001c81c: mov 0xd8(%rax),%rax ; Crash when reading field in the klass 0x000001fd8001c823: mov -0x28(%rbp),%rcx 0x000001fd8001c827: test %rcx,%rcx This is the decoded klass pointer - RAX=0x0000000856565656: which is 0x800000000 (Compressed Class Space base) | 0x56565656 so the compressed class pointer of the object is 0x56565656. That doesn't look good. I wonder where that pattern comes from.
18-10-2021