JDK-8276064 : CheckCastPP with raw oop input floats below a safepoint
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 17,18,19,20
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: linux
  • CPU: x86_64
  • Submitted: 2021-10-27
  • Updated: 2023-01-09
  • Resolved: 2022-11-17
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 17 JDK 20
17.0.7-oracleFixed 20 b25Fixed
Related Reports
Relates :  
Relates :  
Description
The following test failed in the JDK18 CI:

jdk/incubator/vector/Vector512ConversionTests.java

Here's a snippet from the log file:

 1541  CallStaticJavaDirect  ===  1548  1874  1881  15  0  10671  9054  9056  9061  9053  8757  9055  0  8759  9057  0  0  9040  8823  8872  0  1312  10154  1319  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  9056  0  1697  0  0  0  0  0  0  0  0  1884  1547  0  0  0  0  10263  1546  1885  1885  0  0  0  0  0  0  0  0  0  1886  1545  1544  1543  0  0  0  10668  10668  10667  9068  10485  [[ 1542  1540  1887  1893  6726 ]] Static  java.util.Objects::requireNonNull # java/lang/Object * ( java/lang/Object * ) ByteVector::compareTemplate @ bci:1 (line 1748) Byte512Vector::compare @ bci:5 (line 351) Byte512Vector::compare @ bci:3 (line 41) AbstractShuffle::checkIndexes @ bci:24 (line 127) ByteVector::rearrangeTemplate @ bci:1 (line 2102) Byte512Vector::rearrange @ bci:7 (line 412) Byte512Vector::rearrange @ bci:2 (line 41) ByteVector::sliceTemplate @ bci:55 (line 2018) Byte512Vector::slice @ bci:2 (line 384) Byte512Vector::slice @ bci:2 (line 41) AbstractVector::convertShapeTemplate @ bci:234 (line 371) Byte512Vector::convertShape @ bci:4 (line 248) AbstractVector::castShape @ bci:31 (line 287) AbstractVectorConversionTest::conversion_kernel @ bci:374 (line 449) !jvms: AbstractVectorConversionTest::copyConversionArray @ bci:32 (line 341) AbstractVectorConversionTest::conversion_kernel @ bci:232 (line 438)
# To suppress the following error report, specify this argument
# after -XX: or in .hotspotrc:  SuppressErrorAt=/buildOopMap.cpp:360
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/opt/mach5/mesos/work_dir/slaves/ff806ead-2cac-495d-9cbc-62116f99bf14-S13789/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/ae9b8a50-11b3-4a62-a0eb-42987b013aa3/runs/28902ab3-8bff-4d06-9390-2e68644510f4/workspace/open/src/hotspot/share/opto/buildOopMap.cpp:360), pid=21626, tid=21654
#  assert(false) failed: there should be a oop in OopMap instead of a live raw oop at safepoint
#
# JRE version: Java(TM) SE Runtime Environment (18.0+21) (fastdebug build 18-ea+21-1331)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 18-ea+21-1331, mixed mode, sharing, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0x6ade32]  OopFlow::build_oop_map(Node*, int, PhaseRegAlloc*, int*)+0x832
#
# Core dump will be written. Default location: Core dumps may be processed with "/opt/core.sh %p" (or dumping to /opt/mach5/mesos/work_dir/slaves/ff806ead-2cac-495d-9cbc-62116f99bf14-S13709/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/c9e82747-30a9-4a0c-9a72-f2f5a94976d9/runs/b9f1c0fc-fcac-4687-870e-f54d4e6c2394/testoutput/test-support/jtreg_open_test_jdk_jdk_vector/scratch/0/core.21626)
#
# An error report file with more information is saved as:
# /opt/mach5/mesos/work_dir/slaves/ff806ead-2cac-495d-9cbc-62116f99bf14-S13709/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/c9e82747-30a9-4a0c-9a72-f2f5a94976d9/runs/b9f1c0fc-fcac-4687-870e-f54d4e6c2394/testoutput/test-support/jtreg_open_test_jdk_jdk_vector/scratch/0/hs_err_pid21626.log
#
# Compiler replay data is saved as:
# /opt/mach5/mesos/work_dir/slaves/ff806ead-2cac-495d-9cbc-62116f99bf14-S13709/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/c9e82747-30a9-4a0c-9a72-f2f5a94976d9/runs/b9f1c0fc-fcac-4687-870e-f54d4e6c2394/testoutput/test-support/jtreg_open_test_jdk_jdk_vector/scratch/0/replay_pid21626.log
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
#
----------System.err:(1/55)----------
WARNING: Using incubator modules: jdk.incubator.vector
----------rerun:(41/6340)*----------

Here's the crashing thread's stack:

---------------  T H R E A D  ---------------

Current thread (0x00007f607c1fb3c0):  JavaThread "C2 CompilerThread2" daemon [_thread_in_native, id=21654, stack(0x00007f609c6c1000,0x00007f609c7c2000)]


Current CompileTask:
C2:   4258  732             AbstractVectorConversionTest::conversion_kernel (436 bytes)

Stack: [0x00007f609c6c1000,0x00007f609c7c2000],  sp=0x00007f609c7bcb20,  free space=1006k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x6ade32]  OopFlow::build_oop_map(Node*, int, PhaseRegAlloc*, int*)+0x832
V  [libjvm.so+0x6ae819]  OopFlow::compute_reach(PhaseRegAlloc*, int, Dict*)+0x339
V  [libjvm.so+0x6b0a52]  PhaseOutput::BuildOopMaps()+0x1a82
V  [libjvm.so+0x159e64f]  PhaseOutput::Output()+0xd2f
V  [libjvm.so+0x9f8dc7]  Compile::Code_Gen()+0x427
V  [libjvm.so+0xa05148]  Compile::Compile(ciEnv*, ciMethod*, int, bool, bool, bool, bool, bool, DirectiveSet*)+0x1668
V  [libjvm.so+0x81f946]  C2Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0x646
V  [libjvm.so+0xa154b9]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0xd09
V  [libjvm.so+0xa16158]  CompileBroker::compiler_thread_loop()+0x518
V  [libjvm.so+0x188e25c]  JavaThread::thread_main_inner()+0x27c
V  [libjvm.so+0x1894810]  Thread::call_run()+0x100
V  [libjvm.so+0x1574004]  thread_native_entry(Thread*)+0x104
Comments
Fix request [17u] I backport this for parity with 17.0.7-oracle. Typical risk of a C2 change. Small change. Clean backport. Test passes but passes also without the fix. SAP nightly testing passed.
05-01-2023

A pull request was submitted for review. URL: https://git.openjdk.org/jdk17u-dev/pull/1028 Date: 2023-01-04 09:27:10 +0000
04-01-2023

Changeset: cd9c688b Author: Tobias Hartmann <thartmann@openjdk.org> Date: 2022-11-17 05:58:38 +0000 URL: https://git.openjdk.org/jdk/commit/cd9c688bfce36e4b2d37dd68dd8031f197b9eddc
17-11-2022

I got this failure when run jdk/incubator/vector/ tests with -XX:+IgnoreUnrecognizedVMOptions -XX:-MonomorphicArrayCheck -XX:-UncommonNullCast flags in AVX512 machines. Test jdk/incubator/vector/Vector512ConversionTests.java failed (1 time from 3 runs). I attached hs_err and replay files.
16-11-2022

A pull request was submitted for review. URL: https://git.openjdk.org/jdk/pull/10932 Date: 2022-11-01 11:55:04 +0000
01-11-2022

I can reproduce this reliably now with: jtreg -jdk:... -vmoptions:"-XX:CompileCommand=compileonly,ShortMaxVectorLoadStoreTests::loadStoreMaskArray -Xcomp -XX:-TieredCompilation -XX:+StressMethodHandleLinkerInlining" test/jdk/jdk/incubator/vector/ShortMaxVectorLoadStoreTests.java The problem seems to be similar to JDK-8271600 in that a CheckCastPP that should closely follow the Initialize is moved out of a loop and as a result, the raw pointer crosses a safepoint. In this case, the root cause is loop unswitching, namely PhaseIdealLoop::clone_loop_handle_data_uses.
26-10-2022

This showed up again. Re-opening.
17-10-2022

This reproduced last in October 2021 and I was not able to reproduce it with 1000 runs of latest JDK 18u and 19. I suspect that the root cause might be JDK-8252372 and that the issue has been fixed by one of the many follow-up fixes. Assigning to me and closing as Cannot Reproduce for now. Please re-open if it shows up again.
18-05-2022

Deferring to JDK 19 as this is a P3 and RDP 2 is starting today.
20-01-2022

Hi [~jbhateja], do you plan to get a fix in for this in JDK 18? As it is a P3, it needs to be fixed before RDP 2 is starting this Thursday. Otherwise, we have to defer it to JDK 19.
17-01-2022

[~sviswanathan], [~jbhateja], could someone from Intel pick this up? We have a hard time reproducing in our infra because it only triggers in about 4/500 runs on an "Intel_R__Xeon_R__Gold_6354_CPU___3.00GHz". Thanks!
22-11-2021

[~roland] No, unfortunately I'm not able to reproduce the crash, even with a replay file.
28-10-2021

[~dlong] is there a counted loop with a long iv in this method? That seems unlikely. I tried to reproduce it but I don't have an avx512 system readily available. Do you know if it reproduces with a replay file? Maybe I can force UseAVX=3 then.
28-10-2021

ILW = Assert during C2 compilation (regression), single Vector API (incubator) test, no workaround but disable compilation of affected method = HLM = P3
27-10-2021

[~roland], could this be a regression caused by JDK-8259609?
27-10-2021

So far test failing with the following: -XX:UseAVX=3 CPU: Intel_R__Xeon_R__Gold_6354_CPU___3.00GHz CPU features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant tsc arch perfmon rep good nopl xtopology cpuid tsc known freq pni pclmulqdq vmx ssse3 fma cx16 pcid sse4 1 sse4 2 x2apic movbe popcnt tsc deadline timer aes xsave avx f16c rdrand hypervisor lahf lm abm 3dnowprefetch cpuid fault invpcid single ssbd ibrs ibpb stibp ibrs enhanced tpr shadow vnmi flexpriority ept vpid ept ad fsgsbase tsc adjust bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves nt good wbnoinvd arat avx512vbmi umip pku ospke avx512 vbmi2 gfni vaes vpclmulqdq avx512 vnni avx512 bitalg avx512 vpopcntdq la57 rdpid md clear arch capabilities
27-10-2021