JDK-8361582 : AArch64: Some ConH values cannot be replicated with SVE
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 25
  • Priority: P3
  • Status: Open
  • Resolution: Unresolved
  • CPU: aarch64
  • Submitted: 2025-07-08
  • Updated: 2025-08-20
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 26
26Unresolved
Related Reports
Causes :  
Causes :  
Duplicate :  
Duplicate :  
Description
Seeing this reliably on Graviton 3 instance and current mainline. Bisection points to JDK-8352635.

$ CONF=linux-aarch64-server-fastdebug make images test TEST=compiler/vectorization/TestFloat16VectorOperations.java

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/home/shipilev/shipilev-jdk/src/hotspot/cpu/aarch64/assembler_aarch64.hpp:3756), pid=6237, tid=6259
#  guarantee(false) failed: invalid immediate

Current CompileTask:
C2:1867  892 %  b  4       compiler.vectorization.TestFloat16VectorOperations::vectorDivConstantInputFloat16 @ 2 (40 bytes)

Stack: [0x0000ffff72c98000,0x0000ffff72e96000],  sp=0x0000ffff72e91390,  free space=2020k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x47fe84]  Assembler::sve_dup(FloatRegister, Assembler::SIMD_RegVariant, int)+0x124  (assembler_aarch64.hpp:3756)
V  [libjvm.so+0x1532570]  PhaseOutput::scratch_emit_size(Node const*)+0x2b0  (output.cpp:3387)
V  [libjvm.so+0x152b1d8]  PhaseOutput::shorten_branches(unsigned int*)+0x288  (output.cpp:540)
V  [libjvm.so+0x153ba70]  PhaseOutput::Output()+0xa24  (output.cpp:340)
V  [libjvm.so+0x9bd698]  Compile::Code_Gen()+0x518  (compile.cpp:3123)
V  [libjvm.so+0x9bfdec]  Compile::Compile(ciEnv*, ciMethod*, int, Options, DirectiveSet*)+0x17a4  (compile.cpp:892)
V  [libjvm.so+0x801eb8]  C2Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0x174  (c2compiler.cpp:141)
V  [libjvm.so+0x9cd754]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0x9a8  (compileBroker.cpp:2327)
V  [libjvm.so+0x9ce27c]  CompileBroker::compiler_thread_loop()+0x570  (compileBroker.cpp:1971)
V  [libjvm.so+0xed1c8c]  JavaThread::thread_main_inner()+0xec  (javaThread.cpp:773)
V  [libjvm.so+0x19d61bc]  Thread::call_run()+0xb0  (thread.cpp:243)

Comments
Yes, fixes pushed in JDK 25u will go into JDK 25.0.2 at this point.
20-08-2025

I believe 25.0.1 is already frozen. The current backports would go to 25.0.2. This is inconvenient, but not the end of the world. So, integrate the mainline patch, wait for it to accrue mainline testing, and then backport it somewhere in the beginning of September.
19-08-2025

[~thartmann] [~chagedorn] Hi, I would like to push this to JDK25u as well (as [~thartmann] suggested previously in the other duplicate bug I created for this issue) and can see that patches are already going into JDK25u with Oct 21st as deadline for the earliest update - JDK 25.0.1. Can you please advise until when can I push this patch exactly? I could not find any hard deadline until when patches are accepted for JDK 25.0.1. Thanks!
19-08-2025

A pull request was submitted for review. Branch: master URL: https://git.openjdk.org/jdk/pull/26589 Date: 2025-08-01 09:31:40 +0000
01-08-2025

Hi [~bkilambi], As you already have a fix, it makes sense to assign the ticket to it. Thank you.
21-07-2025

Hi [~eastigeevich] Spoke to Tobias on my ticket on the same issue and he says he's ok to backport this fix to JDK 25u. Thanks!
21-07-2025

Hi [~eastigeevich], I was already looking into this and have a fix as well (bug opened here - https://bugs.openjdk.org/browse/JDK-8362594). Let me know if you'd like to continue looking into this. I can drop my ticket. Could you please target both JDK25 and mainline for this fix please?
21-07-2025

Evgeny graciously accepted this task :)
14-07-2025

Thanks for the background. I will target it to JDK 26 for now. But if you are able to fix it within the RDP 1 time frame and want to backport it, feel free to re-target. ILW = Crash with guarantee due to invalid immediate, only seen on Graviton 3 but failing reliably there, disable compilation of affected method = HLM = P3
10-07-2025

I have not tried yet to write a direct test for it. But AFAICS, the problem is in replicateHF_imm rule that assumes any immH operand is encodeable, which does not look right from SVE specs. Therefore, I believe it is a problem in JDK-8355585, and JDK-8355235 just gives us a pathway to an inconvenient immediate. I'll try to build a local test...
09-07-2025

> it just becomes exposed by JDK-8355235 Do you mean JDK-8352635 instead as stated in the description? > I think this gap is present in original JDK-8355585 Were you able to also reproduce it with JDK 25 somehow?
09-07-2025

slowdebug fails here: Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x512184] Assembler::sve_dup(FloatRegister, Assembler::SIMD_RegVariant, int)+0x114 (assembler_aarch64.hpp:3756) V [libjvm.so+0x4da5e4] replicateHF_immNode::emit(C2_MacroAssembler*, PhaseRegAlloc*) const+0x174 (aarch64_vector.ad:4877) V [libjvm.so+0x13f500c] PhaseOutput::scratch_emit_size(Node const*)+0x3ac (output.cpp:3387) That code calls with mode H: https://github.com/openjdk/jdk/blob/1934bd8d2c02cdb1ba9caaef227ed073fb5e1a9d/src/hotspot/cpu/aarch64/aarch64_vector_ad.m4#L3098-L3113 // Replicate a 16-bit half precision float value instruct replicateHF_imm(vReg dst, immH con) %{ match(Set dst (Replicate con)); format %{ "replicateHF_imm $dst, $con\t# replicate immediate half-precision float" %} ins_encode %{ uint length_in_bytes = Matcher::vector_length_in_bytes(this); int imm = (int)($con$$constant) & 0xffff; if (VM_Version::use_neon_for_vector(length_in_bytes)) { __ mov($dst$$FloatRegister, get_arrangement(this), imm); } else { // length_in_bytes must be > 16 and SVE should be enabled assert(UseSVE > 0, "must be sve"); __ sve_dup($dst$$FloatRegister, __ H, imm); } %} ins_pipe(pipe_slow); %} Assert fails in imm: # guarantee(false) failed: invalid immediate: 25598 ...which I think does not satisfy the (imm & 0xff == 0) condition: https://github.com/openjdk/jdk/blob/1934bd8d2c02cdb1ba9caaef227ed073fb5e1a9d/src/hotspot/cpu/aarch64/assembler_aarch64.hpp#L3745-L3760 // SVE broadcast signed immediate to vector elements (unpredicated) void sve_dup(FloatRegister Zd, SIMD_RegVariant T, int imm8) { starti; assert(T != Q, "invalid size"); int sh = 0; if (imm8 <= 127 && imm8 >= -128) { sh = 0; } else if (T != B && imm8 <= 32512 && imm8 >= -32768 && (imm8 & 0xff) == 0) { sh = 1; imm8 = (imm8 >> 8); } else { guarantee(false, "invalid immediate"); } f(0b00100101, 31, 24), f(T, 23, 22), f(0b11100011, 21, 14); f(sh, 13), sf(imm8, 12, 5), rf(Zd, 0); }
08-07-2025

I think this gap is present in original JDK-8355585, it just becomes exposed by JDK-8355235 that started producing more interesting -- now unencodeable -- ConH-s.
08-07-2025

`sve_dup` looks matching the SVE spec pretty directly: "Unconditionally broadcast the signed integer immediate into each element of the destination vector. This instruction is unpredicated. The immediate operand is a signed value in the range -128 to +127, and for element widths of 16 bits or higher it may also be a signed multiple of 256 in the range -32768 to +32512 (excluding 0). The immediate is encoded in 8 bits with an optional left shift by 8. The preferred disassembly when the shift option is specified is "#<simm8>, LSL #8". However an assembler and disassembler may also allow use of the shifted 16-bit value unless the immediate is 0 and the shift amount is 8, which must be unambiguously described as "#0, LSL #8". https://developer.arm.com/documentation/ddi0596/2020-12/SVE-Instructions/DUP--immediate---Broadcast-signed-immediate-to-vector-elements--unpredicated-- So I think we should somehow disallow generating/matching code that goes into 16-bit immediates that cannot be encoded, not being the multiple of 256.
08-07-2025