JDK-8282312 : Minor corrections to evbroadcasti32x4 intrinsic on x86
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 11,17,18,19
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • CPU: x86
  • Submitted: 2022-02-23
  • Updated: 2022-04-04
  • Resolved: 2022-03-08
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 17 JDK 18 JDK 19
11.0.16-oracleFixed 17.0.4-oracleFixed 18.0.2Fixed 19 b13Fixed
Related Reports
Relates :  
Description
The current evbroadcasti32x4 method in assembler_x86.cpp has it requiring AVX512DQ and creating the opcode with an Op/En as T2.  According to the Intel SDM, this form of the vbroadcasti32x4 only requires AVX512F (change to VM_Version::supports_evex()) and should use an Op / En value of T4.
Comments
Fix Request (17u): Should get backported for parity with 17.0.4-oracle. Applies cleanly. GHA tests have passed.
31-03-2022

A pull request was submitted for review. URL: https://git.openjdk.java.net/jdk18u/pull/62 Date: 2022-03-29 15:11:16 +0000
29-03-2022

Fix Request (JDK 18u) Fixes a wrong requirement for emitting an assembly instruction. The fix is low risk and applies cleanly. Already tested and backported to Oracle JDK 17u. Tier 1-3 testing is running for JDK 18u.
29-03-2022

A pull request was submitted for review. URL: https://git.openjdk.java.net/jdk17u-dev/pull/304 Date: 2022-03-29 13:59:50 +0000
29-03-2022

Changeset: 8b45dbda Author: Jamil Nimeh <jnimeh@openjdk.org> Date: 2022-03-08 05:50:41 +0000 URL: https://git.openjdk.java.net/jdk/commit/8b45dbdae6e5dee85ef65ce25850ce692ad3e965
08-03-2022

A pull request was submitted for review. URL: https://git.openjdk.java.net/jdk/pull/7732 Date: 2022-03-07 18:08:25 +0000
07-03-2022

It seems the "tuple_type" is not used to encode this instruction, but it would be good to set it correctly. It looks like the tuple type is used if the Address has a scaled index.
24-02-2022

One thing I did notice is that currently EVEX_T2 is coupled with EVEX_64bit on that evbroadcasti32x4 method. Is that functionally the same as EVEX_T4 with EVEX_32bit? If so, then they should be both broadcasting the same 128-bit address. Something for me to look into but if anyone knows I'm all ears.
24-02-2022

That is something I plan on looking into before I integrate this, as I was wondering the same thing. The T2 Op/En is used with instructions that broadcast in groups of 2 (e.g. vbroadcasti64x2, vbroadcasti32x2). It may have an impact since it might be only broadcasting 32x2 even though the instruction claims it is 32x4. The resulting opcode is "EVEX.512.66.0F38.W0 5A" both before and after the change which is the 32x4 flavor for this instruction. I am new to assembly so I admit to being a little sketchy on how the EVEX_T2 or EVEX_T4 attribute asserts itself and what the behavior is. FWIW the above opcode with a T2 Op/En value doesn't appear to match anything in the SDM. I'm going to make the change locally and see what regression tests show, but something focused on the kernel_crc32_avx512 is certainly in order since it's the only other consumer of this instruction.
24-02-2022

What's the impact of using T2 instead of T4? Could it lead to a wrong result in kernel_crc32_avx512()?
23-02-2022

ILW = wrong encoding, wrong cpu feature check; unknown impact; no workaround = MMH = P3
23-02-2022