JDK-7121648 : Use 3-operands SIMD instructions on x86 with AVX
Type:Enhancement
Component:hotspot
Sub-Component:compiler
Affected Version:8-pool
Priority:P4
Status:Closed
Resolution:Fixed
OS:generic
CPU:x86
Submitted:2011-12-14
Updated:2012-03-29
Resolved:2012-03-29
The Version table provides details related to the release that this issue/RFE will be addressed.
Unresolved : Release in which this issue/RFE will be addressed. Resolved: Release in which this issue/RFE has been resolved. Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.
EVALUATION
VEX prefix converts legacy SSE instructions into 3 operands instructions. Use such instructions in C2 generated code for machines with AVX:
vaddsd XMM2, XMM0, [RSI + #8 + RCX << #3]
I did go ahead and created x86.ad file to collect common 32- and 64-bit mach instructions definitions.
I also fixed match_into_reg() to fold load into arithmetic instruction in a loop. Load was not folded because its control (NULL check) is usually moved outside the loop and loop's head is Region. So I added check for control of load's memory (memory phi) which stays inside a loop.
Before:
090 B11: # B11 B12 <- B10 B11 Loop: B11-B11 inner main of N69 Freq: 999991
090 movsd XMM0, [R8 + #16 + RCX << #3] # double
097 movsd XMM1, [R9 + #16 + RCX << #3] # double
09e vaddsd XMM0, XMM1, XMM0
0a2 movsd [R11 + #16 + RCX << #3], XMM0 # double
After:
090 B11: # B11 B12 <- B10 B11 Loop: B11-B11 inner main of N69 Freq: 999991
090 movsd XMM0, [R8 + #16 + RCX << #3] # double
097 vaddsd XMM0, XMM0, [R9 + #16 + RCX << #3]
09e movsd [R11 + #16 + RCX << #3], XMM0 # double