Bug ID: JDK-8262356 Optimize existing masked operation support for AVX-512.

Versions (Unresolved/Resolved/Fixed)

The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.

Other
tbdUnresolved

Other

tbdUnresolved

- Currently a vector masked operation performs an operation over all the vector lanes followed by a blend operation which selectively updates the result vector under the influence of mask vector.

- Prior to AVX-512 blending newly computed result with older value was the only way to facilitate masked/predicated vector operations.

- A non-AVX-512 vector blend instruction probes the MSB bit for each mask vector lane in order to selectively choose between two source vector lanes.
 
- With AVX-512 there are two ways in which masked operation can be performed as follows
Method 1: 
       vmask = vector_cmp(mask, ALL_ONES)
       vres = vector_operation vsrc1, vsrc2 
       vector_blend(vdst,  vres,  vmask)
 
Method 2: 
      opmask = vector_cmp(mask, ALL_ONES)
      ves = vector_operation vsrc1, vsrc2, opmask

Clearly emitting a predicated vector operation is much more optimal in terms of emitted code size and is energy efficient since a vector operation conditionally operates over portion of vectors. 

- VectorAPI has significantly extended to scope of masked operations, additionally it offer APIs to perform direct mask manipulation e.g. VectorMask.or/and/not.  Thus a direct operation over an Opmask register will enable generating efficient code.

- Using opmask register we can further optimized existing implementation for VectorMask querying operation like VectorMask.firstTrue/lastTrue/anyTrue/allTrue/trueCount.

Relates :	JDK-8272359 - X86: Backend support for optimizing vector masked operations over AVX512 target.
Relates :	JDK-8264954 - unified handling for VectorMask object re-materialization during de-optimization
Relates :	JDK-8262982 - [vector API] add IR and API points for synthetic multi-vectors
Relates :	JDK-8262983 - [vector API] add IR and API points for synthetic partial vectors
Relates :	JDK-8262355 - Support for AVX-512 opmask register allocation.
Relates :	JDK-8264563 - Add masked vector intrinsics for binary/store operations
Relates :	JDK-8266621 - Add masking support for unary/ternary vector intrinsics
Relates :	JDK-8270264 - Add the masking support for vector lanewiseShift
Relates :	JDK-8273406 - Optimize various masked vector operations for AVX512 target.
Relates :	JDK-8273949 - Intrinsic creation for VectorMask.toLong operation.
Relates :	JDK-8266287 - Basic mask IR implementation for the Vector API masking feature support
Relates :	JDK-8267368 - Add masking support for reduction vector intrinsics
Relates :	JDK-8271273 - Java API and IR changes for masked compare operation
Relates :	JDK-8271539 - Add masking support for load/store from/into byte array/buffer
Relates :	JDK-8272100 - VectorAPI: modify existing implementation of masked neg and not operation.
Relates :	JDK-8272479 - Java API and IR changes for masked rearrange operation
Relates :	JDK-8272971 - Intrinsification of VectorMask.cast operation for all compatible vector species
Relates :	JDK-8274569 - X86 backend related incorrectness issues in legacy store mask patterns