| Other |
|---|
| tbdUnresolved |
|
Relates :
|
|
|
Relates :
|
|
|
Relates :
|
|
|
Relates :
|
|
|
Relates :
|
|
|
Relates :
|
|
|
Relates :
|
|
|
Relates :
|
|
|
Relates :
|
|
|
Relates :
|
|
|
Relates :
|
|
|
Relates :
|
|
|
Relates :
|
|
|
Relates :
|
|
|
Relates :
|
|
|
Relates :
|
|
|
Relates :
|
|
|
Relates :
|
|
JDK-8270349 :
|
- Currently a vector masked operation performs an operation over all the vector lanes followed by a blend operation which selectively updates the result vector under the influence of mask vector.
- Prior to AVX-512 blending newly computed result with older value was the only way to facilitate masked/predicated vector operations.
- A non-AVX-512 vector blend instruction probes the MSB bit for each mask vector lane in order to selectively choose between two source vector lanes.
- With AVX-512 there are two ways in which masked operation can be performed as follows
Method 1:
vmask = vector_cmp(mask, ALL_ONES)
vres = vector_operation vsrc1, vsrc2
vector_blend(vdst, vres, vmask)
Method 2:
opmask = vector_cmp(mask, ALL_ONES)
ves = vector_operation vsrc1, vsrc2, opmask
Clearly emitting a predicated vector operation is much more optimal in terms of emitted code size and is energy efficient since a vector operation conditionally operates over portion of vectors.
- VectorAPI has significantly extended to scope of masked operations, additionally it offer APIs to perform direct mask manipulation e.g. VectorMask.or/and/not. Thus a direct operation over an Opmask register will enable generating efficient code.
- Using opmask register we can further optimized existing implementation for VectorMask querying operation like VectorMask.firstTrue/lastTrue/anyTrue/allTrue/trueCount.