| JDK 26 |
|---|
| 26 b24Fixed |
|
Blocks :
|
|
|
Blocks :
|
|
|
Blocks :
|
|
|
Causes :
|
|
|
Causes :
|
|
|
Causes :
|
|
|
Duplicate :
|
|
|
Duplicate :
|
|
|
Duplicate :
|
|
|
Duplicate :
|
|
|
Relates :
|
|
|
Relates :
|
|
|
Relates :
|
|
|
Relates :
|
|
|
Relates :
|
|
|
Relates :
|
|
|
Relates :
|
|
|
Relates :
|
|
JDK-8346993 :
|
|
|
JDK-8366357 :
|
|
|
JDK-8366361 :
|
|
|
JDK-8366427 :
|
|
|
JDK-8366702 :
|
|
|
JDK-8367389 :
|
|
|
JDK-8369448 :
|
This will improve the profitability of vectorizing reductions, and adding shuffle/pack/unpack operations.
Because vectorization is not always profitable, especially if we add more operations to the loop.
There may also be extra cost to subword conversion, see:
https://github.com/openjdk/jdk/pull/23413
--------------------------- PLAN ----------------------
I have a proof-of-concept patch here:
https://github.com/openjdk/jdk/pull/20964
Instead of pushing it as a whole (quite unreviewable), I'll split it up into subtasks.
Here a rough schedule towards Cost-Modeling:
0. Smaller refactorings
1. Scalar node refactoring
- Finer resolution: mem, phi, data, cfg
- These will be needed when modeling the whole loop instead of just the basic block (step 3)
2. Vector node refactoring
- remove reliance on _nodes , so that it will be easier to model the whole loop (step 3)
- instead capture all relevant information in some sort of VTransformNodePrototpye : opcode, vlen, basic_type, etc.
3. Model whole loop instead of only basic block (allows VTransform optimizations like moving reduction out of loop)
- Instead of VTransformGraph::apply_memops_reordering_with_schedule that reorders the old graph,
- I want to build the new loop body from the VTransform directly.
- That means we are less constrained by the old shape of the loop.
4. Optimize: e.g. move reduction out of loop
- Refactor move_unordered_reduction_out_of_loop
- Moving the reduction out of the loop will mean it is not counted in the cost any more, and it is now more profitable (see step 5)
5. Cost-model
- count scalar loop cost (via scalar opcodes)
- count vector loop cost (via scalar opcodes, and vector opcodes + vlen)
- keep track of live nodes (optimization might kill some)
- keep track of nodes inside loop (optimizations might float some nodes out of the loop, don't count their cost)
A later task could be to also do:
VTransformLongToIntVectorNode::optimize
|