JDK-8309647 : [Vector API] Move Reduction outside loop when possible
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 21
  • Priority: P4
  • Status: Open
  • Resolution: Unresolved
  • Submitted: 2023-06-08
  • Updated: 2023-06-13
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbdUnresolved
Related Reports
Relates :  
Description
Do same as JDK-8302652 but for Vector API.

[~jrose] Sketched it like this:

From
int a = 0; for (…) { … a += v.reduceLanes(ADD) … }
to
vector<int> asplit = zeroes(); for (…) { ... asplit = asplit.add(v) … }; int a = asplit.reduceLanes(ADD);


I coded up concreate int-add-reduction example (dot-product).

./java -Xbatch -XX:CompileCommand=printcompilation,Test::* -XX:CompileCommand=exclude,Test::test00 -XX:UseAVX=2 -XX:CompileCommand=print,Test::test11 Test.java > txt1.txt

./java -Xbatch -XX:CompileCommand=printcompilation,Test::* -XX:CompileCommand=exclude,Test::test00 -XX:UseAVX=2 -XX:CompileCommand=print,Test::test12 Test.java > txt1.txt

grepping for "vector_reduction_int", we see that for test11 we have many reductions in the loop, whereas for test12 we only have 2 reductions in the whole compilation.
Comments
[~vlivanov] Mentioned that we could split the vector accumulators. This could apply here and also be an extension for JDK-8302652. We should run experiments to see if this is profitable. But it could drive down the latency of a loop. vacc = vzero; for (...) { vacc += v; } acc = reduce(acc) ==> vacc1 = vzero, ... vacc2 = vzero; for (...) { vacc1 += v1; vacc2 += v2; ... vaccn += vn; } acc =reduce(vacc1+vacc2+...+vaccn)
13-06-2023