JDK-8303113 : C2 SuperWord: improve packing to remove _do_vector_loop
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 21
  • Priority: P4
  • Status: Open
  • Resolution: Unresolved
  • Submitted: 2023-02-23
  • Updated: 2025-01-07
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbdUnresolved
Related Reports
Relates :  
Relates :  
Relates :  
Sub Tasks
JDK-8308606 :  
Description
The CompileCommand Vectorize / _do_vector_loop use the CloneMap to know which ops are from different ops in the single-iteration loop. This helps to eliminate some confusions when ops get packed wrongly.

But _do_vector_loop is off by default, and in same cases makes packing impossible (e.g. for hand-unrolled loops).

We should try to improve packing (creating adjacent memrefs, and extending) to be more robust, and cover all these cases. 

An alternative approach: try once with, and once without _do_vector_loop, and pick more profitable result. But that could double the SuperWord time, and would require us to always maintain the CloneMap.

Related issue: JDK-8309908

------------------ Original Description ----------

See discussion in PR:
https://github.com/openjdk/jdk/pull/12350

Also see
https://github.com/openjdk/jdk/pull/12350#issuecomment-1469539789

Check if these cases can be vectorized with it. Add IR tests for TestPickLastMemoryState.java.

Consider splitting packs, when we have conversion, and one type has more elements per vector than the other.
Comments
New insights and plan: _do_vector_loop currently has 2 effects: 1. disables the alignment requirement on memops. This we can hopefully easily disable. 2. takes info from CloneMap to only allow unroll-clone nodes of the same single-iteration node to be packed. Sometimes this is helpful, sometimes not. We may have to eventually try once with and once without this, and compare the results. I will first try to attack step 1, and then separately address 2 later.
16-05-2023

I will first remove the dead code from JDK-8260943, and then tackle this here.
11-05-2023

Tests in TestPickLastMemoryState.java are complex - use a lot of arrays and expressions. What about simpler misaligned case - store constants: for () { iArr[i1] += C1; iArr[i1 + 2] -= C2; }
15-03-2023