JDK-8252372 : Check if cloning is required to move loads out of loops in PhaseIdealLoop::split_if_with_blocks_post()
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 16,17
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2020-08-26
  • Updated: 2024-09-17
  • Resolved: 2021-05-26
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 17
17 b24Fixed
Related Reports
Duplicate :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Description
While working on JDK-8249607 it was not clear if the code starting at L1399:

 // See if a shared loop-varying computation has no loop-varying uses.
  // Happens if something is only used for JVM state in uncommon trap exits,
  // like various versions of induction variable+offset.  Clone the
  // computation per usage to allow it to sink out of the loop.
  if (has_ctrl(n) && !n->in(0)) {// n not dead and has no control edge (can float about)

is really doing what it is supposed to do. There are several things to be checked and might to be reworked:
- Revisit the entire cloning optimization:
  - Is it really required or would other loop opts move loads out of the loop anyways if there are no uses inside?
  - What if there are some uses inside and outside, is it really beneficial to do the cloning and possibly ending up with multiple loads?
If optimization is required:
- We should rethink the yanking part as it does not prevent the nodes to be put back on the worklist and might block some optimizations. Is there another way to prevent nodes without control to float back into the loop? Maybe adding a cast node with an explicit control?
- Make sure to not pin loads inside the loop if x_ctrl turns out to be a node inside the loop at the end (if that is not already guaranteed)
- Make sure not to pin loads in an outer strip mined loop if they do not have a use there (i.e. think about late_load_ctrl, maybe have a look at idea in 
http://cr.openjdk.java.net/~chagedorn/8249607/webrev.01/ which tackles that problem) 
Comments
Changeset: 9d305b9c Author: Roland Westrelin <roland@openjdk.org> Date: 2021-05-26 09:20:42 +0000 URL: https://git.openjdk.java.net/jdk/commit/9d305b9c0625d73c752724569dbb7f6c8e80931c
26-05-2021

JDK-8260709 revealed yet another problem with this code. Targeting this RFE to JDK 17 for now to keep it on the radar.
03-02-2021

For JDK-8260420, I've experimented with updating late_load_ctrl: @@ -1467,7 +1468,9 @@ void PhaseIdealLoop::split_if_with_blocks_post(Node *n) { // // Because we are setting the actual control input, factor in // the result from get_late_ctrl() so we respect any - // anti-dependences. (6233005). + // anti-dependences. (6233005). Re-compute late ctrl now that + // the load has been cloned and is less restricted by its users. + late_load_ctrl = get_late_ctrl(x, n_ctrl); x_ctrl = dom_lca(late_load_ctrl, x_ctrl); See option 3) described in the PR: https://git.openjdk.java.net/jdk/pull/2315 We would also need to adjust the initial check on late_load_ctrl: // If n is a load, and the late control is the same as the current // control, then the cloning of n is a pointless exercise, because // GVN will ensure that we end up where we started. if (!n->is_Load() || late_load_ctrl != n_ctrl) {
29-01-2021

Roland wrote in his RFR for JDK-8229483: "Unrelated to this fix, I wonder if the code at: http://hg.openjdk.java.net/jdk/jdk/file/cb836bd08d58/src/hotspot/share/opto/loopopts.cpp#l1346 really does what the comment says it does for loads (and really does anything useful actually). Late control for the load is computed with: late_load_ctrl = get_late_ctrl(n, n_ctrl); to make sure anti dependences are taken into account. But if a load is in a loop, it's because it has uses outside [Tobias: he probably meant "inside"] the loop. So late control for the load is in the loop too. When sinking the load, the restriction is that clones should not float below late control, then clones are going to stay in the loop. And that code doesn't do anything for loads. Or am I missing something? https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-August/034846.html
28-01-2021