JDK-8345262 : Loop peeling helps escape analysis
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 24
  • Priority: P4
  • Status: Open
  • Resolution: Unresolved
  • Submitted: 2024-11-29
  • Updated: 2024-12-03
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbdUnresolved
Related Reports
Relates :  
Description
Consider the following program (see also attached):

    public static void payload() {
        var list = new MyList();
        list.addNode();
        list.addNode();

        // ...

        list.cleanup();
    }

    static class MyList  {
        Node first;

        public void addNode() {
            Node newNode = new Node();
            newNode.next = first;
            first = newNode;
        }

        public void cleanup() {
            Node current = first;
            // uncomment to help EA
            // if (current != null) {
            //     current.cleanup();
            //     current = current.next;
            // }
            while (current != null) {
                current.cleanup();
                current = current.next;
            }
        }

        
        private static class Node {
            Node next;

            public void cleanup() {
                // ...
            }
        }

TraceEscapeAnalysis reports several objects that are marked not scalar replaceable:

    JavaObject(1) NoEscape(NoEscape) NSR [ [ ]]   171  ConP  === 0  [[ 418 105 ]]  #null

    JavaObject(5) NoEscape(NoEscape) [ [ 117 ]]   105  Allocate  === 5 6 7 8 1 (22 103 21 1 1 448 448 1 171 ) [[ 106 107 108 115 116 117 ]]  rawptr:NotNull ( int:>=0, java/lang/Object:NotNull *, bool, top, bool ) LoopPeelingEA$MyList::addNode @ bci:0 (line 23) LoopPeelingEA::payload @ bci:9 (line 11) !jvms: LoopPeelingEA$MyList::addNode @ bci:0 (line 23) LoopPeelingEA::payload @ bci:9 (line 11)

    JavaObject(6) NoEscape(NoEscape) [ [ 225 ]]   213  Allocate  === 119 116 210 8 1 (22 103 21 1 1 450 450 1 122 ) [[ 214 215 216 223 224 225 ]]  rawptr:NotNull ( int:>=0, java/lang/Object:NotNull *, bool, top, bool ) LoopPeelingEA$MyList::addNode @ bci:0 (line 23) LoopPeelingEA::payload @ bci:13 (line 12) !jvms: LoopPeelingEA$MyList::addNode @ bci:0 (line 23) LoopPeelingEA::payload @ bci:13 (line 12)
...
    JavaObject(5) NoEscape(NoEscape) is NSR. is merged with other object: JavaObject(1) NoEscape(NoEscape) NSR , by node: Field(8) oop +12 ( 225 213P )[ 179 105P 171P [ ]]   280  AddP  === _ 1 225 166  [[ 279 ]]  !jvms: LoopPeelingEA$MyList::addNode @ bci:13 (line 24) LoopPeelingEA::payload @ bci:13 (line 12)

    JavaObject(1) NoEscape(NoEscape) NSR is NSR. is merged with other object: JavaObject(5) NoEscape(NoEscape) NSR , by node: Field(8) oop +12 ( 225 213P )[ 179 105P 171P [ ]]   280  AddP  === _ 1 225 166  [[ 279 ]]  !jvms: LoopPeelingEA$MyList::addNode @ bci:13 (line 24) LoopPeelingEA::payload @ bci:13 (line 12)

    JavaObject(6) NoEscape(NoEscape) is NSR. is merged with other object: JavaObject(5) NoEscape(NoEscape) NSR , by node: LocalVar(20) [ 230 426 213P 105P 171P [ 409b ]]   369  Phi  === 443 230 426  [[ 409 409 ]]  #LoopPeelingEA$MyList$Node:NotNull *  Oop:LoopPeelingEA$MyList$Node:NotNull * !orig=[372],[386] !jvms: LoopPeelingEA$MyList::cleanup @ bci:9 (line 35) LoopPeelingEA::payload @ bci:17 (line 16)

    JavaObject(5) NoEscape(NoEscape) NSR is NSR. is merged with other object: JavaObject(6) NoEscape(NoEscape) NSR , by node: LocalVar(20) [ 230 426 213P 105P 171P [ 409b ]]   369  Phi  === 443 230 426  [[ 409 409 ]]  #LoopPeelingEA$MyList$Node:NotNull *  Oop:LoopPeelingEA$MyList$Node:NotNull * !orig=[372],[386] !jvms: LoopPeelingEA$MyList::cleanup @ bci:9 (line 35) LoopPeelingEA::payload @ bci:17 (line 16)

    JavaObject(6) NoEscape(NoEscape) NSR is NSR. is merged with other object: JavaObject(1) NoEscape(NoEscape) NSR , by node: LocalVar(20) [ 230 426 213P 105P 171P [ 409b ]]   369  Phi  === 443 230 426  [[ 409 409 ]]  #LoopPeelingEA$MyList$Node:NotNull *  Oop:LoopPeelingEA$MyList$Node:NotNull * !orig=[372],[386] !jvms: LoopPeelingEA$MyList::cleanup @ bci:9 (line 35) LoopPeelingEA::payload @ bci:17 (line 16)

    JavaObject(1) NoEscape(NoEscape) NSR is NSR. is merged with other object: JavaObject(6) NoEscape(NoEscape) NSR , by node: LocalVar(20) [ 230 426 213P 105P 171P [ 409b ]]   369  Phi  === 443 230 426  [[ 409 409 ]]  #LoopPeelingEA$MyList$Node:NotNull *  Oop:LoopPeelingEA$MyList$Node:NotNull * !orig=[372],[386] !jvms: LoopPeelingEA$MyList::cleanup @ bci:9 (line 35) LoopPeelingEA::payload @ bci:17 (line 16)

(Note that I've slightly modified the code to also print out the use node. See the attached IGV graph as well).

The main culprit of the merges is the Phi node generated as a result of the while loop, merging current and null. If we manually peel a single loop iteration (un-comment the code in MyList::cleanup), the allocations go away/can be scalar replaced.

It would be nice if C2 could do this peeling to help EA automatically, so that 1) user code does not have to change and 2) inlining is not affected by the extra peeled iteration being present in the bytecode (we've seen this being in issue in a real world use case).
Comments
This is an other example which shows limitation only data flow sensitive EA (which we have now). Cesar's work allows optimize some merges but not in loop. We need control flow sensitive EA to optimize allocations in loops. We discussed it few times. But when we will work on it is not clear.
03-12-2024

Thank you , Cesar. Yes, loop's Phi is constrain for current merge optimization. I rememberer now. In my opinion loop peeling before EA will be premature. It will complicate graph which EA have to analyze in general case.
03-12-2024

I didn't debug it, but I imagine the Phi isn't removed because it has a loop controlling it. This is the line with the constraint: https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/escape.cpp#L622 I understand that handling such Phis are important. I think I might be able to make RAM work in these cases as well when I have some time.
02-12-2024

[~cslucas] Do you know why your EA objects merge optimization does not work in this case?
02-12-2024

[~thartmann] I mean, if we add a manually peeled iteration to the source code, this will affect the byte code size, and this then seems to effect inlining decisions. As for the example: I think I spoke too soon. I saw a slightly different code shape was changing inlining decisions, but it seems more likely that it changed the order of compilations instead, since this is causing: 'failed to inline: already compiled into a big method". FWIW, this is the patch I was playing with: https://github.com/openjdk/jdk/compare/master...JornVernee:jdk:NoEscape_Peel
02-12-2024

Most loop opts, including PartialPeelLoop, currently happen only after (iterative) Escape Analysis and therefore would not help, unfortunately. Maybe changes like JDK-8289943 will help in the future. > 2) inlining is not affected by the extra peeled iteration being present in the bytecode (we've seen this being in issue in a real world use case). That would be surprising because peeling happens long after any (incremental) inline. Could you share the example?
02-12-2024