JDK-8273914 : Indy string concat changes order of operations
  • Type: Bug
  • Component: tools
  • Sub-Component: javac
  • Affected Version: 11,13,15,17,18,19
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2021-09-16
  • Updated: 2024-11-08
  • Resolved: 2022-01-11
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 17 JDK 19
11-pool-oracleUnresolved 17.0.14-oracleFixed 19 b05Fixed
Related Reports
CSR :  
Duplicate :  
Relates :  
Sub Tasks
JDK-8279866 :  
Description
The invokedynamic string concatenation strategies don't preserve the order of operations when evaluating subexpressions and converting them to strings.

With the inline strategy, each subexpression is evaluated and converted to a string in order. With indy each subexpression is evaluated in order *but not converted to a string*, and later each subexpression is converted to a string in order.

Originally reported on compiler-dev@ [1]. briangoetz confirmed the original behaviour of the inline strategy [2] is correct, and sketched one possible fix [3]:

> The nature of indy leads to the structure we ended up with.  The natural 
> way to do this is to have the various arguments passed as real stack 
> arguments, causing them to all be evaluated before being pushed through 
> the MH nest (which is where the string conversion happens, using 
> MethodHandle::filterArguments.)  We'd essentially have to write a new 
> bootstrap (but leave the old one in place), and have the compiler 
> generate bytecode to do the string conversion as the values are being 
> pushed on the stack.  This requires a different bootstrap and a 
> different calling convention.  So old bytecode will stay around, and the 
> old bootstrap would have to be kept around ... pretty disruptive.

[1] https://mail.openjdk.java.net/pipermail/compiler-dev/2021-September/017911.html
[2] https://mail.openjdk.java.net/pipermail/compiler-dev/2021-September/017912.html
[3] https://mail.openjdk.java.net/pipermail/compiler-dev/2021-September/017929.html

Repro:

class T {
  static String test() {
    StringBuilder builder = new StringBuilder("foo");
    return "" + builder + builder.append("bar");
  }

  public static void main(String[] args) {
    System.err.println(test());
  }
}

$ javac -XDstringConcat=inline T.java ; java T
foofoobar

$ javac -XDstringConcat=indy T.java ; java T
foobarfoobar
Comments
[jdk11u-fix-request] Approval Request from Liam Miller-Cushon Backporting this change fixes a bug with invokedynamic-based string concatenation. The fix has received real-world testing since 2022, including in JDK 21. The backport applies cleanly. Langtools tier1 tests pass.
21-09-2024

[jdk17u-fix-request] Approval Request from Liam Miller-Cushon Backporting this change fixes a bug with invokedynamic-based string concatenation. The fix has received real-world testing since 2022, including in JDK 21. The backport applies cleanly. Langtools tier1 tests pass.
21-09-2024

A pull request was submitted for review. Branch: master URL: https://git.openjdk.org/jdk11u-dev/pull/2933 Date: 2024-09-10 17:11:28 +0000
21-09-2024

A pull request was submitted for review. Branch: master URL: https://git.openjdk.org/jdk17u-dev/pull/2864 Date: 2024-09-10 17:11:32 +0000
21-09-2024

Thanks, I have filed a backport issue for 17u, and filed a CSR with fix versions for 11u and 17u: https://bugs.openjdk.org/browse/JDK-8339881
10-09-2024

On a process perspective, it is fine for a backport CSR to serve multiple fix-versions is the delta is the same for all those release trains. To do this, create a backport of one of the update releases then create the CSR from that backport. Next, manually add the other release trains to the fixVersion field of the CSR. HTH
10-09-2024

I can create the backports if there's appetite for it. Is the process to create backport issues in JBS and then create new CSRs for each backport?
10-09-2024

As I mentioned in related CSR (JDK-8274863), this fix needs to go to 11 and 17 as well. [~cushon], would you like to do it?
04-09-2024

Changeset: cfee4512 Author: Liam Miller-Cushon <cushon@openjdk.org> Date: 2022-01-11 15:45:15 +0000 URL: https://git.openjdk.java.net/jdk/commit/cfee4512f7048e9cf440078e9eb87d37c7ebcdd9
11-01-2022

> The call to BSM would have these arguments as Strings, so the BSM could just skip the excess stringification (I think it would still need to perform null check, though, which would be the job for JIT to elide). The BSM can't really know whether the calling code has pre-emptively stringified or not, can it? I think it has to keep doing the same checks and stringification it's already doing.
06-10-2021

[~cushon]: that sounds like what I was envisioning. (I don't think the implementation is optimizing concatenation of boxed primitives today, but I guess it won't hurt too much to keep that particular door open)
06-10-2021

> call toString() on all non-trivial inputs to existing SCF BSM. (Trivial: primitives, String, Numbers, etc.) I think that's consensus from the compiler-dev thread as well: https://mail.openjdk.java.net/pipermail/compiler-dev/2021-September/018000.html I was planning to clean that patch up and open a PR, if that sounds like what you were envisioning.
06-10-2021

> Was it something like this you had in mind, Aleksey? Yes. The original intent with keeping the exact types and passing the objects for stringification at BSM side was largely driven by the future use for cracking open known types like Integer, or other interesting future types, e.g. records, value type boxes, etc. But this bug shows the danger of doing that, and we should instead stringify non-trivial Objects before handing them off to BSM. Since this effectively just reshuffles the stringification that BSM already does, I would expect little to none performance impact. The call to BSM would have these arguments as Strings, so the BSM could just skip the excess stringification (I think it would still need to perform null check, though, which would be the job for JIT to elide). Doing this by recasting the input expressions into MH trees or some other IR opens up more interesting opportunities for more interesting bugs, which I don't like.
06-10-2021

StringBuilder builder = new StringBuilder("foo"); System.out.println("" + builder + builder.append("bar")); translates to: 0: new #7 // class java/lang/StringBuilder 3: dup 4: ldc #9 // String foo 6: invokespecial #11 // Method java/lang/StringBuilder."<init>":(Ljava/lang/String;)V 9: astore_1 10: getstatic #14 // Field java/lang/System.out:Ljava/io/PrintStream; 13: aload_1 14: aload_1 15: ldc #20 // String bar 17: invokevirtual #22 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; 20: invokedynamic #26, 0 // InvokeDynamic #0:makeConcatWithConstants:(Ljava/lang/StringBuilder;Ljava/lang/StringBuilder;)Ljava/lang/String; 25: invokevirtual #30 // Method java/io/PrintStream.println:(Ljava/lang/String;)V To be compliant with JLS, the toString() calls would have to happen before the invokedynamic, so I think we need to desugar to something like this: StringBuilder sb = new StringBuilder("foo"); System.out.println("" + String.valueOf(sb) + String.valueOf(sb.append("bar"))); ... which translates to: 0: new #7 // class java/lang/StringBuilder 3: dup 4: ldc #9 // String foo 6: invokespecial #11 // Method java/lang/StringBuilder."<init>":(Ljava/lang/String;)V 9: astore_1 10: getstatic #14 // Field java/lang/System.out:Ljava/io/PrintStream; 13: aload_1 14: invokestatic #20 // Method java/lang/String.valueOf:(Ljava/lang/Object;)Ljava/lang/String; 17: aload_1 18: ldc #26 // String bar 20: invokevirtual #28 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; 23: invokestatic #20 // Method java/lang/String.valueOf:(Ljava/lang/Object;)Ljava/lang/String; 26: invokedynamic #32, 0 // InvokeDynamic #0:makeConcatWithConstants:(Ljava/lang/String;Ljava/lang/String;)Ljava/lang/String; 31: invokevirtual #36 // Method java/io/PrintStream.println:(Ljava/lang/String;)V Which will ensure evaluation happens in the expected order. Was it something like this you had in mind, Aleksey? In any case I agree it seems we can salvage this with a similar desugaring without the need to specify and add an entirely new bootstrap method. This should be able to retain most performance characteristics (the redundant toStrings might be possible for JITs to elide). Incidentally this would more or less resolve the concern I had when filing JDK-8245719, since all non-trivial Objects would be converted to String and the number of possible method types the BSM will see will be much reduced.
06-10-2021

It looks to me that the easier way to solve this would be to call toString() on all non-trivial inputs to existing SCF BSM. (Trivial: primitives, String, Numbers, etc.)
06-10-2021