JDK-8136757 : C1 and C2 intrinsics for StringUTF16.(get|set)Char
  • Type: Sub-task
  • Component: hotspot
  • Sub-Component: compiler
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2015-09-18
  • Updated: 2017-08-18
  • Resolved: 2015-10-23
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Related Reports
Relates :  
Most of our performance work is focused on making C2 to perform up to our expectations. Many key performance optimizations for Compact Strings are implemented in C2. However, some users may expect C1 to perform reasonably well with Compact Strings as well. While we cannot provide the same performance levels with a basic compiler like C1, we would still want to provide users with something that can get the performance back to pre-Compact Strings levels.

Case in point. Running SPECjbb2005 with -XX:TieredStopAtLevel=1 on my dev desktop, we have the degradation from 190K to 160K, or around 15%. The profile with Compact Strings show the time is mostly spent compressing and decompressing stuff:

1882.33	<Total>
183.43	java.lang.StringUTF16.compress(char[], int, byte[], int, int)
39.64	java.lang.StringLatin1.inflate(byte[], int, char[], int, int)
20.79	java.lang.String.length()
15.15	java.lang.String.<init>(java.lang.String)
12.76	java.lang.String.coder()
11.83	java.lang.StringUTF16.compress(char[], int, int)
11.00	java.lang.StringLatin1.compareTo(byte[], byte[])

Disabling Compact Strings falls the score further to 150K, but compression and decompression goes away, only to let UTF16 accessors dominate:

1890.32	<Total>
100.05	java.lang.StringUTF16.putChar(byte[], int, int)
93.21	java.lang.StringUTF16.toBytes(char[], int, int)
56.73	java.lang.StringUTF16.getChars(byte[], int, int, char[], int)
51.78	java.lang.StringUTF16.getChar(byte[], int)
21.09	java.lang.String.length()

Note that getChars/toBytes are actually calling (get|put)Char.

Therefore, I would guess intrisifying StringUTF16.(get|put)Char in C1 would help to get our the escape hatch working with -XX:-CompactStrings.
Excellent! That's what I was looking for.

You can find Aleksey's benchmarks (string-density-bench.jar) here: http://cr.openjdk.java.net/~shade/density/

Is it possible to attach the benchmarks?

Who would've thought...

After a few painful rounds of testing, we concluded that using Unsafe in String introduces nasty bootstrapping problems, when some primordial classes (String included) need not-yet-initialized Unsafe, e.g. to access individual String characters. This may happen easily when <clinit> of any primordial class executes before Unsafe is linked. This is potentially solvable by considering the change in the VM Genesis sequence that will link Unsafe or some lightweight form of it before or right after String. That looks like a very intrusive change that potentially affects runtime reliability. With that, we think intrinsics should stay as our primary implementation. Future work may explore how to use Unsafe in this scenario.


Sherman pushed the Unsafe.getChar as the Java-level implementation for StringUTF16.(get|set)Char. Tobias implemented C1 intrinsics for these. C2 intrinsics are intact. Our performance runs show that both C1 and C2 intrinsics are important for performance, until Unsafe issues are resolved (if they *can* be resolved, in the first place). I renamed the issue to keep track of these intrinsics' status. See: http://cr.openjdk.java.net/~shade/8136757/charAt-intrinsics.txt

Another part of the overhead for C1 is caused by blowing the MaxInlineSize limit, see JDK-8138690.

Without doing the C1 intrinsics this sounds reasonable. But I want the C2 intrinsics to be removed before 9 GA.

I think this is what we should do: push the Unsafe.getChar change (Sherman has one in queue), to make the code cleaner, but still retain C2 intrinsics, and (optionally) add more C1 intrinsics for (get|put)Char. When Unsafe.getChar performance is back on track, remove the C1/C2 intrinsics.

I can work on fixing JDK-8074124 if we decide to go for the Unsafe solution.

Sure, but do you want to block JEP-254 work until JDK-8074124 is resolved (not sure it will be in any near future, given how nobody is assigned)? I think patching C1 to emit the specialized byte[]->char[] access code for our case is a reasonable short-term stop-gap solution.

I'm against that. We should put the effort into fixing JDK-8074124 in the first place.

JDK-8074124 outlines some of the pending issues with Unsafe. Therefore, I believe we should add the same intrinsics in C1 as the stop-gap solution, until JDK-8074124 is fixed.

I didn't follow the Compact Strings work at all but why do we have to intrinsify so much? Can't we just use Unsafe for StringUTF16.getChar/putChar and leave the compilers alone?