Bug ID: JDK-8196995 java.lang.Character should not state UTF-16 encoding is used for strings

Type: Bug
Component: core-libs
Sub-Component: java.lang
Affected Version: 9,10

Priority: P4
Status: Closed
Resolution: Not an Issue

Submitted: 2018-02-07
Updated: 2018-02-09
Resolved: 2018-02-08

Versions (Unresolved/Resolved/Fixed)

The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.

JDK 11
11Resolved

Text in java.lang.Character states a UTF-16 character encoding is used for java.lang.String. While was true for many years, it is not necessarily true and not true in practice as of JDK 9 due to the improvements from JEP 254: Compact Strings.

The statement about the encoding should be corrected.

There is no question that the class doc for Character can be improved, but that is hard work! Especially, it reads a little like it was written while supplementary characters were being added to the platform.

09-02-2018

Closing the issue after the review thread did not share the assessment that the current wording is confusing or misleading: http://mail.openjdk.java.net/pipermail/core-libs-dev/2018-February/051325.html

08-02-2018

As far as I can see the *model* presented by the APIs Character, String, CharSequence and in the programming language is UTF-16. This hasn't been changed. The compact strings work changed the internal representation in some cases, but this work didn't change the API model. If there is anything in the specs that talk about internal representation, that of course should be fixed.

08-02-2018

Relates :	JDK-8054307 - JEP 254: Compact Strings
Relates :	JDK-8191410 - Unicode 10.0.0 support