JDK-8196995 : java.lang.Character should not state UTF-16 encoding is used for strings
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.lang
  • Affected Version: 9,10
  • Priority: P4
  • Status: Closed
  • Resolution: Not an Issue
  • Submitted: 2018-02-07
  • Updated: 2018-02-09
  • Resolved: 2018-02-08
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11
Related Reports
Relates :  
Relates :  
Text in java.lang.Character states a UTF-16 character encoding is used for java.lang.String. While was true for many years, it is not necessarily true and not true in practice as of JDK 9 due to the improvements from JEP 254: Compact Strings.

The statement about the encoding should be corrected.
There is no question that the class doc for Character can be improved, but that is hard work! Especially, it reads a little like it was written while supplementary characters were being added to the platform.

Closing the issue after the review thread did not share the assessment that the current wording is confusing or misleading: http://mail.openjdk.java.net/pipermail/core-libs-dev/2018-February/051325.html

As far as I can see the *model* presented by the APIs Character, String, CharSequence and in the programming language is UTF-16. This hasn't been changed. The compact strings work changed the internal representation in some cases, but this work didn't change the API model. If there is anything in the specs that talk about internal representation, that of course should be fixed.