JDK-8217938 : Support new Japanese era in java.lang.Character for Java SE 12
  • Type: CSR
  • Component: core-libs
  • Sub-Component: java.lang
  • Priority: P2
  • Status: Closed
  • Resolution: Approved
  • Fix Versions: 12
  • Submitted: 2019-01-28
  • Updated: 2019-02-22
  • Resolved: 2019-02-01
Related Reports
CSR :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Description
Summary
-------

Mandate support in Java SE 12 for the new Japanese era code point.

Problem
-------

The Java SE 12 Platform supports Unicode 11.0 ([JDK-8212120][2]) but must also support the new Japanese era code point which Unicode 11.0 does not include. The Java SE 11 Platform was updated to support the new code point _at the discretion of the implementation_ ([JDK-8216594][1]), but it is appropriate for the Java SE 12 Platform to mandate support for the new code point in all Java SE 12 implementations.

Solution
--------

Mandate support for the new Japanese era code point in all Java SE 12 implementations. This will make every method in the `Character` class recognize the code point and behave consistently, which will improve the maintainability of all Java SE 12 implementations. (In contrast, the Java SE 11 Platform disallowed the code point in the `isJavaIdentifierStart/Part` methods, which made the implementations of those methods harder to maintain.)

Specification
-------------

Change the first half (before "Unicode Character Representations" header) of the `Character` class specification from:
```
  * The {@code Character} class wraps a value of the primitive
  * type {@code char} in an object. An object of type
  * {@code Character} contains a single field whose type is
  * {@code char}.
  * <p>
  * In addition, this class provides several methods for determining
  * a character's category (lowercase letter, digit, etc.) and for converting
  * characters from uppercase to lowercase and vice versa.
  * <p>
  * Character information is based on the Unicode Standard, version 11.0.0.
  * <p>
  * The methods and data of class {@code Character} are defined by
  * the information in the <i>UnicodeData</i> file that is part of the
  * Unicode Character Database maintained by the Unicode
  * Consortium. This file specifies various properties including name
  * and general category for every defined Unicode code point or
  * character range.
  * <p>
  * The file and its description are available from the Unicode Consortium at:
  * <ul>
  * <li><a href="http://www.unicode.org">http://www.unicode.org</a>
  * </ul>
  * <p>
  * The code point, U+32FF, is reserved by the Unicode Consortium
  * to represent the Japanese square character for the new era that begins
  * May 2019. Relevant methods in the Character class return the same
  * properties as for the existing Japanese era characters (e.g., U+337E for
  * "Meizi"). For the details of the code point, refer to
  * <a href="http://blog.unicode.org/2018/09/new-japanese-era.html">
  * http://blog.unicode.org/2018/09/new-japanese-era.html</a>.
```
to:
```
 * The {@code Character} class wraps a value of the primitive
 * type {@code char} in an object. An object of class
 * {@code Character} contains a single field whose type is
 * {@code char}.
 * <p>
 * In addition, this class provides a large number of static methods for
 * determining a character's category (lowercase letter, digit, etc.)
 * and for converting characters from uppercase to lowercase and vice
 * versa.
 *
 * <h3><a id="conformance">Unicode Conformance</a></h3>
 * <p>
 * The fields and methods of class {@code Character} are defined in terms
 * of character information from the Unicode Standard, specifically the
 * <i>UnicodeData</i> file that is part of the Unicode Character Database.
 * This file specifies properties including name and category for every
 * assigned Unicode code point or character range. The file is available
 * from the Unicode Consortium at
 * <a href="http://www.unicode.org">http://www.unicode.org</a>.
 * <p> 
 * The Java SE 12 Platform uses character information from version 11.0
 * of the Unicode Standard, plus the Japanese Era code point,
 * {@code U+32FF}, from the first version of the Unicode Standard
 * after 11.0 that assigns the code point.
```

Change the second paragraph of `isJavaIdentifierPart(char)` and `isJavaIdentifierPart(int)` method description from:
```
     * A character may be part of a Java identifier if any of the following
     * are true:
```
to:
```
     * A character may be part of a Java identifier if any of the following
     * conditions are true:
```
Change the last list item of conditions in `isJavaIdentifierPart(int)` method description from:
```
     * <li> {@link #isIdentifierIgnorable(int)
     * isIdentifierIgnorable(codePoint)} returns {@code true} for
     * the character
```
to:
```
     * <li> {@link #isIdentifierIgnorable(int)
     * isIdentifierIgnorable(codePoint)} returns {@code true} for
     * the code point
```
Change the second paragraph of `isJavaLetter(char)` method description from:
```
     * A character may start a Java identifier if and only if
     * one of the following is true:
```
to:
```
     * A character may start a Java identifier if and only if
     * one of the following conditions is true:
```
Change the second paragraph of `isJavaLetterOrDigit(char)` method description from:
```
     * A character may be part of a Java identifier if and only if any
     * of the following are true:
```
to:
```
     * A character may be part of a Java identifier if and only if one
     * of the following conditions is true:
```

  [1]: https://bugs.openjdk.java.net/browse/JDK-8216594
  [2]: https://bugs.openjdk.java.net/browse/JDK-8212120

Comments
Re-approving updated request.
01-02-2019

Moving to Approved.
31-01-2019