JDK-8242541 : Small charset issues (ISO8859-16, x-eucJP-Open, x-IBM834 and x-IBM949C)
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.nio.charsets
  • Affected Version: 11,14,15
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • Submitted: 2020-04-13
  • Updated: 2020-09-30
  • Resolved: 2020-05-01
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 13 JDK 15
11.0.10-oracleFixed 13.0.4Fixed 15 b22Fixed
Description
Found small charset issues
* Missing historical name alias in ISO8859-16
* Typo hisname on x-eucJP-Open
* x-IBM834 and x-IBM949C charset source codes should be template style

Detail information is as follows:

Missing historical name alias in ISO-8859-16

java.io.InputStreamReader.getEncoding() returns historical name on Charset if sun.nio.cs.HistoricallyNamedCharset interface is implemented.
But historical name on ISO-8859-16 charset is not defined as its alias.

======
$ cat HistNameTest.java
import java.nio.charset.*;
import java.io.*;


public class HistNameTest {
������ public static void main(String[] args) throws Exception {
�������������� for(Charset cs : Charset.availableCharsets().values()) {
���������������������� String enc = (new InputStreamReader(System.in, cs)).getEncoding();
���������������������� try {
������������������������������ if (!cs.equals(Charset.forName(enc)))
�������������������������������������� System.err.println(cs.name()+"<>"+enc);
���������������������� } catch (Exception e) {
������������������������������ System.err.println(cs.name());
������������������������������ e.printStackTrace();
���������������������� }
�������������� }
������ }
}
$ ~/jdk-15.jdk/Contents/Home/bin/java HistNameTest.java
ISO-8859-16
java.nio.charset.UnsupportedCharsetException: ISO8859_16
������ at java.base/java.nio.charset.Charset.forName(Charset.java:526)
������ at HistNameTest.main(HistNameTest.java:9)
...
======

Typo hisname on x-eucJP-Open

According to make/data/charsetmapping/charsets,
Hisname on x-eucJP-Open is not valid, should be "EUC_JP_Solari"
======
charset x-eucJP-Open EUC_JP_Open
������ package sun.nio.cs.ext
������ type������ template
������ hisname EUC_JP_Solari
������ ascii���� true
������ alias���� EUC_JP_Solaris������������ # JDK historical
������ alias���� eucJP-open
======

But this hisname is not used.
According to src/jdk.charsets/share/classes/sun/nio/cs/ext/EUC_JP_Open.java.template,
Historical name is hard coded, but this typo should be fixed.
======
������ public EUC_JP_Open() {
�������������� super("x-eucJP-Open", $ALIASES$);
������ }

�������� public String historicalName() {
�������������� return "EUC_JP_Solaris";
������ }
======

x-IBM834 and x-IBM949C charset source codes should be template style

According to make/data/charsetmapping/charsets,��
x-IBM834 and x-IBM949C's type are "source"
======
charset x-IBM834 IBM834 # EBCDIC DBCS-only Korean
������ package sun.nio.cs.ext
������ type������ source
...
charset x-IBM949C IBM949C
������ package sun.nio.cs.ext
������ type������ source
======

According to IBM834.java, it refers IBM933 class.
src/jdk.charsets/share/classes/sun/nio/cs/ext/IBM834.java
======
������ public CharsetDecoder newDecoder() {
�������������� IBM933.initb2c();
�������������� return new DoubleByte.Decoder_DBCSONLY(
���������������������� this, IBM933.b2c, null, 0x40, 0xfe);�� // hardcode the b2min/max
������ }
======

According to IBM949C.java, it refers IBM949 class.
src/jdk.charsets/share/classes/sun/nio/cs/ext/IBM949C.java
======
������ public CharsetDecoder newDecoder() {
�������������� return new DoubleByte.Decoder(this,
�������������������������������������������������������������������������� IBM949.b2c,
�������������������������������������������������������������������������� b2cSB,
�������������������������������������������������������������������������� 0xa1,
�������������������������������������������������������������������������� 0xfe);
������ }
======

According to make/data/charsetmapping/charsets,��
IBM933 and IBM949 are not "source" type.
======
charset x-IBM933 IBM933
������ package sun.nio.cs.ext
������ type������ ebcdic
...
charset x-IBM949 IBM949
������ package sun.nio.cs.ext
������ type������ dbcs
======

They can be moved to sun.nio.cs package via make/data/charsetmapping/stdcs-* file.
Then IBM834 and IBM949C cannot move to sun.nio.cs package if type is "source".
So their source code should be template style.
Comments
Fix request (13u) Requesting backport to 13u for parity with 11u, applies cleanly.
08-06-2020

I think we can still admit it to 11.0.8 at this stage. jdk11u-critical approved.
29-05-2020

Fix Request (jdk11u-critical-request) I got jdk11u-fix-yes approval on May 22. I submitted merge request on May 25 before rampdown. I know it's bad timing. But I'd like to confirm the possibility of merging this changeset against 11.0.8.
29-05-2020

Fix Request It's small fixes against charsets, and we'd like to request the fix in 11u. The patches could apply cleanly with above order
22-05-2020

URL: https://hg.openjdk.java.net/jdk/jdk/rev/c824a3791866 User: itakiguchi Date: 2020-05-01 12:53:18 +0000
01-05-2020