JDK-8232161 : Align some one-way conversion in MS950 charset with Windows
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.nio.charsets
  • Affected Version: 14
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • Submitted: 2019-10-11
  • Updated: 2021-01-25
  • Resolved: 2020-03-18
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11
11.0.11-oracleFixed
Related Reports
CSR :  
Relates :  
Sub Tasks
JDK-8240196 :  
Description
According to make/data/charsetmapping/MS950.nr, 10 1-way trip entries are there.

I tried following instructions on Traditional Chinese Windows.

1. Run WriteMS950Data to create 1-way trip data file
>java WriteMS950Data.java ms950data.txt

2. Check byte data
>java DumpText.java ms950data.txt
A2 A4 F9 F9 0D 0A
A2 A5 F9 E9 0D 0A
A2 A7 F9 EB 0D 0A
A2 A6 F9 EA 0D 0A
A2 7E F9 FA 0D 0A
A2 A1 F9 FB 0D 0A
A2 A3 F9 FD 0D 0A
A2 A2 F9 FC 0D 0A
A2 CC A4 51 0D 0A
A2 CE A4 CA 0D 0A

3. Open ms950data.txt by notepad and just save without editing
>notepad ms950data.txt

4. Check data
>java DumpText.java ms950data.txt
F9 F9 F9 F9 0D 0A
F9 E9 F9 E9 0D 0A
F9 EB F9 EB 0D 0A
F9 EA F9 EA 0D 0A
A2 7E A2 7E 0D 0A
A2 A1 A2 A1 0D 0A
A2 A3 A2 A3 0D 0A
A2 A2 A2 A2 0D 0A
A4 51 A4 51 0D 0A
A4 CA A4 CA 0D 0A

I think above data is expected on MS950 charset.

>java -showversion TestMS950
openjdk version "14-ea" 2020-03-17
OpenJDK Runtime Environment (build 14-ea+17-721)
OpenJDK 64-Bit Server VM (build 14-ea+17-721, mixed mode, sharing)
\u2550\u2550, expected: \xF9\xF9\xF9\xF9, result: \xA2\xA4\xA2\xA4
\u255E\u255E, expected: \xF9\xE9\xF9\xE9, result: \xA2\xA5\xA2\xA5
\u2561\u2561, expected: \xF9\xEB\xF9\xEB, result: \xA2\xA7\xA2\xA7
\u256A\u256A, expected: \xF9\xEA\xF9\xEA, result: \xA2\xA6\xA2\xA6
Exception in thread "main" java.lang.Exception: Failed
        at TestMS950.main(TestMS950.java:83)

Also I checked
ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit/bestfit950.txt

$ egrep '^0x255[0e]|^0x256[1a]' bestfit950.txt 
0x2550	0xf9f9	;?? Forms Double Horizontal
0x255e	0xf9e9	;?? Forms Vertical Single And Right Double
0x2561	0xf9eb	;?? Forms Vertical Single And Left Double
0x256a	0xf9ea	;?? Forms Vertical Single And Horizontal Double

I think above 4 entries are unexpected

JTreg output is as follows:
Without Fix
=========
$ jtreg -v TestMS950.java
runner starting test: sun/nio/cs/TestMS950.java
runner finished test: sun/nio/cs/TestMS950.java
Failed. Execution failed: `main' threw exception: java.lang.Exception: Failed
Test results: failed: 1

With Fix
=======
$ jtreg -v TestMS950.java
runner starting test: sun/nio/cs/TestMS950.java
runner finished test: sun/nio/cs/TestMS950.java
Passed. Execution successful
Test results: passed: 1

The fix was submitted by
https://mail.openjdk.java.net/pipermail/core-libs-dev/2019-October/062862.html

Testcase2

1. Create test data ms950.txt
>jdk-14\bin\java WriteMS950Data1.java ms950.txt 

>type ms950.txt 
���
���
���
���
���
���
���
���
���
���
���
���
���
���
���
���
���
���
���
���

2. Cut 1st line character, then put it after findstr command
>findstr /N ��� ms950.txt 
2:���

3. Create backup data file
>copy ms950.txt ms950.txt.orig 
���������         1 ������������

4. Open ms950.txt by notepad, then just save the data
>notepad ms950.txt 

5. Type findstr command again, then #1 and #2 are displayed
>findstr /N ��� ms950.txt 
1:���
2:���

6. Create test data via java
>jdk-14\bin\java cat.java ms950.txt.orig ms950-java.txt 

7. Type findstr command again, nothing displayed <= This is issue
>findstr /N ��� ms950-java.txt 

8. Check differences, 4 characters are displayed
>fc ms950.txt ms950-java.txt 
������������������ ms950.txt ��� MS950-JAVA.TXT
***** ms950.txt
���
���
���
���
���
���
���
���
���
***** MS950-JAVA.TXT
���
���
���
���
���
���
���
���
���
*****

9. Check the character on each file
>jdk-14\bin\java lookfor.java ��� ms950-java.txt
  1: ���
  2: ���

>jdk-14\bin\java lookfor.java ��� ms950.txt
  1: ���
  2: ���

>jdk-14\bin\java lookfor.java ��� ms950.txt.orig
  1: ���
  2: ���
Comments
Fix Request It's small fixes MS950 charset, and I'd like to request the fix in jdk11u-dev. The patches could apply cleanly with above order. CSR JDK-8248305 for 11-pool was approved. (Base CSR was JDK-8233385) Backport JDK-8245689 for 11-pool was opened.
02-07-2020

I've removed the jdk11u-fix-request label. Please add it back once a backport CSR request has been filed and got approved.
04-06-2020

Hi ~itakiguchi, yes, a CSR would be required as a prerequisite to backporting. For process details please have a look at: https://wiki.openjdk.java.net/display/JDKUpdates/How+to+contribute+a+fix
23-05-2020

Fix Request It's small fixes MS950 charset, and we'd like to request the fix in 11u. The patches could apply cleanly with above order. "CSR JDK-8233385 Align some one-way conversion in MS950 charset with Windows" was there. Please let me know anohter CSR is required.
22-05-2020

URL: https://hg.openjdk.java.net/jdk/jdk/rev/5c47c5d72003 User: itakiguchi Date: 2020-03-18 09:17:42 +0000
18-03-2020