JDK-4361835 : Mapping mistakes in JIS0201, JIS0208, JIS0212, and SHIFTJIS.
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.nio.charsets
  • Affected Version: 1.3.0,1.4.0
  • Priority: P3
  • Status: Closed
  • Resolution: Not an Issue
  • OS: solaris_8
  • CPU: sparc
  • Submitted: 2000-08-11
  • Updated: 2001-03-13
  • Resolved: 2001-03-12
Related Reports
Relates :  
Relates :  
Description
The test used four code pages from ftp://www.unicode.org/ .  The
page for JIS0202 is for "JIS X 0201 (1976) to Unicode 1.1".  The
code page for SHIFTJIS says that
#   This table contains the data the Unicode Consortium has on how
#       Shift-JIS (a combination of JIS 0201 and JIS 0208) maps into Unicode"
and is dated 8 March 1994.
The code page for JIS0212 is for "JIS X 0212 (1990)".
This JIS0208 test uses a code page for "JIS X 0208 (1990)".

Hex under "IN" is input to a conversion.  Hex in the "CHECK" column
is the expected output.  Hex under the "OUT" column is the actual output
using a 1.3 or 1.4 JDK.

CHECKING BYTE ARRAY TO STRING
PASS? CODE    IN    CHECK   OUT     COMMENT
FAIL  JIS0201  5C    00A5  005C  # YEN SIGN
FAIL  JIS0201  7E    203E  007E  # OVERLINE
FAIL  JIS0212  2237  007E  FF5E  # TILDE
FAIL  SJIS    5C    00A5  005C  # YEN SIGN
FAIL  SJIS    7E    203E  007E  # OVERLINE
FAIL  SJIS    815F  005C  FF3C  # REVERSE SOLIDUS
FAIL  JIS0208  2140  005C  FF3C  # REVERSE SOLIDUS

CHECKING STRING TO BYTE ARRAY
PASS? CODE    IN    CHECK   OUT     COMMENT
FAIL  JIS0212  007E  2237  7E    # TILDE
FAIL  SJIS    005C  815F  5C    # REVERSE SOLIDUS
FAIL  JIS0208  005C  2140  5C    # REVERSE SOLIDUS

The tests and code pages are attached.  These problems were also
reported in 4296969.  However, 4296969 reports a host of problems.
Not all of the "problems" reported in 4296969 are, in my
opinion, legitimate.  This bug is submitted to keep real bugs
from getting buried.

algol% /usr/j2se/bin/java -version
java version "1.3.0"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.0-b25)
Java HotSpot(TM) Client VM (build 1.3.0-b25, mixed mode)
sqesvr% /usr/local/java/jdk1.4/solaris/bin/java -version
java version "1.4.0beta"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.0beta-b27)
Java HotSpot(TM) Core VM (build 1.3-internal, interpreted mode)


allan.jacobs@Eng 2000-08-11

Comments
EVALUATION Will address for merlin. jerry.driscoll@eng 2000-09-07 These modifications compared to the SJIS/JIS0201/JIS0208/JIS0212 mappings on www.unicode.org are intentional so as to ensure that the Yen symbol and Overline symbols are treated logically as backslash and tilde respectively. There may be an argument for an RFE to be submitted requesting provision of encodings which adhere strictly to the published mappings and which do not perform the special treatment of these specific code points. Also see: http://www.opengroup.or.jp/jvc/cde/ucs-conv-e.html for a discussion on the Yen-sign problem within the Ja encodings. Ian.Little@Ireland 10/19/2000 These are not mistakes. These code points have been purposefully mapped in this way to overcome issues which would otherwise cause major problems for information processing for Japanese customers and licensees.
11-06-2004