JDK-4202869 : Broken mappings in EUC_JP converters
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.nio.charsets
  • Affected Version: 1.2.0
  • Priority: P4
  • Status: Closed
  • Resolution: Duplicate
  • OS: generic
  • CPU: generic
  • Submitted: 1999-01-13
  • Updated: 1999-07-06
  • Resolved: 1999-07-06
Related Reports
Duplicate :  
Description

Name: bb33257			Date: 01/13/99


Four symbol characters have different mappings in the Microsoft
EUC_JP code page and in the real industry standard.
For one of these characters the Java converter is using the MS
mapping rather than the official JIS one:
  x815C -> u2015 (HORIZONTAL BAR) is the incorrect (MS) mapping

It should be changed to this:
  x815C -> u2014 (EM DASH) 
and the reverse should be added as well:
  u2014 (EM DASH)  -> x815C, 

For IBM and MS encoding compatibility, it's also recommended to
add the following mappings, which are not formally part of the
standard but are defined in both the MS and IBM converters:
    uFFE0 (FULLWIDTH CENT SIGN) -> x8191 
    uFFE1 (FULLWIDTH POUND SIGN) -> x8192 
    uFFE3 (FULLWIDTH MACRON) -> x8150  

I'm filing this bug on behalf of Masayuki Fuse <###@###.###>
of IBM Japan's DBCS Technical Center.  Feel free to contact him
directly for more information.
======================================================================

Comments
EVALUATION The mappings for 0x213D (JIS X0208-GL) came from Unicode.org, not Microsoft. It should be changed to follow the national standard. For non-Microsoft encodings, Java follows international/national standards as the policy. The changes for IBM-MS compatibility will not be supported. masayoshi.okutsu@Eng 1999-07-06
06-07-1999