JDK-4113837 : JIS0208 mapping rules are not correct in JDK1.1.6G!
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.nio.charsets
  • Affected Version: 1.1.6,1.2.0
  • Priority: P4
  • Status: Closed
  • Resolution: Fixed
  • OS: generic,solaris_2.5.1
  • CPU: generic,sparc
  • Submitted: 1998-02-20
  • Updated: 1998-08-13
  • Resolved: 1998-08-13
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other Other
1.1.6 1.1.6Fixed 1.2.0Fixed
Related Reports
Relates :  
Relates :  
Description

Name: paC48320			Date: 02/20/98


In JDK 1.1.6G, ByteToChar,CharToByteJIS0208 mapping rules
are changed from JDK 1.1.5 FCS.
We (Japanese Licensee) almost cannot accept the changed rules.
We strongly recommend that mapping rules must be back to JDK1.1.5.

1. [JIS]YEN SIGN(0x5c) and [JIS]REVERSE SOLIDUS(0x2140) are also 
   mapped to [Unicode]U+005C REVERSE SOLIDUS.

   This is a big problem.
   We cannot display U+FF3C FULLWIDTH REVERSE SOLIDUS.
   And programs which [JIS]0x2140 character was used
   must be re-written for this change.
   Because [JIS]0x2140 was not an escape character in JDK1.0-JDK1.1.5,
   but [JIS]0x2140 becomes an escape character in JDK1.1.6G.
   So if javac meets single [JIS]0x2140 character,
   'Invalid escape character' error will be occurred.

   They should not be unified.
   [JIS]0x2140 should be mapped to U+FF3C FULLWIDTH REVERSE SOLIDUS
   as same as JDK1.1.5.

    Solution:
       [JIS]0x5c YEN SIGN <--> U+005C REVERSE SOLIDUS
       [JIS]0x2140 REVERSE SOLIDUS <--> U+FF3C FULLWIDTH REVERSE SOLIDUS.

2. [Unicode]U+00A5 YEN SIGN is not mapped to any JIS code.

    U+00A5 is used as currency pattern 'YEN SIGN'
    in java.text.resources.LocaleElements_ja.
    So, in JDK 1.1.6G, formatting of currency pattern does not work.
    
    [Unicode]U+00A5 YEN SIGN should be mapped to [JIS]0x5c YEN SIGN
    as same as JDK1.1.5.

    Solution:
       [JIS]0x5c YEN SIGN <-- U+00A5 REVERSE SOLIDUS

3. [JIS]WAVE DASH(0x2141) is mapped to [Unicode]U+FF5E FULLWIDTH TILDE.

    This is the biggest problem.
    In JIS X 0221(ISO 10646 JIS Version),
    [JIS]WAVE DASH(0x2141) is mapped to [Unicode]U+301C WAVE DASH.
    And [Unicode]U+FF5E FULLWIDTH TILDE is already used
    in JIS X 0212 which is one of Unicode Standard Sources.
    So this mapping rule is fault for Unicode Source Separate Rule.

    In fact, in Microsoft's CodePage 932
    [JIS]WAVE DASH(0x2141) is mapped to [Unicode]U+FF5E FULLWIDTH TILDE.
    But it is NOT a correct mapping rule.
    I heard it was a workaround for early Windows bug.

    If JavaSoft wants to resolve a problem
    that in Windows a inputted [JIS]WAVE DASH
    is different from a [JIS]WAVE DASH in source file,
    I recommend a different solution.
    It is to convert a string which returns from Win32 API(AWT Peer).
    The convert rule is following.

	StringBuffer sb = new StringBuffer();
	for (int i = 0; i < s.length(); i++) {
	    c  = s.charAt(i);
	    switch (c) {
	    case 0xff3c:	// FULLWIDTH REVERSE SOLIDUS ->
		c = 0x005c;	// REVERSE SOLIDUS
		break;
	    case 0xff5e:	// FULLWIDTH TILDE ->
		c = 0x301c;	// WAVE DASH
		break;
	    case 0x2225:	// PARALLEL TO ->
		c = 0x2016;	// DOUBLE VERTICAL LINE
		break;
	    case 0xff0d:	// FULLWIDTH HYPHEN-MINUS ->
		c = 0x2212;	// MINUS SIGN
		break;
	    case 0xffe0:	// FULLWIDTH CENT SIGN ->
		c = 0x00a2;	// CENT SIGN
		break;
	    case 0xffe1:	// FULLWIDTH POUND SIGN ->
		c = 0x00a3;	// POUND SIGN
		break;
	    case 0xffe2:	// FULLWIDTH NOT SIGN ->
		c = 0x00ac;	// NOT SIGN
		break;
	    }
	    sb.append(c);
	}
	return new String(sb);

    To set a string to Win32API(Peer), you may use the reverse rule.

    Solution:
       [JIS]0x2141 WAVE DASH <--> U+301C WAVE DASH
       [JIS] no mapping <-- U+FF5E FULLWIDTH TILDE

4. Conclusion of 1, 2, and 3

    JDK1.1.6G mapping rules should be back to JDK 1.1.5 mapping rules.
    JDK 1.1.5 mapping rules are the better.
    Japanese Mapping problem is very difficult.
    There is no perfect solution now.
    So I strongly recomment,
    please don't change without asking all Japanese Licensee.
    So JavaSoft must refer JIS X 0221(ISO 10646 JIS Version)
    which is a result Japanese researchers discussed.
    And please see also a report;
	http://www.opengroup.or.jp/jvc/cde/ucs-conv-e.html

    Conclusion Solution:
       [JIS]0x5c YEN SIGN <--> U+005C REVERSE SOLIDUS
       [JIS]0x5c YEN SIGN <-- U+00A5 REVERSE SOLIDUS
       [JIS]0x2140 REVERSE SOLIDUS <--> U+FF3C FULLWIDTH REVERSE SOLIDUS.
       [JIS]0x2141 WAVE DASH <--> U+301C WAVE DASH
       [JIS] no mapping <-- U+FF5E FULLWIDTH TILDE
       [JIS]0x7e OVERLINE <-- U+203E OVERLINE		(addition)

5. Additional infomation after being back to JDK1.1.5 mapping rules.

    Several mapping rules in Windows, MacOS, and so on
    are used now.
    These rules are depend on its platforms.
    I hope these rules will be added to sun.io converters.
    At first, Cp932 which is Microsoft Japanese Shift-JIS Mapping rule.
    It is useful when parsing document files which was saved in Unicode
    by Windows applications.
    At second, a JIS0208 imitator version
    which maps [JIS]0x5c YEN SIGN to [Unicode]U+00A5 YEN SIGN.
    It is useful when parsing document files which [JIS]0x5c YEN SIGN
    means currency pattern in.
    
    I hope we will be able to select these converter for special use.
(Review ID: 25409)
======================================================================

Comments
CONVERTED DATA BugTraq+ Release Management Values COMMIT TO FIX: generic FIXED IN: 1.1.6 INTEGRATED IN: 1.1.6 1.2beta3
14-06-2004

EVALUATION We have made the requested changes. The 5c mapping problem is tricky and needs to be addressed in 1.2. The intent was that both a SJIS and MS932 converter would be made available and that our current SJIS converter's behavior would be moved into the MS932 converter. This is not the right time for that change however so it is post-poned to 1.2 brian.beck@Eng 1998-03-05 brian.beck@Eng 1998-03-05
05-03-1998