JDK-4765370 : Solaris proprietary code converters are needed for Japanese locales
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.nio.charsets
  • Affected Version: 1.3.1_02
  • Priority: P1
  • Status: Closed
  • Resolution: Fixed
  • OS: solaris_9
  • CPU: sparc
  • Submitted: 2002-10-18
  • Updated: 2004-09-16
  • Resolved: 2002-12-17
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other Other Other
1.3.1_07 07Fixed 1.4.0_04Fixed 1.4.2Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Description
Some Japanese characters cannot be displayed in the Japanese locale with JRE 1.3.1_02 using JIS0208 character encoding.  The glyphs for the characters that fail to display are available on the Japanese Solaris as the characters are easily displayed on a dtterm window.  It seems that the JRE is missing a mapping of these characters to the correct font/glyph.

A patched version of CharToByteJIS0208.java has been created which fixes the problem.  The problem also does not exist on 1.4.x.  

To reproduce the problem, you need the Japanese language support (EUC encoding). Next, you need to make sure that the terminal you use is set up such that the locale command will display the following:

% locale
LANG=ja
LC_CTYPE="ja"
LC_NUMERIC="ja"
LC_TIME="ja"
LC_COLLATE="ja"
LC_MONETARY="ja"
LC_MESSAGES="ja"
LC_ALL=

Next, cat the attached file "characters".  On your terminal, you should see something like "Microsoft Word - VIII".  Please select the Roman numerals V and III by using the Sun copy key.   

Next, do the following:  

%cd /usr/demo/J2SE/demo/jfc/Stylepad
% java -jar Stylepad.jar &

Once the application comes up, use the Sun paste key to paste what you just copied from the characters file.  The characters will be displayed as square boxes.  

If you replace CharToBytesJIS0208.java with the attached modified version, the characters are displayed properly.  The diffs between the original and modified files are also attached.

The attached fonts.dir file shows the fonts used in the application.  The font has support for additional Unicode glyphs than those in CharToByteJIS0208.java.  The modifications to the file simply add the additional mappings that the font (HG-MonchoL.ttf-ricoh-hg mincho l-medium-r-normal--0-0-0-0-m-0-jisx0208.1983-0) is capable of handling.








Comments
CONVERTED DATA BugTraq+ Release Management Values COMMIT TO FIX: 1.3.1_07 1.4.0_04 mantis-beta FIXED IN: 1.3.1_07 1.4.0_04 mantis-beta INTEGRATED IN: 1.3.1_07 1.4.0_04 mantis-b14 mantis-beta VERIFIED IN: 1.3.1_07
17-09-2004

EVALUATION Currently, Java uses the eucJP converters which are compliant with the Japanese locale standard provided by Japanese UNIX vendors (the result of Open Group activities). However, Solaris supports its proprietary extensions to the standard. See the eucJP(5) man page. The fix should be to provide Solaris eucJP compatible converters and make them default converters on Solaris for the ja locale. We shouldn't change the JIS X 0208 converter tables. The JISAutoDetect converter needs to use the Solaris proprietary converter (similar to SJIS/MS932 setup) for "eucJP". New converters for the Solaris PCK are also required. PCK is not SJIS. See the PCK(5) man page. This requires additional change to JISAutoDetect for SJIS/MS932/PCK setup. ###@###.### 2002-10-21 ###@###.### 2002-10-23 Mapping of so called "NEC (row 13) special charcters (or symbols)" and "IBM extended characrers" are included in the conversion policy provided by The Open Group Japan Vendor Council (TOG JVC), which looks to be identical to what ###@###.### is refering to. i.e. the mapping is not Sun or Solaris proprietary extension. See: http://www.opengroup.or.jp/jvc/cde/appendix-e.html eucJP(5) refers to the characters as VDCs (vendor defined characters) because they are not defined by JIS itself. ======================================================================== Fix in progress (and under review) for 1.3.1_x Also NIO coders for the modified Solaris mappings are also being prepared - ie. JIS_X_0208_Solaris, JIS_X_0212_Solaris, EUC_JP_Solaris and PCK ###@###.### 2002-11-05 Based on further information provided from i18n/Japan Solaris group some of the originally supplied roundtrip mappings were not correctly implemented. I have made the following changes: Before: 0x2d75(JIS X 0208) <--> U+221A, 0x2265 (JIS X 0208) ---> U+221A Now: 0x2d75(JIS X 0208) ---> U+221A, 0x2265 (JIS X 0208) <--> U+221A Before: 0x2d77(JIS X 0208) <--> U+2220, 0x225C (JIS X 0208) ---> U+2220 Now: 0x2d77(JIS X 0208) ---> U+2220, 0x225c (JIS X 0208) <--> U+2220 Before: 0x2d7c(JIS X 0208) <--> U+222A, 0x2240 (JIS X 0208) ---> U+222A Now: 0x2d7c(JIS X 0208) ---> U+222A, 0x2240 (JIS X 0208) <--> U+222A Before: 0x2d72(JIS X 0208) <--> U+222B, 0x2269 (JIS X 0208) ---> U+222B Now: 0x2d72(JIS X 0208) ---> U+222B, 0x2269 (JIS X 0208) <--> U+222B Before: 0x2d71(JIS X 0208) <--> U+2261, 0x2261 (JIS X 0208) ---> U+2261 Now: 0x2d71(JIS X 0208) ---> U+2d71, 0x2261 (JIS X 0208) <--> U+2261 Before 0x2237(JIS X 0212) ---> U+007E Now: 0x2237(JIS X 0212) <--> U+FF5E The supplied converters are now implemented similar to the Solaris iconv converters. This results in updates in the char->byte mappings for the Solaris EUC_JP and PCK converters being added as part of addressing this bug. ###@###.### 2002-12-11 ###@###.### 2002-12-11 removed 1.4.1_04 from commit to fix field as part of bug clean-up. ###@###.### 2004-09-16
11-12-2002