JDK-4073498 : Motif encoding converters don't handle unknown characters as specified
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.nio.charsets
  • Affected Version: 1.2.0
  • Priority: P5
  • Status: Closed
  • Resolution: Duplicate
  • OS: generic
  • CPU: generic
  • Submitted: 1997-08-21
  • Updated: 2006-02-01
  • Resolved: 2006-02-01
Related Reports
Duplicate :  
Relates :  
Relates :  
Description
The encoding converter API (src/share/classes/sun/io/CharToByteConverter.java) specifies the following behavior when unknown characters are encountered in the input: If substitution mode is enabled, the unknown character is mapped to the substitution bytes for the converter. If substitution mode is disabled, an UnknownCharacterException is thrown.

None of the motif encoding converters implement this behavior as specified. Most map unknown characters to random, implementation-dependent bytes. The Dingbats converter throws the exception independent of the substitution mode.

This bug makes the current implementation of multi-fonts completely unusable - text that needs multiple fonts is at most partially displayed. See bug 4072782 for details.

Comments
EVALUATION All motif converters have be re-implemented by using the new added java.nio.charset APIs in project dealing with #4954012, so they handle unknow/unmappable characters correctly now. Closed as dup of 4954012
01-02-2006

EVALUATION (See suggested fix.)
11-06-2004

SUGGESTED FIX Implement the specified behavior. [xueming.shen@Japan 1997-08-21] Here's a suggested fix for two of the converters. =============suggested fix for CharToByteX11JIS0208.java-------- package sun.awt.motif; import sun.io.CharToByteEUCJIS; import sun.io.UnknownCharacterException; public class CharToByteX11JIS0208 extends CharToByteEUCJIS { public String toString(){ return "X11JIS0208"; } public boolean canConvert(char ch){ if (((0xFF00 & ch) != 0) && (DoubleByte(ch) != -1)){ return true; } return false; } public int convert(char[] input, int inOff, int inEnd, byte[] output, int outOff, int outEnd) throws UnknownCharacterException { charOff = inOff; byteOff = outOff; while (charOff < inEnd) { char ch = input[charOff]; int jishex = DoubleByte(ch); if (((0xFF00 & ch) == 0 || jishex == -1) && subMode == false){ badInputLength = 1; throw new UnknownCharacterException(); } output[byteOff++] = (byte)((jishex / 94) + 0x21); output[byteOff++] = (byte)((jishex % 94) + 0x21); charOff++; } return byteOff - outOff; } } --------suggested fix for CharToByteX11JIS0201----- package sun.awt.motif; import sun.io.CharToByte8859_1; import sun.io.UnknownCharacterException; public class CharToByteX11JIS0201 extends CharToByte8859_1 { public String toString(){ return "X11JIS0201"; } public boolean canConvert(char ch){ if ((ch >= 0xff61 && ch <= 0xff9f) || ch == 0x203e){ return true; } return false; } public int convert(char[] input, int inOff, int inEnd, byte[] output, int outOff, int outEnd) throws UnknownCharacterException { charOff = inOff; byteOff = outOff; while (charOff < inEnd && byteOff < outEnd){ char ch = input[charOff]; if (canConvert(ch) == false){ badInputLength = 1; throw new UnknownCharacterException(); } if (ch == 0x203e) { output[byteOff++] = (byte)0x7e; } else { output[byteOff++] = (byte)(ch - 0xff61 + 0xa1); } charOff++; } return byteOff - outOff; } } ----------------------------------------------------------
11-06-2004