JDK-6310716 : decodeText() doesn't convert from iso-2022-jp to Unicode for some Japanese chars
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.nio.charsets
  • Affected Version: 1.4.2
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: windows_2000
  • CPU: x86
  • Submitted: 2005-08-13
  • Updated: 2010-04-03
  • Resolved: 2005-09-30
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other JDK 6
1.4.2_11Fixed 6 b55Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Description
- Email message with japanese characters shown as garbage) where the bottom issue is the same as our issue; also

creating our issue in the site to hopefully higher the priority of this issue at Sun.


here is our code:

1            subject = msg.getHeader("Subject", null);
2           if (subject != null) {
3                try {
4                    subject = MimeUtility.decodeText(subject);
5                } catch (UnsupportedEncodingException ioEx) {


here is how we get string from email subject and decode it.

- from outlook 2000, send out and email that has subject of :

abc?@?W??def

to our browser based application and outlook2000.

- after the line 1, we get an encoded string like:

String subject= "=?iso-2022-jp?B?YWJjGyRCLSEtOC1qGyhCZGVm?="

and this MIME Base64 encoded chars using iso-2022-jp seems to be fine.

- after decodeText() in line 4, we get:

String subject= "abc���def"

where we expect:

abc?@?W??def

- here is the source of the email received by outlook 2000:

  Subject: =?iso-2022-jp?B?YWJjGyRCLSEtOC1qGyhCZGVm?=

and outlook 2000 can decode it and display  "abc?@?W??def" correctly by looking at the email received.
(this confirms that the encoded form of "=?iso-2022-jp?B?YWJjGyRCLSEtOC1qGyhCZGVm?="
has been encoded correctly.)

Thus, decodeText() seems to have garbled the characters of ?@?W?? (Maru1, RomanNumber4, Kabushiki-gaisha) when it converts from iso-2022-jp to Unicode.
we have also confirmed that regular Japanese characters are converted correctly.

found a similar issue in https://javapartner.sun.com/partner/bugs/data/bugs/6173388.html (6173388 JavaMail 1.3.2

Comments
EVALUATION obviously this is another request (again) to ask iso2022 to support non-standard-jis0208-but-in-ms932 characters. see#6173388
19-08-2005