JDK-4937509 : (tz) sun.tools.javazic.Gen uses String.getBytes - native encoding incorrectly assumed
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.util:i18n
  • Affected Version: 5.0u4,5.0u14,6
  • Priority: P5
  • Status: Resolved
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2003-10-14
  • Updated: 2005-10-10
  • Resolved: 2005-10-10
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 6
6 b56Fixed
Related Reports
Duplicate :  
Relates :  
Relates :  
Description
This is the i18n related part of bug 4927428 which reports the following
general problem in the jdk.


We are currently into porting J2SE V1.4.2 to our mainframe platform,
where the native encoding is EBCDIC. Several tests in JCK failed because
of the careless use of java.lang.String.getBytes() in other java classes;
use of native encoding is essentially wrongfully hardcoded into these
classes.

We compiled the following list of modifications that were necessary to
pass the JCK tests. There are a lot more suspicious uses of getBytes(),
where we were not able to decide whether their usage is right or wrong.

ACTION ITEM:
Please initiate some kind of overall review of the encoding issue,
and fix all java classes where the problem occurs.

=== Start of list ===

The method getBytes() in java.lang.String converts implictly with native
encoding. The usage of this method on machines where native encoding is
not ISO-8859-1 or some compatible ASCII encoding is wrong in J2SE SDK
1.4.2 in the following cases.

sun/tools/javazic/Gen.java
// all occurrences of getBytes() have to be changed to 
// getBytes("US-ASCII") because in sun/util/calendar/ZoneInfoFile.java
// the file is read with US-ASCII encoding

=== end of list ===

Comments
EVALUATION Changed both ZoneInfoFile.java and Gen.java to specify UTF-8.
26-09-2005

EVALUATION Suggested fix from customer: see 6328471. Original question from Sun was: > When porting J2SE to an EBCDIC-based system, do you want to convert > the Olson public source (text files originally encoded in ASCII) to > EBCDIC-based encoding? Customer answer: > We do not convert these files to EBCDIC because they also contain > binary data and are not edited by hand. In sun.util.calendar.ZoneInfoFile > the encoding of the text parts is already assumed to be US-ASCII (see the > use of the String-constructor), so we suggest that in sun.tools.javazic.Gen > all occurrences of getBytes() must be changed to getBytes("US-ASCII").
24-09-2005

CONVERTED DATA BugTraq+ Release Management Values COMMIT TO FIX: mustang
06-08-2004

EVALUATION I intentionally left that part untouched because there's an unknown portability requirement. That is, when porting J2SE to an EBCDIC-based system, do they want to convert the Olson public source (text files originally encoded in ASCII) to EBCDIC-based encoding? If so, how can we get the encoding name of the text files? ###@###.### 2003-10-15 I still need answers to my questions above. Ohterwise, I will assume UTF-8 encoding for any systems. ###@###.### 2003-11-18 Couldn't get any feedback for more than a half year. The implementation may be changed to assume UTF-8. However, we need to be consistent with the direction of the Olson public source encoding. ###@###.### 2004-06-01
01-06-2004