Name: wm7046 Date: 04/13/2004
When converting a byte array (contains an invalid byte sequence for EUC_CN) to
a String using charset "EUC_CN", the result is different before and after
calling new String(bytes, "ISO-8859-1"), e.g:
String s1 = new String(bytes, "EUC_CN");
String s2 = new String(bytes, "ISO-8859-1");
String s3 = new String(bytes, "EUC_CN");
we see s1 is different from s3.
The test case:
public class TestString
{
public static void main(String[] args)
{
byte[] bytes = {-122,72,-122,-9};
try {
String str = null;
int hash = 0;
String outstr = null;
//-------- Before iso-8859
str = new String(bytes,"EUC_CN");
hash = str.hashCode();
outstr = "";
for (int i = 0; i < str.length(); i++)
outstr += " " + (long) str.charAt(i);
System.out.println("String-(EUC_CN)" + outstr);
System.out.println("hash:" + hash);
//-------------iso-8859
str = new String(bytes,"ISO-8859-1");
hash = str.hashCode();
outstr = "";
for (int i = 0; i < str.length(); i++)
outstr += " " + (long) str.charAt(i);
System.out.println("String-ISO-8859-1:" + outstr);
System.out.println("hash:" + hash);
//-------------After iso-8859
str = new String(bytes,"EUC_CN");
hash = str.hashCode();
outstr = "";
for (int i = 0; i < str.length(); i++)
outstr += " " + (long) str.charAt(i);
System.out.println("String-(EUC_CN)" + outstr);
System.out.println("hash:" + hash);
}catch (Exception e)
{
e.printStackTrace();
}
}
}
(Incident Review ID: 244585)
======================================================================