Bug ID: JDK-4881655 wrong conversion using in jni GetStringUTFChars() for japanese sjis or PCK code

Type: Bug
Component: core-libs
Sub-Component: java.nio.charsets
Affected Version: 1.2.2_10

Priority: P4
Status: Closed
Resolution: Not an Issue
OS: solaris_8
CPU: sparc

Submitted: 2003-06-20
Updated: 2003-07-08
Resolved: 2003-07-08

Fujitsu has provided a jni testcase for java 1.2 which doesn't show Japanese SJIS codes
3rd. argument of icv.convertExtended() in convertEx.java is sjis.

I can find in 4642283 Suggested Fix
+ classname = (char *)(*env)->GetStringUTFChars(env, mainClassName, 0);

This is the use of the function in convertEx.c from the testcase provided by Fujitsu
 srcString = (char *)((*env)->GetStringUTFChars(env, strSrc, NULL));
 fromCode = (char *)((*env)->GetStringUTFChars(env, strFrom, NULL));
 toCode = (char *)((*env)->GetStringUTFChars(env, strTo, NULL));
 printf("TEST convertExtended fromCode[%s] toCode[%s]\n",fromCode,toCode);

Testcase has been checked in Sun with Sol8/9 and java 1.2.2_10 and 1.4.1_02
We can see japanese characters in convertEx.java in the call to the jni function icv.convertExtended (using dtpad and gedit)

% java -version
java version "1.2.2"
Solaris VM (build Solaris_JDK_1.2.2_10, native threads, sunwjit)
% javac convertEx.java
% javah -jni convertEx
% cc -V
cc: Forte Developer 7 C 5.4 2002/03/09
% cc -G -I.
-I/usr/java1.2/include -I/usr/java1.2/include/solaris convertEx.c -o libconvertEx.so
"convertEx.c", line 78: warning: argument #2 is incompatible with
prototype:
        prototype: pointer to pointer to const char :  "/usr/include/iconv.h", line 20
        argument : pointer to pointer to char
% java -Djava.library.path=. convertEx UTF-8 PCK
***** start
TEST convertExtended fromCode[UTF-8] toCode[PCK]
TEST convertExtended CALLING [abcdefg]
TEST convertExtended fromCode[UTF-8] toCode[PCK]
TEST convertExtended CALLING [????A~^(o)U?N??]
TEST convertExtended fromCode[UTF-8] toCode[PCK]
TEST convertExtended CALLING [?????????}]
TEST convertExtended fromCode[UTF-8] toCode[PCK]
TEST convertExtended CALLING
[?@?A?B?C?D?E?F?G?H?I?_?`?a?b?c?d?e?f?g?h?i?j?k?l?m?n]
TEST convertExtended fromCode[UTF-8] toCode[PCK]
TEST convertExtended CALLING [???????????????????]
TEST convertExtended fromCode[UTF-8] toCode[PCK]
TEST convertExtended CALLING [?o?p?q?r]
TEST convertExtended fromCode[UTF-8] toCode[PCK]
TEST convertExtended CALLING
[??????}?????L???A~????????????????@?A?B?C?D?E?F?G?H?I?J?K?L?M?N]
TEST convertExtended fromCode[UTF-8] toCode[PCK]
TEST convertExtended CALLING [?@?A?B?C?D?E?F?G?H?I?J?K?L?M?N]
TEST convertExtended fromCode[UTF-8] toCode[PCK]
TEST convertExtended CALLING [?????????????A~^(o)U]
***** end [?????????????U]

% /usr/j2se/bin/java -version
java version "1.4.1_02"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1_02-b06)
Java HotSpot(TM) Client VM (build 1.4.1_02-b06, mixed mode)
% /usr/j2se/bin/java -Djava.library.path=. convertEx UTF-8 PCK
***** start
TEST convertExtended fromCode[UTF-8] toCode[PCK]
TEST convertExtended CALLING [abcdefg]
TEST convertExtended fromCode[UTF-8] toCode[PCK]
TEST convertExtended CALLING [????A~^(o)U?N??]
TEST convertExtended fromCode[UTF-8] toCode[PCK]
TEST convertExtended CALLING [?????????}]
TEST convertExtended fromCode[UTF-8] toCode[PCK]
TEST convertExtended CALLING
[?@?A?B?C?D?E?F?G?H?I?_?`?a?b?c?d?e?f?g?h?i?j?k?l?m?n]
TEST convertExtended fromCode[UTF-8] toCode[PCK]
TEST convertExtended CALLING [???????????????????]
TEST convertExtended fromCode[UTF-8] toCode[PCK]
TEST convertExtended CALLING [?o?p?q?r]
TEST convertExtended fromCode[UTF-8] toCode[PCK]
TEST convertExtended CALLING
[??????}?????L???A~????????????????@?A?B?C?D?E?F?G?H?I?J?K?L?M?N]
TEST convertExtended fromCode[UTF-8] toCode[PCK]
TEST convertExtended CALLING [?@?A?B?C?D?E?F?G?H?I?J?K?L?M?N]
TEST convertExtended fromCode[UTF-8] toCode[PCK]
TEST convertExtended CALLING [?????????????A~^(o)U]
***** end [??????????????????U]

###@###.### 2003-06-25
see also original sjis code in attachment (kanji.gif)

PUBLIC COMMENTS After looking at the testcase it appears the customer is constructing their Strings from raw byte values, not unicode. This is incorrect for multi byte characters. If they wish to do this then they need to have the byte values in a byte array and create the string from that. One way of doing this is as follows: ..... byte[] bytesIn = new byte[1024]; int len = 0; try { FileInputStream fin = new FileInputStream("SJIS.txt"); len = fin.read(bytesIn); fin.close(); } catch (Exception e) { System.err.println(e); } try { String str = new String(bytesIn, 0, len, "SJIS"); ..... I have tested this and it works as correctly. GetStringUTFChars() supports characters in the sjis and pck ranges. ###@###.### 2003-07-08

08-07-2003

EVALUATION jni GetStringUTFChars() supports characters in the sjis and pck ranges. This is not a bug, it is user error (see Summary). ###@###.### 2003-07-08

08-07-2003