JDK-6196407 : J2SE NIO: eucJP-open failed to be looked up.
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.lang
  • Affected Version: 1.4.2,1.4.2_06
  • Priority: P2
  • Status: Resolved
  • Resolution: Fixed
  • OS: solaris_9
  • CPU: x86,sparc
  • Submitted: 2004-11-17
  • Updated: 2010-05-10
  • Resolved: 2004-12-10
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
1.4.2_08 b01Fixed
Related Reports
Duplicate :  
Relates :  
Relates :  
Description
The certain order of calling java.lang.String.getBytes(String) will result
in a seeming failure(taking time and system call error)  of 
java.nio.charset.Charset.loolupViaProviders(String).

For example:
    String str = "abc";
    str.getBytes("eucJP-open");
    str.getBytes("MS932");
    str.getBytes("eucJP-open");
    str.getBytes("MS932");


The calls of getBytes() will result in loolupViaProviders(String). But
it seems to almost fail. Because it was taking time and resulted in
error in stat64 system call at OS level. 

###@###.### 2004-11-17 02:44:22 GMT

Comments
EVALUATION I am not familiar with the charset code but I run the testcase and go through the source code of this part. Seems to me that this is a right behavior. Every time the getBytes run lookupViaProvider() to find the charset but failed every time because eucJP-open charset is not defined in our default charset names. Then the caching is not functioning. If you change the charset name to euc_jp, the problem will go away. My understanding is that the customer want to provider their own implementation to use eucJP-open. I add ###@###.### in the interest list. I think he is the original developer for charset provider. ###@###.### 2004-11-19 02:51:56 GMT While this is the "correct" behavior of Provider lookup mechanism, it's not an acceptable performance for the j2se product... The root cause is we don't have eucjp-open and pck in nio charset collection in 1.4.2_xx, there are ctb/btc (We have them now in 1.5, see bug#4892738). Current implementation of StringCoding class caches only one "last used" charset/converter per thread, so using two encoding names repeatly easily "penetrates" this cache mechanism, and the worse is that the eucjp-open does not exist in any of the StandardCharset Provider or the ExtendedCharsetProvider, so next levels of cache in Charset and AbstractCharsetprovider also do not help, we endup of reaching the final lookup layer to lookup for "new" charset provider again and again, which is expensive. The reason we don't see the same issue with PCK is that we have a special "if PCK" code in StringCoding.java. We have the same problem with all encodings that only exist in sun.io package, such as those IBMxyz encodings. 2 possible quick/easy solutions for this particular issue would be (1)Add the same "special" code for eucjp-open in StringCodeing.lookupCharset() or (2)backport 4892738 to 1.4.2_0x, which I think is the better approach. The disadvantage of above 2 solutions is we still have the same issue with those IBMxyz encodings, if customer care them... Since this issue does not exist in 1.5 and later. Submitter needs escalate to CTE to get this one fixed in 1.4.2_08 release. ###@###.### 2004-11-19 05:07:16 GMT
19-11-2004