Relates :
|
A DESCRIPTION OF THE REQUEST : String#getBytes(..) and new String(bytes..) internally use slow and each time newly instatiated Charset-X-coders. Additionally: At first assumption user could think, that String#getBytes(byte[] buf, Charset cs) might be faster than String#getBytes(byte[] buf, String csn), because he assumes, that Charset would be internally created from csn. As this is only true for the first call, there should be a *note* in JavaDoc about cost of those methods in comparision. Don't forget (byte[] ...) constructor's JavaDoc too. JUSTIFICATION : Assumed that ASCII and ISO-8859-1 have high percentage in usage on those methods especially for CORBA applications, we should have a fast shortcut in class String. See also: http://cr.openjdk.java.net/~sherman/6636323_6636319/webrev http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6636319 http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6636323 EXPECTED VERSUS ACTUAL BEHAVIOR : EXPECTED - Fastpath for ASCII + ISO-8859-1 for methods and constructors like: String#getBytes(..) and new String(bytes..) Alternatives: String#getASCIIBytes(..) String#getISO8859_1Bytes(..) ACTUAL - byte[] getBytes(Charset charset) internally instantiates CharsetEncoder which is much slower, especially on short strings. ---------- BEGIN SOURCE ---------- 1 simple example: public class String { ... int getBytes(byte[] buf, byte mask) { int j = 0; for (int i=0; i<values.length; i++, j++) { if (values[i] | mask == mask) buf[j] = (byte)values[i]; continue; if (isHighSurrogate(values[i] && i+1<length && isLowSurrogate(values[i+1]) i++; buf[j] = '?'; // or default replacement } return j; ... } ---------- END SOURCE ----------