JDK-4454875 : Performance regression in String encoding conversion
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.lang
  • Affected Version: 1.1.8,1.4.0
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • OS: solaris_2.6,windows_2000
  • CPU: x86,sparc
  • Submitted: 2001-05-04
  • Updated: 2013-11-01
  • Resolved: 2001-10-02
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
1.4.0 beta3Fixed
Related Reports
Relates :  
Relates :  
Description

Name: nl37777			Date: 05/04/2001

The test case below shows a significant performance
regression in String encoding conversion from J2RE 1.3.1 to Merlin beta.
The average running times in milliseconds for three runs, using the
second and third cycle times only, on an Ultra 1 are:

Encoding     1.3.1    1.4      increase in 
                               running time
Cp1252       4595     13497    194 %
ISO8859_1    3497     12026    244 %
MS932        5677      8367     47 %
EUC_JP       5898      8703     48 %
ISO2022JP    5883      8671     47 %

It's interesting to notice that the Cp1252 and ISO8859_1 converters used
are the new java.nio.Charset converters, while the MS932, EUC_JP, and
ISO2022JP converters used are the old sun.io converters.

The versions used are

java version "1.3.1-rc2"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.1-rc2-b23)
Java HotSpot(TM) Client VM (build 1.3.1-rc2-b23, mixed mode)

java version "1.4.0-beta"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.0-beta-b63)
Java HotSpot(TM) Client VM (build 1.4.0-beta-b63, mixed mode)


import java.io.UnsupportedEncodingException;

public class StringTranscodingPerf {

    static String[] deStrings = {
        "German",
        // from metal_de.properties
        "Suchen in:",
        "Dateiname:",
        "Dateitypen:",
        "Eine Ebene h\u00F6her",
        "H\u00F6her",
        "Home",
        "Home",
        "Neuen Ordner erstellen",
        "Neuer Ordner",
        "Liste",
        "Liste",
        "Details",
        "Details",
        "Als Symbol darstellen",
        "Maximieren",
        "Schlie\u00DFen",
    };

    static String[] jaStrings = {
        "Japanese",
        // from metal_ja.properties
        "\u53c2\u7167:",
        "\u30d5\u30a1\u30a4\u30eb\u540d:",
        "\u30d5\u30a1\u30a4\u30eb\u30bf\u30a4\u30d7:",
        "1 \u30ec\u30d9\u30eb\u4e0a\u3078",
        "\u4e0a\u3078",
        "\u30db\u30fc\u30e0",
        "\u30db\u30fc\u30e0",
        "\u30d5\u30a9\u30eb\u30c0\u306e\u65b0\u898f\u4f5c\u6210",
        "\u65b0\u898f\u30d5\u30a9\u30eb\u30c0",
        "\u30ea\u30b9\u30c8",
        "\u30ea\u30b9\u30c8",
        "\u8a73\u7d30",
        "\u8a73\u7d30",
        "\u30a2\u30a4\u30b3\u30f3\u5316",
        "\u6700\u5927\u5316",
        "\u9589\u3058\u308b",
    };

    public static void main(String[] args) throws
UnsupportedEncodingException {
        testConversion(deStrings, "Cp1252");
        testConversion(deStrings, "ISO8859_1");
        testConversion(jaStrings, "MS932");
        testConversion(jaStrings, "EUC_JP");
        testConversion(jaStrings, "ISO2022JP");
    }

    private static void testConversion(String[] strings, String encoding)
             throws UnsupportedEncodingException {
        byte[][] bytes = new byte[strings.length][];
        String[] newStrings = new String[strings.length];

        // do each measurement three times to verify when hotspot is done
        for (int cycle = 0; cycle < 3; cycle++) {
            long start = System.currentTimeMillis();
            for (int count = 0; count < 10000; count++) {
                for (int index = 0; index < strings.length; index++) {
                    bytes[index] = strings[index].getBytes(encoding);
                    newStrings[index] = new String(bytes[index], encoding);
                }
            }
            System.out.println(strings[0] + ", " + encoding + ": "
                    + (System.currentTimeMillis() - start) + " ms");
        }

        for (int index = 0; index < strings.length; index++) {
            if (!newStrings[index].equals(strings[index])) {
                throw new RuntimeException(strings[index]
                        + " -> " + newStrings[index]);
            }
        }
    }
}



Comments
CONVERTED DATA BugTraq+ Release Management Values COMMIT TO FIX: generic FIXED IN: merlin-beta3 INTEGRATED IN: merlin-beta3 VERIFIED IN: merlin-beta3
14-06-2004

EVALUATION The given test code runs much more quickly with the revised charset API and implementation (4503732, see comments section for numbers). Routine tuning should bring further improvement, but given that the significant regression reported in the description have been fixed I am closing this bug. -- ###@###.### 2001/10/2
10-09-0168