JDK-4811955 : Charset converter: ISO2022KR shows performance degradation from 1.3.1_04
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.nio.charsets
  • Affected Version: 1.4.2
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2003-02-03
  • Updated: 2003-08-18
  • Resolved: 2003-06-27
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
5.0 tigerFixed
Related Reports
Relates :  
Description
On an Ultra 10, the following performance number was generated for ISO2022KR:

1.3.1_04   1.4.2    Result
--------   ------   ------
  26488    557651   -2005%

This program is based on the program from 4752992. This problem may or may not be similar to that reported in said bug.

Use the following program to reproduce the problem:

---------------------------- Cut Here -------------------------------
import java.io.*;
import java.util.*;

public class CharsetTest {
    private int iterations = 100000;
    private ByteArrayOutputStream baos;
    private OutputStreamWriter osw;

    public static void main(String[] arg) {
        new CharsetTest();
    }

    public CharsetTest() {
        try {
            baos = new ByteArrayOutputStream();

            long start = System.currentTimeMillis();

            for (int i = 0; i < iterations; i++) {
                osw = new OutputStreamWriter(baos, "ISO2022KR");
            }

            long end = System.currentTimeMillis();
            long t = end - start;

            System.out.println("Time: " + t);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

---------------------------- Cut Here -------------------------------

###@###.### 2003-02-03

Comments
CONVERTED DATA BugTraq+ Release Management Values COMMIT TO FIX: tiger FIXED IN: tiger INTEGRATED IN: tiger tiger-b10 VERIFIED IN: tiger
14-06-2004

PUBLIC COMMENTS Verified in Tiger b15. ###@###.### 2003-08-18
18-08-2003

EVALUATION These extra setup costs for the ISO-2022-KR NIO coder can be avoided by using an alternative implementation to the current one. The current implementation involves an internal lookup against charset name "EUC_KR". ISO-2022-KR defined in RFC 1557 provides ISO-2022 escaping for KS X 1001:1992 characters in addition to US-ASCII. It is convenient for ISO-2022-KR to share the same conversion tables as EUC-KR does since Code set 1 of EUC-KR (2 bytes) is also KS X 1001:1992. However, the current implementation performs an uneccessarty recursive charset lookup. By avoiding this lookup the original performance of the setup of this coder can be restored. I am adding a suggested fix. ###@###.### 2003-02-05 Here are some performance numbers based on using the supplied test case which magnifies the repeated lookup performance issue introduced since 1.4.0 and which disimproved in 1.4.1 thru 1.4.2 J2SE version Time elapsed (ms) 1.3.1 FCS (b24) 12163 1.4.0 FCS (b92) 109272 1.4.1 FCS (b21) 669861 1.4.2 FCS (b24) 436707 1.5 (fix applied) 3861 <----- Tiger build with fix. These timings were produced on a single cpu U60 running Solaris 9 and using the default client hotspot compiler. ###@###.### 2003-06-10
10-06-2003