JDK-4093056 : RFE: Add facilities for fast character-encoding conversion
  • Type: Enhancement
  • Component: core-libs
  • Sub-Component: java.nio.charsets
  • Affected Version: 1.1.4,1.1.5,1.2.0
  • Priority: P5
  • Status: Closed
  • Resolution: Duplicate
  • OS: generic,windows_95
  • CPU: generic,x86
  • Submitted: 1997-11-14
  • Updated: 2000-02-25
  • Resolved: 2000-02-25
Related Reports
Duplicate :  
Duplicate :  
Description

Name: joT67522			Date: 11/14/97


ByteToCharConverter and CharToByteConverter was removed from java.io
in the 1.1 Beta 3 and moved to sun.io.

Not allowing low level charset conversion causes a 100%
performance hit when using String.getBytes() and String(byte[], encoding).

Low-level CharToByte and ByteToChar converters needs to be moved back into
java.io for both performance and allow the application to be 100% pure Java.


Below is a small reproduction which does 2 tests:

	1) convert bytes to a String
	2) convert a String into bytes

The first set uses the String() and String.getBytes() methods.

The second set uses the sun.io.ByteToCharConverter() and 
sun.io.CharToByteConverter().

The output from running this app is:

   Test: String(byte[], charset)
        Elapsed time: 750
   Test: String.getBytes(charset)
        Elapsed time: 390
   Test: sun.io.ByteToCharConverter.convertAll()
        Elapsed time: 438
   Test: sun.io.CharToByteConverter.convertAll())
        Elapsed time: 110

You can see that the sun.io is much faster.  These Converter methods
use to be in java.io before the JDK 1.1 Beta-3, but were removed.

We want to have the performance (we are reading/writing to a network
socket) but we also want to be 100% Java.  Using the new "prescibed"
way of doing conversions is a major impact.

When can we get these (or similar) type methods put back into 
the java.io classes?

Thanks for your assitance, 

-Carl

--- ConvertPerf.java ---
import java.io.*;
import sun.io.*;

public class ConvertPerf {

    static final int ARRAYSIZE = 500;

    public static void main(String args[]) 
    {
       int i;
       byte anArrayOfBytes[] = new byte[ARRAYSIZE];
       byte dummyBytes[] = null;
       String aStr = null;
       long startTime;
       long endTime;
       String testString = new String("This is a string to be converted
to bytes");
       String _charsetName = new String("Cp850");

       // fill up the array of bytes with some data
       for (i = 0; i < ARRAYSIZE; i++)
       {
           anArrayOfBytes[i] = 'A';
       }

       // test using String(byte[])
       System.out.println("Test: String(byte[], charset)");
       startTime = System.currentTimeMillis();
       for(i = 0; i < ARRAYSIZE; i++)
       {
           try
           {
               aStr = new String(anArrayOfBytes, _charsetName);
           }
           catch (UnsupportedEncodingException uee)
           {
               System.out.println("Got UnsupportedEncodingException");
           }
       }
       endTime = System.currentTimeMillis();
       System.out.println("\tElapsed time: " + (endTime - startTime));
       
       // test using String().getBytes()
       System.out.println("Test: String.getBytes(charset)");
       startTime = System.currentTimeMillis();
       try
       {
          for(i = 0; i < ARRAYSIZE; i++)
          {
                 dummyBytes = testString.getBytes(_charsetName);
          }
       }
       catch (UnsupportedEncodingException uee)
       {
           System.out.println("Got UnsupportedEncodingException");
       }
       endTime = System.currentTimeMillis();
       System.out.println("\tElapsed time: " + (endTime - startTime));

       System.out.println("Test:
sun.io.ByteToCharConverter.convertAll()");
       ByteToCharConverter _toUnicode =
ByteToCharConverter.getDefault();
       startTime = System.currentTimeMillis();
       try
       {
          for(i = 0; i < ARRAYSIZE; i++)
          {
                aStr = new
String(_toUnicode.convertAll(anArrayOfBytes));
          }
       }
       catch (MalformedInputException mie)
       {
          System.out.println("got MalformedInputException");
       }
       endTime = System.currentTimeMillis();
       System.out.println("\tElapsed time: " + (endTime - startTime));
    
       System.out.println("Test:
sun.io.CharToByteConverter.convertAll())");
       CharToByteConverter _fromUnicode =
CharToByteConverter.getDefault();
       startTime = System.currentTimeMillis();
       try
       {
          for(i = 0; i < ARRAYSIZE; i++)
          {
                dummyBytes =
_fromUnicode.convertAll(testString.toCharArray());
          }
       }
       catch (MalformedInputException mie)
       {
          System.out.println("got MalformedInputException");
       }
       endTime = System.currentTimeMillis();
       System.out.println("\tElapsed time: " + (endTime - startTime));
    
    }

}

--- end of ConvertPerf.java ---


(Review ID: 19817)
======================================================================

Comments
WORK AROUND Name: joT67522 Date: 11/14/97 ======================================================================
11-06-2004

EVALUATION This is an RFE, not a bug. -- mr@eng 11/14/1997 Doing this right will require some significant additions to the java.io package. It is too late to do this for JDK 1.2. -- mr@eng 1/5/1998 Making character converters public is part of RFE 4287465. norbert.lindenberg@Eng 2000-02-25
05-01-1998