JDK-6642323 : Speeding up Single Byte Decoders
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.nio.charsets
  • Affected Version: 7
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2007-12-15
  • Updated: 2011-05-18
  • Resolved: 2011-05-18
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 7
7 b43Fixed
Related Reports
Relates :  
Description
Here's the critical decoder loop for ASCII

	private CoderResult decodeArrayLoop(ByteBuffer src,
					    CharBuffer dst)
	{
	    byte[] sa = src.array();
	    int sp = src.arrayOffset() + src.position();
	    int sl = src.arrayOffset() + src.limit();
	    assert (sp <= sl);
	    sp = (sp <= sl ? sp : sl);
	    char[] da = dst.array();
	    int dp = dst.arrayOffset() + dst.position();
	    int dl = dst.arrayOffset() + dst.limit();
	    assert (dp <= dl);
	    dp = (dp <= dl ? dp : dl);

	    try {
		while (sp < sl) {
		    byte b = sa[sp];
		    if (b >= 0) {
			if (dp >= dl)
			    return CoderResult.OVERFLOW;
			da[dp++] = (char)b;
			sp++;
			continue;
		    }
		    return CoderResult.malformedForLength(1);
		}
		return CoderResult.UNDERFLOW;
	    } finally {
		src.position(sp - src.arrayOffset());
		dst.position(dp - dst.arrayOffset());
	    }
	}


We can optimize the inner loop by checking for overflow and underflow at the same time.


        private static CoderResult normalResult(ByteBuffer src, int sp,
                                                CharBuffer dst, int dp) {
            updateBufferPositions(src, sp, dst, dp);
            return src.hasRemaining() ?
                CoderResult.OVERFLOW :
                CoderResult.UNDERFLOW;
        }

	private CoderResult decodeArrayLoop(ByteBuffer src,
					    CharBuffer dst)
	{
	    byte[] sa = src.array();
	    int sp = src.arrayOffset() + src.position();
	    int sl = src.arrayOffset() + src.limit();

            char[] da = dst.array();
	    int dp = dst.arrayOffset() + dst.position();
            int dl = dp + Math.min(src.remaining(),
                                   dst.remaining());
            while (dp < dl) {
                byte b = sa[sp];
                if (b >= 0) {
                    da[dp++] = (char)b;
                    sp++;
                } else
                    return coderResult(CoderResult.malformedForLength(1),
                                       src, sp, dst, dp);
            }
            return normalResult(src, sp, dst, dp);
        }
 

Similarly, for ISO-8859-1, we can optimize thus:

            int dl = dp + Math.min(src.remaining(),
                                   dst.remaining());
            while (dp < dl) {
                da[dp++] = (char)(sa[sp++] & 0xff);
            }
            return normalResult(src, sp, dst, dp);

For other table-driven single byte decoders, 
we can index into a char array instead of adding
a degree of indirection by indexing into a String.
If all bytes are valid, we can eliminate a check for unmappable
characters.  For e.g. ISO-8859-15, we are left with the tight inner loop

             while (dp < dl) {
                da[dp++] = byteToChar[sa[sp++] & 0xff];
            }

Simple benchmarking shows approximately a factor of two speedup
of decoding of long random byte sequences.

Comments
EVALUATION The benchmark from the latest mapping based new sbcs implementation, not all "tricks" suggested have been applied, but the performance looks much better. old-client old-server new-client new-server ------------------ Decoding 1b cp1252 : 4811 1.000 905 1.000 1965 1.000 746 1.000 Decoding 1b (direct)cp1252 : 9777 2.032 2717 3.001 7223 3.675 2773 3.718 Encoding 1b cp1252 : 4180 0.869 2710 2.994 3034 1.544 1544 2.070 Encoding 1b (direct)cp1252 : 9788 2.034 3521 3.889 7971 4.056 2612 3.502 ------------------ Decoding 1b cp1252 : 4834 1.000 907 1.000 1964 1.000 745 1.000 Decoding 1b (direct)cp1252 : 9995 2.068 2710 2.987 7249 3.691 2775 3.724 Encoding 1b cp1252 : 4182 0.865 2707 2.984 2526 1.286 1548 2.077 Encoding 1b (direct)cp1252 : 9757 2.018 3521 3.881 7970 4.058 2609 3.502 ------------------ Decoding 1b cp1252 : 4915 1.000 906 1.000 1965 1.000 765 1.000 Decoding 1b (direct)cp1252 : 9700 1.973 2707 2.985 7250 3.690 2770 3.618 Encoding 1b cp1252 : 4207 0.856 2706 2.984 3239 1.648 1545 2.019 Encoding 1b (direct)cp1252 : 9894 2.013 3521 3.884 7974 4.058 2609 3.408
10-12-2008

EVALUATION Yes.
15-12-2007