JDK-6695386 : Provide termination of CharsetDecoder by exclusive cipher
  • Type: Enhancement
  • Component: core-libs
  • Sub-Component: java.nio
  • Affected Version: 6
  • Priority: P5
  • Status: Closed
  • Resolution: Won't Fix
  • OS: windows_xp
  • CPU: x86
  • Submitted: 2008-04-29
  • Updated: 2016-06-06
  • Resolved: 2008-11-20
Related Reports
Blocks :  
Description
A DESCRIPTION OF THE REQUEST :
Class java.nio.CharsetDecoder is often used to encode native zero-terminated ASCII strings.
If used ByteBuffer wraps from a fixed sized byte array there are probably trailing 0x00's and neither position nor limit are correctly aligned to these. Then the ByteBuffer's limit must first be aligned to the first 0x00 to avoid invalid chars in the destination CharBuffer.

For this I suggest to add following methods:
    public int terminator() {
	throw new UnsupportedOperationException();
    }
    public CharsetDecoder terminateBy(int newTerminator) {
	throw new UnsupportedOperationException();
    }
By default they should throw UnsupportedOperationException and should be overridden by appropriated sub classes.

JUSTIFICATION :
- zero-terminated ASCII strings are often used to access native APIs.
- is more elegant than extra lines of code
- will enhance performance, as source bytes must not be looped twice
- e.g. helps to fix bug 6449421



---------- BEGIN SOURCE ----------
Usage (e.g.):
(Possible fix for http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6449421)

  return Charset.forName(charSet).newDecoder()
      .onMalformedInput(CodingErrorAction.REPLACE)
      .onUnmappableCharacter(CodingErrorAction.REPLACE)
      .replaceWith("?")
      .terminateBy(0)
      .decode (ByteBuffer.wrap(inBytes))
      .toString();


---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
  int len = 0;
  while (inBytes[len] != 0)
    len++;
  return Charset.forName(charSet).newDecoder()
      .onMalformedInput(CodingErrorAction.REPLACE)
      .onUnmappableCharacter(CodingErrorAction.REPLACE)
      .replaceWith("?")
      .decode (ByteBuffer.wrap(inBytes, 0, len))
      .toString();

Comments
From a semantic view yes, but not from performance view. If ByteBuffer should detect a termination byte in a given byte array, the array needs to be looped twice, there and in CharsetDecoder. Additionally, in many CharsetDecoder implementations there already exists such check on "0" in the inner loop, helping to detect invalid mappings. The interpretation outside the loop path just needs to be altered respectively resulting in zero performance cost. Other than zero ciphers IMHO semantically belong to the CharsetDecoder.
06-06-2016

EVALUATION It seems more reasonable to consider to address this "zero-terminate" request in ByteBuffer class than in CharsetDecoder.
20-11-2008

EVALUATION This appears to be a duplicate of 6452016 submitted by the same person.
29-04-2008