JDK-6196991 : (cs) Many character decoders fail to convert single-byte (e.g. ASCII) input
  • Type: Bug
  • Status: Resolved
  • Resolution: Fixed
  • Component: core-libs
  • Sub-Component: java.nio
  • Priority: P2
  • Affected Version: 1.4.2,5.0
  • OS: linux,windows_xp
  • CPU: x86
  • Submit Date: 2004-11-18
  • Updated Date: 2017-07-11
  • Resolved Date: 2005-03-19
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availabitlity Release.

To download the current JDK release, click here.
Other JDK 6
5.0u35Resolved 6 b29Resolved
Related Reports
Duplicate :  
Duplicate :  
Duplicate :  
Description
FULL PRODUCT VERSION :
java version "1.4.2_01"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_01-b06)
Java HotSpot(TM) Client VM (build 1.4.2_01-b06, mixed mode)

ADDITIONAL OS VERSION INFORMATION :
Microsoft Windows XP [Version 5.1.2600]

EXTRA RELEVANT SYSTEM CONFIGURATION :
Using English version of Windows.
Default locale is "en"


A DESCRIPTION OF THE PROBLEM :
Enclosed program does round-trip conversion of a string from default encoding to SJIS, and then back.  When "ABC" is processed it correctly comes back as "ABC".  When a single character string is processed, it comes back as an empty string.



STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Run provided Java program

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Printed output should be:

roundTrip(A)=A
roundTrip(ABC)=ABC
ACTUAL -
Actual output is:

roundTrip(A)=
roundTrip(ABC)=ABC

REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
package gcr.db.test;

import java.nio.*;
import java.nio.charset.*;

/**
 * Demonstrates apparent bug in SJIS encoding.
 */
public class SjisBug {
  public static void main(String[] args) {
    System.out.println("roundTrip(A)="+roundTrip("A")); // Gets empty string !
    System.out.println("roundTrip(ABC)="+roundTrip("ABC")); // Gets ABC, as expected.
  }
  static String roundTrip(String str) {
    Charset cs=Charset.forName("SJIS");
    return cs.decode(cs.encode(str)).toString();
  }
}



---------- END SOURCE ----------
###@###.### 2004-11-18 04:18:09 GMT

Comments
WORK AROUND If the input is only one byte, you can "pad" the input with some extra data with known decoding (e.g. ASCII), and discard it after decoding. ###@###.### 2005-2-10 17:47:21 GMT
2005-02-10

EVALUATION Looks like Charset-X-Coder's encode/decode might fail if in.remaining == 1 and average$ItypesPerOtype$ is less than 1. ###@###.### 2005-2-08 07:14:46 GMT
2005-02-08

SUGGESTED FIX --- /tmp/geta10520 2005-02-10 00:45:59.407012600 -0800 +++ Charset-X-Coder.java 2005-02-09 23:15:11.996877000 -0800 @@ -740,30 +740,31 @@ * position cannot be mapped to an equivalent $otype$ sequence and * the current unmappable-character action is {@link * CodingErrorAction#REPORT} */ public final $Otype$Buffer $code$($Itype$Buffer in) throws CharacterCodingException { + int remaining = in.remaining(); + if (remaining == 0) + return $Otype$Buffer.allocate(0); int n = (int)(in.remaining() * average$ItypesPerOtype$()); $Otype$Buffer out = $Otype$Buffer.allocate(n); - if (n == 0) - return out; reset(); for (;;) { CoderResult cr; if (in.hasRemaining()) cr = $code$(in, out, true); else cr = flush(out); if (cr.isUnderflow()) break; if (cr.isOverflow()) { - n *= 2; + n = 2*n + 1; // Ensure progress; n might be 0! $Otype$Buffer o = $Otype$Buffer.allocate(n); out.flip(); o.put(out); out = o; continue; } cr.throwException(); ###@###.### 2005-2-10 08:48:27 GMT
2005-02-08