JDK-4464351 : (cs) Charset API does not distinguish malformed input from unmappable chars
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.nio
  • Affected Version: 1.4.0
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2001-05-30
  • Updated: 2001-10-03
  • Resolved: 2001-10-03
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
1.4.0 beta3Fixed
Related Reports
Relates :  
Description
The encode() and decode()
methods within the nio converters treat both the MalformedInputException
and UnmappableCharacterException together in doing substitution.  When the
substitute() method is used, it sets the substitituon for both.

In the Unicode spec, MalformedInput is called  "illegal code unit
sequences".  Conforming Unicode implementations are proscribed from
treating illegal code unit seuqences as characters, and the only options
which Unicode allows are to remove the illegal sequence, or to reject the
illegal sequence.  This is different from undefined characters, for which
it is perfectly valid to do substitution.

Comments
CONVERTED DATA BugTraq+ Release Management Values COMMIT TO FIX: generic FIXED IN: merlin-beta3 INTEGRATED IN: merlin-beta3
14-06-2004

SUGGESTED FIX In order to conform to the Unicode specification, the nio APIs should be modified to allow substitution to be specified independently for MalformedInput and UnmappableCharacter. This could be implemented to set the default substitution for MalformedInput to a zero length String or byte array. Note that for this change to be effective, it is necessary to put the size field on the UnmappableCharacterException and fix the various converters to throw the correct exception.
11-06-2004

EVALUATION Fixed as part of the Charset API redesign (4503732). -- ###@###.### 2001/10/3
10-09-0169