Bug ID: JDK-8217097 Correct UnicodeDecoder U+FFFE handling

Type: CSR
Component: core-libs
Sub-Component: java.nio.charsets

Priority: P3
Status: Closed
Resolution: Approved
Fix Versions: 13

Submitted: 2019-01-15
Updated: 2019-01-15
Resolved: 2019-01-15

Summary
-------

Correct the behavior of UnicodeDecoder subclasses on handling U+FFFE code point in the middle of the input buffer.

Problem
-------

Currently UnicodeDecoder deals with U+FFFE in the middle of a string as "malformed" as it is a non-character. This has been correct up until Unicode 7. However Unicode 7 includes the corrigendum (http://www.unicode.org/versions/corrigendum9.html) that changed the definition of non-characters. UnicodeDecoder's behavior should be modified to conform to it.

Solution
--------

Remove the piece of code in UnicodeDecoder which detects the code point in the middle and return "malformed" CodeResult, so that the UTF16 decoders (StandardCharsets.UTF_16[LE/BE]) can pass through the code point.

Specification
-------------

As required by the Unicode 7 Corrigendum 9, U+FFFE is passed through as a code point.

I see a release note is already planned. Moving to Approved.

15-01-2019