JDK-8216140 : Correct UnicodeDecoder U+FFFE handling
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.nio.charsets
  • Affected Version: 11,13
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2019-01-03
  • Updated: 2019-06-18
  • Resolved: 2019-01-11
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 13
13 b04Fixed
Related Reports
CSR :  
Relates :  
Sub Tasks
JDK-8216588 :  
Discussion thread: https://mail.openjdk.java.net/pipermail/core-libs-dev/2018-December/057639.html

Currently UnicodeDecoder deals with U+FFFE in the middle of a string as "malformed" as it is a non-character. This has been correct up until Unicode 7. However Unicode 7 includes the corrigendum (http://www.unicode.org/versions/corrigendum9.html) that changed the definition of non-characters. UnicodeDecoder's behavior should be modified to conform to it.
As Unicode 7 is supported in JDK 11 (since JDK 9 / JEP 227, AFAIU), it should be fixed there as well?Added "11" as affected versions.

In JDK-8150449, the suggested fix is to remove the code that make FFFE as malformed. There may be other option to make it unmappable instead of malformed, but not only it is incompatible, it is still not conforming to the revised recommendation from Unicode, as other non-character code points are still pass through. (From the corrigendum) --- Noncharacters consist of the values U+nFFFE and U+nFFFF (where n is from 0 to 1016) and the values U+FDD0..U+FDEF. ---

As of JDK8, this request was closed as "not an issue."