United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
JDK-4508058 : UTF-8 encoding does not recognize initial BOM

Details
Type:
Enhancement
Submit Date:
2001-09-27
Status:
Closed
Updated Date:
2006-02-18
Project Name:
JDK
Resolved Date:
2006-02-18
Component:
core-libs
OS:
windows_nt,generic
Sub-Component:
java.nio.charsets
CPU:
other,generic
Priority:
P3
Resolution:
Won't Fix
Affected Versions:
1.4.0,1.4.2_05
Fixed Versions:

Related Reports
Relates:
Relates:

Sub Tasks

Description
A Utf-8 stream can optionally beign with a byte order mark (see, for example http://www.unicode.org.unicode/faq/utf_bom.html).  This is the character FEFF, which is represented as EF BB BF in utf-8. Java's utf-8 encoding does not recognize this character as a BOM, though; the result of reading such a stream is a set of characters bginning with FEFF.

                                    

Comments
EVALUATION

The assumption we made to implement this RFE is that the change would not
break existing real world application, this assumption is obvious not true,
see #6378911. We decided to back out the change we've made and closed this
RFE as "will not fix", for compatibility reason.
                                     
2006-02-18
EVALUATION

for mustang
                                     
2005-09-27
PUBLIC COMMENTS

Java does not recognize the optional BOM which can begin a UTF-8 stream.  It treats the BOM as if it were the initial character of the stream.
                                     
2004-09-29
WORK AROUND

Application code must recognize and skip the BOM itself.
                                     
2004-09-29
SUGGESTED FIX

Recognize the BOM, just as UTF-16 does.
                                     
2004-09-29



Hardware and Software, Engineered to Work Together