Other |
---|
5.0 b54Fixed |
Duplicate :
|
|
Relates :
|
|
Relates :
|
|
Relates :
|
Name: nl37777 Date: 08/29/2003 The Java VM and the various interfaces attached to it (such as the Java Native Interface) have always used a modified form of the standard UTF-8 encoding. The same encoding has been used in the java.io.DataInput and DataOutput classes, but there has been documented for a long time as "Java modified UTF-8". Since Java modified UTF-8 and standard UTF-8 are incompatible, it is necessary to clarify throughout the Java platform specifications which interfaces use which encoding. Also, the description in the Java Virtual Machine Specification and some other documentation make it sound as if Java modified UTF-8 could not encode supplementary characters. In fact, it appears that all parts of the J2SDK that deal with Java modified UTF-8 handle supplementary characters just fine - they simply represent the surrogate pair of the character's UTF-16 representation as two three-byte sequences. This needs to be better documented at least in the following specifications: - Java Virtual Machine Specification - Java Native Interface Specification - Object Serialization Specification - Java Platform Debugger Architecture - Java Virtual Machine Profiler Interface - Java Virtual Machine Tool Interface This is part of Tiger release driver 4533872. ======================================================================
|