Name: mf23781 Date: 12/08/99
Java. At least the Sun JDK 1.2 UTF-8 converter will gladly convert
0xED 0xA0 0x88 0xED 0xBD 0x85 (bad UTF-8 for U+12345) to
"\uD808\uDF45" (correct UTF-16 for the same).
> Are such programs considered useful or harmful?
Good question! It can "repair" some things that otherwise wouldn't
work, but security and reliability often depend on some things *not*
working when they shouldn't. Similar to non-minimal UTF-8 encodings
(of ASCII nulls, for instance).
BTW, I had thought that Java had no support at all for UTF-16, but
I just verified that JDK 1.2 will correctly transcode "\uD808\uDF45"
to UTF-8 0xF0 0x92 0x8D 0x85. Converting back, however, yields
nothing: no characters, no exception, nothing at all.
(Review ID: 98805)
======================================================================