Character set issues have not been addressed when dealing with URL's.
In particular, Java tends to use (UNICODE) strings, whereas protocols
(HTTP, FTP) use byte sequences. I believe that execptions should be
generated for out of bounds values, or the condition handled
gracefully.
Specific code I have seen problems with:
sun/net/www/protocol/http/HttpURLConnection.java (1.21) uses a
java/io/PrintStream for http GET requests. This will cause the top
byte of characters in the URL to be silently dropped.
The encoder java/net/URLEncoder.java (1.4) silently ignores the
top byte in the encode() method. This is used for POST requests.