JDK-4041072 : Character set issues not addressed with URL's
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.net
  • Affected Version: 1.1,1.1.3
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 1997-03-25
  • Updated: 1999-01-15
  • Resolved: 1999-01-15
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
1.2.0 1.2beta4Fixed
Related Reports
Duplicate :  
Relates :  
Description
Character set issues have not been addressed when dealing with URL's.
In particular, Java tends to use (UNICODE) strings, whereas protocols
(HTTP, FTP) use byte sequences.  I believe that execptions should be 
generated for out of bounds values, or the condition handled 
gracefully.
	
Specific code I have seen problems with:  
  sun/net/www/protocol/http/HttpURLConnection.java (1.21) uses a 
       
  java/io/PrintStream for http GET requests.  This will cause the top 
  byte of characters in the URL to be silently dropped.
	
The encoder java/net/URLEncoder.java (1.4) silently ignores the 
top byte in the encode() method.  This is used for POST requests. 
	


Comments
CONVERTED DATA BugTraq+ Release Management Values COMMIT TO FIX: generic FIXED IN: 1.2beta4 INTEGRATED IN: 1.2beta4
14-06-2004

EVALUATION The HTTP specification is silent on this. More investigation is needed. benjamin.renaud@Eng 1998-02-12 The outer layer of encoding has been removed from URLEncoder. HttpURLConnection will remain unchanged for performance reasons at least until UTF8 becomes the officially accepted encoding for all URLs. michael.mccloskey@eng 1998-05-04
04-05-1998