JDK-4239597 : java.net.URLDecode does not handle double byte characters
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.net
  • Affected Version: 1.2.1
  • Priority: P4
  • Status: Closed
  • Resolution: Duplicate
  • OS: generic
  • CPU: generic
  • Submitted: 1999-05-19
  • Updated: 2000-04-24
  • Resolved: 2000-04-24
Related Reports
Duplicate :  
Description

Name: skT88420			Date: 05/19/99


A URL is encoded using java.net.URLEncode which encodes the URL 
using the client OS's character encoding.  One the server side,
the java.net.URLDecode parses the URL and makes a unicode 
character out of each byte. If the client is using a Japanese OS 
with the SJIS encoding and types in the Japanese word setsume, 
the java.net.URLEncode converts URL word setsume to %90%D8%96%BE.  
On the server java.net.URLDecode parses the URL and makes a
four unicode character string out of the four bytes (not a two
character string).

There are two problems.
1) The URL is being encoded using the OS's default encoding but 
that encoding is not being sent to the server So, the server side
does not know how to decode the string properly.

2) The java.net.URLDecode class is decoding the string 
as if every byte is a character.  It is not handling double or
triple byte encoding (it must take into consideration the encoding 
of the URL
(Review ID: 83263) 
======================================================================

Comments
WORK AROUND Name: skT88420 Date: 05/19/99 I always encode the URLs into UTF8 and decode the URLs from UTF8. see java.io.DataInputStream.readUTF and java.io.DataInputStream.writeUTF ======================================================================
11-06-2004

EVALUATION mayank.upadhyay@eng 2000-02-23 Will be fixed to do UTF-8 encoding.
23-02-2000