United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
JDK-4616184 : java.net.URLEncoder.encode(String) doesn't encode to RFC2396 standard

Details
Type:
Enhancement
Submit Date:
2001-12-19
Status:
Closed
Updated Date:
2002-04-17
Project Name:
JDK
Resolved Date:
2002-04-17
Component:
core-libs
OS:
generic
Sub-Component:
java.net
CPU:
generic
Priority:
P4
Resolution:
Not an Issue
Affected Versions:
1.4.0
Fixed Versions:

Related Reports

Sub Tasks

Description

Name: nt126004			Date: 12/19/2001


java version "1.4.0-beta3"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.0-beta3-b84)
Java HotSpot(TM) Client VM (build 1.4.0-beta3-b84, mixed mode)

Using the URLEncoder class the other day, I noticed that 'space' is encoded
to '+' instead of '%20'.  Unless I'm reading the RFC wrong, shouldn't it be
encoded to '%20'?

The information is in section 2.4.1 and 2.4.3.
http://www.ietf.org/rfc/rfc2396.txt

Here's a comment from the java.net.URLEncoder class, lines 45 and 46, that
sites the source of the special characters as an O'reilly book:


    /* The list of characters that are not encoded have been determined by
       referencing O'Reilly's "HTML: The Definitive Guide" (page 164). */


  Interestingly enough, the 3rd edition of this text indicates that '+' should be
encoded to '%20' (p. 195).

Although you would probably want to remove other code in the class that
the 'space' to '+' substitution, the fix is as simple as commenting out line 60
of the class:

	//dontNeedEncoding.set(' '); /* encoding a space to a + is done in the
encode() method */


Here is a simple test program that shows this issue.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
import java.net.URLEncoder;

public class Test
{
   public static void main(String[] args)
   {
      System.out.println("Should see %20 here: " + URLEncoder.encode("space
here"));

   }
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
(Review ID: 137273) 
======================================================================

                                    

Comments
EVALUATION

Actually, this is not a bug. This class implements the recommendations
in the HTML specifications for how to encode URLs in HTML forms.
It is not intended for other uses.

This is specified in various places, including HTML 4.01 
section 17.13.4, and also RFC 1866 (which is superceded
by the W3C HTML recommendations).

###@###.### 2002-04-17
                                     
2002-04-17



Hardware and Software, Engineered to Work Together