JDK-5063160 : Class.getResource doesn't encode URLs correctly
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.net
  • Affected Version: 5.0
  • Priority: P3
  • Status: Closed
  • Resolution: Not an Issue
  • OS: solaris_8
  • CPU: sparc
  • Submitted: 2004-06-15
  • Updated: 2004-06-21
  • Resolved: 2004-06-21
Related Reports
Relates :  
Description

Name: js151677			Date: 06/15/2004


FULL PRODUCT VERSION :
java version "1.5.0-beta2"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0-beta2-b51)
Java HotSpot(TM) Server VM (build 1.5.0-beta2-b51, mixed mode)


ADDITIONAL OS VERSION INFORMATION :
SunOS okoze 5.8 Generic_108528-21 sun4us sparc


A DESCRIPTION OF THE PROBLEM :
The URL from Class.getResource doesn't encode the sign "+" correctly.

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
1. Make a directory:
mkdir test+test

2. Make a test structure:
cd test+test
mkdir resourceTest
mkdir "resourceTest/dir with space"
mkdir "resourceTest/dir+with+space"
touch "resourceTest/file with space"
touch "resourceTest/file+with+space"

3. Store the program below in the directory test+test
public class ResourceLoaderTest {

    public static void main(String[] args) throws Exception {

        System.out.println(ResourceLoaderTest.class.getResource("resourceTest/dir with space"));
        System.out.println(ResourceLoaderTest.class.getResource("resourceTest/dir+with+space"));
        System.out.println(ResourceLoaderTest.class.getResource("resourceTest/file with space"));
        System.out.println(ResourceLoaderTest.class.getResource("resourceTest/file+with+space"));
    }
}


4. Compile and run the given test program
java ResourceLoaderTest.java
javac -classpath . ResourceLoaderTest


EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
  From step 4, I would have expected this output:

file:/myHome/test%2Btest/resourceTest/dir%20with%20space
file:/myHome/test%2Btest/resourceTest/dir%2Bwith%2Bspace
file:/myHome/test%2Btest/resourceTest/file%20with%20space
file:/myHome/test%2Btest/resourceTest/file%2Bwith%2Bspace

ACTUAL -
file:/myHome/test+test/resourceTest/dir%20with%20space
file:/myHome/test+test/resourceTest/dir+with+space
file:/myHome/test+test/resourceTest/file%20with%20space
file:/myHome/test+test/resourceTest/file+with+space

Note that the spaces are URL-encoded but the plusses are not. If you were to run this through URLDecoder.decode the two dirs and the two files would give the same strings.


REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
public class ResourceLoaderTest {

    public static void main(String[] args) throws Exception {

        System.out.println(ResourceLoaderTest.class.getResource("resourceTest/dir with space"));
        System.out.println(ResourceLoaderTest.class.getResource("resourceTest/dir+with+space"));
        System.out.println(ResourceLoaderTest.class.getResource("resourceTest/file with space"));
        System.out.println(ResourceLoaderTest.class.getResource("resourceTest/file+with+space"));
    }
}


---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
Do not use "+" in filename or directory-names
(Incident Review ID: 279859) 
======================================================================

Comments
EVALUATION The test class is loaded using code in the URLClassLoader. It seems that changes for bug 4979820 for class names containing JSR 202 characters has also changed URLs for resources. I've verified that the difference reported in the bug began with b46 (when bug 4979820 was integrated). -- iag@sfbay 2004-06-16 Before the bug fix for 4979820 was introduced, the characters in resource name for URLClassLoader was never encoded. After the bug fix, we would do so where appropriate, including those characters that are reserved. According to RFC2396, ";/?:@&=+$" are characters used as delimiters of the components of a URI as described in 2.2; The characters |"/||;||=| |?" are reserved within a path segment. Section 2.2 says "Characters in the reserved set are not reserved in all contexts. The set of characters actually reserved within any given URI component is defined by that component. In general, a character is reserved if the semantics of the URI changes if the character is replaced with its escaped US-ASCII <http://www.cs.tut.fi/%7Ejkorpela/rfc/2396/full.html#ASCII> encoding. "+" is not reserved in the path component. so it's is correct that we don't escape it. ###@###.### 2004-06-21
21-06-2004