JDK-6394966 : File.toURI() does inconsistent character escaping between 1.4/1.5 and 1.6
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.io
  • Affected Version: 6
  • Priority: P3
  • Status: Closed
  • Resolution: Duplicate
  • OS: solaris_2.5.1
  • CPU: x86
  • Submitted: 2006-03-07
  • Updated: 2017-05-19
  • Resolved: 2006-06-13
Related Reports
Duplicate :  
Description
A DESCRIPTION OF THE REGRESSION :
When converting a filename containing a valid percent-encoding sequence to a URI via File.toURI(), jdk 1.4 and 1.5 will escape the percent characters in the URI, and unescape them during URI.getPath().

jdk1.6 does not escape the percent chars, and thus when getPath or new File(URI) is called, the encoded sequence is decoded to not correspond to the original file.

This also affects simple File->URI->File roundtripping.

Run the supplied test case to see the results. You may supply a character sequence on the command line, for example "%5c" is handled fine by 5.0u6, but is converted to "\" by mustang.

When the sequence "%ff" is used, it is handled fine by 5.0u6, but after the round trip is converted to "%FF".


REPRODUCIBLE TESTCASE OR STEPS TO REPRODUCE:
import java.io.File;

public class MustangURITest {

  public static void main(String[] args) throws Exception {
    String confusion = (args.length > 0) ? args[0] : "%5c";
    String name = "asdf" + confusion + "test.xml";
    File f = new File(name).getCanonicalFile();
    File f2 = new File(f.toURI());
    // Check that roundtrip is OK
    System.out.println("URI is " + f.toURI());
    System.out.println(((f.equals(f2)) ? "PASS " : "FAIL ") + f + " vs " + f2);

    // Same thing via getPath()
    String filestr = f.toString();
    String uripath = f.toURI().getPath();
    System.out.println(((filestr.equals(uripath)) ? "PASS " : "FAIL ") + filestr + " vs " + uripath);
  }
}



RELEASE LAST WORKED:
5.0 Update 6

RELEASE TEST FAILS:
mustang-beta

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Round trip should produce the same name as used as input:

[len@noir volatile]$ /usr/local/java/jdk1.5/bin/java MustangURITest "%5c"
URI is file:/home2/len/reeltwo_sandboxes/volatile/asdf%255ctest.xml
PASS /home2/len/reeltwo_sandboxes/volatile/asdf%5ctest.xml vs /home2/len/reeltwo_sandboxes/volatile/asdf%5ctest.xml
PASS /home2/len/reeltwo_sandboxes/volatile/asdf%5ctest.xml vs /home2/len/reeltwo_sandboxes/volatile/asdf%5ctest.xml

[len@noir volatile]$ /usr/local/java/jdk1.5/bin/java MustangURITest "%ff"
URI is file:/home2/len/reeltwo_sandboxes/volatile/asdf%25fftest.xml
PASS /home2/len/reeltwo_sandboxes/volatile/asdf%fftest.xml vs /home2/len/reeltwo_sandboxes/volatile/asdf%fftest.xml
PASS /home2/len/reeltwo_sandboxes/volatile/asdf%fftest.xml vs /home2/len/reeltwo_sandboxes/volatile/asdf%fftest.xml

ACTUAL -
[len@noir volatile]$ /usr/local/java/jdk1.6/bin/java MustangURITest "%5c"
URI is file:/home2/len/reeltwo_sandboxes/volatile/asdf%5ctest.xml
FAIL /home2/len/reeltwo_sandboxes/volatile/asdf%5ctest.xml vs /home2/len/reeltwo_sandboxes/volatile/asdf\test.xml
FAIL /home2/len/reeltwo_sandboxes/volatile/asdf%5ctest.xml vs /home2/len/reeltwo_sandboxes/volatile/asdf\test.xml

[len@noir volatile]$ /usr/local/java/jdk1.6/bin/java MustangURITest "%ff"
URI is file:/home2/len/reeltwo_sandboxes/volatile/asdf%fftest.xml
FAIL /home2/len/reeltwo_sandboxes/volatile/asdf%fftest.xml vs /home2/len/reeltwo_sandboxes/volatile/asdf%FFtest.xml
FAIL /home2/len/reeltwo_sandboxes/volatile/asdf%fftest.xml vs /home2/len/reeltwo_sandboxes/volatile/asdf%FFtest.xml


OBSERVED APPLICATION IMPACT:
We have software that uses URI's internally for referencing documents stored in various locations, some of which are Files.  This regression breaks the ability to read such referenced files.

(FYI, I am using Linux IA32, Fedora FC4, but I doubt it's specific to this OS)

Release Regression From : 5.0u6
The above release value was the last known release where this 
bug was known to work. Since then there has been a regression.

Comments
EVALUATION Due to numerous compatibility problems introduced by the changes to URI, including the one reported here, a rollback of the URI class has been performed so that its behaviour is now the same as in Tiger (jdk5.0). Closing this bug as a duplicate of the rollback bug, 6394131.
13-06-2006

EVALUATION It is about the 'double encoded' behavior of multi-argument constructors of java.net.URI. Quote following two lines from the bug description: [len@noir volatile]$ /usr/local/java/jdk1.5/bin/java MustangURITest "%5c" URI is file:/home2/len/reeltwo_sandboxes/volatile/asdf%255ctest.xml '%5c' is actually encoded as '%255c', i.e. encoded again. If one feeds that back to the test, the percent character will be encoded once more, and the '%25' will be accumulated. This behavior is documented. The javadoc of java.net.URI 1.5.0 reads: "The multi-argument constructors quote illegal characters as required by the components in which they appear. The percent character ('%') is always quoted by these constructors. Any other characters are preserved." It cuases some problem, though. 6181108 is one example. So in Mustang it has been changed, not necessarily to conform to the new RFC. The javadoc now reads: "The multi-argument constructors quote illegal characters as required by the components in which they appear. A character triplet, consisting of the percent character ('%') followed by two hexadecimal digits, is always considered as percent-encoded. The percent character ('%') in a percent-encoded triplet is not quoted any more; otherwise, it is always quoted by these constructors. Any other characters are preserved." Seems java.io.File(URI) needs to take this into consideration.
08-03-2006

WORK AROUND There's a partial workaround for the supplied test case. To make the first half of this test, f.equals(f2), works, most likely java.io.File(URI) needs to be amended. To make the second half works, modify code from // Same thing via getPath() String filestr = f.toString(); String uripath = f.toURI().getPath(); System.out.println(((filestr.equals(uripath)) ? "PASS " : "FAIL ") + filestr + " vs " + uripath); to // Same thing via getPath() String filestr = f.toString(); String uripath = f.toURI().getRawPath(); System.out.println(((filestr.equals(uripath)) ? "PASS " : "FAIL ") + filestr + " vs " + uripath); This is exactly what the spec of java.net.URI.getPath() & getRawPath() said.
08-03-2006

EVALUATION It appears that changes to java.net.URI to conform to the new RFC have invalidated a guarantee which appears in the spec. URI needs to be modified such that File.toURI() is still compliant with its spec.
07-03-2006