JDK-4238263 : URLEncoder specification incorrect about encoding algorithm
  • Type: Bug
  • Component: docs
  • Sub-Component: guides
  • Affected Version: 1.2.2
  • Priority: P4
  • Status: Closed
  • Resolution: Duplicate
  • OS: solaris_2.5
  • CPU: sparc
  • Submitted: 1999-05-14
  • Updated: 2000-10-10
  • Resolved: 2000-10-10
Related Reports
Duplicate :  
Description

Name: mgC56079			Date: 05/14/99



Javadoc URLEncoder specification says:
-------------------
All other characters are converted into the 3-character string "%xy", where xy is the two-digit
      hexadecimal representation of the lower 8-bits of the character.
                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 
-------------------

Actually, in current implementation, each character translates into bytes according to a platform 
default character encoding using CharToByteConverter, not by taking lower 8-bits of the character. 

------------------- URLEncoder.java
  public static String encode(String s) {
        int maxBytesPerChar = 10;
        StringBuffer out = new StringBuffer(s.length());
        ByteArrayOutputStream buf = new ByteArrayOutputStream(maxBytesPerChar);
        OutputStreamWriter writer = new OutputStreamWriter(buf);

        for (int i = 0; i < s.length(); i++) {
            int c = (int)s.charAt(i);
            if (dontNeedEncoding.get(c)) {
                if (c == ' ') {
                    c = '+';
                }
                out.append((char)c);
            } else {
                // convert to external encoding before hex conversion
                try {
                    writer.write(c);
                    writer.flush();
                } catch(IOException e) {
                    buf.reset();
                    continue;
                }
                byte[] ba = buf.toByteArray();
                for (int j = 0; j < ba.length; j++) {
                    out.append('%');
                    char ch = Character.forDigit((ba[j] >> 4) & 0xF, 16);
                    // converting to use uppercase letter as part of
                    // the hex value if ch is a letter.
                    if (Character.isLetter(ch)) {
                        ch -= caseDiff;
                    }
                    out.append(ch);
                    ch = Character.forDigit(ba[j] & 0xF, 16);
                    if (Character.isLetter(ch)) {
                        ch -= caseDiff;
                    }
                    out.append(ch);
                }
                buf.reset();
            }
        }
    }
-------------------

It should be reflected in specification.


======================================================================

Comments
EVALUATION Closing as duplicate of 4257115 scott.hommel@eng 2000-10-10
10-10-2000