JDK-7020382 : HTMLParser adds extra semicolons in attribute values
  • Type: Bug
  • Component: client-libs
  • Sub-Component: javax.swing
  • Affected Version: 7
  • Priority: P4
  • Status: Closed
  • Resolution: Cannot Reproduce
  • OS: windows_7
  • CPU: x86
  • Submitted: 2011-02-17
  • Updated: 2012-03-20
  • Resolved: 2011-05-17
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 8
8Resolved
Related Reports
Relates :  
Description
FULL PRODUCT VERSION :


A DESCRIPTION OF THE PROBLEM :
This bug is result of 6245596 "fix" .

<img src="http://www/test?a=b&c=d&e=f">
src will be changed to
<img src="http://www/test?a=b&c;=d&e;=f">

because of
String str = '&' + nm + ';';

and image will not be displayed.


My solution would be to add local variable suffix :
	
	String suffix = "";
	switch (ch) {
          case '\n':
            ln++;
            ch = readCh();
            lfCount++;
            break;

          case '\r':
            ln++;
            if ((ch = readCh()) == '\n') {
                ch = readCh();
                crlfCount++;
            }
            else {
                crCount++;
            }
            break;

          case ';':
	    suffix = ";";
            ch = readCh();
            break;
        }

and later

	String str = '&' + nm + suffix;



STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Run source code below and you will see wrong result. (with additional semicolons)


EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
<img src="http://www/test?a=b&c=d&e=f">
ACTUAL -
<img src="http://www/test?a=b&c;=d&e;=f">


REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
import java.io.IOException;
import java.io.StringReader;
import java.util.Enumeration;

import javax.swing.text.MutableAttributeSet;
import javax.swing.text.html.HTML;
import javax.swing.text.html.HTMLEditorKit.ParserCallback;
import javax.swing.text.html.parser.ParserDelegator;

/**
 *
 */

/**
 * @author Milos
 *
 */
public class HtmlParserTest extends ParserCallback {
	String text = "<img src=\"http://www/test?a=b&c=d&e=f\">";
	
    public void parse() throws IOException {
    	StringReader reader = new StringReader(text);
        ParserDelegator delegator = new ParserDelegator();
        delegator.parse(reader, this, true);  // the third parameter is TRUE to ignore charset directive
    }
    
    @Override
    public void handleSimpleTag(HTML.Tag t, MutableAttributeSet a, int pos) {
    	StringBuilder sb = new StringBuilder();
		sb.append("<" + t);
		for (Enumeration<?> en = a.getAttributeNames(); en.hasMoreElements(); ) {
			Object name = en.nextElement();
			String nameStr = name.toString();
			if (!nameStr.equals("_implied_") && !nameStr.equals("style")) {
				sb.append(" ").append(nameStr).append("=\"").append(a.getAttribute(name)).append("\"");
			}
		}
		sb.append(">");
		System.out.println(sb);
    }
    
    public static void main(String[] args) throws IOException {
		HtmlParserTest parser = new HtmlParserTest();
		parser.parse();
	}
}


---------- END SOURCE ----------