JDK-6760982 : JVM 1.6 Xerces Parser Corrupts Attribute Value
  • Type: Bug
  • Component: xml
  • Sub-Component: org.w3c.dom
  • Affected Version: 6
  • Priority: P4
  • Status: Closed
  • Resolution: Duplicate
  • OS: linux
  • CPU: x86
  • Submitted: 2008-10-17
  • Updated: 2012-04-25
  • Resolved: 2009-05-20
Related Reports
Duplicate :  
Description
FULL PRODUCT VERSION :
[root@localhost Download]# /usr/local/jdk1.6.0_06/jre/bin/java -version
java version "1.6.0_06"
Java(TM) SE Runtime Environment (build 1.6.0_06-b02)
Java HotSpot(TM) Client VM (build 10.0-b22, mixed mode, sharing)
[root@localhost Download]# /usr/local/jdk1.6.0_10/jre/bin/java -version
java version "1.6.0_10-beta"
Java(TM) SE Runtime Environment (build 1.6.0_10-beta-b14)
Java HotSpot(TM) Client VM (build 11.0-b11, mixed mode, sharing)


ADDITIONAL OS VERSION INFORMATION :
Linux localhost.localdomain 2.6.23.8-63.fc8 #1 SMP Wed Nov 21 18:51:08 EST 2007 i686 i686 i386 GNU/Linux


A DESCRIPTION OF THE PROBLEM :
Problem
-------
Sun's JVM 1.6 (6u6-linux and 6u10-beta-linux) and its associated
Xerces parser corrupts an attribute value of the enclosed XML
document, PrintXML.xml (compiled with '-source 1.4' for testing
convenience), once node.getChildNodes () is invoked (see attached
code, PrintXML.java).  Instead of producing

 Test (class com.sun.org.apache.xerces.internal.dom.DeferredElementImpl):
     mytest (class com.sun.org.apache.xerces.internal.dom.DeferredElementImpl):
         <attribs> (class com.sun.org.apache.xerces.internal.dom.AttributeMap):
             Y = []
             Z = ZZ[]
             a = []
             b = []
             c = []
             d = []
             e = []
             f = []

it produces

 Test (class com.sun.org.apache.xerces.internal.dom.DeferredElementImpl):
     mytest (class com.sun.org.apache.xerces.internal.dom.DeferredElementImpl):
         <attribs> (class com.sun.org.apache.xerces.internal.dom.AttributeMap):
             Y = ZZ    <-- CORRUPTED VALUE
             Z = ZZ[]
             a = []
             b = []
             c = []
             d = []
             e = []
             f = []

This corruption does _not_ occur with Sun's JVM 1.4 or 1.5, or when
Xerces >= 1.4.4 is explicitly included.  (I surmise that Sun's JVM 1.6
introduces a bug that causes a latent Xerces < 1.4.4 bug to become
manifest; however, I don't know what version of Xerces Sun
distributes.)


Test Case Variations
---- ---- ----------
The number of attributes matters but their names do not; if attributes
are added or removed (I've only varied the count by +/- 1 and 2)
before 'Y', the bug is not manifested.  However, adding variations on
the 'Y'/'Z' attribute couplet _after_ the 'Z' attribute will
(repeatedly) manifest the bug.

The attribute values must have '[' and ']', but it does not matter
what or how may characters precede or follow these characters.  If any
attribute prior to 'Y' does not contain both left and right bracket,
the bug is not manifested.

The order of the attributes matters -- if 'Z' precedes 'Y', the bug is
not manifested.


User Workaround
---- ----------
Distribute Xerces >= 1.4.4 jar files with application.


JVM Workaround
--- ----------
Update XML parser to a more recent version of Xerces and fix
underlying JVM bug.


STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Compile PrintXML.java (see below) with '-source 1.4':

    javac -source 1.4 PrintXML.java

Run with

     java -cp . PrintXML PrintXML.xml


EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
 Test (class com.sun.org.apache.xerces.internal.dom.DeferredElementImpl):
     mytest (class com.sun.org.apache.xerces.internal.dom.DeferredElementImpl):
         <attribs> (class com.sun.org.apache.xerces.internal.dom.AttributeMap):
             Y = []
             Z = ZZ[]
             a = []
             b = []
             c = []
             d = []
             e = []
             f = []

ACTUAL -
 Test (class com.sun.org.apache.xerces.internal.dom.DeferredElementImpl):
     mytest (class com.sun.org.apache.xerces.internal.dom.DeferredElementImpl):
         <attribs> (class com.sun.org.apache.xerces.internal.dom.AttributeMap):
             Y = ZZ
             Z = ZZ[]
             a = []
             b = []
             c = []
             d = []
             e = []
             f = []


ERROR MESSAGES/STACK TRACES THAT OCCUR :
No error message; the artifact is a corrupted XML attribute value.


REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
// PrintXML.java

import java.io.File;
import java.io.FileReader;
import java.io.Reader;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;

import org.w3c.dom.Document;
import org.w3c.dom.NamedNodeMap;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.InputSource;


public class PrintXML {

	private PrintXML () {
	}
		
	private static void _Flush ()
	{
		System.out.flush ();
		System.err.flush ();
	}

	private static void _Println (String str, int level)
	{
		for (int i = 0; i < level; i++)
			System.out.print ("    ");
	
		System.out.println (str);
		System.out.flush ();
	}

	private static void _ErrPrintln (String aStr)
	{
		System.out.flush ();
		System.err.println (aStr);
		System.err.flush ();
	}
	
	private static Document _Parse (File f)
	throws Exception
	{
		FileReader rd = new FileReader (f);
		Document doc = _Parse (rd);

		rd.close ();

		return doc;
	}

	private static Document _Parse (Reader src)
	throws Exception
	{
		DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance ();

		dbf.setValidating (false);	// to improve performance

		DocumentBuilder xmlParser = dbf.newDocumentBuilder ();
		InputSource is = new InputSource (src);

		return xmlParser.parse (is);
	}

	private static void _PrintAttributes (Node n, int level)
	{
		NamedNodeMap nnmap = n.getAttributes ();

		if (nnmap != null && nnmap.getLength () > 0) {
			_Println ("<attribs> (" + nnmap.getClass () + "):", level + 1);

			for (int i = 0; i < nnmap.getLength (); i++) {
				Node an = nnmap.item (i);
				
				String nameStr  = an.getNodeName ();
				String valueStr = an.getNodeValue ();
				
				if (valueStr != "")
					nameStr += " = " + valueStr;

				_Println (nameStr, level + 2);
			}
		}
	}

	private static void _ProcessChildren (Node n, int level)
	throws Exception
	{
		NodeList nlist = n.getChildNodes ();

		if (nlist != null)
			for (int i = 0; i < nlist.getLength (); i++)
				_ProcessNode (nlist.item (i), level + 1);
	}

	private static void _ProcessNode (Node n, int level)
	throws Exception
	{
		n.getAttributes ();
		n.getChildNodes ();
		
		// At this point, for JVM 1.6 and Xerces <= 1.3.1, Test-XML.xml::mytest:Y's attribute is (already) bad.
				
		switch (n.getNodeType()) {
	
			case Node.TEXT_NODE:
				String str = n.getNodeValue ().trim ();
				
				/*...Only print non-empty strings...*/
				if (str.length () > 0) {
					String valStr = n.getNodeValue ();
					
					_Println (valStr, level);
				}
				break;
	
			case Node.COMMENT_NODE:
				break;
	
			default: {
				String  nodeNameStr = n.getNodeName ();

				_Println (nodeNameStr + " (" + n.getClass () + "):", level);
				
				/*...Print children...*/
				_ProcessChildren (n, level);

				/*...Print optional node attributes...*/
				_PrintAttributes (n, level);
			}
		}
	}
	
	/**
	 * @param args
	 */
	public static void main (String[] args) {

		String xmlFile = null;
		
		/*...Process CLI arguments...*/
		for (int i = 0; i < args.length; i++) {
			String argStr = args[i].trim ();
			
			if (xmlFile == null)
				xmlFile = argStr;
			else
				_ErrPrintln ("Unknown argument: " + argStr);
		}

		if (xmlFile == null) {
			_ErrPrintln ("Error: missing <xml file>");
		}
		else {
			try {
				Document xmlDoc = _Parse (new File (xmlFile));
				Node     node   = xmlDoc.getDocumentElement ();
				
				_ProcessNode (node, 0);
				_Flush ();
			}
			catch (Exception e) {
				_ErrPrintln ("Exception: " + e.toString ());
				e.printStackTrace ();
			}
		}
		
		_Flush ();
	}

}


//////////////////////////  Test XML file: PrintXML.xml ///////////////////////////

<?xml version="1.0" encoding="UTF-8"?>

<Test>
  <mytest  a= '[]'
           b= '[]'
           c= '[]'
           d= '[]'
           e= '[]'
           f= '[]'
           Y= '[]'
           Z= 'ZZ[]'
  />
</Test>

---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
Distribute Xerces >= 1.4.4 jar files with application or use Sun JVM < 1.6.

(Note that this bug does not give me confidence in  JVM 1.6's reliability.)

Comments
EVALUATION Refer to the evaluation for 6690015. Thanks again for the comments and votes. I appreciate your help on trying to get the issue resolved.
20-05-2009