JDK-6794483 : Parsing of xml fails if included from one xml to another using xinclude tag
  • Type: Bug
  • Component: xml
  • Sub-Component: org.w3c.dom
  • Affected Version: 6u10
  • Priority: P2
  • Status: Closed
  • Resolution: Fixed
  • OS: windows_xp
  • CPU: x86
  • Submitted: 2009-01-15
  • Updated: 2015-06-05
  • Resolved: 2009-06-15
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other JDK 6 JDK 7
1.4.0 h1148Fixed 6u18Fixed 7Fixed
Related Reports
Relates :  
Description
FULL PRODUCT VERSION :
java version "1.6.0_10"
Java(TM) SE Runtime Environment (build 1.6.0_10-b33)
Java HotSpot(TM) Client VM (build 11.0-b15, mixed mode, sharing)

ADDITIONAL OS VERSION INFORMATION :
Microsoft Windows XP [Version 5.1.2600]

A DESCRIPTION OF THE PROBLEM :
After migrating to Java 6, xinclude fails to parse my xml having empty tags and thereby causing whole application to fail.

The issue is if an xml has an empty tag(<node3/> as shown in test2.xml below) and is included using <xinclude> tag in another xml file(as shown in test1.xml), the parse once encounter empty tag (node3/>) fails to read rest of the tags and treat them as empty tag (as shown in output below), thus causing data loss.

The same works fine with Java 5.

test1.xml
=======
<?xml version="1.0" encoding="UTF-8"?>
<scenario xsi:noNamespaceSchemaLocation="..\xsd\Scenario.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xi="http://www.w3.org/2001/XInclude">

<!-- Configuration data for Device-->
<Device>
<xi:include href="test2.xml" xpointer="element(/1/1)" parse="xml"/>
</Device>

</scenario>

test2.xml
=======
<test2>
<N1>
<node1>Node1 Value</node1>
<node2>Node2 Value</node2>
<node3/>
<node4>Node4 Value</node4>
<node5>
<node6>Node6 Value</node6>
</node5>
</N1>
</test2>


After running the attached code, below is the output printed:


<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<scenario xmlns:xi="http://www.w3.org/2001/XInclude" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="..\xsd\Scenario.xsd">

<!-- Configuration data for Device-->
<Device>
<N1 xml:base="test2.xml">
<node1>Node1 Value</node1>
<node2>Node2 Value</node2>
<node3/>
<node4/>
<node5>
<node6/>
</node5>
</N1>
</Device>

</scenario>

As you can see from above all the nodes after node3 are treated as empty tags and thus all the data were lost. Running the same piece of code in java5 works perfectly fine and prints all the tags and data within it.

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
1. Copy the following contents in test1.xml

<?xml version="1.0" encoding="UTF-8"?>
<scenario xsi:noNamespaceSchemaLocation="..\xsd\Scenario.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xi="http://www.w3.org/2001/XInclude">

<!-- Configuration data for Device-->
<Device>
<xi:include href="test2.xml" xpointer="element(/1/1)" parse="xml"/>
</Device>

</scenario>


2. Copy the following content in test2.xml
<test2>
<N1>
<node1>Node1 Value</node1>
<node2>Node2 Value</node2>
<node3/>
<node4>Node4 Value</node4>
<node5>
<node6>Node6 Value</node6>
</node5>
</N1>
</test2>

3. Run the attached code and check the output.

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Expected result should be

<?xml version="1.0" encoding="UTF-8"?>
<scenario xmlns:xi="http://www.w3.org/2001/XInclude" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="..\xsd\Scenario.xsd">
	<ScenarioName>Scenario4: Single site, multi-channel</ScenarioName>

	<!-- Configuration data for Device-->
	<Device>
		<N1>
		<node1>Node1 Value</node1>
		<node2>Node2 Value</node2>
		<node3/>
		<node4>Node4 Value</node4>
		<node5>
			<node6>Node6 Value</node6>
		</node5>
	</N1>
	</Device>

</scenario>

ACTUAL -
Actual output. Please have a look at node3 and rest of tags following it. Data is lost.

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<scenario xmlns:xi="http://www.w3.org/2001/XInclude" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="..\xsd\Scenario.xsd">
	<ScenarioName>Scenario4: Single site, multi-channel</ScenarioName>

	<!-- Configuration data for Device-->
	<Device>
		<N1 xml:base="Common.xml">
		<node1>Node1 Value</node1>
		<node2>Node2 Value</node2>
		<node3/>
            <node4/>
            <node5>
                <node6/>
            </node5>
        </N1>
	</Device>

</scenario>

ERROR MESSAGES/STACK TRACES THAT OCCUR :
No error message is printed

REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
import java.io.File;
 
import java.io.IOException;
 
import java.io.StringWriter;
 
 
 
import javax.xml.parsers.DocumentBuilder;
 
import javax.xml.parsers.DocumentBuilderFactory;
 
import javax.xml.parsers.ParserConfigurationException;
 
import javax.xml.transform.OutputKeys;
 
import javax.xml.transform.Transformer;
 
import javax.xml.transform.TransformerConfigurationException;
 
import javax.xml.transform.TransformerException;
 
import javax.xml.transform.TransformerFactory;
 
import javax.xml.transform.dom.DOMSource;
 
import javax.xml.transform.stream.StreamResult;
 
 
 
import org.w3c.dom.Document;
 
import org.xml.sax.SAXException;
 
 
 
public class XmlParserDemo {
 
public static void main(String[] args) {
 
 
 
XmlParserDemo xmlParserDemo = new XmlParserDemo();
 
Document doc = xmlParserDemo.parseXmlFile("test1.xml");
 
 
 
StringWriter sw = new StringWriter();
 
StreamResult result = new StreamResult(sw);
 
 
 
TransformerFactory transformerFact = TransformerFactory.newInstance();
 
transformerFact.setAttribute("indent-number", new Integer(4));
 
Transformer transformer;
 
try {
 
transformer = transformerFact.newTransformer();
 
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
 
transformer.setOutputProperty(OutputKeys.METHOD, "xml");
 
transformer.setOutputProperty(OutputKeys.MEDIA_TYPE, "text/xml");
 
 
 
//"true" indicate Append content If file exist in system
 
transformer.transform(new DOMSource(doc), result);
 
System.out.println("test" + sw);
 
} catch (TransformerConfigurationException ex) {
 
ex.printStackTrace();
 
} catch (TransformerException ex) {
 
ex.printStackTrace();
 
}
 
}
 
 
 
public Document parseXmlFile(String fileName){
 
System.out.println("Parsing XML file... " + fileName);
 
DocumentBuilder docBuilder = null;
 
Document doc = null;
 
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
 
docBuilderFactory.setCoalescing(true);
 
docBuilderFactory.setXIncludeAware(true);
 
System.out.println("Include: " + docBuilderFactory.isXIncludeAware());
 
docBuilderFactory.setNamespaceAware(true);
 
docBuilderFactory.setExpandEntityReferences(true);
 
try {
 
docBuilder = docBuilderFactory.newDocumentBuilder();
 
}
 
catch (ParserConfigurationException e) {
 
e.printStackTrace();
 
}
 
 
 
File sourceFile = new File(fileName);
 
try {
 
doc =  docBuilder.parse(sourceFile);
 
}
 
catch (SAXException e) {
 
e.printStackTrace();
 
}
 
catch (IOException e) {
 
e.printStackTrace();
 
}
 
System.out.println("XML file parsed");
 
return doc;
 
}
 
}

---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
All empty tags need to converted into proper tags to make it working ex.
convert <node3/> to <node3></node3>

But if xml file is generated on the fly this will still cause issue.

Release Regression From : 5.0u17
The above release value was the last known release where this 
bug was not reproducible. Since then there has been a regression.

Comments
EVALUATION Fixed in the jaxp 1.4 repository. One may get the fix by downloading the latest build from jaxp.dev.java.net or wait for jdk future update release.
15-06-2009

EVALUATION The bug exists in the jaxp 1.4/jdk6 codebase as indicated in the report. The problem is in the xinclude code that handles the parsing of included xml where a flag is set when empty elements are encountered.
15-06-2009