JDK-8262285 : XML Transform using indent creates empty (whitespaces) lines
  • Type: Bug
  • Component: xml
  • Sub-Component: javax.xml.transform
  • Affected Version: 11,15,17
  • Priority: P3
  • Status: Resolved
  • Resolution: Not an Issue
  • Submitted: 2021-02-18
  • Updated: 2021-06-17
  • Resolved: 2021-06-17
Related Reports
Duplicate :  
Relates :  
Description
ADDITIONAL SYSTEM INFORMATION :
Test on macOS (but I guess is for all OS).

Tested with:
Oracle 1.8.0_281-b09 - OK
AdoptOpenJDK 1.8.0_282-b08 - OK
OpenJDK 9.0.4+11 - FAIL
OpenJDK 10.0.2+13 - FAIL
OpenJDK 11.0.2+9 - FAIL
AdoptOpenJDK 11.0.10+9 - FAIL
OpenJDK 12.0.2+10 - FAIL
OpenJDK 13.0.2+8 - FAIL
OpenJDK 14.0.2+12-46 - FAIL
OpenJDK 15.0.2+7-27 - FAIL
OpenJDK 16+36-2231 - FAIL
OpenJDK 17-ea+9-653 - FAIL

A DESCRIPTION OF THE PROBLEM :
There is a change in behavior of the XML Transformer indentation that we saw trying to upgrade from JDK 8 to 11.

It seems that after a transformation is applied the Text Nodes with spaces are put in a separated line producing a result where a "blank" line is interleaved on each line between Element Nodes. 
This "blank" line contains the characters used for indentation of the node following it plus a newline.

This is noticeable even without making modifications to the input XML before the transformation.
If another transformations is executed over the output then a new "blank" line is added to the already previously added between actual Element Nodes.

There are multiple reports online of this behavior (like https://stackoverflow.com/q/58478632) but couldn't find something on the bugs DB. I think that the closest/related ones could be:
https://bugs.openjdk.java.net/browse/JDK-8230083
https://bugs.openjdk.java.net/browse/JDK-8223291

REGRESSION : Last worked in version 8u281

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Compile and run the provided example with JDK 8 and JDK 11 (or any other 9+).
See the new lines with only whitespaces (raw output and displaying whitespaces as '��' and tabs as '\t' is provided to better see the issue)

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
$ rm -f *.class; ./oracle/jdk1.8.0_281/Contents/Home/bin/javac XmlBugExample.java; ./oracle/jdk1.8.0_281/Contents/Home/bin/java XmlBugExample

###### JAVA VERSION: Oracle Corporation 1.8.0_281-b09

###### RESULT:

<?xml version="1.0" encoding="UTF-8"?><users>
	<!-- pre-existing entry BEGIN -->
		    <user> <!-- a user -->
		        <name>A name</name>
		        <email>An email</email>
		    </user>
    <!-- pre-existing entry END -->
</users>


###### RESULT SHOWING WHITESPACES:

<?xml��version="1.0"��encoding="UTF-8"?><users>
\t<!--��pre-existing��entry��BEGIN��-->
\t\t��������<user>��<!--��a��user��-->
\t\t����������������<name>A��name</name>
\t\t����������������<email>An��email</email>
\t\t��������</user>
��������<!--��pre-existing��entry��END��-->
</users>
ACTUAL -
$ rm -f *.class; ./adoptopenjdk/jdk-11.0.10+9/Contents/Home/bin/javac XmlBugExample.java; ./adoptopenjdk/jdk-11.0.10+9/Contents/Home/bin/java XmlBugExample

###### JAVA VERSION: AdoptOpenJDK 11.0.10+9

###### RESULT:

<?xml version="1.0" encoding="UTF-8"?><users>

    <!-- pre-existing entry BEGIN -->

    <user>

        <!-- a user -->

        <name>A name</name>

        <email>An email</email>

    </user>

    <!-- pre-existing entry END -->

</users>


###### RESULT SHOWING WHITESPACES:

<?xml��version="1.0"��encoding="UTF-8"?><users>
��������\t
��������<!--��pre-existing��entry��BEGIN��-->
��������\t\t��������
��������<user>
������������������
����������������<!--��a��user��-->
����������������\t\t����������������
����������������<name>A��name</name>
����������������\t\t����������������
����������������<email>An��email</email>
����������������\t\t��������
��������</user>
����������������
��������<!--��pre-existing��entry��END��-->
��������
</users>

---------- BEGIN SOURCE ----------
import java.io.StringReader;
import java.io.StringWriter;

import javax.xml.transform.OutputKeys;
import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;

public class XmlBugExample {

    private static final String S_KEY_INDENT_AMOUNT = "{http://xml.apache.org/xalan}indent-amount";

    public static void main(String[] args) throws Exception {
        String inputXML = "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n"
                + "<users>\n"
                + "\t<!-- pre-existing entry BEGIN -->\n"
                + "\t\t    <user> <!-- a user -->\n"
                + "\t\t        <name>A name</name>\n"
                + "\t\t        <email>An email</email>\n"
                + "\t\t    </user>\n"
                + "    <!-- pre-existing entry END -->\n"
                + "</users>";

        Source in = new StreamSource(new StringReader(inputXML));
        StreamResult out = new StreamResult(new StringWriter());

        TransformerFactory transformerFactory = TransformerFactory.newInstance();
        Transformer transformer = transformerFactory.newTransformer();
        transformer.setOutputProperty(OutputKeys.INDENT, "yes");
        transformer.setOutputProperty(S_KEY_INDENT_AMOUNT, "4");
        transformer.transform(in, out);

        String result = out.getWriter().toString();

        System.out.println("\n###### JAVA VERSION: "
                + System.getProperty("java.vendor") + " "
                + System.getProperty("java.runtime.version"));
        System.out.println("\n###### RESULT:\n");
        System.out.println(result);
        System.out.println("\n###### RESULT SHOWING WHITESPACES:\n");
        System.out.println(result.replaceAll("\\t", "\\\\t").replaceAll(" ", "\u00B7"));
    }

}
---------- END SOURCE ----------

FREQUENCY : always



Comments
See Java 9 Release Notes (https://www.oracle.com/java/technologies/javase/9-notes.html) about changes in the Pretty-Print feature. Note that whitespaces are significant without a schema. The processor will treat a Text node as is, a Text node, regardless of its content. If what you want is a ���reformat��� (removing whitespaces and adding new indentation as shown in the code), consider applying a stylesheet to remove whitespaces. Workaround: use a stylesheet to remove empty Text nodes.
17-06-2021

The observations on Windows 10: JDK 8: Passed. JDK 11: Failed, a "blank" line is interleaved on each line. JDK 15: Failed. JDK 17ea+6: Failed.
24-02-2021