JDK-8261970 : reutilization of org.w3c.dom.ls.LSSerializer,produces unexpected result in 8u271
  • Type: Bug
  • Component: xml
  • Affected Version: 8u261,17
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2021-02-18
  • Updated: 2021-05-17
  • Resolved: 2021-02-25
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 8
8u281Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Description
A customer raised an issue that reusing the LSSerializer object does not work after 8u251. In the below example setting useThreadLocalSerializer=true reuses the LSSerializer. While setting useThreadLocalSerializer=false, both 8u251 and 8u271 provide the same result, but with useThreadLocalSerializer=true, they provide different result.

Code Sample:

import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.CharConversionException;
import java.io.IOException;
import java.io.StringWriter;
import java.util.Arrays;

import javax.xml.XMLConstants;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;

import org.w3c.dom.DOMConfiguration;
import org.w3c.dom.DOMException;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.ls.DOMImplementationLS;
import org.w3c.dom.ls.LSException;
import org.w3c.dom.ls.LSOutput;
import org.w3c.dom.ls.LSSerializer;
import org.xml.sax.EntityResolver;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;

public class CodificationTest {

    /**
    * True for reuse LSSerializer. 
    */
    private static boolean useThreadLocalSerializer=true;
    private static boolean useThreadLocalLSOutput=true;

    public static void main(final String[] args) throws Throwable {
        System.setProperty("jdk.xml.isStandalone", "true");
        final String valueXML =
                "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<root><value>A��E��I��O��U��</value></root>";
        final Document doc = parseString(valueXML);
        final byte[] arrUtf8 = serialize(doc, "UTF-8", false, false);
        final byte[] arrWin1251 = serialize(doc, "WINDOWS-1252", false, false);
        String s = new String(arrUtf8);
        String s1 = new String(arrWin1251);
        System.out.println("byte[] (UTF-8)       : " + s);
        System.out.println("byte[] (WINDOWS-1252): " + s1);
    }

   
    static Document parseString(final String input) throws SAXException, CharConversionException {
        try {
            ByteArrayInputStream bais = null;
            Document newDoc = null;

            try {
                byte[] inputBinary = input.getBytes();

                if (!new String(inputBinary).equals(input)) {
                    inputBinary = input.getBytes("WINDOWS-1252");
                }
                
                if (!new String(inputBinary).equals(input)) {
                    inputBinary = input.getBytes("UTF-8");
                }
                bais = new ByteArrayInputStream(inputBinary);

                newDoc = getDocBuilder().parse(bais);
            } finally {
                if (bais != null) {
                    bais.close();
                }
            }

            return newDoc;
        } catch (final Exception e) {
            if (e instanceof CharConversionException) {
                throw (CharConversionException) e;
            } else if (e instanceof SAXParseException) {
                throw (SAXParseException) e;
            } else {
                System.out.println(input);
                e.printStackTrace();
            }
        }
        return null;
    }

  
    private static byte[] serialize(final Node node, final String encoding,
        final boolean prettyPrint, final boolean omitHead) throws Throwable {
        final DOMImplementationLS domImplLS = (DOMImplementationLS) getDocBuilder().getDOMImplementation();
        final LSSerializer lsSerializer;
        if (useThreadLocalSerializer) {
           lsSerializer = lsSerializerGenerator.get();
        } else {
           lsSerializer = domImplLS.createLSSerializer();
        }
        final DOMConfiguration domConfig = lsSerializer.getDomConfig();

        try {
            domConfig.setParameter("format-pretty-print", Boolean.valueOf(prettyPrint));
            domConfig.setParameter("xml-declaration", Boolean.valueOf(!omitHead));
        } catch (final DOMException de) {
            throw new Throwable(
                    "Error in XML", de);
        }

        final ByteArrayOutputStream outStream = new ByteArrayOutputStream();
        final LSOutput lsOutput;
        if (useThreadLocalLSOutput) {
           lsOutput = lsOutputGenerator.get();
        }else {
           lsOutput =  domImplLS.createLSOutput();
        }
        lsOutput.setEncoding(encoding);
        lsOutput.setByteStream(outStream);

       
        try {
            final Transformer t = TransformerFactory.newInstance().newTransformer();
            t.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
            final StringWriter sw = new StringWriter();
            t.transform(new DOMSource(node), new StreamResult(sw));
        } catch (final Exception e) {
        }

        try {
            lsSerializer.write(node, lsOutput);
            return outStream.toByteArray();
        } catch (final LSException lse) {
            throw new Throwable(lse.getMessage(), lse);
        }
    }

    private static DocumentBuilder getDocBuilder() {
        final DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        factory.setNamespaceAware(true);
        try {
            factory.setFeature("http://xml.org/sax/features/external-general-entities", false);

            factory.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
            factory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
            final DocumentBuilder builder = factory.newDocumentBuilder();
            builder.setEntityResolver(new EntityResolverImpl());
            return builder;
        } catch (final ParserConfigurationException pce) {
            return null;
        }
    }

    private static class EntityResolverImpl implements EntityResolver {
       
        @Override
        public InputSource resolveEntity(final String publicId, final String systemId)
                    throws SAXException, IOException {
             return new InputSource(new ByteArrayInputStream(new byte[0]));
        }
    }

    private static ThreadLocal<LSSerializer> lsSerializerGenerator =
            new ThreadLocal<LSSerializer>() {

                @Override
                protected LSSerializer initialValue() {
                    final DOMImplementationLS domImplLS = (DOMImplementationLS) getDocBuilder().getDOMImplementation();
                    return domImplLS.createLSSerializer();
                }
    };
    
 
    private static ThreadLocal<LSOutput> lsOutputGenerator = new ThreadLocal<LSOutput>() {

        @Override
        protected LSOutput initialValue() {
            final DOMImplementationLS domImplLS = (DOMImplementationLS) getDocBuilder().getDOMImplementation();
            return domImplLS.createLSOutput();
        }
    };
}



Expected:

byte[] (UTF-8)       : <?xml version="1.0" encoding="UTF-8"?>
<root><value>A��E��I��O��U��</value></root>
byte[] (WINDOWS-1252): <?xml version="1.0" encoding="WINDOWS-1252"?>
<root><value>A���E���I���O���U���</value></root>

Actual:

byte[] (UTF-8)       : <?xml version="1.0" encoding="UTF-8"?>
<root><value>A��E��I��O��U��</value></root>
byte[] (WINDOWS-1252): <?xml version="1.0" encoding="UTF-8"?>
<root><value>A��E��I��O��U��</value></root>
Comments
Encoding format with 8u271 is using UTF-8 for WINDOWS-1252 also. This is a regression from Xerces 2.12.0 update introduced in 8u261 and fails with the latest JDK also. If a node is already visited, the update does not process the encoding format and uses either the default/previous encoding format.
23-02-2021