JDK-8173111 : Excessive recursion in EventFilterSupport when filtering over large number of XML events can cause StackOverflow
  • Type: Bug
  • Component: xml
  • Sub-Component: javax.xml.stream
  • Affected Version: 8,9
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2017-01-10
  • Updated: 2017-02-09
  • Resolved: 2017-01-24
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 10 JDK 9
10Fixed 9 b155Fixed
Description
FULL PRODUCT VERSION :
java version "1.8.0_102"
Java(TM) SE Runtime Environment (build 1.8.0_102-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode)

ADDITIONAL OS VERSION INFORMATION :
Darwin AirKrish.local 16.3.0 Darwin Kernel Version 16.3.0: Thu Nov 17 20:23:58 PST 2016; root:xnu-3789.31.2~1/RELEASE_X86_64 x86_64

A DESCRIPTION OF THE PROBLEM :
When streaming large XML files using an event filter, the EventFilterSupport#nextEvent method throws a StackOverflowError. Solution changes recursion to while loops instead.

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Created a project on GitHub that reproduces the issue:

https://github.com/NaanProphet/java-xml-stax-recursion-bug

The simple JUnit does the following:

1) Create a large XML file ~ 100 MB
2) Implement an EventFilter
3) Create a new event filtering XML reader using XMLInputFactory.newInstance(). xif.createFilteredReader
4) Read the file
5) Throws StackOverflowError

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Expected the XML reader to skip all events successfully without throwing an exception
ACTUAL -
JVM crashes with StackOverflowError

ERROR MESSAGES/STACK TRACES THAT OCCUR :
java.lang.StackOverflowError
	at com.sun.org.apache.xerces.internal.util.SymbolTable.addSymbol(SymbolTable.java:216)
	at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.scanQName(XMLEntityScanner.java:853)
	at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:193)
	at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2784)
	at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602)
	at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112)
	at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:553)
	at com.sun.xml.internal.stream.XMLEventReaderImpl.nextEvent(XMLEventReaderImpl.java:83)
	at javax.xml.stream.util.EventReaderDelegate.nextEvent(EventReaderDelegate.java:89)
	at com.sun.xml.internal.stream.EventFilterSupport.nextEvent(EventFilterSupport.java:69)
	at com.sun.xml.internal.stream.EventFilterSupport.nextEvent(EventFilterSupport.java:76)
	at com.sun.xml.internal.stream.EventFilterSupport.nextEvent(EventFilterSupport.java:76)
	at com.sun.xml.internal.stream.EventFilterSupport.nextEvent(EventFilterSupport.java:76)
	at com.sun.xml.internal.stream.EventFilterSupport.nextEvent(EventFilterSupport.java:76)
	at com.sun.xml.internal.stream.EventFilterSupport.nextEvent(EventFilterSupport.java:76)
	at com.sun.xml.internal.stream.EventFilterSupport.nextEvent(EventFilterSupport.java:76)
	at com.sun.xml.internal.stream.EventFilterSupport.nextEvent(EventFilterSupport.java:76)
	at com.sun.xml.internal.stream.EventFilterSupport.nextEvent(EventFilterSupport.java:76)
	at com.sun.xml.internal.stream.EventFilterSupport.nextEvent(EventFilterSupport.java:76)
	at com.sun.xml.internal.stream.EventFilterSupport.nextEvent(EventFilterSupport.java:76)
	at com.sun.xml.internal.stream.EventFilterSupport.nextEvent(EventFilterSupport.java:76)
	at com.sun.xml.internal.stream.EventFilterSupport.nextEvent(EventFilterSupport.java:76)
	at com.sun.xml.internal.stream.EventFilterSupport.nextEvent(EventFilterSupport.java:76)
	at com.sun.xml.internal.stream.EventFilterSupport.nextEvent(EventFilterSupport.java:76)
	<line repeated 1000 more times>


Process finished with exit code 255

REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
Full program with XML test files available here: https://github.com/NaanProphet/java-xml-stax-recursion-bug




package com.bitwiseninja.staxbug;

import com.igormaznitsa.jute.annotations.JUteTest;
import org.junit.Assert;
import org.junit.Test;

import javax.xml.stream.EventFilter;
import javax.xml.stream.XMLEventReader;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.events.XMLEvent;
import java.io.*;
import java.lang.management.ManagementFactory;
import java.lang.management.RuntimeMXBean;
import java.util.List;
import java.util.NoSuchElementException;
import java.util.concurrent.atomic.AtomicBoolean;
import java.util.concurrent.atomic.AtomicLong;
import java.util.zip.GZIPInputStream;

/**
 * <p>
 *     Unit tests using a 100MB XML file. Uses JUte to kickoff tests in separate JVMs.
 *     Should be run from Maven so that classloader JVM options are respected
 * </p>
 * <p>
 *     See <code>src/test/resources/dblp.xsd</code> for the XML structure
 * </p>
 */
public class SimpleStaxTest {

    private static final String INPUT_XML_FILE = "target/test-classes/dblp.xml.gz";
    private static final String TAG_TO_FILTER = "dblp";
    private static final String JVM_XBOOTCLASSPATH_KEY = "-Xbootclasspath/p:";
    private static final String MAVEN_GENERATED_PATCH_LOCATION = "./target/classes/";
    private static final String JVM_ARG_PATCH = JVM_XBOOTCLASSPATH_KEY + MAVEN_GENERATED_PATCH_LOCATION;
    private static final String JVM_MAX_MEMORY_SIZE = "-Xmx256m";
    private static final String PATCH_FILE_PACKAGE = "com/sun/xml/internal/stream/";
    private static final String PATCHED_CLASS = "EventFilterSupport.class";


    /**
     * Stock JVM test for nextEvent. Throws StackOverflowError because algorithm uses recursion
     */
    @Test(expected = StackOverflowError.class)
    @JUteTest(jvmOpts = {JVM_MAX_MEMORY_SIZE})
    public void testNextEventRegularJdk() throws Exception {
        doTest(visitNextEvent);
    }

    /**
     * Patched JVM test for nextTag. Uses the compiled .class file from Maven during the <code>compile</code>
     * phase. Recursion is replaced with a while loop
     */
    @Test
    @JUteTest(jvmOpts = {JVM_ARG_PATCH}, printConsole = true)
    public void testNextEventPatchedJdk() throws Exception {
        checkPatchedJvmSetup();
        doTest(visitNextEvent);
    }

    private void checkPatchedJvmSetup() {
        // pre-check 1
        RuntimeMXBean runtimeMxBean = ManagementFactory.getRuntimeMXBean();
        List<String> arguments = runtimeMxBean.getInputArguments();
        Assert.assertTrue("Test is not running with patched JVM argument! " +
                "Please note this JUnit cannot be run from the IDE and must be run through Maven " +
                "in order for the JUnit to run with proper JVM arguments. " +
                "Current JVM arguments: " + arguments, jvmContainsPatch(arguments));

        // pre-check 2
        File patchFolder = new File(MAVEN_GENERATED_PATCH_LOCATION + PATCH_FILE_PACKAGE);
        File patchFile = new File(patchFolder, PATCHED_CLASS);
        Assert.assertTrue("Could not find patch file! " +
                "Please note this JUnit cannot be run from the IDE and must " +
                "be run through Maven in order for the JUnit to run with proper JVM arguments. " +
                "Expected patch file location: " + patchFile.getAbsolutePath(), patchFile.exists());
    }

    private boolean jvmContainsPatch(List<String> arguments) {
        boolean jvmContainsPatch = false;
        for (String jvmArg : arguments) {
            if (JVM_ARG_PATCH.equals(jvmArg)) {
                jvmContainsPatch = true;
                break;
            }
        }
        return jvmContainsPatch;
    }

    /**
     * Stress test that attempts to filter everything below the outermost XML tag
     *
     */
    private void doTest(Consumer<XMLEventReader> xmlEventReaderConsumer) throws IOException, XMLStreamException {
        XMLEventReader filteringEventReader = createXmlReader();

        System.out.println("Attempting to read entire XML file, filtering out tag " + TAG_TO_FILTER);
        visitAllEvents(filteringEventReader, xmlEventReaderConsumer);

        System.out.println("Great success, k bye");
    }

    private Consumer<XMLEventReader> visitNextEvent = new Consumer<XMLEventReader>() {
        @Override
        public void accept(XMLEventReader xmlEventReader) {
            try {
                xmlEventReader.nextEvent();
            } catch (XMLStreamException e) {
                throw new IllegalStateException("Could not read next XML event!", e);
            }
        }
    };

    private XMLEventReader createXmlReader() throws IOException, XMLStreamException {
        File xmlFile = new File(INPUT_XML_FILE);
        System.out.println("Reading XML file for test: " + xmlFile.getAbsolutePath());
        InputStream inputStream = new GZIPInputStream(new FileInputStream(xmlFile));
        XMLInputFactory xif = XMLInputFactory.newInstance();
        XMLEventReader simpleEventReader = xif.createXMLEventReader(inputStream);
        XmlTagEventFilter filter = new XmlTagEventFilter();
        filter.setTagToFilter(TAG_TO_FILTER);
        return xif.createFilteredReader(simpleEventReader, filter);
    }

    private static void visitAllEvents(XMLEventReader filteringEventReader, Consumer<XMLEventReader> xmlEventReaderConsumer) throws XMLStreamException {
        while (true) {
            try {
                // i.e. calls either nextEvent or nextTag
                xmlEventReaderConsumer.accept(filteringEventReader);
            } catch (NoSuchElementException e) {
                break;
            }
        }
    }

    /**
     * XML Event Filter that removes all a specific tag/block by name
     * <p>
     * Note: this class is stateful and must be instantiated for each unmarshaller
     */
    public static class XmlTagEventFilter implements EventFilter {

        private String tagToFilter;

        private AtomicBoolean filterOn = new AtomicBoolean();

        private AtomicLong numEventsFiltered = new AtomicLong();

        @Override
        public boolean accept(XMLEvent event) {
            if (event.isEndElement()) {
                String endTagName = event.asEndElement().getName().getLocalPart();
                if (tagToFilter.equals(endTagName)) {
                    // start accepting the next event
                    System.out.println("Turning off filter! End tag is " + endTagName);
                    System.out.println("Number of events filtered: " + numEventsFiltered.getAndSet(0L));
                    filterOn.set(false);
                    // but filter this one
                    return false;
                }
            }

            if (event.isStartElement()) {
                String startTagName = event.asStartElement().getName().getLocalPart();
                if (tagToFilter.equals(startTagName)) {
                    // exclude this tag and all tags inside
                    System.out.println("Turning on filter! Start tag is: " + startTagName);
                    filterOn.set(true);
                    return false;
                }
            }

            // if filter is off, accept the tag
            if (filterOn.get()) {
                numEventsFiltered.incrementAndGet();
            }
            return !filterOn.get();
        }

        /**
         * @param tagToFilter the name of the XML tag to filter
         */
        public void setTagToFilter(String tagToFilter) {
            this.tagToFilter = tagToFilter;
        }
    }

    /**
     * Java 8-esque static interface, since we're compiling to Java 6
     *
     * @param <T>
     */
    public interface Consumer <T>{
        void accept(T param);
    }
}
---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
Manually fixed com.sun.xml.internal.stream.EventFilterSupport and used the following JVM flag from the command line to overwrite the JDK with a patched class file

-Xbootclasspath/p:


Comments
Review request sent: http://cr.openjdk.java.net/~dfuchs/webrev_8173111/webrev.00/
23-01-2017

This looks like a design issue in com.sun.xml.internal.stream.EventFilterSupport where there is a recursive call that could be avoided in several places. For instance: public XMLEvent nextEvent()throws XMLStreamException{ if(super.hasNext()){ //get the next event by calling XMLEventReader XMLEvent event = super.nextEvent(); //if this filter accepts this event then return this event if(fEventFilter.accept(event)){ return event; } else{ return nextEvent(); } }else{ throw new NoSuchElementException(); } }//nextEvent() could probably be rewritten as: public XMLEvent nextEvent()throws XMLStreamException{ while (super.hasNext()){ //get the next event by calling XMLEventReader XMLEvent event = super.nextEvent(); //if this filter accepts this event then return this event if(fEventFilter.accept(event)){ return event; } } throw new NoSuchElementException(); }//nextEvent()
20-01-2017

To reproduce the issue, extract the attached test case and run using the command mvn test. Following are the results: JDK 8u101/102 - Fail JDK 9ea +148 -Fail Following is the extract of the Stackoverflowerror generated: [INFO] ?Error? [ERROR] null [ERROR] java.lang.StackOverflowError [ERROR] at com.sun.org.apache.xerces.internal.utils.XMLLimitAnalyzer.addValue(XMLLimitAnalyzer.java:119) [ERROR] at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.checkLimit(XMLEntityScanner.java:976) [ERROR] at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.scanQName(XMLEntityScanner.java:871) [ERROR] at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:193) [ERROR] at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2784) [ERROR] at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602) [ERROR] at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112) [ERROR] at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:553) [ERROR] at com.sun.xml.internal.stream.XMLEventReaderImpl.nextEvent(XMLEventReaderImpl.java:83) [ERROR] at javax.xml.stream.util.EventReaderDelegate.nextEvent(EventReaderDelegate.java:89) [ERROR] at com.sun.xml.internal.stream.EventFilterSupport.nextEvent(EventFilterSupport.java:69) [ERROR] at com.sun.xml.internal.stream.EventFilterSupport.nextEvent(EventFilterSupport.java:76) [ERROR] at com.sun.xml.internal.stream.EventFilterSupport.nextEvent(EventFilterSupport.java:76) [ERROR] at com.sun.xml.internal.stream.EventFilterSupport.nextEvent(EventFilterSupport.java:76) [ERROR] at com.sun.xml.internal.stream.EventFilterSupport.nextEvent(EventFilterSupport.java:76) [ERROR] at com.sun.xml.internal.stream.EventFilterSupport.nextEvent(EventFilterSupport.java:76) [ERROR] at com.sun.xml.internal.stream.EventFilterSupport.nextEvent(EventFilterSupport.java:76)
20-01-2017

From submitter: Inside SimpleStaxTest.java there are actually two tests: ��� testNextEventRegularJdk expects a StackOverflowError ��� testNextEventPatchedJdk expects no error, when using the patched fix The log shows that both tests pass, which suggests the error + fix were both recreated successfully on the Windows and Ubuntu machines you tried. Apologies if that wasn't very straightforward.
20-01-2017

To submitter: I could not reproduce the issue with the test case provided at https://github.com/NaanProphet/java-xml-stax-recursion-bug . I tried on Windows 7 and Ubuntu 14.0.4. Attached is the output when run with JDK 8u102 on ubuntu. Are you getting the issue only on Mac OS or with all the OS?
20-01-2017