JDK-6946312 : XML parser omits characters callback to ContentHandler since 6u18
  • Type: Bug
  • Component: xml
  • Sub-Component: org.xml.sax
  • Affected Version: 6u18
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2010-04-22
  • Updated: 2012-04-25
  • Resolved: 2010-04-28
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other JDK 6 JDK 7
1.4.0 1.4Fixed 6u20-revFixed 7Fixed
Related Reports
Relates :  
Description
Starting with update 18 of JDK 1.6.0, the bundled XML Parser is omitting a characters callback to the content handler in the following scenario:

-Using XMLReader
-Set a javax.xml.validation.Schema on the XMLReader
-The schema contains an xs:any with processContents="skip"
-Parse an XML document which has an element containing a text node for the any
-The characters method on the ContentHandler is not invoked for the text contianed in the element

If the Schema isn't set on the SAXParserFactory then the characters method is called as expected. Also if the ProcessContents="skip" is removed, then the characters method is also called as expected. Finally in any JDK version previous to jdk1.6.0_18 the characters method is always called as expected.

A testcase has been attached.

How to reproduce:
Extract the attachment of this CR and go to the folder called xmlparser.characters.issue.
Run java test/Test with the JDK of your choice. 6u17 looks fine, but with 6u18 through 6u20 the issue is reproducible.

$ java -showversion test/Test
java version "1.6.0_17"
Java(TM) SE Runtime Environment (build 1.6.0_17-b04)
Java HotSpot(TM) Server VM (build 14.3-b01, mixed mode)

Parse InputStream:
Characters called:

Characters called: text
Characters called:

$ java -showversion test/Test
java version "1.6.0_18"
Java(TM) SE Runtime Environment (build 1.6.0_18-b07)
Java HotSpot(TM) Server VM (build 16.0-b13, mixed mode)

Parse InputStream:

$ java -showversion test/Test
java version "1.6.0_19"
Java(TM) SE Runtime Environment (build 1.6.0_19-b04)
Java HotSpot(TM) Server VM (build 16.2-b04, mixed mode)

Parse InputStream:

$ java -showversion test/Test
java version "1.6.0_20"
Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
Java HotSpot(TM) Server VM (build 16.3-b01, mixed mode)

Parse InputStream:

Comments
EVALUATION This looks like regression introduced by fixes for 6564400 (6545684). In an attempt to improve performance of the 'ignore whitespace' patch, a mechnism was added to avoid complex tests performed in handleCharacters(). However, a flag used to indicate if the character is ignorable whitespace can sometimes fail to reset, thus may cause subsequent charcter event being skipped. Note that after the patch for 6564400 (6545684), ignorable whitespace will be not be reported unless the feature added in that patch is enabled as demonstrated below: saxParserFactory.setFeature("http://java.sun.com/xml/schema/features/report-ignored-element-content-whitespace", true); Without enabling the above-mentioned feature, the testcase submitted in this bug report will report only "text" character event, that is, printing out "Characters called: text". However, ignorableWhitespace will be invoked. *** (#1 of 1): [ UNSAVED ] ###@###.###
26-04-2010