JDK-8038087 : javax.xml.parsers.DocumentBuilder.parse(java.io.InputStream) reads incorrect characters at specific positions
  • Type: Bug
  • Component: xml
  • Sub-Component: javax.xml.parsers
  • Priority: P3
  • Status: Resolved
  • Resolution: Duplicate
  • Submitted: 2014-03-21
  • Updated: 2014-04-08
  • Resolved: 2014-04-08
Related Reports
Duplicate :  
Description
javax.xml.parsers.DocumentBuilder.parse(java.io.InputStream) reads incorrect characters 
at specific positions from an xml input file.

The behaviour is strictly reproducible.
The behaviour is not reproducible with Java SE 5.0.

The issue can be demonstrated on record 1021 of a certain xml file:

Java SE 5.0 reads the correct data item:
% /jdk1.5.0_09/bin/java BugReproductor 1020
TITAL
%

Java SE 8 reads incorrectly:
% /jdk1.8.0/bin/java BugReproductor 1020
ZSTAL
%

% /jdk1.8.0/bin/java -version
java version "1.8.0"
Java(TM) SE Runtime Environment (build 1.8.0-b132)
Java HotSpot(TM) 64-Bit Server VM (build 25.0-b70, mixed mode)
%

The incorrect behaviour is reproducible with Java SE 7:
% /jdk1.7.0_51/bin/java BugReproductor 1020
ZSTAL
%

The incorrect behaviour is reproducible with Java SE 6:
% /jdk1.6.0_60/bin/java BugReproductor 1020
ZSTAL
%
Comments
The following bug can be related (XML 1.1 parser) to observed failures: https://bugs.openjdk.java.net/browse/JDK-8027359. It was fixed across all JDKs: will try to execute attached test case on fixed binaries.
08-04-2014

There is a workaround: using classes 'java.io.InputStreamReader' and 'org.xml.sax.InputSource' and set the encoding explicitly prevents the issue from happening: % diff src.1/XMLParser.java src.2/XMLParser.java 4a5,9 > // added 20140206 > import java.io.Reader; > import java.io.InputStreamReader; > import org.xml.sax.InputSource; > // end added 20140206 20c25,31 < return documentBuilder.parse(data); --- > // added 20140206 > Reader reader = new InputStreamReader(data,"UTF-8"); > InputSource is = new InputSource(reader); > is.setEncoding("UTF-8"); > return documentBuilder.parse(is); > // end added 20140206 > // return documentBuilder.parse(data); % % /jdk1.5.0_09/bin/java BugReproductor 1020 TITAL % /jdk1.8.0/bin/java BugReproductor 1020 TITAL % /jdk1.7.0_51/bin/java BugReproductor 1020 TITAL % /jdk1.6.0_60/bin/java BugReproductor 1020 TITAL %
21-03-2014