JDK-7014220 : UTF lexical presentation of some new digits accepted by XML document validator with JAXP 1.4.5
  • Type: Bug
  • Component: xml
  • Sub-Component: javax.xml.validation
  • Affected Version: 1.4.0,7
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2011-01-24
  • Updated: 2012-04-25
  • Resolved: 2011-01-27
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other JDK 7
1.4.0 1.4Fixed 7Fixed
Related Reports
Relates :  
Description
See CR 6971190 where the similar problem is described. 
With the same schema as in CR 6971190 the following xml document will be accepted since JDK 7 b126:

<?xml version="1.0"?>
<doc  xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'
    xsi:noNamespaceSchemaLocation='reS51.xsd' >

<!--
base='string', pattern='\d', value='#x0BE6;', type='invalid', RULE='37'
-->

      <elem att='&#x0BE6;'/>

 </doc>

The "x0BE6;" is "TAMIL DIGIT ZERO" according to Unicode 6 ( http://www.unicode.org/Public/6.0.0/ucd/UnicodeData.txt) which was integrated into JDK 7 since b 121(See CR 6959267).

The issue also exists for the following symbols:
x0BF0 - TAMIL NUMBER TEN
x0F2A - TIBETAN DIGIT HALF ONE
x1372 - ETHIOPIC NUMBER TEN

Regression since JDK 7 b126. Looks like caused by jaxp 1.4.5 was integrated (See CR 7007257).

Comments
EVALUATION While fixing 6971190, chars more than required by the jck/w3c tests were added, for example, TAMIL DIGIT ZERO. U+0BE6, Tamil zero was added as of Unicode 4.1. The current jck/w3c tests contains a negative tests for these chars.
27-01-2011