Bug ID: JDK-7014220 UTF lexical presentation of some new digits accepted by XML document validator with JAXP 1.4.5

Type: Bug
Component: xml
Sub-Component: javax.xml.validation
Affected Version: 1.4.0,7

Priority: P3
Status: Closed
Resolution: Fixed
OS: generic
CPU: generic

Submitted: 2011-01-24
Updated: 2012-04-25
Resolved: 2011-01-27

Versions (Unresolved/Resolved/Fixed)

The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.

Other	JDK 7
1.4.0 1.4Fixed	7Fixed

See CR 6971190 where the similar problem is described. 
With the same schema as in CR 6971190 the following xml document will be accepted since JDK 7 b126:

<?xml version="1.0"?>
<doc  xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'
    xsi:noNamespaceSchemaLocation='reS51.xsd' >

<!--
base='string', pattern='\d', value='#x0BE6;', type='invalid', RULE='37'
-->

      <elem att='&#x0BE6;'/>

 </doc>

The "x0BE6;" is "TAMIL DIGIT ZERO" according to Unicode 6 ( http://www.unicode.org/Public/6.0.0/ucd/UnicodeData.txt) which was integrated into JDK 7 since b 121(See CR 6959267).

The issue also exists for the following symbols:
x0BF0 - TAMIL NUMBER TEN
x0F2A - TIBETAN DIGIT HALF ONE
x1372 - ETHIOPIC NUMBER TEN

Regression since JDK 7 b126. Looks like caused by jaxp 1.4.5 was integrated (See CR 7007257).

EVALUATION While fixing 6971190, chars more than required by the jck/w3c tests were added, for example, TAMIL DIGIT ZERO. U+0BE6, Tamil zero was added as of Unicode 4.1. The current jck/w3c tests contains a negative tests for these chars.

27-01-2011