JDK-7038254 : Xml document validator accepts value that not in set of characters identified with PrivateUse block
  • Type: Bug
  • Component: xml
  • Sub-Component: javax.xml.validation
  • Affected Version: 1.4.0
  • Priority: P3
  • Status: Closed
  • Resolution: Not an Issue
  • OS: generic
  • CPU: generic
  • Submitted: 2011-04-20
  • Updated: 2012-04-25
  • Resolved: 2011-04-21
Related Reports
Relates :  
Description
The validator uses the following schema:

<?xml version="1.0"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">

<xsd:element name="doc">
    <xsd:complexType>
            <xsd:choice>
                <xsd:element name="elem" type="Regex" minOccurs="1" maxOccurs="unbounded"/>
            </xsd:choice>
    </xsd:complexType>
</xsd:element>

<xsd:simpleType name="Regex">
       <xsd:restriction base="xsd:string">
           <xsd:pattern value="\p{IsPrivateUse}?"/>   
       </xsd:restriction>
</xsd:simpleType>

</xsd:schema>


and the xml document:

<?xml version="1.0"?>
<doc  xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'
    xsi:noNamespaceSchemaLocation='reM98.xsd' >

     <elem>&#x100000;</elem>

 </doc>

According to the
http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/datatypes.html#regexs
the block escape \p{IsPrivateUse} defines the set of characters started from code #xE000 and finished with code #xF8FF

So the value x100000 shouldn't be accepted by validator.
In that time if the value of element in the xml will be greater of equal to the xF0000  the validator accept such xml document as valid. The last value when the validator works correctly is xEFFFF.