JDK-6970904 : Character sequence \w in an regex pattern is narrower than defined in the specification
  • Type: Bug
  • Component: xml
  • Sub-Component: org.xml.sax
  • Affected Version: 7
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2010-07-21
  • Updated: 2019-05-13
  • Resolved: 2011-08-26
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other JDK 7
1.4.0 1.4Fixed 7u4Fixed
Related Reports
Relates :  
Description
Enclosed test case RegexTest_234 contains the valid xml document RegexTest_234.xml for the valid schema RegexTest_234.xsd.

The specification (http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/datatypes.html#regexs) states: 
Character sequence:      Equivalent ��character class:

\w                       [#x0000-#x10FFFF]-[\p{P}\p{Z}\p{C}] 
                         (all characters except the set of "punctuation", "separator" and "other" characters)

The character sequence in xml document is foo#xcab1 bar#xcab1, the regex pattern is (\w+)\s+(\w+), validation of the xml document against the schema fails with the exception:
SAX error: file:~/devel/analysis/RegexTest_234.xml(1,129): cvc-pattern-valid: Value 'foo�� bar��' is not facet-valid with respect to pattern '(\w+)\s+(\w+)' for type '#AnonType_valuedoc'.

Although the document is valid.

Comments
EVALUATION This test now passes with the current jaxp build: xml_schema/msData/regex/jaxp/RegexTest_234.html#RegexTest_234.v
26-08-2011