United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
JDK-4706545 : Provide (or document) regex character classes for Java character classes

Details
Type:
Enhancement
Submit Date:
2002-06-21
Status:
Resolved
Updated Date:
2003-07-07
Project Name:
JDK
Resolved Date:
2003-07-07
Component:
core-libs
OS:
generic
Sub-Component:
java.util.regex
CPU:
generic
Priority:
P4
Resolution:
Fixed
Affected Versions:
5.0
Fixed Versions:
5.0 (tiger)

Related Reports

Sub Tasks

Description
The regular expression api has a number of convenient pre-defined character classes; e.g. \p{Lower} for lowercase ASCII, \p{InGreek} for Greek letters, etc.  However, for some classes there are differences between the Unicode/regex notion of the class and the Java notion of the class.  For example, the JLS notion of white space is *not* the same as the \p{Space} set since the JLS does not include vertical tab (\v a.k.a. \x0B).  Additionally, the Character class has many methods to help indentify certain classes of characters, including 3 methods with different definitions of whitespace.  It would be useful if there were documented regex character classes for each of the is* methods in Character.  Beyond documenting corresponding regular expression, new character classes for sets defined in Character could be defined.

Having regular expressions for the character sets in Character would ease writing regular expression to precisely recognize Java constructs.

                                    

Comments
EVALUATION

Sounds like a reasonable suggestion.
###@###.### 2002-06-24

Character classes have been added to match the isXXX methods in Character that are not deprecated.
###@###.### 2003-06-05
                                     
2002-06-24
CONVERTED DATA

BugTraq+ Release Management Values

COMMIT TO FIX:
tiger

FIXED IN:
tiger

INTEGRATED IN:
tiger
tiger-b10


                                     
2004-06-14



Hardware and Software, Engineered to Work Together