The regular expression api has a number of convenient pre-defined character classes; e.g. \p{Lower} for lowercase ASCII, \p{InGreek} for Greek letters, etc. However, for some classes there are differences between the Unicode/regex notion of the class and the Java notion of the class. For example, the JLS notion of white space is *not* the same as the \p{Space} set since the JLS does not include vertical tab (\v a.k.a. \x0B). Additionally, the Character class has many methods to help indentify certain classes of characters, including 3 methods with different definitions of whitespace. It would be useful if there were documented regex character classes for each of the is* methods in Character. Beyond documenting corresponding regular expression, new character classes for sets defined in Character could be defined.
Having regular expressions for the character sets in Character would ease writing regular expression to precisely recognize Java constructs.