Name: rmT116609 Date: 05/21/2002
FULL PRODUCT VERSION :
java version "1.4.0"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.0-b92)
Java HotSpot(TM) Client VM (build 1.4.0-b92, mixed mode)
FULL OPERATING SYSTEM VERSION :
Linux 2.4.17, SuSE 7.3, Windows 2000, Solaris 2.8
A DESCRIPTION OF THE PROBLEM :
Pattern.compile("[\\p{InLatinExtended-B}]*");
results in:
java.util.regex.PatternSyntaxException: Unknown character family {LatinExtended-B} near index 21
it's of course workaroundable with:
Pattern.compile("[\u0180-\u024F]*");
but it's a bug! :)
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
1) Compile the test case(test.java)
2) Run it.
ERROR MESSAGES/STACK TRACES THAT OCCUR :
Exception in thread "main" java.util.regex.PatternSyntaxException: Unknown character family {LatinExtended-B} near index 21
[\p{InLatinExtended-B}]*
^
at java.util.regex.Pattern.error(Pattern.java:1472)
at java.util.regex.Pattern.familyError(Pattern.java:2137)
at java.util.regex.Pattern.retrieveFamilyNode(Pattern.java:2114)
at java.util.regex.Pattern.family(Pattern.java:2096)
at java.util.regex.Pattern.range(Pattern.java:2024)
at java.util.regex.Pattern.clazz(Pattern.java:1991)
at java.util.regex.Pattern.sequence(Pattern.java:1529)
at java.util.regex.Pattern.expr(Pattern.java:1489)
at java.util.regex.Pattern.compile(Pattern.java:1257)
at java.util.regex.Pattern.<init>(Pattern.java:1013)
at java.util.regex.Pattern.compile(Pattern.java:760)
at test.main(test.java:38)
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
import java.util.regex.*;
public class test {
public static void main(String args[]) throws Throwable {
Pattern.compile("[\\p{InLatinExtended-B}]*");
}
}
---------- END SOURCE ----------
CUSTOMER WORKAROUND :
Pattern.compile("[\u0180-\u024F]*");
(Review ID: 146813)
======================================================================