ADDITIONAL SYSTEM INFORMATION :
$ java -version
openjdk version "11.0.1" 2018-10-16
OpenJDK Runtime Environment 18.9 (build 11.0.1+13)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.1+13, mixed mode)
A DESCRIPTION OF THE PROBLEM :
When using the CASE_INSENSITIVE flag, the matching behavior of the POSIX character classes and a literal character class with the same set differs.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
See test program.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
The pattern "[a-z]" should behave the same as "\\p{Lower}" which in the docs it says is US-ASCII only and the same as "[a-z]".
ACTUAL -
When running with the CASE_INSENSITIVE flag, "[a-z]" will match an uppercase letter, but "\\p{Lower}" will not.
---------- BEGIN SOURCE ----------
// $ javac Test.java
// $ java -ea Test
// Exception in thread "main" java.lang.AssertionError
// at Test.main(Test.java:8)
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
Pattern p1 = Pattern.compile("[a-z]", Pattern.CASE_INSENSITIVE);
Pattern p2 = Pattern.compile("\\p{Lower}", Pattern.CASE_INSENSITIVE);
assert(p1.matcher("A").find() == p2.matcher("A").find());
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
Avoid using POSIX character classes.
FREQUENCY : always