Name: gm110360 Date: 06/02/2003
FULL PRODUCT VERSION :
java version "1.4.2-beta"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2-beta-b19)
Java HotSpot(TM) Client VM (build 1.4.2-beta-b19, mixed mode)
FULL OS VERSION :
Windows XP
A DESCRIPTION OF THE PROBLEM :
I wanted to match in a string everything, except '>'. I use the regex "[^>]" But actually it doesn't match the character "\u203A" (The HTML-character ›) as well.
The same applies to '<' and '\u2039', the html ‹ character.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
just run the program below.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
using JRE1.4.1, you get the correct result (last line is important):
C:\> c:\Programme\Java\j2re1.4.1_02\bin\java -classpath classes PatternGTTest
Pattern '>' matches '>'
Pattern '>' does not match '?'
Pattern '[^>]' does not match '>'
Pattern '[^>]' matches '?'
ACTUAL -
using JRE1.4.2-beta, you get an incorrect result (see last line):
C:\> c:\Programme\Java\j2re1.4.2\bin\java -classpath classes PatternGTTest
Pattern '>' matches '>'
Pattern '>' does not match '?'
Pattern '[^>]' does not match '>'
Pattern '[^>]' does not match '?'
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
import java.util.regex.*;
public class PatternGTTest {
public static void main(String[] args) throws Exception {
checkMatch(">", ">");
checkMatch(">", "\u203A"); // ›
checkMatch("[^>]", ">");
checkMatch("[^>]", "\u203A");
}
public static void checkMatch(String pat, String in) {
System.out.print("Pattern '" + pat + "'");
Pattern p = Pattern.compile(pat);
if (!p.matcher(in).matches()) System.out.print(" does not match ");
else System.out.print(" matches ");
System.out.println("'" + in + "'");
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
use java 1.4.1
Release Regression From : 1.4.1_03
The above release value was the last known release where this
bug was known to work. Since then there has been a regression.
(Review ID: 186810)
======================================================================