JDK-8169056 : StringIndexOutOfBoundsException in Pattern.compile with CANON_EQ flag
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.util.regex
  • Affected Version: 8
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2016-10-30
  • Updated: 2017-11-29
  • Resolved: 2017-03-23
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 8
8u152 b02Fixed
Description
FULL PRODUCT VERSION :
java version "1.8.0_112"
Java(TM) SE Runtime Environment (build 1.8.0_112-b15)
Java HotSpot(TM) 64-Bit Server VM (build 25.112-b15, mixed mode)

ADDITIONAL OS VERSION INFORMATION :
Microsoft Windows [version 10.0.14393]

A DESCRIPTION OF THE PROBLEM :
Pattern.compile throws StringIndexOutOfBoundsException instead of PatternSyntaxException for the pattern string "[" with canonical equivalence (CANON_EQ) flag set.

See https://josm.openstreetmap.de/ticket/13870 for a real-life scenario where a pattern is validated during user input.

As per QUality outreach, please add "josm-found" label to this bug report.

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Run the following code:

Pattern.compile("[", Pattern.CANON_EQ);

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
PatternSyntaxException
ACTUAL -
StringIndexOutOfBoundsException

ERROR MESSAGES/STACK TRACES THAT OCCUR :
java.lang.StringIndexOutOfBoundsException: String index out of range: 1
	at java.lang.String.codePointAt(String.java:687)
	at java.util.regex.Pattern.normalizeCharClass(Pattern.java:1416)
	at java.util.regex.Pattern.normalize(Pattern.java:1392)
	at java.util.regex.Pattern.compile(Pattern.java:1659)
	at java.util.regex.Pattern.<init>(Pattern.java:1351)
	at java.util.regex.Pattern.compile(Pattern.java:1054)


REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
import java.util.regex.Pattern;

public class PatternBug {
    public static void main(String[] args) throws Exception {
        Pattern.compile("[", Pattern.CANON_EQ);
   }
}

---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
catch StringIndexOutOfBoundsException in caller's code


Comments
To reproduce the issue, run the attached test case. Following are the results on various JDK versions: JDK 8 - Fail JDK 8u112 - fail JDK 8u122 ea - Fail JDK 9-ea + 141 - Pass Following is the output on JDK 8u versions: Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String index out of range: 1 at java.lang.String.codePointAt(String.java:675) at java.util.regex.Pattern.normalizeCharClass(Pattern.java:1416) at java.util.regex.Pattern.normalize(Pattern.java:1392) at java.util.regex.Pattern.compile(Pattern.java:1659) at java.util.regex.Pattern.<init>(Pattern.java:1351) at java.util.regex.Pattern.compile(Pattern.java:1054) at JI9044959.main(JI9044959.java:6) Following is the output on JDK 9-ea + 141 : Exception in thread "main" java.util.regex.PatternSyntaxException: Unclosed character class near index 0 [ ^ at java.util.regex.Pattern.error(java.base@9-ea/Pattern.java:1990) at java.util.regex.Pattern.clazz(java.base@9-ea/Pattern.java:2658) at java.util.regex.Pattern.sequence(java.base@9-ea/Pattern.java:2099) at java.util.regex.Pattern.expr(java.base@9-ea/Pattern.java:2031) at java.util.regex.Pattern.compile(java.base@9-ea/Pattern.java:1753) at java.util.regex.Pattern.<init>(java.base@9-ea/Pattern.java:1402) at java.util.regex.Pattern.compile(java.base@9-ea/Pattern.java:1096) at JI9044959.main(JI9044959.java:6)
02-11-2016

This bug is already fixed in JDK9 build 119 onwards
01-11-2016