United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: JDK-6337993 java.util.regex.Pattern StackOverflowError's on (x|y)*
JDK-6337993 : java.util.regex.Pattern StackOverflowError's on (x|y)*

Details
Type:
Bug
Submit Date:
2005-10-18
Status:
Open
Updated Date:
2011-06-06
Project Name:
JDK
Resolved Date:
Component:
core-libs
OS:
windows_7,windows_2000
Sub-Component:
java.util.regex
CPU:
x86
Priority:
P4
Resolution:
Unresolved
Affected Versions:
1.4.2,5.0,6u10
Targeted Versions:

Related Reports
Duplicate:

Sub Tasks

Description
FULL PRODUCT VERSION :
java version "1.5.0_05"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_05-b05)
Java HotSpot(TM) Client VM (build 1.5.0_05-b05, mixed mode, sharing)


ADDITIONAL OS VERSION INFORMATION :
Microsoft Windows 2000 [Version 5.00.2195]

A DESCRIPTION OF THE PROBLEM :
any pattern like (x|y)* (an alternative wrapped in a star) is implemented recursively by Pattern. Each matched alternative uses about 5 method calls on the stack.

This makes that kind of pattern useless on any reasonably-sized input. Is there any chance you can make the Pattern implementation non-recursive in this case?

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
see example code

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
matched? true
ACTUAL -
java.lang.StackOverflowError

ERROR MESSAGES/STACK TRACES THAT OCCUR :
java.lang.StackOverflowError
	at java.util.regex.Pattern$Branch.match(Pattern.java:3933)
	at java.util.regex.Pattern$GroupHead.match(Pattern.java:3973)
	at java.util.regex.Pattern$Loop.match(Pattern.java:4100)
	at java.util.regex.Pattern$GroupTail.match(Pattern.java:4032)
	at java.util.regex.Pattern$Single.match(Pattern.java:3004)
	at java.util.regex.Pattern$Branch.match(Pattern.java:3933)
	at java.util.regex.Pattern$GroupHead.match(Pattern.java:3973)
	at java.util.regex.Pattern$Loop.match(Pattern.java:4100)
	at java.util.regex.Pattern$GroupTail.match(Pattern.java:4032)
	at java.util.regex.Pattern$Single.match(Pattern.java:3004)
	at java.util.regex.Pattern$Branch.match(Pattern.java:3933)
	at java.util.regex.Pattern$GroupHead.match(Pattern.java:3973)
	at java.util.regex.Pattern$Loop.match(Pattern.java:4100)
	at java.util.regex.Pattern$GroupTail.match(Pattern.java:4032)
	at java.util.regex.Pattern$Single.match(Pattern.java:3004)
	at java.util.regex.Pattern$Branch.match(Pattern.java:3933)
	at java.util.regex.Pattern$GroupHead.match(Pattern.java:3973)
	at java.util.regex.Pattern$Loop.match(Pattern.java:4100)
	at java.util.regex.Pattern$GroupTail.match(Pattern.java:4032)
	at java.util.regex.Pattern$Single.match(Pattern.java:3004)
	at java.util.regex.Pattern$Branch.match(Pattern.java:3933)
	at java.util.regex.Pattern$GroupHead.match(Pattern.java:3973)
	at java.util.regex.Pattern$Loop.match(Pattern.java:4100)
	at java.util.regex.Pattern$GroupTail.match(Pattern.java:4032)
	at java.util.regex.Pattern$Single.match(Pattern.java:3004)
	at java.util.regex.Pattern$Branch.match(Pattern.java:3933)
	at java.util.regex.Pattern$GroupHead.match(Pattern.java:3973)
	at java.util.regex.Pattern$Loop.match(Pattern.java:4100)
	at java.util.regex.Pattern$GroupTail.match(Pattern.java:4032)
	at java.util.regex.Pattern$Single.match(Pattern.java:3004)
	at java.util.regex.Pattern$Branch.match(Pattern.java:3933)
	at java.util.regex.Pattern$GroupHead.match(Pattern.java:3973)
	at java.util.regex.Pattern$Loop.match(Pattern.java:4100)
	at java.util.regex.Pattern$GroupTail.match(Pattern.java:4032)
	at java.util.regex.Pattern$Single.match(Pattern.java:3004)
	at java.util.regex.Pattern$Branch.match(Pattern.java:3933)
	at java.util.regex.Pattern$GroupHead.match(Pattern.java:3973)
	at java.util.regex.Pattern$Loop.match(Pattern.java:4100)
	at java.util.regex.Pattern$GroupTail.match(Pattern.java:4032)
	at java.util.regex.Pattern$Single.match(Pattern.java:3004)
	at java.util.regex.Pattern$Branch.match(Pattern.java:3933)
	at java.util.regex.Pattern$GroupHead.match(Pattern.java:3973)
	at java.util.regex.Pattern$Loop.match(Pattern.java:4100)
	at java.util.regex.Pattern$GroupTail.match(Pattern.java:4032)
	at java.util.regex.Pattern$Single.match(Pattern.java:3004)
	at java.util.regex.Pattern$Branch.match(Pattern.java:3933)
	at java.util.regex.Pattern$GroupHead.match(Pattern.java:3973)
	at java.util.regex.Pattern$Loop.match(Pattern.java:4100)
	at java.util.regex.Pattern$GroupTail.match(Pattern.java:4032)
	at java.util.regex.Pattern$Single.match(Pattern.java:3004)
	at java.util.regex.Pattern$Branch.match(Pattern.java:3933)
	at java.util.regex.Pattern$GroupHead.match(Pattern.java:3973)
	at java.util.regex.Pattern$Loop.match(Pattern.java:4100)
	at java.util.regex.Pattern$GroupTail.match(Pattern.java:4032)
	at java.util.regex.Pattern$Single.match(Pattern.java:3004)
	at java.util.regex.Pattern$Branch.match(Pattern.java:3933)
	at java.util.regex.Pattern$GroupHead.match(Pattern.java:3973)
	at java.util.regex.Pattern$Loop.match(Pattern.java:4100)
	at java.util.regex.Pattern$GroupTail.match(Pattern.java:4032)
	at java.util.regex.Pattern$Single.match(Pattern.java:3004)
	at java.util.regex.Pattern$Branch.match(Pattern.java:3933)
	at java.util.regex.Pattern$GroupHead.match(Pattern.java:3973)
	at java.util.regex.Pattern$Loop.match(Pattern.java:4100)
	at java.util.regex.Pattern$GroupTail.match(Pattern.java:4032)
	at java.util.regex.Pattern$Single.match(Pattern.java:3004)
	at java.util.regex.Pattern$Branch.match(Pattern.java:3933)
	at java.util.regex.Pattern$GroupHead.match(Pattern.java:3973)
	at java.util.regex.Pattern$Loop.match(Pattern.java:4100)
	at java.util.regex.Pattern$GroupTail.match(Pattern.java:4032)
	at java.util.regex.Pattern$Single.match(Pattern.java:3004)
	at java.util.regex.Pattern$Branch.match(Pattern.java:3933)
	at java.util.regex.Pattern$GroupHead.match(Pattern.java:3973)
	at java.util.regex.Pattern$Loop.match(Pattern.java:4100)
	at java.util.regex.Pattern$GroupTail.match(Pattern.java:4032)
	at java.util.regex.Pattern$Single.match(Pattern.java:3004)
	at java.util.regex.Pattern$Branch.match(Pattern.java:3933)
	at java.util.regex.Pattern$GroupHead.match(Pattern.java:3973)
	at java.util.regex.Pattern$Loop.match(Pattern.java:4100)
	at java.util.regex.Pattern$GroupTail.match(Pattern.java:4032)
	at java.util.regex.Pattern$Single.match(Pattern.java:3004)
	at java.util.regex.Pattern$Branch.match(Pattern.java:3933)
	at java.util.regex.Pattern$GroupHead.match(Pattern.java:3973)
	at java.util.regex.Pattern$Loop.match(Pattern.java:4100)
	at java.util.regex.Pattern$GroupTail.match(Pattern.java:4032)
	at java.util.regex.Pattern$Single.match(Pattern.java:3004)
	at java.util.regex.Pattern$Branch.match(Pattern.java:3933)
	at java.util.regex.Pattern$GroupHead.match(Pattern.java:3973)
	at java.util.regex.Pattern$Loop.match(Pattern.java:4100)
	at java.util.regex.Pattern$GroupTail.match(Pattern.java:4032)
	at java.util.regex.Pattern$Single.match(Pattern.java:3004)
	at java.util.regex.Pattern$Branch.match(Pattern.java:3933)
	at java.util.regex.Pattern$GroupHead.match(Pattern.java:3973)
	at java.util.regex.Pattern$Loop.match(Pattern.java:4100)
	at java.util.regex.Pattern$GroupTail.match(Pattern.java:4032)
	at java.util.regex.Pattern$Single.match(Pattern.java:3004)
	at java.util.regex.Pattern$Branch.match(Pattern.java:3933)
	at java.util.regex.Pattern$GroupHead.match(Pattern.java:3973)
	at java.util.regex.Pattern$Loop.match(Pattern.java:4100)
	at java.util.regex.Pattern$GroupTail.match(Pattern.java:4032)
	at java.util.regex.Pattern$Single.match(Pattern.java:3004)
	at java.util.regex.Pattern$Branch.match(Pattern.java:3933)
	at java.util.regex.Pattern$GroupHead.match(Pattern.java:3973)
	at java.util.regex.Pattern$Loop.match(Pattern.java:4100)
	at java.util.regex.Pattern$GroupTail.match(Pattern.java:4032)
	at java.util.regex.Pattern$Single.match(Pattern.java:3004)
	at java.util.regex.Pattern$Branch.match(Pattern.java:3933)
	at java.util.regex.Pattern$GroupHead.match(Pattern.java:3973)
	at java.util.regex.Pattern$Loop.match(Pattern.java:4100)
	at java.util.regex.Pattern$GroupTail.match(Pattern.java:4032)
	at java.util.regex.Pattern$Single.match(Pattern.java:3004)
	at java.util.regex.Pattern$Branch.match(Pattern.java:3933)
	at java.util.regex.Pattern$GroupHead.match(Pattern.java:3973)
	at java.util.regex.Pattern$Loop.match(Pattern.java:4100)
	at java.util.regex.Pattern$GroupTail.match(Pattern.java:4032)
	at java.util.regex.Pattern$Single.match(Pattern.java:3004)
	at java.util.regex.Pattern$Branch.match(Pattern.java:3933)
	at java.util.regex.Pattern$GroupHead.match(Pattern.java:3973)
	at java.util.regex.Pattern$Loop.match(Pattern.java:4100)
	at java.util.regex.Pattern$GroupTail.match(Pattern.java:4032)
	at java.util.regex.Pattern$Single.match(Pattern.java:3004)
	at java.util.regex.Pattern$Branch.match(Pattern.java:3933)
	at java.util.regex.Pattern$GroupHead.match(Pattern.java:3973)
	at java.util.regex.Pattern$Loop.match(Pattern.java:4100)
	at java.util.regex.Pattern$GroupTail.match(Pattern.java:4032)
	at java.util.regex.Pattern$Single.match(Pattern.java:3004)
	at java.util.regex.Pattern$Branch.match(Pattern.java:3933)
	at java.util.regex.Pattern$GroupHead.match(Pattern.java:3973)
	at java.util.regex.Pattern$Loop.match(Pattern.java:4100)
	at java.util.regex.Pattern$GroupTail.match(Pattern.java:4032)
	at java.util.regex.Pattern$Single.match(Pattern.java:3004)
	at java.util.regex.Pattern$Branch.match(Pattern.java:3933)
	at java.util.regex.Pattern$GroupHead.match(Pattern.java:3973)
	at java.util.regex.Pattern$Loop.match(Pattern.java:4100)
	at java.util.regex.Pattern$GroupTail.match(Pattern.java:4032)
	at java.util.regex.Pattern$Single.match(Pattern.java:3004)
	at java.util.regex.Pattern$Branch.match(Pattern.java:3933)
	at java.util.regex.Pattern$GroupHead.match(Pattern.java:3973)
	at java.util.regex.Pattern$Loop.match(Pattern.java:4100)
	at java.util.regex.Pattern$GroupTail.match(Pattern.java:4032)
	at java.util.regex.Pattern$Single.match(Pattern.java:3004)
	at java.util.regex.Pattern$Branch.match(Pattern.java:3933)
	at java.util.regex.Pattern$GroupHead.match(Pattern.java:3973)
	at java.util.regex.Pattern$Loop.match(Pattern.java:4100)
	at java.util.regex.Pattern$GroupTail.match(Pattern.java:4032)
Exception in thread "main"


REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class RegexpFun {
    public static void main(String[] args) {

        CharSequence largeString = makeLargeString();
        Pattern p = Pattern.compile("(x|y)*");
        Matcher m = p.matcher(largeString);
        System.out.println("matches? " + m.matches()); // throws StackOverflowError
    }

    private static CharSequence makeLargeString() {
        final int size = 1000;
        StringBuffer largeString = new StringBuffer(size);
        for (int i = 0; i < size/2; i++) {
            largeString.append("xy");
        }
        return largeString;
    }
}
---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
no workaround

                                    

Comments
EVALUATION

Yep, the limit of current implementation, need find a solution.
                                     
2005-10-22
SUGGESTED FIX

With the help of members of the regular expression team (and of course the UWC development team), we 
were able to optimize the regular expression used in parsing filters with \ and ". The results are quite 
encouraging: we are able to get to create reasonably complex filters with \ and " (please see below).

The patch with this fix is available at the following location:

    /net/brandy.red.iplanet.com/wspace/sv102973/JAN-31-2006/patches-uwc-6.2p3-export-full-SPARC-20060131-118540-28.tar

Please feel free to use the patch at the above location in order to test in your environment and let me know 
how is goes...

Regards,
Srinivas.

=====================================================================================
Following are some of the filters that were successfully created, read and edited:

$ ldapsearch -b "ou=people,o=india.sun.com,dc=india,dc=sun,dc=com" -h varali.india.sun.com -D "cn=directory manager" -w password -T -x -S mailSieveRuleSource -B "uid=admin"
.
.
.
mailSieveRuleSource=#RULE: $Name="quotedFilter"   $Order=11   $Type="DEFAULT_TYPE"
require "fileinto";
#BEGINFILTER
if allof(header :contains ["Subject"," Comments"," Keywords"] "output from \"cron\" or \"mon\" or \"bon\" or \"john\" or \"this\" or \"that\""){
fileinto "Inbox";
stop;
}
#ENDFILTER

mailSieveRuleSource=#RULE: $Name="slashedFilter"   $Order=12   $Type="DEFAULT_TYPE"
require "notify";
require "variables";
#BEGINFILTER
if allof(allof(header :contains ["Subject"," Comments"," Keywords"] "this is a \\slashed\\ string containing \\abc\\\, \\def\\\, \\ghi\\\, \\jkl\\\, \\mno\\\, \\pqr\\\, \\stu\\\, \\vwx\\\, and \\yz\\"),anyof(allof(not header :matches ["Subject"," Comments"," Keywords"] ["POSTMASTER-AUTO-FW:*","postmaster-auto-fw:*"],header :matches ["Subject"," Comments"," Keywords"] "*"),true)){
notify :method "email" :echo :options "###@###.###" "POSTMASTER-AUTO-FW:${1}" "This is an auto-forwarded message";
stop;
}
#ENDFILTER

mailSieveRuleSource=#RULE: $Name="quotedAndSlashed"   $Order=13   $Type="DEFAULT_TYPE"
require "fileinto";
require "notify";
require "variables";
#BEGINFILTER
if allof(allof(header :contains ["Subject"," Comments"," Keywords"] "output from \"cron\" or \"mon\" or \"bon\" or \"john\" or \"this\" or \"that\"",header :contains ["Subject"," Comments"," Keywords"] "this is a \\slashed\\ string containing \\abc\\\, \\def\\\, \\ghi\\\, \\jkl\\\, \\mno\\\, \\pqr\\\, \\stu\\\, \\vwx\\\, and \\yz\\"),anyof(allof(not header :matches ["Subject"," Comments"," Keywords"] ["POSTMASTER-AUTO-FW:*","postmaster-auto-fw:*"],header :matches ["Subject"," Comments"," Keywords"] "*"),true)){
fileinto "Inbox";
notify :method "email" :echo :options "###@###.###" "POSTMASTER-AUTO-FW:${1}" "This is an auto-forwarded message";
stop;
}
#ENDFILTER
                                     
2006-02-22



Hardware and Software, Engineered to Work Together