JDK-5078293 : Pattern.compile hangs up on a particular regexp template
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.util.regex
  • Affected Version: 1.4.2
  • Priority: P4
  • Status: Closed
  • Resolution: Duplicate
  • OS: solaris_2.5.1
  • CPU: x86
  • Submitted: 2004-07-26
  • Updated: 2006-02-04
  • Resolved: 2006-02-04
Related Reports
Duplicate :  
Description
Name: js151677			Date: 07/26/2004


FULL PRODUCT VERSION :
On Linux:

java version "1.4.2_02"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_02-b03)
Java HotSpot(TM) Client VM (build 1.4.2_02-b03, mixed mode)

On Win32:
java version "1.4.2_01"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.
Java HotSpot(TM) Client VM (build 1.4.2_01-b06, mixed mode)


ADDITIONAL OS VERSION INFORMATION :
Linux 2.4.18-3 #1 Thu Apr 18 07:37:53 EDT 2002 i686 unknown

Microsoft Windows XP [Version 5.1.2600]


A DESCRIPTION OF THE PROBLEM :
This problem takes place on both win32 and Linux platforms.

When I use particular regexp, Pattern::compile hungs up. The regexp is 100% correct. It successfully work with C PCRE library and perl.

Code extract:

        String sTemplate1= "[A-Z\\d]+[\\s]*TUPC102 (\\w\\w\\w\\d\\d) (\\d\\d:\\d\\d:\\d\\d) (\\d\\d\\d\\d) INFO TUPLE CHANGED FROM\\s*\\s*TABLE NAME:\\s*(.*)\\s*\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()$()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?";

        String sLog1="UKGSM30     TUPC102 FEB18 11:01:19 6513 INFO TUPLE CHANGED FROM         TABLE NAME: GHLRPLMN        HPLMN (HPLMN) $ ";

         Pattern xPattern;
         Matcher xMatcher;

         xPattern = Pattern.compile(sTemplate1.toString());


If the regexp will be changed to
"UKGSM\\d\\d+[\\s]*TUPC102 (\\w\\w\\w\\d\\d) and so on

it sucessfully passes.


STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Execute test below

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
matching should pass
ACTUAL -
regexp compilation hungs up

REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
import java.io.*;
import java.util.*;
import java.util.regex.*;

public class MatchLog
{
   public static void main(String args[])
   {
        String sTemplate1= "[A-Z\\d]+[\\s]*TUPC102 (\\w\\w\\w\\d\\d) (\\d\\d:\\d\\d:\\d\\d) (\\d\\d\\d\\d) INFO TUPLE CHANGED FROM\\s*\\s*TABLE NAME:\\s*(.*)\\s*\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()$()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?\\s*(?:([\\w $()]+)\\n)?";
        String sLog1="UKGSM30     TUPC102 FEB18 11:01:19 6513 INFO TUPLE CHANGED FROM         TABLE NAME: GHLRPLMN        HPLMN (HPLMN) $ ";

         System.out.println("Template: \n"+sTemplate+"\n");
         System.out.println("Log: \n"+sLog1+"\n");

         
         Pattern xPattern;
         Matcher xMatcher;

         xPattern = Pattern.compile(sTemplate1.toString());
         System.out.println("COMPILED!!!");
         xMatcher = xPattern.matcher(sLog1.toString());
         System.out.println("MATCHED!!!");

         if(xMatcher.find())
         {
            System.out.println("***************** MATCHED ******************\n");
            for(int i= 1; i<xMatcher.groupCount() ; i++)
            {
               System.out.print(i);
               System.out.println(": "+xMatcher.group(i)+"\n");
            }
         }
         else
         {
            System.out.println("********************* NOT MATCHED ********************\n");
         }


   }
};

---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
Avoid leading [] usage
(Incident Review ID: 289825) 
======================================================================

Comments
EVALUATION See #5013651, fixed in mustang. Closed as dup
04-02-2006

EVALUATION Possibly "not a bug". It is common to report a "hang" when the regular expression requires an exponentially increasing amount of backtracking. We need to investigate whether this is occurring here. -- iag@sfbay 2004-06-26
26-06-2004