United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
JDK-4771934 : Matcher.find() hangs for no apparent reason

Details
Type:
Bug
Submit Date:
2002-10-31
Status:
Closed
Updated Date:
2002-10-31
Project Name:
JDK
Resolved Date:
2002-10-31
Component:
core-libs
OS:
solaris_9
Sub-Component:
java.util.regex
CPU:
generic
Priority:
P3
Resolution:
Not an Issue
Affected Versions:
1.4.1
Fixed Versions:

Related Reports

Sub Tasks

Description
###@###.### 2002-10-31
I wrote a little program to replace the "<meta ... charset=" html tag with a tag containing a defined codeset for any input files, and found that it was hanging on two files (both samples attached)

The pattern that it's hanging on looks like :

// looking for a html meta tag like :
// <META http-equiv="Content-Type" content="text/html; charset=ISO-8859-5"> 
        
Pattern mypattern = Pattern.compile("<\\s*"+      
                                "(meta|META)"+
                                "(\\s|[^>])+"+
                                "(CHARSET|charset)="+
                                "(\\s|[^>])+>");

my test program (attached) can be run on any html input file, and it should print out what (if any) text it replaced. I can reproduce this on java full version "1.4.0_02-20020711" and "1.4.1_01-b01"

Both html attachments cause this error to occur. Though lots of other html files (both with and without matches for the above regex) work fine.

Trussing java while it's hung reveals lots of :

11130/5:	poll(0xF2A7FD88, 0, 10)				= 0
11130/5:	poll(0xF2A7FD88, 0, 10)				= 0
11130/5:	poll(0xF2A7FD88, 0, 10)				= 0
11130/5:	poll(0xF2A7FD88, 0, 10)				= 0
11130/5:	poll(0xF2A7FD88, 0, 10)				= 0
11130/5:	poll(0xF2A7FD88, 0, 10)				= 0
11130/5:	poll(0xF2A7FD88, 0, 10)				= 0
11130/5:	poll(0xF2A7FD88, 0, 10)				= 0
11130/5:	poll(0xF2A7FD88, 0, 10)				= 0
11130/5:	poll(0xF2A7FD88, 0, 10)				= 0
11130/5:	poll(0xF2A7FD88, 0, 10)				= 0
11130/5:	poll(0xF2A7FD88, 0, 10)				= 0
11130/5:	poll(0xF2A7FD88, 0, 10)				= 0
11130/5:	poll(0xF2A7FD88, 0, 10)				= 0
11130/5:	poll(0xF2A7FD88, 0, 10)				= 0
11130/5:	poll(0xF2A7FD88, 0, 10)				= 0
11130/5:	poll(0xF2A7FD88, 0, 10)				= 0
11130/2:	lwp_cond_wait(0x0002BD10, 0x0002BCF8, 0xFADFFD60) (sleeping...)
11130/5:	poll(0xF2A7FD88, 0, 10)				= 0
11130/2:	lwp_cond_wait(0x0002BD10, 0x0002BCF8, 0xFADFFD60) Err#62 ETIME
11130/5:	poll(0xF2A7FD88, 0, 10)				= 0
11130/5:	poll(0xF2A7FD88, 0, 10)				= 0
11130/5:	poll(0xF2A7FD88, 0, 10)				= 0
11130/5:	poll(0xF2A7FD88, 0, 10)				= 0


                                    

Comments
EVALUATION

The construct (\\s|[^>]) causes an exponentially increasing amount of backtracking. The matcher is not hung, but it each character added to the meta tag doubles the time it takes to evaluate the match.
###@###.### 2002-10-31
                                     
2002-10-31
WORK AROUND

###@###.### 2002-10-31
The pattern :
Pattern mypattern = Pattern.compile ("<(\\s)*"+      
                                "(meta|META)"+
                                "([^>])+"+
                                "(CHARSET|charset)="+
                                "([^>])+>");

works. - see comments
                                     
2004-06-11



Hardware and Software, Engineered to Work Together