JDK-8220708 : Slow matching of specific regular expressions
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 9,11,12,13
  • Priority: P3
  • Status: Open
  • Resolution: Unresolved
  • OS: linux
  • CPU: x86_64
  • Submitted: 2019-03-07
  • Updated: 2020-11-24
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbdUnresolved
Related Reports
Relates :  
Description
ADDITIONAL SYSTEM INFORMATION :
Running on Linux 64 bit

A DESCRIPTION OF THE PROBLEM :
While upgrading from Java 1.8 to 11 we've noticed that regular expressions matching of specific values is much slower in JDK 11. This doesn't happen for all expression, We were able to reproduce it with a specific expression that we're using. On 4 cores machine this was x10 times slower on JDK 11

REGRESSION : Last worked in version 8u202

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Run the attached source code on the two java versions (compiled with Java 1.8, run with both)

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
# ./1.8.0.151/bin/java TestRegex
Starting benchmark...
........................................................................................................................................................................................................Finished all threads
Finished running benchmark - took 1573 millis
# ./jdk-11.0.2+9/bin/java TestRegex
Starting benchmark...
........................................................................................................................................................................................................Finished all threads
Finished running benchmark - took 14767 millis

ACTUAL -
times should be similar in JDK 1.8 and JDK 11

---------- BEGIN SOURCE ----------
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

public class TestRegex {

    public static void main(String[] args) throws Exception {

        long startTime = System.currentTimeMillis();
        int numOfThreads = 4;
        ExecutorService executor = Executors.newFixedThreadPool(numOfThreads);
        System.out.println("Starting benchmark...");

        String s1 = "( wtime = 1h:4m:51s , utime = 3814.79s , stime = 36.29s , RM = 3539 , VM=4218 , MaxRSS=4912316 , MaxVSZ=5154912 , DiskIOReadKB=0 , DiskIOWriteKB=0 , avgovertime(30s;VM)= 4206.166666666667 , avgovertime(30s;desiredCpu)= 1.0 , avgovertime(5m;desiredCpu)= 1.0 , avgovertime(60m;VM)= 3586.9467787114845 , deltaovertime(60m;utime+stime)= 3583.0 , avgovertime(2m;desiredCpu)= 1.0 , avgovertime(1m;desiredCpu)= 1.0 , avgovertime(60m;RM)= 3054.817415730337 , deltaovertime(15m;utime+stime)= 896.0 , avgovertime(5m;VM)= 4108.1 , avgovertime(30s;RM)= 3527.0 , deltaovertime(5m;utime+stime)= 295.0 , avgovertime(5m;RM)= 3430.483333333333 , avgovertime(2m;RM)= 3495.9166666666665 , avgovertime(15m;VM)= 3864.866666666667 , avgovertime(2m;VM)= 4174.833333333333 , avgovertime(15m;RM)= 3194.822222222222 , deltaovertime(2m;utime+stime)= 114.0 , Max(avgovertime(5m;RM))= (3508;2536) , Max(avgovertime(30s;VM))= (4252;2536) , Max(avgovertime(5m;VM))= (4153;2536) , Max(avgovertime(2m;desiredCpu))= (1;144) , Max(avgovertime(15m;RM))= (3222;2536) , Max(avgovertime(60m;RM))= (3054;3883) , Max(avgovertime(2m;VM))= (4221;2536) , Max(deltaovertime(2m;utime+stime))= (117;2526) , Max(avgovertime(60m;VM))= (3586;3888) , Max(avgovertime(2m;RM))= (3573;2536) , Max(avgovertime(30s;desiredCpu))= (1;69) , Max(avgovertime(15m;VM))= (3864;3888) , Max(avgovertime(1m;desiredCpu))= (1;84) , Max(avgovertime(30s;RM))= (3735;459) , Max(avgovertime(5m;desiredCpu))= (1;339) , Max(deltaovertime(5m;utime+stime))= (297;2706) , Max(deltaovertime(15m;utime+stime))= (897;3808) , Max(deltaovertime(60m;utime+stime))= (3584;3747) , trmGB= 755.96 , freeCoresCap= 0.5 , freeMemCap= 356 , LocalDirsTotalMB= 827.36 , frmPer= 64.45 , time= 03/07/19-10:22:05 , frmGB= 487.23 , memoryCons= 3.3145182291666666 , coresCons= 0.9966666666666667 )";
        String s2 = " Max\\(avgovertime\\(5m;RM\\)\\)";

        for (int j=0; j<numOfThreads; j++)
        {
            executor.execute(new Runnable() {
                                 @Override
                                 public void run() {
                                     for(int i=0; i<50000; i++) {
                                         if (i % 1000 == 0)
                                         {
                                             System.out.print(".");
                                         }
                                         boolean result = Boolean.valueOf(s1.matches("[\\d\\D]*" + s2 + "[\\d\\D]*"));
                                     }
                                 }
                             }
            );
        }

        executor.shutdown();
        while (!executor.isTerminated()) {
        }
        System.out.println("Finished all threads");

        long finishTime = System.currentTimeMillis();
        System.out.println("Finished running benchmark - took " + (finishTime-startTime) + " millis");
    }
}



---------- END SOURCE ----------

FREQUENCY : always



Comments
The underlying problem is fixed in 15. The bug is open specifically for 11.
24-11-2020

> I'll keep this bug mainly for jdk11/12 I'm setting the fix version to 'tbd' for now.
12-06-2019

[~rraghavan] Removed c2 label since JDK-8222075 already tracks C2-specific part of the fix.
30-04-2019

Filed JDK-8222075 to track loop predication enhancements. I'll keep this bug mainly for jdk11/12. I expect JDK-8222075 to fix the regression in the latest release, but it may be too complex to be backported. In that case, other alternatives should be explored.
06-04-2019

I experimented with enabling loop predication for subtype checks and got promising results with a prototype: $ java -showversion TestRegex java version "1.8.0_201" Java(TM) SE Runtime Environment (build 1.8.0_201-b09) Java HotSpot(TM) 64-Bit Server VM (build 25.201-b09, mixed mode) Starting benchmark Finished running benchmark - took 533 millis openjdk version "12" 2019-03-19 OpenJDK Runtime Environment (build 12+33) OpenJDK 64-Bit Server VM (build 12+33, mixed mode, sharing) Starting benchmark Finished running benchmark - took 2359 millis openjdk version "13-internal" 2019-09-17 OpenJDK Runtime Environment (build 13-internal+0-adhoc.vlivanov.jdk) OpenJDK 64-Bit Server VM (build 13-internal+0-adhoc.vlivanov.jdk, mixed mode, sharing) Starting benchmark Finished running benchmark - took 307 millis
06-04-2019

There are some inefficiencies in the generated code: * Subtype checks aren't hoisted out of the loop, though receivers are constant (both predicate and captured "this" are loaded from final instance fields). * Interface subtype check is immediately followed by a more specific exact type check. It follows from peculiarity of code shape: invokespecial is immediately followed by an interface call on the same receiver which is inlined based on the profile. C2 could strength-reduce the whole sequence to exact type check, but it doesn't. But I don't see a general solution to reliably eliminate the check. IMO it's worth to investigate possible solutions on library level as well.
28-03-2019

The following code suffers the most from receiver subtype check: java.util.regex.Pattern$CharPropertyGreedy::match: ... while (i < to) { int ch = Character.codePointAt(seq, i); if (!predicate.is(ch)) break; i += Character.charCount(ch); n++; } static interface BmpCharPredicate extends CharPredicate { ... default CharPredicate union(CharPredicate p) { if (p instanceof BmpCharPredicate) return (BmpCharPredicate)(ch -> is(ch) || p.is(ch)); return ch -> is(ch) || p.is(ch); } The following lambda captures (BmpCharPredicate this) and (BmpCharPredicate p): (BmpCharPredicate)(ch -> is(ch) || p.is(ch)) and its body is represented as an instance private method on BmpCharPredicate interface: private boolean lambda$union$3(java.util.regex.Pattern$CharPredicate, int); descriptor: (Ljava/util/regex/Pattern$CharPredicate;I)Z flags: (0x1002) ACC_PRIVATE, ACC_SYNTHETIC It's invoked from java.util.regex.Pattern$BmpCharPredicate$$Lambda$5 (VMAC with java.util.regex.Pattern$BmpCharPredicate as a host) on captured "this" instance (...$$Lambda$5.arg$1) using invokespecial and it requires receiver subtype check since it has interface type (java.util.regex.Pattern$BmpCharPredicate). java.util.regex.Pattern$CharPropertyGreedy::match (122 bytes) ... @ 28 java.util.regex.Pattern$BmpCharPredicate$$Lambda$5/...::is (13 bytes) inline (hot) \-> TypeProfile (4958/4958 counts) = java/util/regex/Pattern$BmpCharPredicate$$Lambda$5 @ 9 java.util.regex.Pattern$BmpCharPredicate::lambda$union$3 (26 bytes) inline (hot) ... java.util.regex.Pattern$CharPropertyGreedy::match (122 bytes) ... 22 nofast_aload_0 23 nofast_getfield 4 <java/util/regex/Pattern$CharPropertyGreedy.predicate/Ljava/util/regex/Pattern$CharPredicate;> 26 nofast_iload #6 28 invokeinterface 8 <java/util/regex/Pattern$CharPredicate.is(I)Z> 48 bci: 28 VirtualCallData count(0) nonprofiled_count(0) entries(1) 'java/util/regex/Pattern$BmpCharPredicate$$Lambda$5'(4958 1.00) method_entries(0) ... java.util.regex.Pattern$BmpCharPredicate$$Lambda$5/...::is (13 bytes) 0 fast_aaccess_0 1 fast_agetfield 17 <java/util/regex/Pattern$BmpCharPredicate$$Lambda$5.arg$1/Ljava/util/regex/Pattern$BmpCharPredicate;> 4 fast_aaccess_0 5 fast_agetfield 19 <java/util/regex/Pattern$BmpCharPredicate$$Lambda$5.arg$2/Ljava/util/regex/Pattern$CharPredicate;> 8 iload_1 9 invokespecial 32 <java/util/regex/Pattern$BmpCharPredicate.lambda$union$3(Ljava/util/regex/Pattern$CharPredicate;I)Z> 0 bci: 9 CounterData count(3988) 12 ireturn
27-03-2019

$ /Library/Java/JavaVirtualMachines/openjdk-11.0.1.jdk/Contents/Home/bin/java -showversion -Xbatch -XX:-TieredCompilation -XX:CICompilerCount=1 -XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining -XX:CompileCommand=quiet -XX:CompileCommand=compileonly,*::match -XX:CompileCommand=compileonly,*::charAt -XX:CompileCommand=compileonly,*::codePointBefore -XX:CompileCommand=compileonly,*::isSurrogate TestRegex openjdk version "11.0.1" 2018-10-16 OpenJDK Runtime Environment 18.9 (build 11.0.1+13) OpenJDK 64-Bit Server VM 18.9 (build 11.0.1+13, mixed mode) Starting benchmark 192 17 b java.lang.String::charAt (25 bytes) @ 1 java.lang.String::isLatin1 (19 bytes) inline (hot) @ 12 java.lang.StringLatin1::charAt (28 bytes) inline (hot) 194 18 b java.lang.StringLatin1::charAt (28 bytes) 204 19 b java.util.regex.Pattern$CharPropertyGreedy::match (122 bytes) @ 17 java.lang.Character::codePointAt (51 bytes) inline (hot) @ 2 java.lang.String::charAt (25 bytes) inline (hot) \-> TypeProfile (3988/3988 counts) = java/lang/String @ 1 java.lang.String::isLatin1 (19 bytes) inline (hot) @ 12 java.lang.StringLatin1::charAt (28 bytes) inline (hot) @ 9 java.lang.Character::isHighSurrogate (18 bytes) inline (hot) @ 28 java.util.regex.Pattern$BmpCharPredicate$$Lambda$5/0x0000000800067040::is (13 bytes) inline (hot) \-> TypeProfile (4958/4958 counts) = java/util/regex/Pattern$BmpCharPredicate$$Lambda$5 @ 9 java.util.regex.Pattern$BmpCharPredicate::lambda$union$3 (26 bytes) inline (hot) @ 2 java.util.regex.CharPredicates$$Lambda$3/0x0000000800061040::is (5 bytes) inline (hot) \-> TypeProfile (3988/3988 counts) = java/util/regex/CharPredicates$$Lambda$3 @ 1 java.util.regex.CharPredicates::lambda$ASCII_DIGIT$15 (20 bytes) inline (hot) @ 8 java.util.regex.ASCII::isDigit (18 bytes) inline (hot) @ 12 java.util.regex.Pattern$CharPredicate$$Lambda$4/0x0000000800067c40::is (9 bytes) inline (hot) \-> TypeProfile (3004/3004 counts) = java/util/regex/Pattern$CharPredicate$$Lambda$4 @ 5 java.util.regex.Pattern$CharPredicate::lambda$negate$3 (16 bytes) inline (hot) @ 2 java.util.regex.CharPredicates$$Lambda$3/0x0000000800061040::is (5 bytes) inline (hot) \-> TypeProfile (2205/2205 counts) = java/util/regex/CharPredicates$$Lambda$3 @ 1 java.util.regex.CharPredicates::lambda$ASCII_DIGIT$15 (20 bytes) inline (hot) @ 8 java.util.regex.ASCII::isDigit (18 bytes) inline (hot) @ 42 java.lang.Character::charCount (12 bytes) inline (hot) @ 80 java.util.regex.Pattern$Slice::match (79 bytes) inline (hot) @ 80 java.util.regex.Pattern$LastNode::match (45 bytes) executed < MinInliningThreshold times \-> TypeProfile (1940/1942 counts) = java/util/regex/Pattern$Slice @ 49 java.lang.CharSequence::charAt (0 bytes) virtual call @ 75 java.util.regex.Pattern$Node::match (27 bytes) virtual call @ 101 java.lang.Character::codePointBefore (48 bytes) inline (hot) @ 5 java.lang.CharSequence::charAt (0 bytes) virtual call @ 12 java.lang.Character::isLowSurrogate (18 bytes) inline (hot) @ 27 java.lang.CharSequence::charAt (0 bytes) virtual call @ 34 java.lang.Character::isHighSurrogate (18 bytes) inline (hot) @ 109 java.lang.Character::charCount (12 bytes) inline (hot) 252 20 b java.util.regex.Matcher::match (154 bytes) @ 121 java.util.regex.Pattern$CharPropertyGreedy::match (122 bytes) already compiled into a big method \-> TypeProfile (320/320 counts) = java/util/regex/Pattern$CharPropertyGreedy 747 21 b java.util.regex.Pattern$LastNode::match (45 bytes) Finished running benchmark - took 2619 millis
26-03-2019

$.../java -showversion -Xbatch -XX:-TieredCompilation -XX:CICompilerCount=1 -XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining -XX:CompileCommand=quiet -XX:CompileCommand=compileonly,*::match* -XX:CompileCommand=compileonly,*::codePointAt -XX:CompileCommand=compileonly,*::isSatisfiedBy -XX:CompileCommand=compileonly,*::isSatisfiedBy -XX:CompileCommand=compileonly,*::charCount -XX:CompileCommand=compileonly,*::isType -XX:CompileCommand=compileonly,*::getType -XX:CompileCommand=compileonly,*::charAt -XX:CompileCommand=compileonly,*::isHighSurrogate TestRegex java version "1.8.0_201" Java(TM) SE Runtime Environment (build 1.8.0_201-b09) Java HotSpot(TM) 64-Bit Server VM (build 25.201-b09, mixed mode) Starting benchmark 260 1 b java.lang.String::charAt (29 bytes) 263 2 b java.util.regex.Pattern$Ctype::isSatisfiedBy (24 bytes) @ 12 java.util.regex.ASCII::isType (15 bytes) inline (hot) @ 1 java.util.regex.ASCII::getType (17 bytes) inline (hot) 265 3 b java.util.regex.ASCII::isType (15 bytes) @ 1 java.util.regex.ASCII::getType (17 bytes) inline (hot) 265 4 b java.util.regex.ASCII::getType (17 bytes) 266 5 b java.util.regex.Pattern$Curly::match0 (174 bytes) @ 31 java.util.regex.Pattern$CharProperty::match (56 bytes) too big \-> TypeProfile (3/3 counts) = java/util/regex/Pattern$5 @ 77 java.util.regex.Pattern$CharProperty::match (56 bytes) inline (hot) \-> TypeProfile (4957/4957 counts) = java/util/regex/Pattern$5 @ 10 java.lang.Character::codePointAt (51 bytes) inline (hot) @ 2 java.lang.String::charAt (29 bytes) inline (hot) \-> TypeProfile (3988/3988 counts) = java/lang/String @ 9 java.lang.Character::isHighSurrogate (18 bytes) inline (hot) @ 18 java.util.regex.Pattern$5::isSatisfiedBy (28 bytes) inline (hot) @ 5 java.util.regex.Pattern$Ctype::isSatisfiedBy (24 bytes) inline (hot) \-> TypeProfile (3988/3988 counts) = java/util/regex/Pattern$Ctype @ 12 java.util.regex.ASCII::isType (15 bytes) inline (hot) @ 1 java.util.regex.ASCII::getType (17 bytes) inline (hot) @ 16 java.util.regex.Pattern$CharProperty$1::isSatisfiedBy (17 bytes) inline (hot) \-> TypeProfile (3004/3004 counts) = java/util/regex/Pattern$CharProperty$1 @ 5 java.util.regex.Pattern$Ctype::isSatisfiedBy (24 bytes) inline (hot) \-> TypeProfile (2205/2205 counts) = java/util/regex/Pattern$Ctype @ 12 java.util.regex.ASCII::isType (15 bytes) inline (hot) @ 1 java.util.regex.ASCII::getType (17 bytes) inline (hot) @ 32 java.lang.Character::charCount (12 bytes) inline (hot) @ 37 java.util.regex.Pattern$Node::match (27 bytes) inline (hot) \-> TypeProfile (3990/3990 counts) = java/util/regex/Pattern$Node @ 141 java.util.regex.Pattern$Slice::match (79 bytes) inline (hot) @ 141 java.util.regex.Pattern$LastNode::match (45 bytes) executed < MinInliningThreshold times \-> TypeProfile (1940/1942 counts) = java/util/regex/Pattern$Slice 286 6 b java.util.regex.Matcher::match (109 bytes) @ 86 java.util.regex.Pattern$Curly::match (86 bytes) inline (hot) \-> TypeProfile (320/320 counts) = java/util/regex/Pattern$Curly @ 54 java.util.regex.Pattern$Curly::match0 (174 bytes) already compiled into a medium method @ 72 java.util.regex.Pattern$Curly::match1 (63 bytes) too big @ 82 java.util.regex.Pattern$Curly::match2 (59 bytes) too big 296 7 b java.lang.Character::charCount (12 bytes) 296 8 b java.lang.Character::isHighSurrogate (18 bytes) 297 9 b java.util.regex.Pattern$CharProperty::match (56 bytes) @ 10 java.lang.Character::codePointAt (51 bytes) inline (hot) @ 2 java.lang.String::charAt (29 bytes) inline (hot) \-> TypeProfile (6694/6694 counts) = java/lang/String @ 9 java.lang.Character::isHighSurrogate (18 bytes) inline (hot) @ 18 java.util.regex.Pattern$5::isSatisfiedBy (28 bytes) inline (hot) \-> TypeProfile (6696/6696 counts) = java/util/regex/Pattern$5 @ 5 java.util.regex.Pattern$Ctype::isSatisfiedBy (24 bytes) inline (hot) \-> TypeProfile (6694/6694 counts) = java/util/regex/Pattern$Ctype @ 12 java.util.regex.ASCII::isType (15 bytes) inline (hot) @ 1 java.util.regex.ASCII::getType (17 bytes) inline (hot) @ 16 java.util.regex.Pattern$CharProperty$1::isSatisfiedBy (17 bytes) inline (hot) \-> TypeProfile (5492/5492 counts) = java/util/regex/Pattern$CharProperty$1 @ 5 java.util.regex.Pattern$Ctype::isSatisfiedBy (24 bytes) inline (hot) \-> TypeProfile (4693/4693 counts) = java/util/regex/Pattern$Ctype @ 12 java.util.regex.ASCII::isType (15 bytes) inline (hot) @ 1 java.util.regex.ASCII::getType (17 bytes) inline (hot) @ 32 java.lang.Character::charCount (12 bytes) inline (hot) @ 37 java.util.regex.Pattern$Node::match (27 bytes) inline (hot) \-> TypeProfile (6696/6696 counts) = java/util/regex/Pattern$Node 425 10 b java.util.regex.Pattern$Curly::match (86 bytes) @ 54 java.util.regex.Pattern$Curly::match0 (174 bytes) already compiled into a big method 433 11 b java.util.regex.Matcher::matches (10 bytes) @ 6 java.util.regex.Matcher::match (109 bytes) inline (hot) @ 86 java.util.regex.Pattern$Curly::match (86 bytes) inline (hot) \-> TypeProfile (321/321 counts) = java/util/regex/Pattern$Curly @ 54 java.util.regex.Pattern$Curly::match0 (174 bytes) already compiled into a big method 435 12 b java.util.regex.Pattern$LastNode::match (45 bytes) Finished running benchmark - took 733 millis
26-03-2019

This was introduced in JDK 9 b167. AFAICT this build did not contain library changes relevant to this issue. FWIW, a big regex performance change[1] did happen in 9b119. I tried swapping VMs between b166 and b167. Build 166 run with the b167 VM shows the slower performance (and vice versa). Here are some snippets from a couple -Xprof runs I did with the stock builds (test modified to use a single thread): JDK9b166: --------- Flat profile of 0.96 secs (73 total ticks): pool-1-thread-1 ... Compiled + native Method 60.3% 42 + 2 java.util.regex.Pattern$CharPropertyGreedy.match ... JDK9b167: --------- Flat profile of 7.13 secs (487 total ticks): pool-1-thread-1 ... Compiled + native Method 90.3% 440 + 0 java.util.regex.Pattern$CharPropertyGreedy.match ... So the hot method is spending 10x the ticks with b167. Running -Xprof with -Xint shows this: JDK9b166, -Xprof: ----------------- Flat profile of 24.74 secs (1928 total ticks): pool-1-thread-1 Interpreted + native Method 19.1% 369 + 0 java.util.regex.CharPredicates.lambda$ASCII_DIGIT$15 17.0% 328 + 0 java.lang.StringLatin1.charAt 10.8% 209 + 0 java.lang.String.isLatin1 9.4% 182 + 0 java.util.regex.ASCII.isDigit ... 1.3% 26 + 0 java.util.regex.Pattern$BmpCharPredicate$$Lambda$5.is 1.3% 26 + 0 java.lang.Character.codePointAt 0.8% 16 + 0 java.util.regex.Pattern$CharPredicate$$Lambda$4.is JDK9b167, -Xprof: ----------------- Flat profile of 48.21 secs (4137 total ticks): pool-1-thread-1 Interpreted + native Method 58.8% 2433 + 0 java.util.regex.Pattern$BmpCharPredicate$$Lambda$5.is 32.2% 1332 + 0 java.util.regex.Pattern$CharPredicate$$Lambda$4.is 2.7% 110 + 0 java.lang.StringLatin1.charAt 2.3% 97 + 0 java.lang.String.isLatin1 ... So something with lambdas (or a couple particular lambdas) changed (as far as -Xint is concerned, anyway), maybe? Moving to hotspot for investigation. 1. http://hg.openjdk.java.net/jdk9/dev/jdk/rev/d0c319c32334
18-03-2019

Ran the testcase on a Windows 7 machine 64 bit i5-5300U CPU@2.30 GHz. Below are the results : JDK 8u201 - 3745 millis JDK 9.0.4 - 21899 millis JDK 10.0.2 - 24640 millis JDK 11.0.2 - 21917 millis JDK 12+33 - 22321 millis JDK 13+12 - 22587 millis
15-03-2019