JDK-8169235 : Java REGEX match error
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.util.regex
  • Affected Version: 7
  • Priority: P4
  • Status: Closed
  • Resolution: Duplicate
  • OS: generic
  • CPU: generic
  • Submitted: 2016-10-29
  • Updated: 2017-05-19
  • Resolved: 2017-04-07
Related Reports
Duplicate :  
Description
FULL PRODUCT VERSION :
java version "1.7.0_85"
Java(TM) SE Runtime Environment (build 1.7.0_85-b15)


ADDITIONAL OS VERSION INFORMATION :
Linux 2.6.39-400.211.1.el6uek.x86_64 #1 SMP Fri Nov 15 
13:39:16 PST 2013 x86_64 x86_64 x86_64 GNU/Linux


A DESCRIPTION OF THE PROBLEM :
Java 8 seems to have it fixed.

The problem occurs in Java 7



STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
                         Java 8 produces

OK I found           F()  with the argument 'one'
OK I found   as.factor()  with the argument 'two'
OK I found      factor()  with the argument 'three'
OK I found           F()  with the argument 'four'

                         Java 7 produces
OK I found           F()  with the argument 'one'
OK I found   as.factor()  with the argument 'two'
OK I found      factor()  with the argument 'three'



import java.util.regex.Matcher;
import java.util.regex.Pattern;

import static java.lang.System.out;

/**
 * @author Dmitry Golovashkin. Created on 10/28/16.
 */
public class Main {

  private static void asFactor(final String formula) {
    final String functionName = "\\b((as\\.)?factor|F)\\b"; // factor(ID), as.factor(ID), or F(ID).
    final String space = "\\s*";
    final String leftParenthesis = "\\(";
    final String id = "([^)]*)"; // May include leading/trailing space.
    final String rightParenthesis = "\\)";
    final String asFactorRegex = functionName + space + leftParenthesis + id + rightParenthesis;

    final Pattern asFactorPattern = Pattern.compile(asFactorRegex);
    final Matcher matcher = asFactorPattern.matcher(formula);
    final StringBuffer sb = new StringBuffer();

    while (matcher.find()) {
      final String factorName = matcher.group(3).trim();
      out.printf("OK I found  %10s()  with the argument \'%s\'%n", matcher.group(1), matcher.group(3));
    }
  }

  public static void main(String[] args) {
    final String formula = "y ~ F(one) + as.factor(two) + factor(three) + F(four)";
    out.println("input formula " + formula);
    out.println();

    asFactor(formula);
  }
}


EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
I was expecting to see

                         Java 8 produces (correct)

OK I found           F()  with the argument 'one'
OK I found   as.factor()  with the argument 'two'
OK I found      factor()  with the argument 'three'
OK I found           F()  with the argument 'four'


However Java 7 matches just three.

ACTUAL -
                         Java 7 produces
OK I found           F()  with the argument 'one'
OK I found   as.factor()  with the argument 'two'
OK I found      factor()  with the argument 'three'

and this is incorrect.
Java 8 produces the correct result.

REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
import java.util.regex.Matcher;
import java.util.regex.Pattern;

import static java.lang.System.out;

/**
 * @author Dmitry Golovashkin. Created on 10/28/16.
 *        dmitry.golovashkin@oracle.com
 */
public class Main {

  private static void asFactor(final String formula) {
    final String functionName = "\\b((as\\.)?factor|F)\\b"; // factor(ID), as.factor(ID), or F(ID).
    final String space = "\\s*";
    final String leftParenthesis = "\\(";
    final String id = "([^)]*)"; // May include leading/trailing space.
    final String rightParenthesis = "\\)";
    final String asFactorRegex = functionName + space + leftParenthesis + id + rightParenthesis;

    final Pattern asFactorPattern = Pattern.compile(asFactorRegex);
    final Matcher matcher = asFactorPattern.matcher(formula);
    final StringBuffer sb = new StringBuffer();

    while (matcher.find()) {
      final String factorName = matcher.group(3).trim();
      out.printf("OK I found  %10s()  with the argument \'%s\'%n", matcher.group(1), matcher.group(3));
    }
  }

  public static void main(String[] args) {
    final String formula = "y ~ F(one) + as.factor(two) + factor(three) + F(four)";
    out.println("input formula " + formula);
    out.println();

    asFactor(formula);
  }
}

---------- END SOURCE ----------


Comments
Since the issue is not reproducible on latest oracle's binary(JDK8,JDK9), closing this issue as closed/can not reproduce.
04-11-2016

Checked this issue against 8,8u112,8u122ea,9ea on Windows and could reproduce the issue on JDK 7 family, but could not on JDK 8 and 9ea family. Steps to reproduce: ************************* - Run the attached test application(Main.java) with JDK. Result: ********* OS : Windows 7 64 bit [Version 6.1.7601] JDK: 7 b147 : Fail 7u80 b07 : Fail << 8 b132 : Pass 8u102 b14 : Pass 8u112 b15 : Pass 8u122ea b02 : Pass 9ea+133 : Pass ================================================================================================================
04-11-2016