JDK-8292872 : MatchResult should provide values of named-capturing groups
  • Type: CSR
  • Component: core-libs
  • Sub-Component: java.util.regex
  • Priority: P4
  • Status: Provisional
  • Resolution: Unresolved
  • Fix Versions: 20
  • Submitted: 2022-08-24
  • Updated: 2022-09-19
Related Reports
CSR :  
Description
Summary
-------

`java.util.regex.MatchResult` lacks the ability to use group names rather than group numbers to query the result.
This proposal aims to add `start(String)`, `end(String)` and `group(String)` for that. In addition, it adds `namedGroups()` to obtain a mapping from group names to group numbers, as well as `hasMatch()` to query whether a match was successful.

`java.util.regex.Pattern` also adds a public `namedGroups()`.

Problem
-------

Interface `java.util.regex.MatchResult` currently offers a way to query the result details by using group numbers. But a regex pattern can specify named capturing groups as well.

While `java.util.regex.Matcher` exposes methods `start(String)`, `end(String)` and `group(String)` that accept group names, the superinterface `java.util.regex.MatchResult` does not.

Solution
--------

Default methods `start(String)`, `end(String)` and `group(String)` are added to `java.util.regex.MatchResult`. They rely on the presence of default method `namedGroups()`, which must be overridden to return a mapping from group names to the corresponding group number.

In addition, the interface add a `hasMatch()`, which must be overridden to return whether a previous match operation was successful.

The implementation of `namedGroups()` in class `java.util.regex.Matcher` also relies on the additional, equally named method in class `java.util.regex.Pattern`.


Specification
-------------

```
     package java.util.regex;
    
    + * @implNote
    + * Support for named groups is implemented by the default methods
    + * {@link #start(String)}, {@link #end(String)} and {@link #group(String)}.
    + * They all make use of the map returned by {@link #namedGroups()}, whose
    + * default implementation simply throws {@link UnsupportedOperationException}.
    + * It is thus sufficient to override {@link #namedGroups()} for these methods
    + * to work. However, overriding them directly might be preferable for
    + * performance or other reasons.
    + *
      * @author  Michael McCloskey
      * @see Matcher
      * @since 1.5
      */
     public interface MatchResult
    
    
    
    +    /**
    +     * Returns the start index of the subsequence captured by the given
    +     * <a href="Pattern.html#groupname">named-capturing group</a> during the
    +     * previous match operation.
    +     *
    +     * @param  name
    +     *         The name of a named-capturing group in this matcher's pattern
    +     *
    +     * @return  The index of the first character captured by the group,
    +     *          or {@code -1} if the match was successful but the group
    +     *          itself did not match anything
    +     *
    +     * @throws  IllegalStateException
    +     *          If no match has yet been attempted,
    +     *          or if the previous match operation failed
    +     *
    +     * @throws  IllegalArgumentException
    +     *          If there is no capturing group in the pattern
    +     *          with the given name
    +     *
    +     * @throws UnsupportedOperationException
    +     *          If the default implementation of {@link #namedGroups()}
    +     *          is not overridden.
    +     *
    +     * @implSpec
    +     * The default implementation of this method invokes {@link #namedGroups()}
    +     * to obtain the group number from the {@code name} argument, and uses this
    +     * number as argument to an invocation of {@link #start(int)}.
    +     *
    +     * @since 20
    +     */
    +    default int start(String name) {
    
    
    
    +    /**
    +     * Returns the offset after the last character of the subsequence
    +     * captured by the given <a href="Pattern.html#groupname">named-capturing
    +     * group</a> during the previous match operation.
    +     *
    +     * @param  name
    +     *         The name of a named-capturing group in this matcher's pattern
    +     *
    +     * @return  The offset after the last character captured by the group,
    +     *          or {@code -1} if the match was successful
    +     *          but the group itself did not match anything
    +     *
    +     * @throws  IllegalStateException
    +     *          If no match has yet been attempted,
    +     *          or if the previous match operation failed
    +     *
    +     * @throws  IllegalArgumentException
    +     *          If there is no capturing group in the pattern
    +     *          with the given name
    +     *
    +     * @throws UnsupportedOperationException
    +     *          If the default implementation of {@link #namedGroups()}
    +     *          is not overridden.
    +     *
    +     * @implSpec
    +     * The default implementation of this method invokes {@link #namedGroups()}
    +     * to obtain the group number from the {@code name} argument, and uses this
    +     * number as argument to an invocation of {@link #end(int)}.
    +     *
    +     * @since 20
    +     */
    +    default int end(String name) {
    
    
    
    +    /**
    +     * Returns the input subsequence captured by the given
    +     * <a href="Pattern.html#groupname">named-capturing group</a> during the
    +     * previous match operation.
    +     *
    +     * <p> If the match was successful but the group specified failed to match
    +     * any part of the input sequence, then {@code null} is returned. Note
    +     * that some groups, for example {@code (a*)}, match the empty string.
    +     * This method will return the empty string when such a group successfully
    +     * matches the empty string in the input.  </p>
    +     *
    +     * @param  name
    +     *         The name of a named-capturing group in this matcher's pattern
    +     *
    +     * @return  The (possibly empty) subsequence captured by the named group
    +     *          during the previous match, or {@code null} if the group
    +     *          failed to match part of the input
    +     *
    +     * @throws  IllegalStateException
    +     *          If no match has yet been attempted,
    +     *          or if the previous match operation failed
    +     *
    +     * @throws  IllegalArgumentException
    +     *          If there is no capturing group in the pattern
    +     *          with the given name
    +     *
    +     * @throws UnsupportedOperationException
    +     *          If the default implementation of {@link #namedGroups()}
    +     *          is not overridden.
    +     *
    +     * @implSpec
    +     * The default implementation of this method invokes {@link #namedGroups()}
    +     * to obtain the group number from the {@code name} argument, and uses this
    +     * number as argument to an invocation of {@link #group(int)}.
    +     *
    +     * @since 20
    +     */
    +    default String group(String name) {
    
    
    
    +    /**
    +     * Returns an unmodifiable map from capturing group names to group numbers.
    +     * If there are no named groups, returns an empty map.
    +     *
    +     * @return an unmodifiable map from capturing group names to group numbers
    +     *
    +     * @throws UnsupportedOperationException if the implementation does not
    +     *          support named groups.
    +     *
    +     * @implSpec The default implementation of this method always throws
    +     *          {@link UnsupportedOperationException}
    +     *
    +     * @apiNote
    +     * This method must be overridden by an implementation that supports
    +     * named groups.
    +     *
    +     * @since 20
    +     */
    +    default Map<String,Integer> namedGroups() {
    
    
    
    +    /**
    +     * Returns whether {@code this} contains a valid match from
    +     * a previous match or find operation.
    +     *
    +     * @return whether {@code this} contains a valid match
    +     *
    +     * @throws UnsupportedOperationException if the implementation cannot report
    +     *          whether it has a match
    +     *
    +     * @implSpec The default implementation of this method always throws
    +     *          {@link UnsupportedOperationException}
    +     *
    +     * @since 20
    +     */
    +    default boolean hasMatch() {
```

```
    package java.util.regex;
    public final class Matcher implements MatchResult
    
    
    
    +    /**
    +     * {@inheritDoc}
    +     *
    +     * @return {@inheritDoc}
    +     *
    +     * @since {@inheritDoc}
    +     */
    +    @Override
    +    public Map<String, Integer> namedGroups() {
    
    
    
    +    /**
    +     * {@inheritDoc}
    +     *
    +     * @return {@inheritDoc}
    +     *
    +     * @since {@inheritDoc}
    +     */
    +    @Override
    +    public boolean hasMatch() {

```

```
    package java.util.regex;
    public final class Pattern
    
    
    
    +    /**
    +     * Returns an unmodifiable map from capturing group names to group numbers.
    +     * If there are no named groups, returns an empty map.
    +     *
    +     * @return an unmodifiable map from capturing group names to group numbers
    +     *
    +     * @since 20
    +     */
    +    public Map<String, Integer> namedGroups() {
```

Comments
`@implSpec` states that it invokes `namedGroups()`. `@throws UnsupportedOperationException` specifies this will be thrown if `namedGroups()` is not overridden. Thus, if neither `start(String)` nor `namedGroups()` are properly overridden, the former throws. The same holds for `end(String)` and `group(String)`. I think there's no contradiction, or else I'm not getting some point on the semantics of `@implSpec`.
19-09-2022

Moving to Provisional, not Approved. There appear to be some contradictions in the implSpec tags and UnsupportedOperationException tags. In both + * @throws UnsupportedOperationException + * If the default implementation of {@link #namedGroups()} + * is not overridden. + * + * @implSpec + * The default implementation of this method invokes {@link #namedGroups()} + * to obtain the group number from the {@code name} argument, and uses this + * number as argument to an invocation of {@link #start(int)}. + * + * @since 20 + */ + default int start(String name) { and + * @throws UnsupportedOperationException + * If the default implementation of {@link #namedGroups()} + * is not overridden. + * + * @implSpec + * The default implementation of this method invokes {@link #namedGroups()} + * to obtain the group number from the {@code name} argument, and uses this + * number as argument to an invocation of {@link #group(int)}. + * + * @since 20 + */ + default String group(String name) { it seems the UnsupportedOperationException would not be thrown.
13-09-2022