JDK-8200378 : String::strip, String::stripLeading, String::stripTrailing
  • Type: CSR
  • Component: core-libs
  • Sub-Component: java.lang
  • Priority: P3
  • Status: Closed
  • Resolution: Approved
  • Fix Versions: 11
  • Submitted: 2018-03-28
  • Updated: 2018-05-09
  • Resolved: 2018-05-08
Related Reports
CSR :  
Description
Summary
-------

This feature introduces three String instance methods for removal
of Unicode white space from the beginning and end of a string.

Problem
-------

String::trim has existed from early days of Java when Unicode had not
fully evolved to the standard we widely use today.

The definition of space used by String::trim is any code point less than
or equal to the space code point (\u0020), commonly referred to as ASCII 
or ISO control characters.

Unicode-aware trimming routines should use Character::isWhitespace(int).

Additionally, developers have not been able to specifically remove
indentation white space or to specifically remove trailing white space.

Solution
--------

Introduce trimming methods that are Unicode white space aware
and provide additional control of leading only or trailing only.

Specification
-------------

```
   /**
     * Returns a string whose value is this string, with all leading
     * and trailing {@link Character#isWhitespace(int) white space}
     * removed.
     * <p>
     * If this {@code String} object represents an empty string,
     * or if all code points in this string are
     * {@link Character#isWhitespace(int) white space}, then an empty string
     * is returned.
     * <p>
     * Otherwise, returns a substring of this string beginning with the first
     * code point that is not a {@link Character#isWhitespace(int) white space}
     * up to and including the last code point that is not a 
     * {@link Character#isWhitespace(int) white space}.
     * <p>
     * This method may be used to strip
     * {@link Character#isWhitespace(int) white space} from
     * the beginning and end of a string.
     *
     * @return  a string whose value is this string, with all leading
     *          and trailing white space removed
     *
     * @see Character#isWhitespace(int)
     *
     * @since 11
     */
    public String strip() {

    /**
     * Returns a string whose value is this string, with all leading
     * {@link Character#isWhitespace(int) white space} removed.
     * <p>
     * If this {@code String} object represents an empty string,
     * or if all code points in this string are
     * {@link Character#isWhitespace(int) white space}, then an empty string
     * is returned.
     * <p>
     * Otherwise, returns a substring of this string beginning with the first
     * code point that is not a {@link Character#isWhitespace(int) white space}
     * up to to and including the last code point of this string.
     * <p>
     * This method may be used to trim
     * {@link Character#isWhitespace(int) white space} from
     * the beginning of a string.
     *
     * @return  a string whose value is this string, with all leading white
     *          space removed
     *
     * @see Character#isWhitespace(int)
     *
     * @since 11
     */
    public String stripLeading() {

    /**
     * Returns a string whose value is this string, with all trailing
     * {@link Character#isWhitespace(int) white space} removed.
     * <p>
     * If this {@code String} object represents an empty string,
     * or if all characters in this string are
     * {@link Character#isWhitespace(int) white space}, then an empty string
     * is returned.
     * <p>
     * Otherwise, returns a substring of this string beginning with the first
     * code point of this string up to and including the last code point
     * that is not a  {@link Character#isWhitespace(int) white space}.
     * <p>
     * This method may be used to trim
     * {@link Character#isWhitespace(int) white space} from
     * the end of a string.
     *
     * @return  a string whose value is this string, with all trailing white
     *          space removed
     *
     * @see Character#isWhitespace(int)
     *
     * @since 11
     */
    public String stripTrailing() {

```
Comments
The methods do indeed return {@code this} when no change has occurred.
08-05-2018

Regarding the issue of returning `this`, I agree that the specification should **not** specify circumstances under which this same string instance is returned. The rest of the String class isn't consistent about this. For example, replace(oldchar, newchar) says that if oldchar doesn't occur, then this String object is returned. However, the substring() methods make no mention of this special case. Of note is that various methods, including the substring() overloads **do** implement the optimization of returning `this` where appropriate. I think the strip() family should implement this, but they shouldn't specify it.
08-05-2018

Perhaps not strictly necessary to promise "this" be returned if there are not white space characters in question to be removed, but moving the request to Approved in its current form.
08-05-2018

Comments: "upto" => "up to" "...a {@code String} object representing an empty string is returned." => "...an empty string is returned." Please refinalize once these and any other edits have been made.
06-05-2018

Methinks that the math-speak is unnecessary.
03-05-2018

From my reading of the spec, String.substring(int, int) works on char values and not code points, implying text like "the result of {@code this.substring(k, m + n)" is consistent with the rest of the description of the methods' behavior. Marking the request as pended until this is sorted out.
02-05-2018

Is it "white space" or "whitespace"? Other than that, reviewed. -Sundar
27-04-2018

(1) i know the "wording" is kinda of copy/paste from the existing trim() method, but personally feel it might be better to replace all "white space" to {@link Character#isWhitespace(int) whitespace}. For example "Returns a string whose value is this string, with any leading|trailing {@link Character#isWhitespace(int) whitespace} character removed." and maybe with a @see Character.isWhitespace(int) (2) then a {@code String} object representing an empty string is returned. --> simply "then an empty string is returned" ?
25-04-2018

Reviewed. -Sundar
29-03-2018