JDK-8223781 : String::translateEscapes (Preview)
  • Type: CSR
  • Component: core-libs
  • Sub-Component: java.lang
  • Priority: P3
  • Status: Closed
  • Resolution: Approved
  • Fix Versions: 13
  • Submitted: 2019-05-13
  • Updated: 2019-11-09
  • Resolved: 2019-05-30
Related Reports
CSR :  
Relates :  
Relates :  
Description
Summary
-------

This feature introduces a new String instance method to translate escape sequences, such as `\n`, `\t`, `\'`, `\"`, and `\\`, as described in full in section 3.10.6 of the The Java™ Language Specification.

This will be a [preview language feature](http://openjdk.java.net/jeps/12) as part of [Text Blocks](https://bugs.openjdk.java.net/browse/JDK-8222530)

Problem
-------

The specification of Text Blocks requires that the Java compiler defer processing of escape sequences until after line terminator translation and re-indentation. To provide consistency with the Java Language Specification and long term maintainability, escape translation will be provided by a library method.

Solution
--------

The solution is to provide a new String instance method which adheres to Java™ Language Specification section 3.10.7. This method takes the receiver String and replaces escape sequences with character equivalents. Attempts to translate strings containing invalid escapes sequences raise an IllegalArgumentException.

Specification
-------------

```
    /**
     * Returns a string whose value is this string, with escape sequences
     * translated as if in a string literal.
     * <p>
     * Escape sequences are translated as follows;
     * <table class="plain">
     *   <caption style="display:none">Translation</caption>
     *   <thead>
     *   <tr>
     *     <th scope="col">Escape</th>
     *     <th scope="col">Name</th>
     *     <th scope="col">Translation</th>
     *   </tr>
     *   </thead>
     *   <tr>
     *     <td>{@code \u005Cb}</td>
     *     <td>backspace</td>
     *     <td>{@code U+0008}</td>
     *   </tr>
     *   <tr>
     *     <td>{@code \u005Ct}</td>
     *     <td>horizontal tab</td>
     *     <td>{@code U+0009}</td>
     *   </tr>
     *   <tr>
     *     <td>{@code \u005Cn}</td>
     *     <td>line feed</td>
     *     <td>{@code U+000A}</td>
     *   </tr>
     *   <tr>
     *     <td>{@code \u005Cf}</td>
     *     <td>form feed</td>
     *     <td>{@code U+000C}</td>
     *   </tr>
     *   <tr>
     *     <td>{@code \u005Cr}</td>
     *     <td>carriage return</td>
     *     <td>{@code U+000D}</td>
     *   </tr>
     *   <tr>
     *     <td>{@code \u005C"}</td>
     *     <td>double quote</td>
     *     <td>{@code U+0022}</td>
     *   </tr>
     *   <tr>
     *     <td>{@code \u005C'}</td>
     *     <td>single quote</td>
     *     <td>{@code U+0027}</td>
     *   </tr>
     *   <tr>
     *     <td>{@code \u005C\u005C}</td>
     *     <td>backslash</td>
     *     <td>{@code U+005C}</td>
     *   </tr>
     *   <tr>
     *     <td>{@code \u005C0 - \u005C377}</td>
     *     <td>octal escape</td>
     *     <td>code point equivalents</td>
     *   </tr>
     * </table>
     *
     * @implNote
     * This method does <em>not</em> translate Unicode escapes such as "{@code \u005cu2022}".
     * Unicode escapes are translated by the Java compiler when reading input characters and
     * are not part of the string literal specification.
     *
     * @throws IllegalArgumentException when an escape sequence is malformed.
     *
     * @return String with escape sequences translated.
     *
     * @jls 3.10.7 Escape Sequences
     *
     * @since 13
     *
     * @deprecated  This method is associated with text blocks, a preview language feature.
     *              Text blocks and/or this method may be changed or removed in a future release.
     */
    @Deprecated(forRemoval=true, since="13")
    public String translateEscapes() {
 
```
Comments
Thanks for the clarification of the table's contents. Moving to Approve subject to including some example/explanation of what a Unicode escape is. Sample text (although perhaps not text that would make it through the toolchain as intended) edited in above.
30-05-2019

After trying different sequences \u005C? was the combination that made it through the javac/HTML/markdown gauntlet. ``` Escape Name Unicode/action \b backspace U+0008 \t horizontal tab U+0009 \n line feed U+000A \f form feed U+000C \r carriage return U+000D \" double quote U+0022 \' single quote U+0027 \\ backslash U+005C \0 - \377 octal escape code point equivalents ``` Added implnote in regard to Unicode escapes.
30-05-2019

I may be missing something in multiple levels of javaco + HTML escaping, but I don't see what the intended values in the "Escape" column are meant to be. I believe this method's spec should discuss why JLS 3.3. Unicode Escapes are *not* processed. Moving to Provisional.
29-05-2019

http://mail.openjdk.java.net/pipermail/core-libs-dev/2019-May/060411.html
24-05-2019