JDK-8236208 : DocTrees should provide getCharacters(EntityTree)
  • Type: CSR
  • Component: tools
  • Sub-Component: javadoc(tool)
  • Priority: P3
  • Status: Closed
  • Resolution: Approved
  • Fix Versions: 16
  • Submitted: 2019-12-18
  • Updated: 2020-09-10
  • Resolved: 2020-09-10
Related Reports
CSR :  
Description
Summary
-------

In the Compiler Tree API, (`com.sun.source.**`), a new method is added to the [DocTrees] class, 
to obtain the characters represented by an instance of [EntityTree].

[DocTrees]: https://docs.oracle.com/en/java/javase/13/docs/api/jdk.compiler/com/sun/source/util/DocTrees.html
[EntityTree]: https://docs.oracle.com/en/java/javase/13/docs/api/jdk.compiler/com/sun/source/doctree/EntityTree.html

Problem
-------

The `com.sun.source.doctree.EntityTree` interface provides
a representation for the entities that may be found while parsing
a documentation comment, but there is no public way to 
determine the characters represented by the entity.
There is support internally within the `doclint` utility, but this
support is not generally available, and it would be convenient if it 
were.

Solution
--------

Add a new method to the `DocTrees` utility class to return
the characters represented by an entity.

Note that in many cases, entities may represent multiple characters, 
because of the need for either surrogate pairs or combining characters.

Specification
-------------

The following method is added to the `DocTrees` utility class:

    /**
     * Returns a string containing the characters for the entity represented in a given entity tree,
     * or {@code null} if the entity does not represent a series of characters.
     *
     * <p>The interpretation of entities is based on section
     * <a href="https://www.w3.org/TR/html52/syntax.html#character-references">8.1.4. Character references</a>
     * in the HTML 5.2 specification.</p>
     *
     * @returns a string containing the characters
     */
    public abstract String getCharacters(EntityTree tree);

The definition of the method specifically references a section within the HTML 5.2 specification. It is expected that this reference may be updated in future releases to newer versions of the HTML specification.  This is similar to the way that the specification for `java.lang.Character` may be updated to refer to newer versions of the Unicode specification.
Comments
In other circumstances, we'd add the moral equivalent of a default method in a case like this, but I don't think that is necessary here given the limited usage of this class. Moving to Approved.
10-09-2020