JDK-8256441 : Add Stream.toList() method
  • Type: CSR
  • Component: core-libs
  • Sub-Component: java.util.stream
  • Priority: P3
  • Status: Closed
  • Resolution: Approved
  • Fix Versions: 16
  • Submitted: 2020-11-17
  • Updated: 2020-11-19
  • Resolved: 2020-11-19
Related Reports
CSR :  
Description
Summary
-------

There needs to be a convenient and optimized way to create a `List` from the contents of a `Stream`.

Problem
-------

Given a stream, it's possible to collect the elements into a `List` using a `Collector`:

    stream.collect(Collectors.toList())

This works, but it runs through the `Collector` interface, which does accumulation and merging. There is no way for a stream (particularly a parallel stream) of known size to deposit results directly into a `List` without extra allocation and copying.

Solution
--------

Add a new method `toList()` directly to the `Stream` interface. This allows the streams machinery to deposit the results directly into a destination array which can then be wrapped safely in an unmodifiable `List`. Not only does this avoid copying and allocation, it's quite convenient.

Although it's not specified explicitly, the streams implementation tolerates null elements in most cases. The default and optimized implementations of `toList()` tolerate nulls, but this is also not specified explicitly. A general statement about null handling in streams may be the subject of a future specification revision.

Specification
-------------
Add the following method to the `java.util.stream.Stream` interface:

    /**
     * Accumulates the elements of this stream into a {@code List}. The elements in
     * the list will be in this stream's encounter order, if one exists. The returned List
     * is unmodifiable; calls to any mutator method will always cause
     * {@code UnsupportedOperationException} to be thrown. There are no
     * guarantees on the implementation type or serializability of the returned List.
     * 
     * <p>The returned instance may be <a href="../lang/doc-files/ValueBased.html">value-based</a>.
     * Callers should make no assumptions about the identity of the returned instances.
     * Identity-sensitive operations on these instances (reference equality ({@code ==}),
     * identity hash code, and synchronization) are unreliable and should be avoided.
     * 
     * <p>This is a <a href="package-summary.html#StreamOps">terminal operation</a>.
     * 
     * @apiNote If more control over the returned object is required, use
     * {@link Collectors#toCollection(Supplier)}.
     * 
     * @implSpec The implementation in this interface returns a List produced as if by the following:
     * <pre>{@code
     * Collections.unmodifiableList(new ArrayList<>(Arrays.asList(this.toArray())))
     * }</pre>
     * 
     * @implNote Most instances of Stream will override this method and provide an implementation
     * that is highly optimized compared to the implementation in this interface.
     * 
     * @return a List containing the stream elements
     * 
     * @since 16
     */
    default List<T> toList() { ... }

Comments
Thanks. Regarding the typical wording "the default implementation..." an earlier draft of this proposal had that wording, but it drew complaints because it implied that a typical stream pipeline such as Stream.of("a", "b", "c").toList() would invoke the default implementation. This was considered misleading. The problem is that "default method" implies that it's invoked unless the caller does something special. That's not the case. What will almost always get invoked is an optimized internal JDK implementation. The actual "default method" (that is, the implementation defined in this interface) will be invoked only if the caller is using a third party stream extension that doesn't override this method; this should be quite rare. It would be good to come up with standardized wording for this concept such as "the implementation in this interface" or "the implementation of this method defined in this interface" or some such and to use it consistently throughout the JDK. Typical wording such as "the default implementation" or "this implementation" is confusing or ambiguous.
19-11-2020

Moving to Approved. A code review comment for consideration: for starting the implSpec text consider "The default implementation..." That is one of the stock phrases used elsewhere in the base module docs, but that usage isn't entirely consistent. No need to re-review the CSR if updated to that or similar wording.
19-11-2020