JDK-8368232 : Improve robustness of String constructors with mutable array inputs
  • Type: CSR
  • Component: core-libs
  • Sub-Component: java.lang
  • Priority: P4
  • Status: Draft
  • Resolution: Unresolved
  • Fix Versions: 21-pool
  • Submitted: 2025-09-22
  • Updated: 2025-10-03
Related Reports
CSR :  
Relates :  
Description
Summary
-------
JDK implementation specific change to hardens the String constructors implementation where the data used to construct the String is modified during construction

Problem
-------
Strings, after construction, are immutable but may be constructed from mutable arrays of bytes, characters, or integers.
The string constructors should guard against the effects of mutating the arrays during construction that might invalidate internal invariants for the correct behavior of operations on the resulting strings. In particular, a number of operations have optimizations for operations on pairs of latin1 strings and pairs of non-latin1 strings, while operations between latin1 and non-latin1 strings use a more general implementation.


Solution
--------
This is a JDK implementation specific change which hardens the String constructor implementation where the data used to construct the String is modified during construction. Ensure that strings identified as non-Latin1 contain at least one non-Latin1 character.

For Latin1 inputs—whether the arrays are encoded in ASCII, ISO-8859-1, UTF-8, or any other encoding decoded to Latin1—the scanning and compression processes remain unchanged.

If a non-Latin1 character is detected, the string is flagged as non-Latin1, with the added verification that a non-Latin1 character exists at the same index. If that character turns out to be Latin1, it indicates that the input array has been modified, and the scan result may be incorrect. While a ConcurrentModificationException could be triggered, introducing the risk of an unexpected exception in an existing application is undesirable. Instead, the non-Latin1 version of the input is re-scanned and compressed. The outcome of this scan determines whether the string should be returned in its Latin1 or non-Latin1 form.

Specification
-------------
No Specification changes
Comments
[~rriggs], no; that is not how the process works by default: "...Afterward, if a backport of the main bug covering JDK (N-1) does not already exist, a backport of the main bug covering JDK (N-1) should be created. Then, a CSR can be created from that backport. The CSR for the backport should explicitly state how the interface change for the backport relates to the interface change for the main release: either the interface change is the same or, if it differs, what the difference is." https://wiki.openjdk.org/display/csr/CSR+FAQs
25-09-2025

Note: It is same as original CSR (JDK-8319228)
22-09-2025