| JDK 11 | JDK 17 | JDK 21 | JDK 22 | JDK 8 | 
|---|---|---|---|---|
| 11.0.31-oracleUnresolved | 17.0.19-oracleUnresolved | 21.0.10-oracleUnresolved | 22 b27Fixed | 8u481Resolved | 
| CSR :   | |
| Relates :   | |
| Relates :   | |
| Relates :   | |
| Relates :   | 
A DESCRIPTION OF THE PROBLEM :
A race condition in the String constructor taking a char[] (and probably other constructors too) allows creating a String with an incorrect coder: A String only containing latin-1 characters, but still encoded using UTF-16.
This is because in between the constructor checking if the content can be encoded using latin-1 and it being encoded as UTF-16, the content of the passed in array may have changed
See https://wouter.coekaerts.be/2023/breaking-string
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Concurrently modify the char[] passed into the String constructor. See example code.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
A String where .equals and other methods behave correctly
ACTUAL -
A String where .equals and other methods are inconsistent with its contents
---------- BEGIN SOURCE ----------
/**
 * Given a latin-1 String, creates a copy that is
 * incorrectly encoded as UTF-16.
 */
static String breakIt(String original) {
  if (original.chars().max().orElseThrow() > 256) {
    throw new IllegalArgumentException(
        "Can only break latin-1 Strings");
  }
  char[] chars = original.toCharArray();
  // in another thread, flip the first character back
  // and forth between being encodable as latin-1 or not
  Thread thread = new Thread(() -> {
    while (!Thread.interrupted()) {
      chars[0] ^= 256;
    }
  });
  thread.start();
  // at the same time call the String constructor,
  // until we hit the race condition
  while (true) {
    String s = new String(chars);
    if (s.charAt(0) < 256 && !original.equals(s)) {
      thread.interrupt();
      return s;
    }
  }
}
String a = "foo";
String b = breakIt(a);
// they are not equal to each other
System.out.println(a.equals(b));
// => false
// they do contain the same series of characters
System.out.println(Arrays.equals(a.toCharArray(),
    b.toCharArray()));
// => true
---------- END SOURCE ----------
FREQUENCY : always
| 
 |