JDK 22 |
---|
22 b27Fixed |
CSR :
|
|
Relates :
|
|
Relates :
|
|
Relates :
|
|
Relates :
|
A DESCRIPTION OF THE PROBLEM : A race condition in the String constructor taking a char[] (and probably other constructors too) allows creating a String with an incorrect coder: A String only containing latin-1 characters, but still encoded using UTF-16. This is because in between the constructor checking if the content can be encoded using latin-1 and it being encoded as UTF-16, the content of the passed in array may have changed See https://wouter.coekaerts.be/2023/breaking-string STEPS TO FOLLOW TO REPRODUCE THE PROBLEM : Concurrently modify the char[] passed into the String constructor. See example code. EXPECTED VERSUS ACTUAL BEHAVIOR : EXPECTED - A String where .equals and other methods behave correctly ACTUAL - A String where .equals and other methods are inconsistent with its contents ---------- BEGIN SOURCE ---------- /** * Given a latin-1 String, creates a copy that is * incorrectly encoded as UTF-16. */ static String breakIt(String original) { if (original.chars().max().orElseThrow() > 256) { throw new IllegalArgumentException( "Can only break latin-1 Strings"); } char[] chars = original.toCharArray(); // in another thread, flip the first character back // and forth between being encodable as latin-1 or not Thread thread = new Thread(() -> { while (!Thread.interrupted()) { chars[0] ^= 256; } }); thread.start(); // at the same time call the String constructor, // until we hit the race condition while (true) { String s = new String(chars); if (s.charAt(0) < 256 && !original.equals(s)) { thread.interrupt(); return s; } } } String a = "foo"; String b = breakIt(a); // they are not equal to each other System.out.println(a.equals(b)); // => false // they do contain the same series of characters System.out.println(Arrays.equals(a.toCharArray(), b.toCharArray())); // => true ---------- END SOURCE ---------- FREQUENCY : always
|