JDK-8306927 : Collator treats "v" and "w" as the same letter for Swedish language locale.
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.util:i18n
  • Affected Version: 8,11,17,20,21
  • Priority: P4
  • Status: Open
  • Resolution: Unresolved
  • OS: generic
  • CPU: generic
  • Submitted: 2023-04-18
  • Updated: 2023-04-26
Description
A DESCRIPTION OF THE PROBLEM :
Sorting rule was changed in 2006 but the JDK has not been updated.
From https://en.wikipedia.org/wiki/Swedish_alphabet : "The two letters were often combined in the collating sequence as if they were all V or all W, until 2006 when the 13th edition of Svenska Akademiens ordlista (The Swedish Academy's Orthographic Dictionary) declared a change. W was given its own section in the dictionary, and the W = V sorting rule was deprecated."

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Run code provided.

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
List sorted in correct order: va, vc, wb.
ACTUAL -
List sorted in incorrect order: va, wb, vc.

---------- BEGIN SOURCE ----------
import java.text.Collator;
import java.util.Arrays;
import java.util.Locale;

public class Main {
    public static void main(String[] args) {
        var list = Arrays.asList("wb", "va", "vc");
        list.sort(Collator.getInstance(new Locale("sv", "SE")));
        System.out.println("Sorted: " + list); // Incorrect order: va, wb, vc. It should be: va, vc, wb.
    }
}
---------- END SOURCE ----------

FREQUENCY : always



Comments
The observations on Windows 10: JDK 8: Failed, va, wb, vc returned. JDK 11: Failed. JDK 17: Failed. JDK 20: Failed. JDK 21ea+5: Failed.
26-04-2023