JDK-8267069 : Update Hebrew/Indonesian/Yiddish ISO 639 language codes to current
  • Type: CSR
  • Component: core-libs
  • Sub-Component: java.util:i18n
  • Priority: P4
  • Status: Closed
  • Resolution: Approved
  • Fix Versions: 17
  • Submitted: 2021-05-12
  • Updated: 2025-07-23
  • Resolved: 2021-05-26
Related Reports
CSR :  
Relates :  
Description
Summary
-------

Change the mapping of the obsolete ISO 639 code mapping in `Locale` class to the current code.

Problem
-------

Historically, constructors in `java.util.Locale` class map three ISO 639 language codes, namely "he", "ji", and "id" to their obsolete codes; "iw", "yi", and "in" for backward compatibility. Although this solution works well to accept both obsolete and current ISO 639 codes, constructed `Locale` object represents the obsolete language code (i.e. `Locale.getLanguage()` and `Locale.toString()` returns obsolete language codes), which looks as if the current language codes were not supported.

Solution
--------

Flip the mapping from current->obsolete to obsolete->current. For example, mapping for Hebrew changes from "he" -> "iw" to "iw"->"he". To provide the backward compatible behavior, a new system property `java.locale.useOldISOCodes` will be introduced. If the value of the system property is `true`, then the Locale class behaves in a backward-compatible manner. `java.util.ResourceBundle.Control#newBundle()` is also modified to load both obsolete and current bundle name resource if needed, honoring the requested name as a priority.

Specification
-------------

Change the part of the class description of `java.util.Locale` as follows:

       *
       * <p>During deserialization, readResolve adds extensions as described
       * in <a href="#special_cases_constructor">Special Cases</a>, only
       * for the two cases th_TH_TH and ja_JP_JP.
       *
    -  * <h4>Legacy language codes</h4>
    +  * <h4><a id="legacy_language_codes">Legacy language codes</a></h4>
       *
       * <p>Locale's constructor has always converted three language codes to
       * their earlier, obsoleted forms: {@code he} maps to {@code iw},
       * {@code yi} maps to {@code ji}, and {@code id} maps to
    -  * {@code in}.  This continues to be the case, in order to not break
    -  * backwards compatibility.
    +  * {@code in}. Since Java SE 17, this is no longer the case. Each
    +  * language maps to its new form; {@code iw} maps to {@code he}, {@code ji}
    +  * maps to {@code yi}, and {@code in} maps to {@code id}.
    +  *
    +  * <p>For the backward compatible behavior, the system property
    +  * {@systemProperty java.locale.useOldISOCodes} reverts the behavior
    +  * back to prior to Java SE 17 one. If the system property is set
    +  * to {@code true}, those three current language codes are mapped to their
    +  * backward compatible forms.
       *
       * <p>The APIs added in 1.7 map between the old and new language codes,
    -  * maintaining the old codes internal to Locale (so that
    -  * {@code getLanguage} and {@code toString} reflect the old
    -  * code), but using the new codes in the BCP 47 language tag APIs (so
    +  * maintaining the mapped codes internal to Locale (so that
    +  * {@code getLanguage} and {@code toString} reflect the mapped
    +  * code, which depends on the {@code java.locale.useOldISOCodes} system
    +  * property), but using the new codes in the BCP 47 language tag APIs (so
       * that {@code toLanguageTag} reflects the new one). This


Change the method description of each constructor in `Locale` class as follows:

          /**
           * Construct a locale from language and country.
           * This constructor normalizes the language value to lowercase and
           * the country value to uppercase.
    -      * <p>
    -      * <b>Note:</b>
    +      * @implNote
           * <ul>
    -      * <li>ISO 639 is not a stable standard; some of the language codes it defines
    -      * (specifically "iw", "ji", and "in") have changed.  This constructor accepts both the
    -      * old codes ("iw", "ji", and "in") and the new codes ("he", "yi", and "id"), but all other
    -      * API on Locale will return only the OLD codes.
    +      * <li>Obsolete ISO 639 codes ("iw", "ji", and "in") are mapped to
    +      * their current forms. See <a href="#legacy_language_codes">Legacy language
    +      * codes</a> for more information.
           * <li>For backward compatibility reasons, this constructor does not make
           * any syntactic checks on the input.
           * </ul>

Change the method description of `Locale#getLanguage()` as follows:

      
         /**
          * Returns the language code of this Locale.
          *
    -     * <p><b>Note:</b> ISO 639 is not a stable standard&mdash; some languages' codes have changed.
    -     * Locale's constructor recognizes both the new and the old codes for the languages
    -     * whose codes have changed, but this function always returns the old code.  If you
    -     * want to check for a specific language whose code has changed, don't do
    -     * <pre>
    -     * if (locale.getLanguage().equals("he")) // BAD!
    -     *    ...
    -     * </pre>
    -     * Instead, do
    -     * <pre>
    -     * if (locale.getLanguage().equals(new Locale("he").getLanguage()))
    -     *    ...
    -     * </pre>
    +     * @implNote This method returns the new forms for the obsolete ISO 639
    +     * codes ("iw", "ji", and "in"). See <a href="#legacy_language_codes">
    +     * Legacy language codes</a> for more information.
    +     *
          * @return The language code, or the empty string if none is defined.
          * @see #getDisplayLanguage
          */

Change the method description of `Locale#forLanguageTag()` as follows:

           *
           * <p>The following <b>conversions</b> are performed:<ul>
           *
           * <li>The language code "und" is mapped to language "".
           *
    -      * <li>The language codes "he", "yi", and "id" are mapped to "iw",
    -      * "ji", and "in" respectively. (This is the same canonicalization
    -      * that's done in Locale's constructors.)
    +      * <li>The language codes "iw", "ji", and "in" are mapped to "he",
    +      * "yi", and "id" respectively. (This is the same canonicalization
    +      * that's done in Locale's constructors.) See
    +      * <a href="#legacy_language_codes">Legacy language codes</a>
    +      * for more information.
           *
           * <li>The portion of a private use subtag prefixed by "lvariant",
           * if any, is removed and appended to the variant field in the
           * result locale (without case normalization).  If it is then
           * empty, the private use subtag is discarded:

Add the following list item in the method description of `java.util.ResourceBundle.Control#newBundle()` as follows:

               *
               * <li>If {@code format} is neither {@code "java.class"}
               * nor {@code "java.properties"}, an
               * {@code IllegalArgumentException} is thrown.</li>
               *
    +          * <li>If the {@code locale}'s language is one of the
    +          * <a href="./Locale.html#legacy_language_codes">Legacy language
    +          * codes</a>, either old or new, then repeat the loading process
    +          * if needed, with the bundle name with the other language.
    +          * For example, "iw" for "he" and vice versa.
               * </ul>
               *
               * @param baseName
               *        the base bundle name of the resource bundle, a fully
               *        qualified class name
Comments
Yes, we noticed the deprecation in JDK 25. [~cushon] and our i18n team have made progress on switching to -Djava.locale.useOldISOCodes=false, although the process is a bit risky.
23-07-2025

[~manc], thanks for letting us know. We plan to deprecate this system property due to the unnecessary complexity it introduces in locale comparison and resource loading. Starting with JDK 25, specifying this property triggers a warning message (see JDK-8353118). It is scheduled for removal in a future release (see JDK-8355522). Please begin refactoring your codebase accordingly.
23-07-2025

We (Google) highly appreciate the support for -Djava.locale.useOldISOCodes=true to keep the legacy behavior. We found it highly disruptive to change the return values for Locale.getLanguage() and Locale.toString(). For other languages (C++, Go), our i18n team has taken approaches similar to JDK's legacy behavior to handle the ISO language code deprecation. For now, most our applications have to set -Djava.locale.useOldISOCodes=true by default. I'd like to emphasize that the JDK probably needs to support -Djava.locale.useOldISOCodes=true indefinitely.
16-03-2023

I see a release note is already planned; moving to Approved.
26-05-2021