JDK-8177568 : JEP 314: Additional Unicode Language-Tag Extensions
  • Type: JEP
  • Component: core-libs
  • Sub-Component: java.util:i18n
  • Priority: P2
  • Status: Closed
  • Resolution: Delivered
  • Fix Versions: 10
  • Submitted: 2017-03-24
  • Updated: 2018-03-06
  • Resolved: 2018-03-06
Related Reports
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Sub Tasks
JDK-8185647 :  
Description
Summary
-------

Enhance `java.util.Locale` and related APIs to implement additional Unicode extensions of BCP 47 language tags.

Goals
-----

Support for [BCP 47][bcp47] language tags was was initially [added in Java SE 7][1], with support for the Unicode locale extension limited to calendars and numbers. This JEP will implement more of the extensions specified in the latest [LDML specification][2], in the relevant JDK classes.

Non-Goals
---------

Unicode language-tag extensions other than those described below will be ignored.

Description
-----------

As of Java SE 9, the [supported BCP 47 U language-tag extensions][3] are `ca` and `nu`. This JEP will add support for the following additional extensions:

   - `cu` (currency type)
   - `fw` (first day of week)
   - `rg` (region override)
   - `tz` (time zone)

In order to support these additional extensions, changes will be made to the following APIs:

  - `java.text.DateFormat::get*Instance` will return instances based on the extensions `ca`, `rg` and/or `tz`
  - `java.text.DateFormatSymbols::getInstance` will return instances based on the extension `rg`
  - `java.text.DecimalFormatSymbols::getInstance` will return instances based on the extension `rg`
  - `java.text.NumberFormat::get*Instance` will return instances based on the extensions `nu` and/or `rg`
  - `java.time.format.DateTimeFormatter::localizedBy` will return `DateTimeFormatter` instances based on the extensions `ca`, `rg`, and/or `tz`
  - `java.time.format.DateTimeFormatterBuilder::getLocalizedDateTimePattern` will return pattern string based on the `rg` extension.
  - `java.time.format.DecimalStyle::of` will return `DecimalStyle` instances based on the extensions `nu`, and/or `rg`
  - `java.time.temporal.WeekFields::of` will return `WeekFields` instances based on the extensions `fw` and/or `rg`
  - `java.util.Calendar::{getFirstDayOfWeek,getMinimalDaysInWeek}` will return values based on the extensions `fw` and/or `rg`
  - `java.util.Currency::getInstance` will return `Currency` instances based on the extensions `cu` and/or `rg`
  - `java.util.Locale::getDisplayName` will return a string that includes display names for these U extensions
  - `java.util.spi.LocaleNameProvider` will have new SPIs for the keys and types of these U extensions

Risks and Assumptions
---------------------

The display names returned from `Locale::getDisplayName` depend on the localized data provided by each locale provider.

[1]: http://openjdk.java.net/projects/locale-enhancement/
[2]: http://www.unicode.org/reports/tr35/tr35.html#Locale_Extension_Key_and_Type_Data
[3]: http://www.oracle.com/technetwork/java/javase/documentation/java9locales-3559485.html
[bcp47]: http://www.rfc-editor.org/rfc/bcp/bcp47.txt

Comments
[~agarciar] I don't think that expanding "BCP" to "Best Current Practices" is helpful here. People who already know what BCP 47 is don't need the expansion, and people who don't are likely to be confused by it ("Best Current Practices" according to whom?) In this case, better to leave it as an acronym. Reverted.
03-11-2017

Thanks, Mark. It looks good to me.
23-10-2017

Okay, thanks for the clarification. I've updated the text to clarify things a bit and add a little more background for non-experts. If this looks okay to you then assign the issue back to me and I'll move it to Candidate, with a proposed new title of "Additional Unicode Language-Tag Extensions" unless you have a better suggestion.
23-10-2017

The intention of this JEP is to make JDK's relevant classes aware of those particular extensions currently not supported. For example, if an application specifies a Locale of "en-US-u-cu-EUR", which means US English with Euro currency, Currency.getInstance(theLoc) ignores that "u-cu-EUR" part and instantiates a USDollar Currency in JDK 9. With this JEP, it would return Euro. This should work no matter what the data source is, CLDR or JRE Legacy.
23-10-2017

It's not entirely clear to me what you're proposing here. JEP 127 (http://openjdk.java.net/jeps/127) introduced support for Unicode CLDR data. JEP 252 (http://openjdk.java.net/jeps/252) changed that data to be the default source of locale information in the JDK. From reading your Description I suspect that you mean you're going to revise a bunch of existing APIs so that they use CLDR data, and in particular the "cu", "fw", "rg", and "tz" BCP 47 U extensions, rather than whatever data source they're using now. Is that correct? If so, what data source are they using now?
23-10-2017