Name: bb33257 Date: 09/17/98
We need to modify our collation orders for Norwegian and Danish
to correspond with the official national standards for those
countries. According to the ISO representatives for those
countries, the correct rules are as follows:
Some official wordings taken from the new ISO/IEC 15897 cultural
register on Danish is:
Ordering in Danish is defined in Danish Standard DS 377,
3rd edition (1980) and the Danish Orthography Dictionary
("Retskrivningsordbogen", 2. udgave, Aschehoug, K��benhavn
1996. ISBN 87-11-10000-1).
Normal <a> to <z> ordering is used on the Latin script, except
for the following letters: The letters <��> <��> <��> are
ordered as 3 separate letters after <z>. <��> is ordered as <y>,
<��> as <��>, <��> as <��>, <��> as <d>, <��> as <t><h>, French <?>
as <o><e>. Two <a>s are ordered as <��>, except when denoting two
sounds (which is normally the case only in combined words).
Nonaccented letters come before accented letters, and capital
letters come before small letters, when words otherwise compare
equally. There is no explicit ordering of accents specified
in "Retskrivningsordbogen", and whether case or accents
are the most important is not specified.
Data from the ISO/IEC (and CEN) cultural register is available
at http://www.dkuug.dk/cultreg/
-----------------------------------------------------------
Norwegian ordering is as follows
Aa, Bb, Cc....,Yy:����, Vv..., Zz,����:����, ����:����,����:<Aa><aa>.
�� have the name LATIN LETTER AE ...(Ash) in 10 646 by the way....
Notice that a double a (Aa or aa) is ordered as �� and ��. �� replaced Aa in
the 1917 Norwegian writing reform. The same happened in Danish in 1948.
�� can be displayed AE, �� can be displayed OE and �� can be displayed as AA
in 7-bit ASCII when there are no alternative ways of displying them. AE and
OE are not legitimate variants of �� and �� as such and should therefore be
regarded as separate characters when written as separate characters.
I have been representing Norway in the JTC1/SC2 in the ISO/IEC 10 646 work.
Kolbj��rn Aamb��,
University of Oslo Library.
======================================================================
Name: rlT66838 Date: 07/20/99
For Danish (da_DK) and Norwegian users (no_NO, no_NO_B), Java sorts improperly:
When sorting by catalog products' descriptions, v and w are treated as the same character (have the same weighting).
Example: 1. waffle
2. verkehrt
3. Victor
4. wood
5. vox
6. wrench
The correct result is to have v and w have different primary weightings. This desired behavior has been confirmed by a native Norwegian in our company.
//Get the Collator for no_NO (or da_DK or no_NO_B) and set its strength to
PRIMARY
Collator no_NO_Collator = Collator.getInstance(Locale.no_NO);
no_NO_Collator.setStrength(Collator.PRIMARY);
if( no_NO_Collator.compare("waffle", "vaffle") == 0 )
{
System.out.println("Strings are equivalent");
}
The (incorrect) result will be that the "Strings are equivalent". This is
the case for the locales: no_NO, no_NO_B, da_DK. This is the case for
Collator.PRIMARY and Collator.SECONDARY. The difference only shows up for
Collator.TERTIARY.
The correct answer is that for the letter "w" and "v", Collator.PRIMARY and
Collator.SECONDARY should be viewed as different characters. This is the
correct behavior in Danish, Norwegian, as described by our native Norwegian
employee.
(Review ID: 85477)
======================================================================