JDK-4422049 : (cs) Extended set of character converters not available through Charset API
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.nio
  • Affected Version: 1.4.0,1.4.1
  • Priority: P4
  • Status: Closed
  • Resolution: Duplicate
  • OS: generic,linux
  • CPU: generic,x86
  • Submitted: 2001-03-06
  • Updated: 2005-01-25
  • Resolved: 2005-01-25
Related Reports
Duplicate :  
Relates :  
Relates :  
Description
The current implementation of java.nio.Charset provides access to only a small subset of the character encodings that were supported by the java.io APIs. The test case:

import java.nio.Charset;
import java.util.Iterator;

public class GetAvailableCharsets {

    public static void main (String[] args) {
    
        Iterator charsetIterator = Charset.availableCharsets().keySet().iterator();
        while (charsetIterator.hasNext()) {
            String charsetName = (String) charsetIterator.next();
            Charset charset = Charset.forName(charsetName);
            System.out.println(charset.name() + " - " + charset.displayName());
        }
        // Big5 is both preferred MIME name and our old converter name
        Charset charset = Charset.forName("Big5");
        System.out.println(charset.name() + " - " + charset.displayName());
    }
}

produces the following output:

ISO-8859-1 - ISO-8859-1
ISO-8859-15 - ISO-8859-15
US-ASCII - US-ASCII
UTF-16 - UTF-16
UTF-16BE - UTF-16BE
UTF-16LE - UTF-16LE
UTF-8 - UTF-8
windows-1252 - windows-1252
Exception in thread "main" java.nio.UnsupportedCharsetException: Big5
        at java.nio.Charset.forName(Charset.java:329)
        at GetAvailableCharsets.main(GetAvailableCharsets.java:15)


The expected output would include all character encodings listed at
http://java.sun.com/j2se/1.3/docs/guide/intl/encoding.doc.html
and then repeat Big5.


ingrid.yao@Eng 2001-05-29

Merlin CAP member reports the same complain.

Comments
EVALUATION The charset encoder/decoder APIs are fundamentally incompatible with the existing internal converter interfaces. It is impossible to make the old converters available via the new APIs in a way that would be efficient enough to be usable. We plan to completely replace the old converters in the Tiger timeframe. -- mr@eng 2001/4/4 Full internationalization is a product requirement as per Internationalization / localization Big Rules (http://global.eng/sunteam/bigrules.html). Please do not close this bug unless you obtain a Big Rules exception. norbert.lindenberg@Eng 2001-04-09 This will not be fixed in Merlin. This bug will remain open until management resolves the Big Rules issue. -- ###@###.### 2001/10/3 Checked b42 1.5.0 and Big5_Solaris (the Solaris variant of the de-facto Big5 standard) implementation hasn't been yet ported as part of the bundled set of j2se nio based charsets. Will add this as part of bugID 5015668 ###@###.### 2004-03-22 Bug 5015668 was fixed in tiger-beta. All of the old encoders in the jdk are now available through the java.nio.charset API beginning with tiger-beta. As of tiger-fcs, this is the output for the provided test case: $ java GetAvailableCharsets Big5 - Big5 Big5-HKSCS - Big5-HKSCS EUC-JP - EUC-JP EUC-KR - EUC-KR GB18030 - GB18030 GB2312 - GB2312 GBK - GBK IBM-Thai - IBM-Thai IBM00858 - IBM00858 IBM01140 - IBM01140 IBM01141 - IBM01141 IBM01142 - IBM01142 IBM01143 - IBM01143 IBM01144 - IBM01144 IBM01145 - IBM01145 IBM01146 - IBM01146 IBM01147 - IBM01147 IBM01148 - IBM01148 IBM01149 - IBM01149 IBM037 - IBM037 IBM1026 - IBM1026 IBM1047 - IBM1047 IBM273 - IBM273 IBM277 - IBM277 IBM278 - IBM278 IBM280 - IBM280 IBM284 - IBM284 IBM285 - IBM285 IBM297 - IBM297 IBM420 - IBM420 IBM424 - IBM424 IBM437 - IBM437 IBM500 - IBM500 IBM775 - IBM775 IBM850 - IBM850 IBM852 - IBM852 IBM855 - IBM855 IBM857 - IBM857 IBM860 - IBM860 IBM861 - IBM861 IBM862 - IBM862 IBM863 - IBM863 IBM864 - IBM864 IBM865 - IBM865 IBM866 - IBM866 IBM868 - IBM868 IBM869 - IBM869 IBM870 - IBM870 IBM871 - IBM871 IBM918 - IBM918 ISO-2022-CN - ISO-2022-CN ISO-2022-JP - ISO-2022-JP ISO-2022-KR - ISO-2022-KR ISO-8859-1 - ISO-8859-1 ISO-8859-13 - ISO-8859-13 ISO-8859-15 - ISO-8859-15 ISO-8859-2 - ISO-8859-2 ISO-8859-3 - ISO-8859-3 ISO-8859-4 - ISO-8859-4 ISO-8859-5 - ISO-8859-5 ISO-8859-6 - ISO-8859-6 ISO-8859-7 - ISO-8859-7 ISO-8859-8 - ISO-8859-8 ISO-8859-9 - ISO-8859-9 JIS_X0201 - JIS_X0201 JIS_X0212-1990 - JIS_X0212-1990 KOI8-R - KOI8-R Shift_JIS - Shift_JIS TIS-620 - TIS-620 US-ASCII - US-ASCII UTF-16 - UTF-16 UTF-16BE - UTF-16BE UTF-16LE - UTF-16LE UTF-8 - UTF-8 windows-1250 - windows-1250 windows-1251 - windows-1251 windows-1252 - windows-1252 windows-1253 - windows-1253 windows-1254 - windows-1254 windows-1255 - windows-1255 windows-1256 - windows-1256 windows-1257 - windows-1257 windows-1258 - windows-1258 windows-31j - windows-31j x-Big5-Solaris - x-Big5-Solaris x-euc-jp-linux - x-euc-jp-linux x-EUC-TW - x-EUC-TW x-eucJP-Open - x-eucJP-Open x-IBM1006 - x-IBM1006 x-IBM1025 - x-IBM1025 x-IBM1046 - x-IBM1046 x-IBM1097 - x-IBM1097 x-IBM1098 - x-IBM1098 x-IBM1112 - x-IBM1112 x-IBM1122 - x-IBM1122 x-IBM1123 - x-IBM1123 x-IBM1124 - x-IBM1124 x-IBM1381 - x-IBM1381 x-IBM1383 - x-IBM1383 x-IBM33722 - x-IBM33722 x-IBM737 - x-IBM737 x-IBM856 - x-IBM856 x-IBM874 - x-IBM874 x-IBM875 - x-IBM875 x-IBM921 - x-IBM921 x-IBM922 - x-IBM922 x-IBM930 - x-IBM930 x-IBM933 - x-IBM933 x-IBM935 - x-IBM935 x-IBM937 - x-IBM937 x-IBM939 - x-IBM939 x-IBM942 - x-IBM942 x-IBM942C - x-IBM942C x-IBM943 - x-IBM943 x-IBM943C - x-IBM943C x-IBM948 - x-IBM948 x-IBM949 - x-IBM949 x-IBM949C - x-IBM949C x-IBM950 - x-IBM950 x-IBM964 - x-IBM964 x-IBM970 - x-IBM970 x-ISCII91 - x-ISCII91 x-ISO-2022-CN-CNS - x-ISO-2022-CN-CNS x-ISO-2022-CN-GB - x-ISO-2022-CN-GB x-iso-8859-11 - x-iso-8859-11 x-JIS0208 - x-JIS0208 x-JISAutoDetect - x-JISAutoDetect x-Johab - x-Johab x-MacArabic - x-MacArabic x-MacCentralEurope - x-MacCentralEurope x-MacCroatian - x-MacCroatian x-MacCyrillic - x-MacCyrillic x-MacDingbat - x-MacDingbat x-MacGreek - x-MacGreek x-MacHebrew - x-MacHebrew x-MacIceland - x-MacIceland x-MacRoman - x-MacRoman x-MacRomania - x-MacRomania x-MacSymbol - x-MacSymbol x-MacThai - x-MacThai x-MacTurkish - x-MacTurkish x-MacUkraine - x-MacUkraine x-MS950-HKSCS - x-MS950-HKSCS x-mswin-936 - x-mswin-936 x-PCK - x-PCK x-windows-874 - x-windows-874 x-windows-949 - x-windows-949 x-windows-950 - x-windows-950 Big5 - Big5 I expect to close this bug. ###@###.### 2005-1-21 00:43:13 GMT The work to migrate the old sun.io converters to the java.nio.charset APIs was covered in by a number of bugs all fixed in Tiger: 4890306: Make NIO charsets sun.nio.cs.*, sun.nio.cs.ext.* self standing 4891216: Migrate IBM host/ebcdic converters to use java.nio.charset API/SPI 5015668: Big5_Solaris (variant of Big5 for Solaris) needs to be a java.nio.charset supp I'm going to close this bug as a duplicate of 4890306, the oldest bug on the list where most of the work for conversion was done. ###@###.### 2005-1-25 00:42:31 GMT
25-01-2005