JDK-7187821 : Default encoding changed from 6u33 to 7u6 [macosx]
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.nio.charsets
  • Affected Version: 7u6
  • Priority: P2
  • Status: Resolved
  • Resolution: Duplicate
  • OS: os_x
  • CPU: x86
  • Submitted: 2012-07-30
  • Updated: 2013-04-12
  • Resolved: 2013-04-12
Related Reports
Duplicate :  
Description
J2SE Version (please include all output from java -version flag):
7u6

Does this problem occur on J2SE 6ux or 7ux?  Yes / No (pick one)
no

Operating System Configuration Information (be specific):
Mac OSX 10.8

Bug Description:

the charset encoding behavior has changed form Java 1.6.33 to 1.7.6 
concerning the list files method.

Following test case:
  * OSX 10.8 default installation
  * Java installation that came from Apple
  * Current developer release of Java 1.7
  * Folder with files or folders containing special characters like umlauts, 
    Chinese or the Apple-Symbol
  * Run the File.listfiles method against this folder with version 1.6 to 
    see all the files
  * Run the File.listfiles method against this folder with version 1.7 to 
    only see files without special character
 
When debugging into listfiles -> File.list -> UnixFilesystem -> String.getBytes, 
the charsetName argument is different in version 1.6.33 (utf-8) to 
version 1.7 (US-ASCII).

It's very suspicious that we cannot find out where the UTF-8 comes from
while it is the correct behavior.

The only workaround is to set the environment variable LANG to en_US.UTF-8
the system having trouble with is a brand new installation of OSX Mountain Lion - no upgrade. Double-checked using the Guest Account which gets cleared every time you lock out. The settings have all been the default settings. Please find two screenshots attached(image001), showing the different output. 

The Language Settings in the System Prefs are: English with a "UK (German)" Locale (however this does not have any impact).

The bug is not the different setting of file.encoding. The bug is that a 
change of file.encoding has an effect on file names. The file.encoding 
described the encoding inside the files and not the file names.
 (See Inet-test.png) screenshot


This bug has the effect that files with Umlauts are not visible in the JFileChooser. See this screenshot (JFileChooser on the left, Finder on the right). Note the missing "test��.rb" and "pdfc��hm��.pdf" files. check FileEncodingChange.png

Comments
Yep - done.
12-04-2013

Can we close this one as a dup of JDK-8003228?
12-04-2013

Yes, CAP member had confirmed it had been fixed in JDK 8 b84
11-04-2013

This should be fixed now that 8003228 has been backported to 7u.
08-04-2013

it is a show stopper issue for the CAP member to test and there is no workaround for JNLP application.
17-10-2012

EVALUATION The locale env variables are obviously not set correctly on submitter's system. See "Comments" section for details, all set to "C" except LC_CTYPE="UTF-8", which is also not a correct, "UTF-8" is definitley not an appropriate locale name. Naoto confirm all settings are correct/appropriate on his 10.8 system (en_US.UTF-8). Have not heard further from submitter on his system configuration. Closed this one as "not reproducible" for now.
08-08-2012

EVALUATION It would be really useful if the submitter could tell us which locale they are using. Also what is the output of this test on: public class Test { public static void main(String[] args) { System.out.println( System.getProperty("file.encoding") ); } } For whatever reason, Apple JDK maps UTF8 locales to "MacRoman", this may be the issue here.
31-07-2012