JDK-6464154 : (process) subprocess environment sort order differs from Windows native sort order
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.lang
  • Affected Version: 5.0u7,6
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • OS: windows,windows_xp
  • CPU: generic,x86
  • Submitted: 2006-08-25
  • Updated: 2011-03-07
  • Resolved: 2011-03-07
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other JDK 6 JDK 7
5.0u11Fixed 6u2Fixed 7 b03Fixed
Related Reports
Relates :  
Description
As specified in the Microsoft documentation,

http://windowssdk.msdn.microsoft.com/en-us/library/ms682009.aspx

"All strings in the environment block must be sorted alphabetically by name. The sort is case-insensitive, Unicode order, without regard to locale"

Unfortunately, this specification does not make it clear whether a character between
the lower-case range and the upper-case range, such as "_", should sort
before or after the alphabetic characters.  An empirical test shows that it
Windows (actually cmd.exe's "set" command) sorts it after, while Java's
implementation sorts it before.

Comments
SUGGESTED FIX Peter Ahe writes: "I think the code is fine but you are using a dangerous technique and anyone reading this code will have to think carefully to rule out the potential for problems. Josh and Neal suggest using "return (i2 < i1 ? -1 : (i2 == i1 ? 0 : 1))" in Puzzler 65."
28-08-2006

SUGGESTED FIX --- /u/martin/ws/mustang/src/windows/classes/java/lang/ProcessEnvironment.java 2005-12-04 15:11:48.242579000 -0800 +++ /u/martin/ws/process/src/windows/classes/java/lang/ProcessEnvironment.java 2006-08-26 00:01:35.117667000 -0700 @@ -171,8 +171,25 @@ private static final class NameComparator implements Comparator<String> { - public int compare(String x, String y) { - return x.compareToIgnoreCase(y); + public int compare(String s1, String s2) { + // We can't use String.compareToIgnoreCase since it + // canonicalizes to lower case, while Windows + // canonicalizes to upper case! For example, "_" should + // sort *after* "Z", not before. + int n1 = s1.length(); + int n2 = s2.length(); + int min = Math.min(n1, n2); + for (int i = 0; i < min; i++) { + char c1 = s1.charAt(i); + char c2 = s2.charAt(i); + if (c1 != c2) { + c1 = Character.toUpperCase(c1); + c2 = Character.toUpperCase(c2); + if (c1 != c2) + return c1 - c2; + } + } + return n1 - n2; } } @@ -180,7 +197,7 @@ implements Comparator<Map.Entry<String,String>> { public int compare(Map.Entry<String,String> e1, Map.Entry<String,String> e2) { - return e1.getKey().compareToIgnoreCase(e2.getKey()); + return nameComparator.compare(e1.getKey(), e2.getKey()); } }
26-08-2006

EVALUATION Here's a test case demonstrating the problem on Windows 2000: import java.util.*; import java.io.*; public class EnvSort { public static String commandOutput(Process p) throws IOException { StringBuilder sb = new StringBuilder(); InputStream is = p.getInputStream(); int c; while ((c = is.read()) != -1) sb.append((char)c); return sb.toString().replaceAll("\r", ""); } public static String commandOutput(String[] command, String[] env) throws IOException { Process p1 = Runtime.getRuntime().exec(command, env); ProcessBuilder pb = new ProcessBuilder(command); pb.environment().clear(); for (String envvar : env) { String[] split = envvar.split("="); pb.environment().put(split[0], split[1]); } String output1 = commandOutput(p1); String output2 = commandOutput(pb.start()); if (! output1.equals(output2)) throw new Error(); return output1; } public static void main(String[] args) throws Throwable { String[] env = {"Z=Z", "A=A", "+=+", "_=_", "~=~", "x=x", "b=b"}; String childEnv1 = commandOutput("cmd /c set".split(" "), env) .replaceAll("(?m)^.[^=].*\n", ""); Writer w = new OutputStreamWriter (new FileOutputStream("EnvSort.bat")); w.write("@echo off\r\n"); for (String envvar : env) w.write("set " + envvar + "\r\n"); w.write("set\r\n"); w.close(); String childEnv2 = commandOutput("cmd /c EnvSort.bat".split(" "), new String[]{}) .replaceAll("(?m)^.[^=].*\n", ""); new File("EnvSort.bat").delete(); System.out.print(childEnv1); System.out.println("--------------------------------------"); System.out.print(childEnv2); if (! childEnv1.equals(childEnv2)) throw new Error("inconsistent environments!"); } } ------------------------------------------------------------ $ jver mustang jr EnvSort.java ==> javac -source 1.6 -Xlint:all EnvSort.java ==> java -esa -ea EnvSort +=+ _=_ A=A b=b x=x Z=Z ~=~ -------------------------------------- +=+ A=A b=b x=x Z=Z _=_ ~=~ Exception in thread "main" java.lang.Error: inconsistent environments! at EnvSort.main(EnvSort.java:53)
25-08-2006

EVALUATION The Process code uses the builtin String case-insensitive comparator. This is *different* from the windows one. private static class CaseInsensitiveComparator implements Comparator<String>, java.io.Serializable { // use serialVersionUID from JDK 1.2.2 for interoperability private static final long serialVersionUID = 8575799808933029326L; public int compare(String s1, String s2) { int n1=s1.length(), n2=s2.length(); for (int i1=0, i2=0; i1<n1 && i2<n2; i1++, i2++) { char c1 = s1.charAt(i1); char c2 = s2.charAt(i2); if (c1 != c2) { c1 = Character.toUpperCase(c1); c2 = Character.toUpperCase(c2); if (c1 != c2) { c1 = Character.toLowerCase(c1); c2 = Character.toLowerCase(c2); if (c1 != c2) { return c1 - c2; } } } } return n1 - n2; } } Windows apparently canonicalizes to upper case, not lower case. Doh! Which choice is better? Well, what would a *standard* function do? Let's check Unix 2003.... hnmmmmmm..... strcasecmp http://www.opengroup.org/onlinepubs/009695399/functions/strncasecmp.html "strcasecmp() and strncasecmp() shall behave as if the strings had been converted to lowercase and then a byte comparison performed" so Java's choice is best (ignoring lingering issues about characters that have don't have unique upper case and lower case versions, and character sequences that have different length in a different case, such as German es-zet.)
25-08-2006