OPERATING SYSTEM(S)
Microsoft Windows XP Professional
5.1.2600 Service Pack 2 Build 2600
X86-based PC
FULL JDK VERSION(S)
All 1.4.2, 5.0 and 6 and 7, including 1.6.0_18-ea-b04.
PROBLEM DESCRIPTION from LICENSEE
In the testcase provided below, two threads create one child process each using Runtime.exec(). One thread creates a short-lived child process by running the command:
ping -n 1 -w 1000 10.0.0.0
(This basically acts as "sleep 1" would on a Unix system.)
The second thread runs a longer lived command, equivalent to "sleep 10":
ping -n 1 -w 10000 10.0.0.0
The input and error streams are read from both processes, on each respective thread, as quickly as possible.
The problem is that even though the shorter process must be finishing after one second, under certain circumstances the reading of the input/error streams seems to be delayed until the longer lived process has finished.
EXPECTED BEHAVIOUR
The read should always return immediately after the shorter process completes, and the message "[Exec Thread Short] - "ping -n 1 -w 1000 10.0.0.0" finished" should be printed approximately one second after the program is launched.
For example:
---------------------------------------
[Thu Nov 19 11:27:07 GMT 2009] [Exec Thread Long ] - "ping -n 1 -w 10000 10.0.0.0" started
[Thu Nov 19 11:27:07 GMT 2009] [Exec Thread Short] - "ping -n 1 -w 1000 10.0.0.0" started
[Thu Nov 19 11:27:07 GMT 2009] [Exec Thread Short] - read InputStream - START
[Thu Nov 19 11:27:07 GMT 2009] [Exec Thread Long ] - read InputStream - START
[Thu Nov 19 11:27:08 GMT 2009] [Exec Thread Short] - read InputStream - END
[Thu Nov 19 11:27:08 GMT 2009] [Exec Thread Short] - read ErrorStream - START
[Thu Nov 19 11:27:08 GMT 2009] [Exec Thread Short] - read ErrorStream - END
[Thu Nov 19 11:27:08 GMT 2009] [Exec Thread Short] - "ping -n 1 -w 1000 10.0.0.0" finished
[Thu Nov 19 11:27:17 GMT 2009] [Exec Thread Long ] - read InputStream - END
[Thu Nov 19 11:27:17 GMT 2009] [Exec Thread Long ] - read ErrorStream - START
[Thu Nov 19 11:27:17 GMT 2009] [Exec Thread Long ] - read ErrorStream - END
[Thu Nov 19 11:27:17 GMT 2009] [Exec Thread Long ] - "ping -n 1 -w 10000 10.0.0.0" finished
---------------------------------------
OBSERVED BEHAVIOUR
However, in approximately 50% of runs we observe that the "[Exec Thread Short] - "ping -n 1 -w 1000 10.0.0.0" finished" message is printed at approximately the same time as the "[Exec Thread Long ] - "ping -n 1 -w 10000 10.0.0.0" finished" message is printed.
For example:
---------------------------------------
[Thu Nov 19 11:26:33 GMT 2009] [Exec Thread Long ] - "ping -n 1 -w 10000 10.0.0.0" started
[Thu Nov 19 11:26:33 GMT 2009] [Exec Thread Short] - "ping -n 1 -w 1000 10.0.0.0" started
[Thu Nov 19 11:26:33 GMT 2009] [Exec Thread Long ] - read InputStream - START
[Thu Nov 19 11:26:33 GMT 2009] [Exec Thread Short] - read InputStream - START
[Thu Nov 19 11:26:44 GMT 2009] [Exec Thread Long ] - read InputStream - END
[Thu Nov 19 11:26:44 GMT 2009] [Exec Thread Long ] - read ErrorStream - START
[Thu Nov 19 11:26:44 GMT 2009] [Exec Thread Long ] - read ErrorStream - END
[Thu Nov 19 11:26:44 GMT 2009] [Exec Thread Short] - read InputStream - END
[Thu Nov 19 11:26:44 GMT 2009] [Exec Thread Long ] - "ping -n 1 -w 10000 10.0.0.0" finished
[Thu Nov 19 11:26:44 GMT 2009] [Exec Thread Short] - read ErrorStream - START
[Thu Nov 19 11:26:44 GMT 2009] [Exec Thread Short] - read ErrorStream - END
[Thu Nov 19 11:26:44 GMT 2009] [Exec Thread Short] - "ping -n 1 -w 1000 10.0.0.0" finished
---------------------------------------
Note the timestamps: the output from the "Short" thread is delayed until the "Long" thread has completed.
It does not look like a regression, since the behavior is same with 1.3.1, 1.4.2, 5.0, and 6 with both IBM and SUN JDKs.
Note that the problem seems to be specific to the Windows JDK, and it seems to be related to the order in which the threads launch their respective processes (see suggested fix section below for more on this). I could not reproduce the unexpected behaviour on Solaris.
TESTCASE
import java.io.*;
import java.util.*;
public class ExecTest implements Runnable {
private String command = "";
public ExecTest (String command) {
this.command = command;
}
public static void main(String[] args) {
ExecTest exec1 = new ExecTest("ping -n 1 -w 1000 10.0.0.0");
ExecTest exec2 = new ExecTest("ping -n 1 -w 10000 10.0.0.0");
Thread execThread1 = new Thread(exec1);
execThread1.setName("Exec Thread Short");
execThread1.start();
Thread execThread2 = new Thread(exec2);
execThread2.setName("Exec Thread Long ");
execThread2.start();
}
public void run() {
System.out.println("[" + new Date() + "] [" + Thread.currentThread().getName() + "] - \"" + this.command + "\" started");
execute(command);
System.out.println("[" + new Date() + "] [" + Thread.currentThread().getName() + "] - \"" + this.command + "\" finished");
System.out.flush();
}
public void execute(String array) {
try {
// If these lines are uncommented the testcase behaves as expected
//if(Thread.currentThread().getName().equals("Exec Thread Long "))
//Thread.sleep(100);
Process p = Runtime.getRuntime().exec(array);
System.out.println("[" + new Date() + "] [" + Thread.currentThread().getName() + "] - read InputStream - START");
String result = read(p.getInputStream());
System.out.println("[" + new Date() + "] [" + Thread.currentThread().getName() + "] - read InputStream - END");
System.out.println("[" + new Date() + "] [" + Thread.currentThread().getName() + "] - read ErrorStream - START");
result = read(p.getErrorStream());
System.out.println("[" + new Date() + "] [" + Thread.currentThread().getName() + "] - read ErrorStream - END");
} catch (Exception e) {
e.printStackTrace();
}
}
public static String read(InputStream is) {
InputStreamReader isr = new InputStreamReader(is);
StringBuffer data = new StringBuffer();
try {
int i;
while ((i = isr.read()) != -1) {
data.append((char)i);
}
} catch (Exception ioe) {
ioe.printStackTrace();
}
return data.toString();
}
}
REPRODUCTION INSTRUCTIONS
1. javac ExecTest.java
2. java ExecTest
SUGGESTED FIX
I don't have a complete understanding of the problem, but I will share details of the debugging I have done so far:
1. If, instead of using the forced delay workaround in the form of a
sleep() call in the testcase, we introduce a synchronised block to
the java.lang.Win32Process constructor, containing the entire
implementation, the issue also does not occur.
2. We can reduce the scope of the block such that only the
java.lang.Process.create() call is wrapped in a synchronized block,
and this also prevents the issue from occuring.
3. We can effectively reduce the scope of the synchronized block even
further by employing a mutex to ensure that the Win32 CreateProcess()
API is only called by one thread at a time. With this modification in
place the problem does not occur. So the issue seems to be caused by
calling CreateProcess() on multiple concurrent threads.
I hope this information helps.