JDK-6921885 : (process) Multi-threaded re-entrance into Runtime.exec() affects child process's streams
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.lang
  • Affected Version: 7
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • OS: windows_xp
  • CPU: generic
  • Submitted: 2010-02-01
  • Updated: 2013-08-08
  • Resolved: 2013-08-08
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 7
7-poolFixed
Related Reports
Relates :  
Description
OPERATING SYSTEM(S)
Microsoft Windows XP Professional
5.1.2600 Service Pack 2 Build 2600
X86-based PC

FULL JDK VERSION(S)
All 1.4.2, 5.0 and 6 and 7, including 1.6.0_18-ea-b04.

PROBLEM DESCRIPTION from LICENSEE

In the testcase provided below, two threads create one child process each using Runtime.exec(). One thread creates a short-lived child process by running the command:

ping -n 1 -w 1000 10.0.0.0

(This basically acts as "sleep 1" would on a Unix system.)

The second thread runs a longer lived command, equivalent to "sleep 10":

ping -n 1 -w 10000 10.0.0.0

The input and error streams are read from both processes, on each respective thread, as quickly as possible.

The problem is that even though the shorter process must be finishing after one second, under certain circumstances the reading of the input/error streams seems to be delayed until the longer lived process has finished.

EXPECTED BEHAVIOUR

The read should always return immediately after the shorter process completes, and the message "[Exec Thread Short] - "ping -n 1 -w 1000 10.0.0.0" finished" should be printed approximately one second after the program is launched.

For example:
---------------------------------------
[Thu Nov 19 11:27:07 GMT 2009] [Exec Thread Long ] - "ping -n 1 -w 10000 10.0.0.0" started
[Thu Nov 19 11:27:07 GMT 2009] [Exec Thread Short] - "ping -n 1 -w 1000 10.0.0.0" started
[Thu Nov 19 11:27:07 GMT 2009] [Exec Thread Short] - read InputStream - START
[Thu Nov 19 11:27:07 GMT 2009] [Exec Thread Long ] - read InputStream - START
[Thu Nov 19 11:27:08 GMT 2009] [Exec Thread Short] - read InputStream - END
[Thu Nov 19 11:27:08 GMT 2009] [Exec Thread Short] - read ErrorStream - START
[Thu Nov 19 11:27:08 GMT 2009] [Exec Thread Short] - read ErrorStream - END
[Thu Nov 19 11:27:08 GMT 2009] [Exec Thread Short] - "ping -n 1 -w 1000 10.0.0.0" finished
[Thu Nov 19 11:27:17 GMT 2009] [Exec Thread Long ] - read InputStream - END
[Thu Nov 19 11:27:17 GMT 2009] [Exec Thread Long ] - read ErrorStream - START
[Thu Nov 19 11:27:17 GMT 2009] [Exec Thread Long ] - read ErrorStream - END
[Thu Nov 19 11:27:17 GMT 2009] [Exec Thread Long ] - "ping -n 1 -w 10000 10.0.0.0" finished
---------------------------------------

OBSERVED BEHAVIOUR

However, in approximately 50% of runs we observe that the "[Exec Thread Short] - "ping -n 1 -w 1000 10.0.0.0" finished" message is printed at approximately the same time as the "[Exec Thread Long ] - "ping -n 1 -w 10000 10.0.0.0" finished" message is printed.

For example:
---------------------------------------
[Thu Nov 19 11:26:33 GMT 2009] [Exec Thread Long ] - "ping -n 1 -w 10000 10.0.0.0" started
[Thu Nov 19 11:26:33 GMT 2009] [Exec Thread Short] - "ping -n 1 -w 1000 10.0.0.0" started
[Thu Nov 19 11:26:33 GMT 2009] [Exec Thread Long ] - read InputStream - START
[Thu Nov 19 11:26:33 GMT 2009] [Exec Thread Short] - read InputStream - START
[Thu Nov 19 11:26:44 GMT 2009] [Exec Thread Long ] - read InputStream - END
[Thu Nov 19 11:26:44 GMT 2009] [Exec Thread Long ] - read ErrorStream - START
[Thu Nov 19 11:26:44 GMT 2009] [Exec Thread Long ] - read ErrorStream - END
[Thu Nov 19 11:26:44 GMT 2009] [Exec Thread Short] - read InputStream - END
[Thu Nov 19 11:26:44 GMT 2009] [Exec Thread Long ] - "ping -n 1 -w 10000 10.0.0.0" finished
[Thu Nov 19 11:26:44 GMT 2009] [Exec Thread Short] - read ErrorStream - START
[Thu Nov 19 11:26:44 GMT 2009] [Exec Thread Short] - read ErrorStream - END
[Thu Nov 19 11:26:44 GMT 2009] [Exec Thread Short] - "ping -n 1 -w 1000 10.0.0.0" finished
---------------------------------------

Note the timestamps: the output from the "Short" thread is delayed until the "Long" thread has completed.

It does not look like a regression, since the behavior is same with 1.3.1, 1.4.2, 5.0, and 6 with both IBM and SUN JDKs.

Note that the problem seems to be specific to the Windows JDK, and it seems to be related to the order in which the threads launch their respective processes (see suggested fix section below for more on this). I could not reproduce the unexpected behaviour on Solaris.

TESTCASE

import java.io.*;
import java.util.*;
public class ExecTest implements Runnable {
    private String command = "";

    public ExecTest (String command) {
        this.command = command;
    }

    public static void main(String[] args) {
        ExecTest exec1 = new ExecTest("ping -n 1 -w 1000 10.0.0.0");
        ExecTest exec2 = new ExecTest("ping -n 1 -w 10000 10.0.0.0");

        Thread execThread1 = new Thread(exec1);
        execThread1.setName("Exec Thread Short");
        execThread1.start();
        Thread execThread2 = new Thread(exec2);
        execThread2.setName("Exec Thread Long ");
        execThread2.start();
    }

    public void run() {
        System.out.println("[" + new Date() + "] [" + Thread.currentThread().getName() + "] - \"" + this.command + "\" started");
        execute(command);
        System.out.println("[" + new Date() + "] [" + Thread.currentThread().getName() + "] - \"" + this.command + "\" finished");
        System.out.flush();
    }

    public void execute(String array) {
        try {
            // If these lines are uncommented the testcase behaves as expected
            //if(Thread.currentThread().getName().equals("Exec Thread Long "))
            //Thread.sleep(100);
            Process p = Runtime.getRuntime().exec(array);

            System.out.println("[" + new Date() + "] [" + Thread.currentThread().getName() + "] - read InputStream - START");
            String result = read(p.getInputStream());
            System.out.println("[" + new Date() + "] [" + Thread.currentThread().getName() + "] - read InputStream - END");

            System.out.println("[" + new Date() + "] [" + Thread.currentThread().getName() + "] - read ErrorStream - START");
            result = read(p.getErrorStream());
            System.out.println("[" + new Date() + "] [" + Thread.currentThread().getName() + "] - read ErrorStream - END");
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    public static String read(InputStream is) {
        InputStreamReader isr = new InputStreamReader(is);
        StringBuffer data = new StringBuffer();
        try {
            int i;
            while ((i = isr.read()) != -1) {
                data.append((char)i);
            }
        } catch (Exception ioe) {
            ioe.printStackTrace();
        }
        return data.toString();
    }
}

REPRODUCTION INSTRUCTIONS

1. javac ExecTest.java
2. java ExecTest

SUGGESTED FIX

I don't have a complete understanding of the problem, but I will share details of the debugging I have done so far:

1. If, instead of using the forced delay workaround in the form of a
   sleep() call in the testcase, we introduce a synchronised block to
   the java.lang.Win32Process constructor, containing the entire
   implementation, the issue also does not occur.

2. We can reduce the scope of the block such that only the
   java.lang.Process.create() call is wrapped in a synchronized block,
   and this also prevents the issue from occuring.

3. We can effectively reduce the scope of the synchronized block even
   further by employing a mutex to ensure that the Win32 CreateProcess()
   API is only called by one thread at a time. With this modification in
   place the problem does not occur. So the issue seems to be caused by
   calling CreateProcess() on multiple concurrent threads.

I hope this information helps.

Comments
Umbrella fix from JDK-7147084. http://hg.openjdk.java.net/jdk8/tl/jdk/rev/2c4f1081a0fa
08-08-2013

Welcome to MS world http://support.microsoft.com/kb/315939
07-05-2013

EVALUATION i double-checked my test scenario and realise now the sleep commands were not producing any output, which is not the same situation. If I replace them with shell scripts that sleep and send something to stdout, then the issue does occur. I'm not convinced there is a jdk bug yet, but I'll accept the report for the moment.
23-07-2010

EVALUATION I observed the same behavior. However, if you replace the ping command with something different eg. the MKS sleep command (ie sleep 1 and sleep 10), then it doesn't happen ever. That suggests there is some occasional time dependence between the two calls to ping. It could be something to do with a common name lookup which prevents the shorter running command from returning before the second one is finished. In any case, this sounds more like a Windows anomaly than a JDK issue. So, I'm going to close the report. If I've missed something, please let me know and it can be re-opened if necessary.
23-07-2010