JDK-4073195 : (process) Process.destroy() isn't guaranteed to kill child process on Solaris
  • Type: Enhancement
  • Component: core-libs
  • Sub-Component: java.lang
  • Affected Version: 1.1.2,1.1.5,5.0,6
  • Priority: P3
  • Status: Resolved
  • Resolution: Duplicate
  • OS: solaris,solaris_2.5.1,solaris_2.6
  • CPU: x86,sparc
  • Submitted: 1997-08-20
  • Updated: 2012-11-15
  • Resolved: 2012-11-15
Related Reports
Duplicate :  
Relates :  
Relates :  
Description
Name: joT67522			Date: 08/20/97


This is a copy of the code fragment that creates the process
and executes the printit shell script...
================================================================================
=======
      try {
        Runtime rt = Runtime.getRuntime();
        Process pro = rt.exec("/export/home/usr/msgoldst/printit test.dat");
        DataInputStream in =
          new DataInputStream ( new BufferedInputStream(pro.getInputStream()));
        while ((s=in.readLine()) != null) {
           System.out.println("out - > " + s);
        }
        DataInputStream ine =
          new DataInputStream ( new BufferedInputStream(pro.getErrorStream()));
        while ((s=in.readLine()) != null) {
            System.out.println("err - > " + s);
        }
      }
      //.
      //.  catch blocks follow for IOException and Exception ...
      //.
================================================================================
====
This is a copy of the shell script that is called printit
================================================================================
====
#!/bin/ksh
lp -dit_printer $1
exit
================================================================================
====
Here is what the process table looks like after running the program ...
Note process pid=24775.  Every time the program runs I get a new
<defunct> process.
================================================================================
====
msgoldst 24775 24769  0                   0:00 <defunct>
msgoldst 24349 24347  0 08:10:12 pts/1    0:02 -ksh
msgoldst 24781 24349  1 09:41:02 pts/1    0:00 grep gold
msgoldst 24769 24753 14 09:40:47 pts/2    0:05 /export/home/usr/msgoldst/jdk1.1/
bin/../bin/sparc/green_threads/java -Dorbixweb
msgoldst 24753 24544  1 09:40:28 pts/2    0:00 orbixd
msgoldst 24544 24542  0 08:36:08 pts/2    0:01 -ksh

================================================================================
====
What Can I do about cleaning up the process so that there is no <defunct> 
processes ?
Also note that I also experience that the program will hang up on occassion 
while 
handling the reading of standard input from the child process.

I did discover the following on your fixed bugs for vesion 1.1.2 important known
bugs in the Virtual machine :

bug id    Summary:
-------   
--------------------------------------------------------------------------------
---------------------------------
1237893   On Solaris platforms only, a blocking read of System.in blocks all 
threads. As a crude workaround, call the
          following readin() routine instead of System.in.read(): static int 
readin() throws IOException,
          InterruptedException It polls every 50 milliseconds, and it can't 
detect EOF, so its behavior is substandard, but
          it works

I wonder if this is what is causing my problem on the process hanging on reading
standard in ?  I will try the suggested work-around for this problem but I don't 
know what to do about the <defunct> process problem ?
================================================================================
====

Thanks in advance,

Best Regards,

Mark Goldston
###@###.###
Rockwell Automotive
 
company - Rockwell Authomotive , email - ###@###.###
======================================================================


1) create dummy long-lived app

public class Bar {

  public static void main(String args[])
  {
    // Just sit here.
    Object foo = new Object();
    synchronized (foo) {
      try {
	foo.wait();
      } catch (InterruptedException e) {}
    }
  }
}

2) create a script that runs the long-lived app.

-rwxrwxr-x   1 abartle        19 Dec 26 12:34 runjava*
#!/bin/sh
java Bar

3) create a Foo class that demonstrates the bug:
Process.destroy won't kill a shell script

import java.io.IOException;
/**
 * Blah
 */
public class Foo {

  public static void main(String args[])
  {
    try 
      {
	Process foo = Runtime.getRuntime().exec("runjava");
	System.out.println("Started runjava");

	// Wait 10 seconds
	Thread.currentThread().sleep(10000);

	// try to kill the process
	System.out.println("Killing runjava");
	foo.destroy(); // doesn't work

	// get exit value -- will throw exception if process not killed!
	foo.exitValue();
      } 
    catch (InterruptedException e) {System.out.println(e.getMessage());}
    catch (IOException e) {System.out.println(e.getMessage());}
    catch (IllegalThreadStateException e) {System.out.println(e.getMessage());}
  }
}


4) Look at the output of running Foo
foomachine:% java Foo
Started runjava
Killing runjava
process hasn't exited

(it threw the IllegalThreadStateException)

5) Verify that the Solaris process are all still running
 19073 pts/16   S  0:00 /usr/local/java/bin/../bin/sparc/green_threads/java Foo
 19078 pts/16   S  0:00 /bin/sh ./runjava
 19079 pts/16   S  0:00 /usr/local/java/bin/../bin/sparc/green_threads/java Bar

Note BOTH the "runjava" and java Bar are still there!

Thanks for your help,

Aron

Comments
EVALUATION I think the Java platform is too mature today for us to actually change the way that Process.destroy() works. We could enable solutions for users by providing access to the child pid, so that users could more easily run kill -9 childPid themselves. See 4244896: (process) Provide System.getPid(), System.killProcess(String pid) Or we could add a method Process.destroyWithoutMercy that would have the desired guaranteed kill semantics. But users are never satisfied. They probably also want a way to kill all descendants of the child process. Not at all easy. Changing to Cause Known, RFE.
15-03-2006

WORK AROUND Implement a (non-portable) getPid for the Java VM for example, by running /bin/sh -c 'echo $PPID' Then you can (non-portably) collect process tree information, including all the children of this JVM using something like /usr/bin/ps -e -o 'pid,ppid' ... Then you can explicitly kill any descendant processes gathered from the previous analysis using /usr/xpg4/bin/kill using any desired death row policy.
15-03-2006

WORK AROUND Name: joT67522 Date: 08/20/97 ======================================================================
11-06-2004

EVALUATION Well, regrettably the two report that have been combined into this one have different causes. The first problem with defunct process probably is caused by reading from System.in as the submitter suggests. I've sent him email in hopes of getting a reply. The second report points at a few problem. I can explain to you what I think is going on. The Java VM using SIGTERM to implement Process.destroy(). Your runjava script uses sh and it hangs around. Apparently sh ignores SIGTERM when it's running so Process.destroy() has no effect. We could use SIGKILL but all that will end up doing is killing runjava but "java Bar" will continue running. Probably when we create a new process using Runtime.exec() it should be in it's own process group and we should kill the process group using SIGTERM and then wait a bit to see if it's still running and then kill it using SIGKILL. I guess that would work but it's fairly complicated. If you don't use the intermediate shell script then the program works fine. You may also be able to use a different shell or write a shell trap handler which would make your shell script work. The other problem this points at is that since the subprogram hasn't exited the VM hangs around because the reader threads it uses aren't daemons. This is fixed in 1.2, but I'm not sure of the bug id. If you have opinions about how this should behave, I'd like to hear them. tom.rodriguez@Eng 1998-03-04 I think the code should probably be changed to use SIGINT. Potentially we could send SIGINT first, wait a moment and if it's still alive send SIGKILL but given the complexity of UNIX signals and process groups I'm not sure we can really get great semantics out of this. tom.rodriguez@Eng 1998-05-05
05-05-1998