Customer email:
===============
Although the fix that you provided solved out RMI IIOP problem, we are
seeing some existing behaviors in 1.4.0 breaking we go to 1.4.1 01 with the
patch.
Here's the details of the problem we're having...
Recently, we have upgraded our product development to use JRE 1.4.1 from
JRE 1.4.0 (JRE and entire Java SDK). We did this on recommendation from
our Sun Support representatives, to resolve a serialization problem in
RMI/IIOP with empty String objects.
When we did this, it broke another part of our product, which deals with
process management. We are able to show that the system works properly,
when run with JRE 1.4.0, but we get different behavior and we have
problems, when we run it with JRE 1.4.1. We are using Solaris 8. Here is
an explanation of some of the details.
We have a Java application, which we call the Process Watchdog (or
cemswd). It spawns other Processes (by using the Process class), and reads
from their respective STDOUT and STDERR IO streams ( using
BufferedReader(InputStreamReader(InputStream)) ). These processes are
instances of a shell script (the Process Watchdog Helper (or
cemswdhelper). These shell scripts execute another shell script, which in
turn launches a JVM. So, for a given spawned child process, the
/usr/proc/bin/ptree would look something like this:
-Process Watchdog (Java)
-Process Watchdog Helper (ksh)
-Application Wrapper (ksh)
-Application (Java)
At some point later, when we call Process.destroy() on these Process
objects (Process Watchdog Helper instances), we see different behavior
depending on which JRE version we use.
In JRE 1.4.0, the BufferedReaders for STDERR and STDOUT return null from
the readLine() method, the processes (Process Watchdog Helper instances)
get a signal 15, which causes them to do their graceful shutdown (they
shutdown their Application Wrapper child processes), and when we do a
Process.waitFor() on the Process Watchdog Helper process, we get a return
value of 0 (graceful exit). This is the behavior we want.
When we execute exactly the same thing (same compiled code, just JRE
switch), using JRE 1.4.1, the BufferedReaders throw IOExceptions from
their readLine() method, like this:
12 Feb 03 13:34:25.814 pid=15567 tid=20 src=java.io.FileInputStream
EXCEPTION: Unable to read process stream.
Exception: java.io.IOException: Bad file number
Stack Trace:
java.io.FileInputStream.readBytes at line 536
java.io.FileInputStream.read at line 191
sun.nio.cs.StreamDecoder$CharsetSD.readBytes at line 406
sun.nio.cs.StreamDecoder$CharsetSD.implRead at line 446
sun.nio.cs.StreamDecoder.read at line 180
java.io.InputStreamReader.read at line 167
java.io.BufferedReader.fill at line 136
java.io.BufferedReader.readLine at line 299
java.io.BufferedReader.readLine at line 362
com.nortel.cdma.gsf.process.ProcessThread$ProcessOutputLogger.run at line
280
We lose the buffered output from the child process (Process Watchdog
Helper instance), and the Process.waitFor() call returns 13 (abnormal
exit). As a result, the Process Watchdog Helper instances are killed
(assuming possibly due to signal 9) before they can gracefully shutdown
their child processes (Application Wrapper instances). So, the result is
Application Wrapper instances left hanging around with parent process id
of 1, and the Process Watchdog no longer has any visibility into whether
these child processes are running or not, since its direct children
(Process Watchdog Helper instances) are dead.
Please explain the following:
- Exactly what is the behavior is of the Process.destroy() method
in JRE 1.4.1 and JRE 1.4.0 (including which signals are used)?
- Why was there a change from JRE 1.4.0 to 1.4.1?
- Is there a JRE 1.4.1 patch to fix this?
- How can we change our code to work-around / fix this problem?