JDK-4154947 : JDK 1.1.6, 1.2/Windows NT: Interrupting a thread blocked does not unblock IO
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.net
  • Affected Version: 1.1.3,1.1.4,1.1.5,1.1.6
  • Priority: P4
  • Status: Closed
  • Resolution: Won't Fix
  • OS: generic,windows_95,windows_nt
  • CPU: generic,x86
  • Submitted: 1998-07-06
  • Updated: 2001-01-22
  • Resolved: 2001-01-22
Related Reports
Duplicate :  
Duplicate :  
Relates :  
Relates :  
Description
With JDK 1.1.6, on Solaris, if you interrupt a thread which is blocked
on network IO (specifically, read), the read gets terminated with
InterruptedIOException.   On Windows NT, no such exception is thrown
and the thread is left blocked in the read call.

Attached is a test program to demo the bug.  On Solaris, the output is:
	pause...
	Waiting to accept connections: ServerSocket[addr=0.0.0.0/0.0.0.0,port=0,localport=1234]
	awake...
	Creating connection to server
	Created connection: Socket[addr=localhost/127.0.0.1,port=1234,localport=37294]
	pause...
	Got connection: Socket[addr=localhost/127.0.0.1,port=37294,localport=1234]
	Blocking on read
	awake...
	Interrupting other thread which should be stopped in read
	Interrupt called
	exiting
	PASS: Interrupted as expected
	java.io.InterruptedIOException: operation interrupted
        	at java.net.SocketInputStream.read(SocketInputStream.java:92)
        	at java.net.SocketInputStream.read(SocketInputStream.java:108)
        	at test$1.run(test.java:19)
	gzilla% 


On Windows, the PASS line does not get generated:
	pause...
	Waiting to accept connections: ServerSocket[addr=0.0.0.0/0.0.0.0,port=0,localport=1234]
	awake...
	Creating connection to server
	Created connection: Socket[addr=localhost/127.0.0.1,port=1234,localport=1043]
	pause...
	Got connection: Socket[addr=localhost/127.0.0.1,port=1043,localport=1234]
	Blocking on read
	awake...
	Interrupting other thread which should be stopped in read
	Interrupt called
	exiting

The source of this program is attached.


Comments
EVALUATION This is an analysis of the situation provided by hong.zhang@Eng : The Interrutable IO was and still is the underlying IO semantics of Java runtime system. However it has never been fully implemented since JDK1.0.2 because of complexity and difficulty, especially on win32. Many people have tried to implemented the Interruptable IO on win32, and no one had a nice solution. The native win32 IO semantics are synchonous io and asynchronous io. There is no IO semantics that is equivalent to interruptable io. So the implementation of interruptable must use some sort of asynchronous io, in win32 jargon, use the overlapped io. The overlapped io is sort of new thing on win32, it was introduced by NT 3.51. It also supported by Winsock 2. However not all win32 io system support overlapped io, for example, the stdin, anonymous pipe, socket connect(), gethostbyname(), or winsock 1. So if we have to implement the interruptable io on win32, the only thing we can achieve on both win95 and winnt is to use overlapped (asynchronous) io to simulate interruptable io for most socket operations, except connect() and gethostbyname(). Unless Microsoft will introduce new API in win98 and nt5. On Solaris, we have green and native package. In green package, we manage our own IO semantics. With careful implementation, we could largely implement any semantics we need. The problem is on native thread. Even the Solaris API use the interruptable IO semantics, the actual implementation seems to have some problems. Our current implementation use the user signal 1 to simulate the interruptable io behavior. It only barely works. The SunSoft propose a new API to fully support interruptable io: pthread_interrupt_np(pthread_t tid); (Which implicitly means the current Solaris API does not fully support interruptable io.) It will be very nice to have kernel level support. If it is implemented, we will have near good implementation on Solaris. porting such a perfectness will be extremely difficult on other platform. Even so, there is still no garantee about the state of IO stream, as I last time talked to Wesley Chen. However we still have the question: What is semantics of interruptable io, resumption or termination ??? Similar to the problem of interruptable io, the close(int fd) semantics is also highly different on win32 and Solaris. On win32, if a thread blocks on a fd, and second thread close the fd, the first thread will get an error return immediately. On Solaris, the second thread will be blocked until the first thread finishs its io operation. If the io operation is socket read, both threads may block for a long time. Now I am trying to implement the win32 close() semantics on Solaris native thread in a relative expensive way. And we need to see what the result we get. In addition to the implementation difficulty, there is also a usage difficulty. Most win32 programmers don't understand the interruptable io semantics. On the other side, most unix developers don't know the close() semantics of win32. And there are lot of Java code reflecting this. For example, most Java library (include ours) does not catch InterruptedException specifically, and treat them as fatal IOException. It is also true that several Java IO stream does not implement the close notification semantics (the win32 close semantics). Besides the above implementation issues, we also need to consider the usage of interruptable semantics. Considering when one user (Java) thread need to wake up another thread, (let me name it "Foo") which is blocked on the DataInputStream, which wraps SocketInputStream which wraps recv(). When the interrupt exception is thrown, the exception will be propagated all the way up to the user level. However the state of DataInputStream, SocketInputStream, recv() are possibly in unknown state. If the user ever want to resume the io operation later, he may get unknown data from stream, and get totally lost. So Foo has to remember to close the stream if he get interrupted. But in this way, the usability of interruptable is largely lost. It is much like the close() semantics of windows. When I use grep to search the entire build tree, the IOException appear at about 1600 places. There are 67 places catch IOException, but only 9 places catch InterruptedIOException in PrintStream and PrintWriter class. Generally, the InterruptedIOException is considered as IOException, treated as fatal error. Making InterruptedIOException to have resumption semantics will be extremely difficult on any platform, and will be against the semantics of Java language exception. But if we choose termination semantics, the interruptable io is very similar to the close() semantics. On general, even we have a solid implementation now, we can not expect users to use the semantics correctly, given the truth most win32 developers background and preference. Looking at our own VM and core library code, it is obvious most of our own engineers don't quite understand the issue. But there is serious backward compatibility issue here. For now, the Microsoft VM implement the socket interruptable io in a kludgy way. They use asynchronous select call to implement it. But that is only work for socket. There is some overhead with asynchronous socket. (At least the interrupt will take about 1s to 3s.) In a heavy loaded socket application, one event polling thread may not be enough to handle all the network traffice. For long term, they will probably use overlapped io to implement it. The latest Netscape VM does not support interruptable socket io operation. The above situation has been there for quite long time, because we don't have a perfect solution on either side. But the JDK 1.2 is our major release. If we could not define the Java IO semantics during this release, we will have more trouble later on. When we really start to implement them in the future, we will have to face more code break. Even there is no perfect solution now, we need to face it and reach an agreement before 1.2 FCS. At least the users of 1.2 will have a consistent IO semantics to follow. The situtaion have changed recently on Solaris. For Solaris 2.7, the close() semantics has been changed to non-blocking, see Bug 4043763 for details. It will greatly affect the JVM, because the VM will automatically report the error code from the OS, which is EBADF in this case. It is nice for spec to get ready for that now, otherwise the spec will break on Solaris 2.7. What I suggest is spec clearly indicates the semantics of stream close(), which should be, "if a stream is closed by a thread, all other thread that are blocking on the stream should get an error return". If there is existing stream implementation does not meet this semantics, that is a bug. We will try to fix as much as possible before FCS. But whether should we deprecate Interruptable IO before FCS is another big question. If we don't deprecate for 1.2, we will probably never be able to deprecate it and have to implement them, either before 1.2 FCS or later? For now no one really know how to do that on win32. -- The issue of Thread.interrupt was examined some time ago by the TRC, see :- http://javaweb.eng/trc/minutes/99-07-16.txt The agreement was that the specified functionality cannot be reasonable implemented and developers should not rely on interruptable I/O. Even if a solution similiar to Solaris were implemented on Win32 there would still be inconsistencies on other OS ports where it would not be possible to provide the same behaviour. In merlin the specification for Thread.interrupted has been updated so that it indicates that an interrupt just sets the thread's interrupted status. If the thread is blocked in Object.wait it will receive an InterruptException. Based on all the above I am closing this bug. alan.bateman@ireland 2001-01-22
22-01-2001