JDK-4469394 : (so) SocketChannels registered with OP_WRITE only release selector once (win)
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.nio
  • Affected Version: 1.4.0,1.4.1
  • Priority: P2
  • Status: Resolved
  • Resolution: Fixed
  • OS: solaris_7,solaris_8,windows_nt
  • CPU: generic,x86,sparc
  • Submitted: 2001-06-13
  • Updated: 2002-06-12
  • Resolved: 2002-04-25
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other Other
1.4.0_02 02Fixed 1.4.1Fixed
Related Reports
Duplicate :  
Relates :  
Description

Name: bsC130419			Date: 06/13/2001


java version "1.4.0-beta"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.0-beta-b65)
Java HotSpot(TM) Client VM (build 1.4.0-beta-b65, mixed mode)

Non-blocking SocketChannels registered with a Selector and including the
SelectionKey.OP_WRITE operation code will not release Selector.select() more
than once. The first time selector.select() is called and the outgoing socket
is available for writing, the select method returns as expected, and the
selector.selectedKeys Set includes the SelectionKey for the SocketChannel with
readyOps including OP_WRITE. However, after removing the SelectionKey from the
selectedKeys Set, any later call to selector.select() will block indefinitely.

The following simple program demonstrates the problem. The program creates a
ServerSocketChannel in non-blocking mode using an InetSocketAddress built
from "localhost", port 1110. The program registers the ServerSocketChannel with
a Selector with operations OP_ACCEPT. selector.select() then blocks until a
client (e.g., a Telnet client) connects to localhost 1110.

Once a client is accepted, a SocketChannel for that client is configured in non-
blocking mode. The new SocketChannel is registered with the same Selector with
operation OP_WRITE. The next selector.select() call returns also most
immedaitely with a selectedKey set including the SelectionKey for the new
SocketChannel -- indicating the socket can be written to. The program writes a
simple, 7 byte buffer to the socket (using SocketChannel.write(ByteBuffer)),
and removes the SelectionKey from the selector's selectedKey Set. The next call
to selector.select() never returns, even though the outgoing client socket
should be again available for writing almost instantaneously.

Note that the erroneous behavior occurs whether or not the test program
actually writes any bytes to the SocketChannel. In the final block of the main
() method's loop below, simply comment out the call to SocketChannel.write()
(actual line is "sc.write(bb);"), and the same problem still occurs.

// Simple test program to exercise writing multiple times
// to a SocketChannel using non-blocking I/O in the java.nio.*
// packages.

import java.net.*;
import java.nio.*;
import java.nio.channels.*;
import java.util.*;

public class WriteTest
{
    public static void main(String[] args) throws Exception
    {
	//...contents of this byte buffer will be written to
	//   all connected clients repeatedly until they disconnect...
	ByteBuffer bb = ByteBuffer.wrap(new byte[] {
	    (byte)'B', (byte)'r', (byte)'i', (byte)'a', (byte)'n',
	    (byte)'\r', (byte)'\n'
	});

	//...ServerSocketChannel to receive new client connections on
	//   "localhost" interface, port 1110
	ServerSocketChannel ssc = ServerSocketChannel.open();
	ssc.configureBlocking(false);
	ServerSocket ss = ssc.socket();
	ss.bind(new InetSocketAddress(InetAddress.getByName("localhost"),
1110));

	//...single Selector used to receive non-blocking I/O events...
	Selector sel = Selector.open();

	//...register the ServerSocketChannel with the Selector...
	SelectionKey keyAccept = ssc.register(sel, ssc.validOps());

	while(true)
	{
	    if(sel.select() == 0)
		continue;

	    Set selectedKeys = sel.selectedKeys();

	    //...if a new client is trying to copnnect, accept the
	    //   connection then loop...
	    if(selectedKeys.contains(keyAccept))
	    {
		Socket sock = ssc.accept();
		SocketChannel sc = sock.getChannel();
		sc.configureBlocking(false);
		SelectionKey k = sc.register(sel, SelectionKey.OP_WRITE);
		k.attach(sc);
		selectedKeys.remove(keyAccept);
		continue;
	    }

	    //...no new client connections, meaning that at least one of
	    //   the currently connected clients can receive more bytes...
	    Iterator iter = selectedKeys.iterator();
	    while(iter.hasNext())
	    {
		SelectionKey k = (SelectionKey)iter.next();
		SocketChannel sc = (SocketChannel)k.attachment();
		sc.write(bb); // Problem persists even after
			      // commenting this line out.
		bb.rewind();
		selectedKeys.remove(k);
	    }
	}
    }
}

The following modified version of the above test program attempts to work
around the problem by creating a new Selector object after each write operation
and registering all client SocketChannels with the new Selector (instead of re-
using the same Selector in each iteration of the outer infinite loop). This
does not fix the problem, however, indicating to me that the problem is in the
native layer (comment elided for brevity). (Note that this version of the test
program does not work with multiple simultaneous clients -- but it still
demonstrates the problem which is the important part).

import java.net.*;
import java.nio.*;
import java.nio.channels.*;
import java.util.*;

public class WriteTest
{
    public static void main(String[] args) throws Exception
    {
	ByteBuffer bb = ByteBuffer.wrap(new byte[] {
	    (byte)'B', (byte)'r', (byte)'i', (byte)'a', (byte)'n',
	    (byte)'\r', (byte)'\n'
	});

	ServerSocketChannel ssc = ServerSocketChannel.open();
	ssc.configureBlocking(false);
	ServerSocket ss = ssc.socket();
	ss.bind(new InetSocketAddress(InetAddress.getByName("localhost"),
1110));

	Selector sel = Selector.open();

	SelectionKey keyAccept = ssc.register(sel, ssc.validOps());

	while(true)
	{
	    if(sel.select() == 0)
		continue;

	    Set selectedKeys = sel.selectedKeys();

	    if(selectedKeys.contains(keyAccept))
	    {
		Socket sock = ssc.accept();
		SocketChannel sc = sock.getChannel();
		sc.configureBlocking(false);
		SelectionKey k = sc.register(sel, SelectionKey.OP_WRITE);
		k.attach(sc);
		selectedKeys.remove(keyAccept);
		continue;
	    }

	    //...this section modified from the previous example to open
	    //   a new Selector and use the new one in the next iteration
	    //   through the output (infinite) while loop...
	    Iterator iter = selectedKeys.iterator();
	    Selector sel2 = Selector.open();
	    while(iter.hasNext())
	    {
		SelectionKey k = (SelectionKey)iter.next();
		SocketChannel sc = (SocketChannel)k.attachment();
		sc.write(bb);
		bb.rewind();
		selectedKeys.remove(k);

		k = sc.register(sel2, SelectionKey.OP_WRITE);
		k.attach(sc);
	    }
	    ssc.register(sel2, SelectionKey.OP_ACCEPT);
	    sel.close();
	    sel = sel2;
	}
    }
}


Brian Maso
(Review ID: 126007) 
======================================================================

Comments
CONVERTED DATA BugTraq+ Release Management Values COMMIT TO FIX: 1.4.0_02 hopper FIXED IN: 1.4.0_02 hopper INTEGRATED IN: 1.4.0_02 hopper VERIFIED IN: hopper
14-06-2004

WORK AROUND Name: bsC130419 Date: 06/13/2001 No known workaround that I can find other than not using non-blocking I/O for socket writes. ======================================================================
11-06-2004

SUGGESTED FIX ###@###.### 2002-04-02 For 1.4.0_02, the file which has to be changed is in j2se/src/win32/native/sun/nio/ch/PollArrayWrapper.c sccs diffs -C PollArrayWrapper.c ------- PollArrayWrapper.c ------- *** /tmp/d7Rai8Q Tue Apr 2 20:18:35 2002 --- PollArrayWrapper.c Wed Mar 6 19:39:20 2002 *************** *** 109,114 **** --- 109,167 ---- WSAEVENT *events = (WSAEVENT *) jlong_to_ptr(eventAddress); int err; WSANETWORKEVENTS ne; + int i, lastError; + FD_SET readfds, writefds, exceptfds; + const static struct timeval zerotime = {0, 0}; + if (numfds > 1) { + FD_ZERO(&readfds); + FD_ZERO(&writefds); + FD_ZERO(&exceptfds); + for (i = 0; i < numfds -1; i++) { + if (fds[i].events & POLLIN) { + FD_SET(fds[i].fd, &readfds); + } + if (fds[i].events & POLLOUT) { + FD_SET(fds[i].fd, &writefds); + } + FD_SET(fds[i].fd, &exceptfds); + } + while ((result = WSAWaitForMultipleEvents(numfds - 1, events + 1, + FALSE, 0, FALSE)) != WSA_WAIT_TIMEOUT) { + if (result == WSA_WAIT_FAILED) { + lastError = WSAGetLastError(); + JNU_ThrowIOExceptionWithLastError(env, + "In Poll0 reset events:WSA_WAIT_FAILED"); + return lastError; + } + WSAResetEvent(events[result - WSA_WAIT_EVENT_0 + 1]); + }; + + if ((result = select(numfds - 1 , &readfds, &writefds, &exceptfds, + &zerotime)) != 0){ + if (result == SOCKET_ERROR) { + lastError = WSAGetLastError(); + JNU_ThrowIOExceptionWithLastError(env, + "Poll0 before select:WSA_WAIT_FAILED"); + return lastError; + } + timeout = 0; + + for (i = 0 ; i < numfds - 1; i++) { + + if ((fds[i].events & POLLIN) && + FD_ISSET(fds[i].fd, &readfds)) { + fds[i].revents |= POLLIN; + } + if ((fds[i].events & POLLOUT) && FD_ISSET(fds[i].fd, &writefds)) { + fds[i].revents |= POLLOUT; + } + if (FD_ISSET(fds[i].fd, &exceptfds)){ // error + fds[i].revents = fds[i].events; + } + } + } + } + if (timeout < 0) { timeout = WSA_INFINITE;
11-06-2004

PUBLIC COMMENTS A new version of Selector for Windows solves this problem
10-06-2004

EVALUATION This bug is due to the fact that the NIO specification requires level-triggered readiness notifications (which is, conveniently, how they work in Solaris and Linux) but Windows only provides edge-triggered notifications. A fix is in progress and is slated for the 1.4.1 release. -- ###@###.### 2002/3/23
03-10-0189