United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
JDK-4729342 : (so) Selector.select() throws CancelledKeyException

Details
Type:
Bug
Submit Date:
2002-08-09
Status:
Resolved
Updated Date:
2014-04-30
Project Name:
JDK
Resolved Date:
2004-04-26
Component:
core-libs
OS:
solaris_8,linux,windows_2000
Sub-Component:
java.nio
CPU:
x86,sparc
Priority:
P3
Resolution:
Fixed
Affected Versions:
1.4.0,1.4.1,1.4.2
Fixed Versions:
1.4.2_06 (06)

Related Reports
Backport:

Sub Tasks

Description

Name: nt126004			Date: 08/09/2002


FULL PRODUCT VERSION :
> java -version
java version "1.4.0"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.0-b92)
Java HotSpot(TM) Client VM (build 1.4.0-b92, mixed mode)


FULL OPERATING SYSTEM VERSION :
> rpm -query glibc
glibc-2.2.4-19

> uname -a
Linux jj.123office.com 2.4.9-7 #1 Thu Oct 18 13:47:25 EDT
2001 i686 unknown



A DESCRIPTION OF THE PROBLEM :
In my java.nio application, I experienced a strange,
intermittent incident where the Selector.select() threw a
CancelledKeyException shortly after another thread had
closed a channel and also cancelled the SelectionKey.

According to the JavaDocs, it is safe to call
SelectionKey.cancel() and/or Channel.close() at any time.

Also, according to the JavaDocs, when Selector.select() is
called, it takes great pains to ensure the safety and
reliability of the operation. See
http://java.sun.com/j2se/1.4/docs/api/java/nio/channels/Selector.html
section "Selection", steps 1, 2a, 2b, and 3.

Here is the stack trace that I believe shows that the NIO
internal classes are in the midst of step 2a (or 2b?) when
the exception is thrown:

java.nio.channels.CancelledKeyException
at
sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
at
sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59)
at sun.nio.ch.SocketChannelImpl.translateAndSetReadyOps
(SocketChannelImpl.java:415)
at sun.nio.ch.AbstractPollSelectorImpl.updateSelectedKeys
(AbstractPollSelectorImpl.java:93)
at
sun.nio.ch.PollSelectorImpl.doSelect(PollSelectorImpl.java:65)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:62)
at com.teamon.util.net.IOPumpNIO.run(IOPumpNIO.java:151)
at java.lang.Thread.run(Thread.java:536)


Since it is OK to close a channel at any time, and since it
is OK to cancel a SelectionKey at any time, and since
Selector.select() is supposed to correctly handle cancelled
keys , then how can it ever throw CancelledKeyException?

I don't see how this should be possible.

And for completeness, JavaDoc for java.nio.channels.Selector
says, regarding concurrency:
    "Application code should be careful to synchronize and
check these conditions as necessary if there is any
possiblity that another thread will cancel a key or close a
channel."
It is not clear to me what the application can do that would
either cause this situation or work around the problem,
since the stack trace indicates that the problem is deep
inside the java.nio implementation, where this is supposed
to be handled.


Unfortunately, I don't think that I could provide a simple test case 
right now because:
- has only happened once
- it is in the guts of a large and complicated application which would 
be very difficult to setup

All I have is the logs from our application that show the events and the 
time they occurred.  The first event happened on thread named H93

10:42:11.071 07/25 [DBUG] [H93] ConnectionNio         ===> QUIT^M
10:42:11.072 07/25 [DBUG] [H93] ConnectionNio         Connection.close()
java.nio.channels.SocketChannel[connected local=/192.168.100.35:38832
remote=/207.66.210.213:110]
10:42:11.073 07/25 [DBUG] [H93] IOPumpNIO             Goodbye sun.nio.ch.SelectionKeyImpl@ca209e

which was produced by the following code:
                if (null != sockChan)
                {
                    if (SyslogUtils.canDebug (this))
                        SyslogUtils.debug (this, "Connection.close() " + 
sockChan);
                    sockChan.close();
                    sockChan = null;
                }
                ...
                sk.cancel();
                ...

                if (SyslogUtils.canDebug(this))
                    SyslogUtils.debug (this, "Goodbye " + sk);

43 milliseconds later, while blocked in Selector.select(), my main event 
loop thread named "IOPump c05ffd", experiences the 
CancelledKeyException, and my catch (Throwable) block log the exception 
and the stack:

10:42:11.116 07/25 [FTAL] [IOPump c05ffd] IOPumpNIO             Leaving run() under duress:
java.nio.channels.CancelledKeyException
java.nio.channels.CancelledKeyException
    at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
    at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59)
    at sun.nio.ch.SocketChannelImpl.translateAndSetReadyOps(SocketChannelImpl.java:415)
    at sun.nio.ch.AbstractPollSelectorImpl.updateSelectedKeys(AbstractPollSelectorImpl.java:93)
    at sun.nio.ch.PollSelectorImpl.doSelect(PollSelectorImpl.java:65)
    at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:62)
    at com.teamon.util.net.IOPumpNIO.run(IOPumpNIO.java:151)
    at java.lang.Thread.run(Thread.java:536)


Looking over my logs, I don't see any other application threads that 
were active at the time of the exception.

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
1.  This is an intermittent problem - in fact I have only
obseved it once, so it seems to be timing dependent.


EXPECTED VERSUS ACTUAL BEHAVIOR :
Expected:  no CancelledKeyException being thrown from the
Selector.select() methods.

Actual:  intermittent CancelledKeyExceptions thrown

REPRODUCIBILITY :
This bug can be reproduced occasionally.
(Review ID: 160037) 
======================================================================

Name: nt126004			Date: 08/29/2002


FULL PRODUCT VERSION :
java version "1.4.1-rc"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1-rc-b19)
Java HotSpot(TM) Client VM (build 1.4.1-rc-b19, mixed mode)


FULL OPERATING SYSTEM VERSION :
Microsoft Windows 2000 [Version 5.00.2195]
Win2K, SP3

A DESCRIPTION OF THE PROBLEM :
Closing a registered SocketChannel in a different Thread
from the Thread that is doing the select will
intermittently generate a NullPointerException:

Exception in thread "main" java.lang.NullPointerException
        at
sun.nio.ch.WindowsSelectorImpl$SubSelector.processFdSet(WindowsSelectorImpl.java:302)
        at
sun.nio.ch.WindowsSelectorImpl$SubSelector.processSelectedKeys(WindowsSelectorImpl.java:280)
        at
sun.nio.ch.WindowsSelectorImpl$SubSelector.access$2600(WindowsSelectorImpl.java:244)
        at sun.nio.ch.WindowsSelectorImpl.updateSelectedKeys(WindowsSelectorImpl.java:405)
        at sun.nio.ch.WindowsSelectorImpl.doSelect(WindowsSelectorImpl.java:141)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:62)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:67)
        at nio_test.main(nio_test.java:23)

on Solaris:
Exception in thread "main" java.nio.channels.CancelledKeyException
        at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
        at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59)
        at sun.nio.ch.SocketChannelImpl.translateAndSetReadyOps(SocketChannelImpl.java:682)
        at sun.nio.ch.DevPollSelectorImpl.updateSelectedKeys(DevPollSelectorImpl.java:108)
        at sun.nio.ch.DevPollSelectorImpl.doSelect(DevPollSelectorImpl.java:75)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:62)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:67)
        at nio_test.main(nio_test.java:24)



STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
1. run the enclosed sample code from a DOS box
2. Open a telnet session ala "telnet localhost 10064"
3. Type some text in the telnet session. This will cause
the sample code to close the socket in a separate thread.
4. You will get the above exception in the sample app.
NOTE: You may have to run the telnet session several times -
 it doesn't always reproduce. But, it happens very often
for me.

EXPECTED VERSUS ACTUAL BEHAVIOR :
I don't expect an NPE!

ERROR MESSAGES/STACK TRACES THAT OCCUR :
Exception in thread "main" java.lang.NullPointerException
        at sun.nio.ch.WindowsSelectorImpl$SubSelector.processFdSet(WindowsSelect
orImpl.java:302)
        at sun.nio.ch.WindowsSelectorImpl$SubSelector.processSelectedKeys(Window
sSelectorImpl.java:280)
        at sun.nio.ch.WindowsSelectorImpl$SubSelector.access$2600(WindowsSelecto
rImpl.java:244)
        at sun.nio.ch.WindowsSelectorImpl.updateSelectedKeys(WindowsSelectorImpl
.java:405)
        at sun.nio.ch.WindowsSelectorImpl.doSelect(WindowsSelectorImpl.java:141)

        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:62)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:67)
        at nio_test.main(nio_test.java:23)

REPRODUCIBILITY :
This bug can be reproduced often.

---------- BEGIN SOURCE ----------
import java.nio.channels.*;
import java.net.*;
import java.util.*;
import java.io.*;

public class nio_test
{
	public static void main(String[] args) throws Exception
	{
		System.out.println("Java Version: " + 
							System.getProperty("java.vm.version", "???"));

		close_thread_mbr.start();

		final Selector			selector = Selector.open();
		ServerSocketChannel		server = ServerSocketChannel.open();
		server.configureBlocking(false);
		InetSocketAddress		address = new InetSocketAddress(10064);
		server.socket().bind(address);
		server.register(selector, SelectionKey.OP_ACCEPT);

		for(;;)
		{
			if ( selector.select() > 0 )
			{
				Set			keys = selector.selectedKeys();
				Iterator	iterator = keys.iterator();
				while ( iterator.hasNext() )
				{
					SelectionKey		key = (SelectionKey)iterator.next();
					iterator.remove();

					if ( key.isAcceptable() )
					{
						SocketChannel	socket = server.accept();
						socket.configureBlocking(false);
						socket.socket().setTcpNoDelay(false);
						socket.register(selector, SelectionKey.OP_READ, socket);
					}
					else
					{
						if ( key.isReadable() )
						{
							System.out.println("read at " + 
                                               System.currentTimeMillis());

							SocketChannel socket = 
								(SocketChannel)key.attachment();
							synchronized(close_list_mbr)
							{
								close_list_mbr.add(socket);
								close_list_mbr.notifyAll();
							}
						}
					}
				}
			}
		}
	}

	private static class close_thread extends Thread
	{
		public void run()
		{
			for(;;)
			{
				try
				{
					synchronized(close_list_mbr)
					{
						while ( close_list_mbr.size() == 0 )
						{
							close_list_mbr.wait();
						}
						Iterator iterator = close_list_mbr.iterator();
						while ( iterator.hasNext() )
						{
							SocketChannel socket = 
									(SocketChannel)iterator.next();
							socket.close();
						}
					}
				}
				catch ( Exception e )
				{
					e.printStackTrace();
				}
			}
		}
	}

	private static List				close_list_mbr = new ArrayList();
	private static close_thread		close_thread_mbr = new close_thread();
}

---------- END SOURCE ----------
(Review ID: 163458)
======================================================================

                                    

Comments
EVALUATION

Possible race condition in selector implementation.  -- ###@###.### 2002/8/9

In the source code provided, the channel key often gets added to the selectedKey set again before the other thread gets around to closing it. Then either the select logic or the user's isReadable() will throw a CancelledKeyException. The select logic should prevent this from happening and that is a bug. But if the user does not cancel the key themselves, or at least clear the interest ops of the key, before closing the channel, then they will still get the exception when invoking isReadable if the closing thread is slower than the selecting one.

Therefore it is a good idea when you are done reading from a channel and don't want to see its key again in select(), to clear its interest ops (may be temporary) or cancel it (more permanent). If you are going to close the channel because you are done with it, then you don't want it appearing back in the selected key set before it gets closed, so if you are closing it in a thread other than the thread doing the selecting, go ahead and cancel the key.

I have added code to the workaround section, based on the sample in the description, that demonstrates this.

###@###.### 2004-04-01

The select logic no longer throws CancelledKeyException if a key that has become ready has meanwhile been cancelled in another thread.
###@###.### 2004-04-20
                                     
2004-04-01
WORK AROUND

If you are going to close a channel (even by adding it to a list to be closed in anotehr thread) go ahead and cancel the key yourself, or at least clear its interestOps.


import java.nio.channels.*;
import java.net.*;
import java.util.*;
import java.io.*;

public class Blah {
    private static List close_list_mbr = new ArrayList();

    private static CloseThread close_thread_mbr = new CloseThread();

    public static void main(String[] args) throws Exception {
        System.out.println("Java Version: " + 
                           System.getProperty("java.vm.version", "???"));

        close_thread_mbr.start();
        final Selector selector = Selector.open();
        ServerSocketChannel server = ServerSocketChannel.open();
        server.configureBlocking(false);
        InetSocketAddress address = new InetSocketAddress(10064);
        server.socket().bind(address);
        server.register(selector, SelectionKey.OP_ACCEPT);

        while(true) {
            if (selector.select() > 0) {
                Set keys = selector.selectedKeys();
                Iterator iterator = keys.iterator();
                while (iterator.hasNext()) {
                    SelectionKey key = (SelectionKey)iterator.next();
                    iterator.remove();

                    if (key.isAcceptable()) {
                        SocketChannel socket = server.accept();
                        socket.configureBlocking(false);
                        socket.socket().setTcpNoDelay(false);
                        socket.register(selector,SelectionKey.OP_READ,socket);
                    } else {
                        if (key.isReadable()) {
                            System.out.println("read at " + 
                                               System.currentTimeMillis());
                            SocketChannel socket = 
                                (SocketChannel)key.attachment();
                            // If you are done reading it and are going to
                            // close this channel, cancel the key before 
                            // calling select() otherwise the key may
                            // reappear in the selected key set before this
                            // other thread gets around to closing it. Then
                            // the key get cancelled and there is a key
                            // in the selected set that is cancelled, so
                            // when you next called isReadable() you get
                            // the CancelledKeyException. Or sometimes due
		            // to a bug the select logic will throw the
                            // exception.
                            key.cancel();
                            synchronized(close_list_mbr) {
                                close_list_mbr.add(socket);
                                close_list_mbr.notifyAll();
                            }
                        }
                    }
                }
            }
        }
    }

    private static class CloseThread extends Thread {
        public void run() {
            for(;;) {
                try {
                    synchronized(close_list_mbr) {
                        while (close_list_mbr.size() == 0) {
                            close_list_mbr.wait();
                        }
                        Iterator iterator = close_list_mbr.iterator();
                        while (iterator.hasNext()) {
                            SocketChannel socket = 
                                (SocketChannel)iterator.next();
                            socket.close();
                        }
                    }
                } catch (Exception e) {
                    e.printStackTrace();
                }
            }
        }
    }
}

                                     
2004-07-08
CONVERTED DATA

BugTraq+ Release Management Values

COMMIT TO FIX:
1.4.2_06
generic
tiger-beta2

FIXED IN:
1.4.2_06
tiger-beta2

INTEGRATED IN:
1.4.2_06
tiger-b49
tiger-beta2


                                     
2004-07-08



Hardware and Software, Engineered to Work Together