JDK-4744057 : (se) Potential deadlock between Selector and SelectableChannel
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.nio
  • Affected Version: 1.4.0,1.4.1
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: generic,solaris_8
  • CPU: generic,sparc
  • Submitted: 2002-09-09
  • Updated: 2006-06-21
  • Resolved: 2006-06-21
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other JDK 6
5.0u10Fixed 6 b89Fixed
Description
Name: nt126004			Date: 09/09/2002


FULL PRODUCT VERSION :
java version "1.4.1-rc"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1-rc-b19)
Java HotSpot(TM) Client VM (build 1.4.1-rc-b19, mixed mode)


FULL OPERATING SYSTEM VERSION :
SunOs sss1 5.8 Generic_108528-13 sun4u sparc SUNW,Ultra-80


A DESCRIPTION OF THE PROBLEM :
Closing a SelectableChannel in one thread and calling
Selector.select() in another thread may result in a
deadlock situation (the channel being registred with the
selector). The reason is both SelectableChannel.close() and
Selector.select() synchronize on the same couple of objects
but in a different order as the following stack trace shows:

CLOSE (Thread A):
=================
java.nio.channels.spi.AbstractSelector.cancel()  => enter
monitor for a java.util.HashSet (protecting this selector's
cancelled key list)
java.nio.channels.spi.AbstractSelectionKey.cancel()
java.nio.channels.spi.AbstractSelectableChannel.implCloseCha
nnel()  => enter monitor for a java.lang.Object (protecting
this channel's keys)
java.nio.channels.spi.AbstractInterruptibleChannel.close()


SELECT (Thread B):
==================
java.nio.channels.spi.AbstractSelectableChannel.removeKey
()  => enter monitor for a java.lang.Object (protecting
this channel's keys)
java.nio.channels.spi.AbstractSelector.deregister()
sun.nio.ch.DevPollSelectorImpl.implDereg()
sun.nio.ch.SelectorImpl.processDeregisterQueue()  => enter
monitor for a java.util.HashSet (protecting this selector's
cancelled key list)
sun.nio.ch.DevPollSelectorImpl.doSelect()
sun.nio.ch.SelectorImpl.select()


It is possible that Thread A acquires a lock on the Object
then Thread B acquires a lock on the HashSet before Thread
A. This results in a deadlock.


STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Run the example code under a Thread Analyser (e.g.
OptimizeIt).

Note: deadlock is unreproductible but thread / source code
analysis shows it can happen.

REPRODUCIBILITY :
This bug can be reproduced rarely.

---------- BEGIN SOURCE ----------
import java.nio.channels.*;

public class Deadlock
{
    public static void main(String[] args) throws Exception
    {
        final Selector selector = Selector.open();
        
        Thread thread = new Thread(new Runnable() {
                public void run()
                {
                    while (true)
                    {
                        try
                        {
                            System.out.println("select");
                            selector.select(100);
                        }
                        catch (Exception e)
                        {}
                    }
                }
            });
        thread.start();
        
        while (true)
        {
            SocketChannel channel = SocketChannel.open();
            channel.configureBlocking(false);
            channel.register(selector, 0, null);
            System.out.println("close");
            channel.close();
        }
    }
}

---------- END SOURCE ----------
(Review ID: 164121) 
======================================================================

Comments
EVALUATION -- This bug has also been fixed in 5.0u9.
18-07-2006

EVALUATION Yes, this is possible. -- ###@###.### 2002/12/3 The suggested deaklock scenario is (I slightly modified the info copy/pasted from Description to make it easy to explain) "Thread A acquires a lock on the Object (A-2) then Thread B acquires a lock on the HashSet (B-3) before Thread A. This results in a deadlock." --when we have the stack trace as showed below. CLOSE (Thread A): ================= (4)java.nio.channels.spi.AbstractSelector.cancel() => enter monitor AbstractSelectionKey.cancelledKeys (3)java.nio.channels.spi.AbstractSelectionKey.cancel() (2)java.nio.channels.spi.AbstractSelectableChannel.implClose.Channel() => enter monitor AbstractSelectableChannel.keyLock (1)java.nio.channels.spi.AbstractInterruptibleChannel.close() SELECT (Thread B): ================== (6)java.nio.channels.spi.AbstractSelectableChannel.removeKey () => enter monitor AbstractSelectableChannel.keyLock (5)java.nio.channels.spi.AbstractSelector.deregister() (4)sun.nio.ch.DevPollSelectorImpl.implDereg() (3)sun.nio.ch.SelectorImpl.processDeregisterQueue() => enter monitor AbstractSelectionKey.cancelledKeys (2)sun.nio.ch.DevPollSelectorImpl.doSelect() (1)sun.nio.ch.SelectorImpl.select() To run into a real deadlock, (a)Thread A has the need to go up and reach (4) after (3), which means THE SelectionKey has NOT be canceled yet (is valid). And (b)Thread B has the need to go up and reach (4) after (3) for THE SelectionKey, which means THE SelectionKey has already been in the selector's cancelledKeys, implies that it has already been canceled. (a) and (b) can not be both true for the same Selector, Channel and the SelectKey in between. ###@###.### 2005-06-10 06:50:31 GMT Above evaluation (a and b can not be both true) is true, if we have only two threads A and B involved, one is selecting the channel while the other is closing it the same time. After taking a second look at the code, I realized that we do have a possible race condition at java.nio.channels.spi.AbstractSelectionKey.cancel(), if there is a third thread C canceling the SelectionKey directly (invoking selectionkey's cancel() method) while thread A and B are busy doing select and close as described above. The deadlock occurs when Thread A and C both get into AbstractSelectionKey.cancel(), in which Thread A with the steps (1), (2) and (3) as described above and Thread C by invoking cancel() directly, pass "if (valid)" check, then Thread C goes ahead to put "this key" into selector's cancelledKey set, after the "key" has been put into the cancelledKey set, Thread B comes via the steps described above. Nowe we have the "(a) and (b) are both true" scenario. With a modified version of the test case public static void main(String[] args) throws Exception { final Selector selector = Selector.open(); final SelectionKey key = null; new Thread(new Runnable() { public void run() { while (true) { try { System.out.println("select"); selector.select(100); } catch (Exception e) {} } } }).start(); new Thread(new Runnable() { public void run() { while (true) { try { Iterator<SelectionKey> itr = selector.keys().iterator(); if (itr.hasNext()) { System.out.println("cancel key"); itr.next().cancel(); } } catch (Exception e) {} } } }).start(); while (true) { SocketChannel channel = SocketChannel.open(); channel.configureBlocking(false); channel.register(selector, 0, null); System.out.println("close"); channel.close(); } } And purposely added a Thread.sleep(200) into AbstractSelectionKey.cancel() method to increase the possibility of race condition, as showed below public final void cancel() { if (valid) { try { Thread.sleep(100); }catch (Exception e) {} valid = false; ((AbstractSelector)selector()).cancel(this); } } The deadlock is easily reproducible on my sparc machine. Obviously we need to make the "valid check" and "cancel" an a atomic block
15-09-2004