United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
JDK-4881228 : (so) Selector.select() fails on OP_ACCEPT when the network is unplugged (wxp)

Details
Type:
Bug
Submit Date:
2003-06-19
Status:
Closed
Updated Date:
2012-10-10
Project Name:
JDK
Resolved Date:
2004-02-06
Component:
core-libs
OS:
windows_xp,windows_2000
Sub-Component:
java.nio
CPU:
x86
Priority:
P3
Resolution:
Fixed
Affected Versions:
1.4.2,1.4.2_02,1.4.2_03
Fixed Versions:
1.4.2_05 (05)

Related Reports
Backport:
Duplicate:

Sub Tasks

Description

Name: rmT116609			Date: 06/19/2003


FULL PRODUCT VERSION :
java version "1.4.2-beta"
Java(TM) 2 Runtime Environment, standard Edition (build 1.4.2-beta-b19)
Java HotSpot(TM) Client VM (build 1.4.2-beta-b19, mixed mode)



FULL OS VERSION :
Microsoft Windows XP [Version 5.1.2600]

A DESCRIPTION OF THE PROBLEM :
When a selector is registered with a non-blocking server socket for operation OP_ACCEPT, when windows recognizes that the network cable is unplugged, the method int Selector.select() will not block and return 0.  The SelectedKey set will be empty.  This continues even when the network cable is plugged in again.

This is a problem because according to most examples, Selector.select() is safe to put in an infinite loop.  The abnormal behavior of continuously returning immediately when there are no connections to be accepted causes the loop accepting connections to spin, hogging the CPU.

It seems this is a windows-specific problem.  The Apple JVM for Mac OS X will not spin when the network is unplugged.  Instead, the method Selector.select() remains blocked, as it should acording to the spec.

I tested with apple vm version:
java version "1.4.1_01"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1_01-39)
Java HotSpot(TM) Client VM (build 1.4.1_01-14, mixed mode)


STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Run the example code, following directions.  When prompted, unplug the network cable and wait 30 seconds.  Shortly, the program will begin printing thousands of lines of output.  At this point, ctrl-c the program, as it will not exit.

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
The program should not print anything out while the network connection is unplugged.  The operation Selector.select() should only return when another operation specified in the documentation occurs.  When the network is unplugged, nothing is connecting to the selector, and thus, no output should be printed.
ACTUAL -
The program will start printing out hundreds lines of output beginning with "<accept loop>".  These are all printed out in the accept loop, which now does not block on the Selector.select() operation, but instead passes right through it, spinning in an infinite loop.

REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
import java.util.*;
import java.nio.channels.*;
import java.io.*;
import java.net.*;

/**
 * <p>Title: Windows NIO bug test case</p>
 * <p>Description: This is a simple program to run to test the windows NIO problem with Selector.select() when
 * its selecting for OP_ACCEPT on a ServerSocketChannel and the network cable is unplugged. Simply run
 * the program on Windows XP and unplug the network cable when instructed. </p>
 * <p>Copyright: Copyright (c) 2003 Heroix Corp.</p>
 * <p>Company: Heroix Corp</p>
 * @author Jonthan Hess (###@###.###)
 * @version 1.0
 */

public class WindowsNioBug implements Runnable {
    /** The connection selector for the server socket.  This is used to find connections */
    private Selector connectionSelector;

    /**The server socket that is listening for connections */
    private ServerSocketChannel ssc;

    /**The socket address that we are listening on.  This is an argument to the constructor*/
    private SocketAddress address;

    public WindowsNioBug() {
        address = new InetSocketAddress(8987);
        System.out.println("Opening socket on port 8987");
        try {
            ssc = ServerSocketChannel.open();
            this.connectionSelector = Selector.open();

            this.ssc.configureBlocking(false);
            this.ssc.socket().bind(address);
            this.ssc.register(this.connectionSelector, SelectionKey.OP_ACCEPT);
        }
        catch (IOException ex) {
            System.out.println("Error opening the socket on port 8987");
            ex.printStackTrace();
        }
    }

    public static void main(String[] args) {
        System.out.println("Creating new instance of WindowsNioBug");
        WindowsNioBug bug = new WindowsNioBug();
        Thread t = new Thread(bug);
        t.start();

        System.out.println("Waiting 3 seconds, then attempting to connect");
        try {
            Thread.sleep(3000);
        }
        catch (InterruptedException ex) {
        }
        System.out.println("Now attemting to connect");
        try {
            Socket s = new Socket();
            s.connect(bug.address);
            s.close();
            System.out.println("Done connecting");
        }
        catch (IOException ex1) {
            System.out.println("Error while connecting to server socket");
            ex1.printStackTrace();
        }
        System.out.println("Now, unplug your network cable and wait about 15 seconds.  The loop in listenForConnections() will spin infinately around Selector.select()");
        System.out.println("This is not the behavoir");
    }
    public void run() {
        System.out.println("Listening for connections.  Telnet to port 8987 to prove that it's working");
        listenForConnections();
    }
    /**
     * Listens for connections on the server socket and hands them off to
     * handleConnection(SocketChanel sc) when a connection is received.
     */
    private void listenForConnections() {
        System.out.println("listenForConnections");
        try {
            while (true) {
                // this next operation is the one that is broken
                int keys = connectionSelector.select(); // select all connections that need to be serviced, blocks until service requiered
                
                System.out.println("<accept loop>  connectionSelector.select() has returned "+keys );
                System.out.println("<accept loop> ssc.isOpen() "+ssc.isOpen() );
                System.out.println("<accept loop> ssc.isRegistered() "+ssc.isRegistered() );
                System.out.println("<accept loop> ssc.isBlocking() "+ssc.isBlocking() );


                // iterate through available connections
                for (Iterator i = connectionSelector.selectedKeys().iterator();
                     i.hasNext(); ) {
                    SelectionKey key = (SelectionKey) i.next();
                    i.remove();
                    System.out.println("<accept loop> Selected key: "+key);

                    ServerSocketChannel readyServer = (ServerSocketChannel) key.
                            channel();
                    SocketChannel sc = readyServer.accept(); // this is non-blocking, returns null if none are ready

                    if (sc != null) {
                       System.out.println("<accept loop> The connection was good, closing");
                       sc.close();
                    }

                }
            }
        } catch (IOException ex) {
            System.out.println("<accept loop> IOException while handling connections");
            ex.printStackTrace();
        } finally {
            System.out.println("<accept loop> exiting AcceptConnectionThread.listenForConnections");
        }
    }

}
---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
Check if the operating system is Windows.  If it is and Selector.select() returns 0, wait 100 milliseconds before continuing.  If 0 is returned several times in a row, close the socket and attempt to re-open it at less frequent intervals.
(Review ID: 188371) 
======================================================================

                                    

Comments
EVALUATION

Not a Mantis showstopper.  -- ###@###.### 2003/6/24

Selector spin can be caused by these two situations, both of which
are the result of ready to write being identical to ready to connect
at the native level.

1. The channel is not connected, the key interested in writing and
   the key ready for writing. Because the key is ready for something it
   is interested in, the select operation returns immediately, but the
   key is not marked as ready for write, because the channel is not
   in the connected state, therefore the key was not added to the selected
   set. So the selector spins and returns 0.

2. The channel is connected, the key interested in connecting and 
   the key ready for connecting. Because the key is ready for something it
   is interested in, the select operation returns immediately, but the
   key is not marked as ready to connect, because the channel is already
   connected. Therefore the key was not added to the selected set, the
   selector spins and returns 0.

###@###.### 2003-08-13



The pipe used for Selector wakeup is based on a socket connection
through the network interface. When the network cable is disconnected 
this connection is disrupted and all subsequent selects will complete
immediately. The issue can be addressed by basing the pipe implementation
on a connection through the loopback interface.
###@###.### 2004-01-07
                                     
2004-01-07
CONVERTED DATA

BugTraq+ Release Management Values

COMMIT TO FIX:
1.4.2_05
generic
tiger-beta2

FIXED IN:
1.4.2_05
tiger-beta2

INTEGRATED IN:
1.4.2_05
tiger-b38
tiger-beta2

VERIFIED IN:
1.4.2_05


                                     
2004-09-20



Hardware and Software, Engineered to Work Together