JDK-6445262 : Need analysis on nio error
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.nio
  • Affected Version: 5.0u4
  • Priority: P3
  • Status: Closed
  • Resolution: Duplicate
  • OS: linux_2.6
  • CPU: generic
  • Submitted: 2006-06-29
  • Updated: 2011-02-16
  • Resolved: 2006-07-12
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
5.0u4Resolved
Related Reports
Duplicate :  
Relates :  
Description
Please have the complete details available from customer below :

I am running a standalone JGroups test program that has completed and is trying to shutdown 
our NIO TCP based communications connections and then the NIO Selector handling threads.  
At the time of failure, we are trying to close all communication connections.  
I am trying to close the connections from the Main java thread 
(read + write Selector processing threads are still registered to receive event notifications for these connections).

I can probably work around the problem once I understand why the error exception is generated.  
Would it be possible for you to explain what the exception means and what the Native code is trying to do?


Test Case :
============
Attached.
 P.S. : capture.txt file is the text file having the logs seen at customer's setup.

Steps To Reproduce The Problem Reported :
==========================================
Please place the attached files in some folder on your Linux machine and update tcp_nio.xml to use 
your machines ip address (change 2 instances of bind_addr="164.99.208.53").

Then, run 4 terminal shells and start ./sendernio.sh command in each one.  
Wait about 5 seconds after starting one before the next.  
The four instances will communicate with eachother for a short while and then terminate the test.  
You will see the exception in some of the windows.

Also attached is capture.txt which is the output from one of my test runs.



The event sequences with Test Case are :
=========================================

1.  Wait for the configured number of members to connect (num_members=4).
2.  Each member sends the configured number of messages to other members (num_msgs=10000).
3.  The Java main entry point is in control of starting the test and after completion, stopping it.  
    The main thread shuts down the group communication layer and exits.

O/S Details :
==============
Novell Linux Desktop 9 (based on Suse Linux) 
uname -a output :

"Linux smarlow 2.6.5-7.243-default #1 Mon Dec 5 21:08:42 
UTC 2005 i686 i686 i386 GNU/Linux"


JDK Version Details :
======================
I tried both 1.5.0_04- b05 and 1.6.0- beta- b59g.  
Notice that I have included the 1.6 based output below in a previous message which had a slightly different error message (Thread signal failed).


We think the problem is caused by the way I have written the JGroups NIO code.  
I would like to correct this.  Previously, I was just ignoring the error but I want to fix it.


Error Messages :
==================

I am getting the following error and not sure what it really means.  
It sounds like a bug between Java + native code but I'm just guessing.

The snippet of Exception stack is :

With 1.5.0_04-b05 :
--------------------

Jun 27, 2006 3:33:44 PM org.jgroups.blocks.ConnectionTableNIO$Connection closeSocket
SEVERE: error closing socket connection
java.io.IOException: Invalid argument
        at sun.nio.ch.NativeThread.signal(Native Method)
        at sun.nio.ch.SocketChannelImpl.implCloseSelectableChannel(SocketChannelImpl.java:634)
        at java.nio.channels.spi.AbstractSelectableChannel.implCloseChannel(AbstractSelectableChannel.java:201)
        at java.nio.channels.spi.AbstractInterruptibleChannel.close(AbstractInterruptibleChannel.java:97)
        at org.jgroups.blocks.ConnectionTableNIO$Connection.closeSocket(ConnectionTableNIO.java:913)



With 1.6.0-beta-b59 :
-----------------------

java.io.IOException: Thread signal failed
        at sun.nio.ch.NativeThread.signal(Native Method)
        at sun.nio.ch.SocketChannelImpl.implCloseSelectableChannel(SocketChannel Impl.java:638)
        at java.nio.channels.spi.AbstractSelectableChannel.implCloseChannel(Abst ractSelectableChannel.java:201)
        at java.nio.channels.spi.AbstractInterruptibleChannel.close(AbstractInte rruptibleChannel.java:97)
        at org.jgroups.blocks.ConnectionTableNIO$Connection.closeSocket(Connecti onTableNIO.java:913)


Code snippet closing the socket :
----------------------------------

The org.jgroups.blocks.ConnectionTableNIO$Connection.closeSocket looks like this:

      void closeSocket()
      {
         if (sock_ch != null)
         {
            try
            {
               if(sock_ch.isConnected() && sock_ch.isOpen()) {
                  sock_ch.close();
               }
            }
            catch (Exception e)
            {
               log.error("error closing socket connection", e);
            }
            sock_ch = null;
         }
      }

Possible reasons for the root cause for this problem is to be found out.

Comments
EVALUATION Customer has verified that issue goes away with 5.0u8.
12-07-2006

EVALUATION -- The stack traces show sun.nio.ch.NativeThread.signal failing as it tries to signal a non-existant thread. The issue might thus be 6380091 which is fixed in mustang and 5.0u8.
11-07-2006

EVALUATION We do have a race condition in SocketChanelImpl when there is a read thread setting and cleaning up "readerThread" while another thread signaling the possible blocking reader/writer in implCloseSelectableChannel (or shutdownInput/Output). The setting and cleaningup of "readerThread" need to be synced by the "stateLock", which actually we've already been doing with the fix we putback for bug#6285901 in Mustang b86. Is it possible for the submitter to try out JDK6 b86 to see if his issue is gone?
29-06-2006