JDK-6378870 : Confusing error "java.net.SocketException: Invalid argument" for socket disconnection
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.net
  • Affected Version: 5.0u5,6u20
  • Priority: P4
  • Status: Closed
  • Resolution: Fixed
  • OS: solaris_9,windows_2008
  • CPU: x86,sparc
  • Submitted: 2006-01-30
  • Updated: 2011-05-18
  • Resolved: 2011-05-18
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 6 JDK 7
6u22-revFixed 7 b05Fixed
Related Reports
Relates :  
Description
* socket.setTcpNoDelay(tcpNoDelay) reported the following error:

ERROR [org.apache.tomcat.util.net.PoolTcpEndpoint] Socket error caused by 

remote host /167.10.54.100 
java.net.SocketException: Invalid argument 
        at java.net.PlainSocketImpl.socketSetOption(Native Method) 
        at java.net.PlainSocketImpl.setOption(Unknown Source) 
        at java.net.Socket.setTcpNoDelay(Unknown Source) 
        at org.apache.tomcat.util.net.PoolTcpEndpoint.setSocketOptions(PoolTcpEn
dpoint.java503) 
        at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpo
int.java:515) 
        at rg.apache.tomcat.util.net.MasterSlaveWorkerThread.run(MasterSlaveWork
erThread.java:112) 
        at java.lang.Thread.run(Unknown Source) 

* corresponding truss output:

/182: setsockopt(244, tcp, TCP_NODELAY, 0xFFFFFFFE907FEE80, 4, 1) Err#22 EINVAL 
/182: write(11, " 2 0 0 5 - 1 2 - 2 2   2".., 644) = 644 
/182:     Incurred fault #6, FLTBOUNDS  %pc = 0xFFFFFFFF3905CCC0 
/182:       siginfo: SIGSEGV SEGV_MAPERR addr=0x00000008 
/182:     Received signal #11, SIGSEGV [caught] 
/182:       siginfo: SIGSEGV SEGV_MAPERR addr=0x00000008 
 
/180: setsockopt(31, tcp, TCP_NODELAY, 0xFFFFFFFE90DFED80, 4, 1) Err#22 EINVAL 
/180: write(11, " 2 0 0 5 - 1 2 - 2 2   2".., 644) = 644 
/180: sysinfo(SI_HOSTNAME, "i240", 256)      = 5 
/180: door_info(4, 0xFFFFFFFE90DFB528)    = 0 

* JBoss support evaluated the problem and recommended that:

"we recommend Sun take a look at it to prevent further
confusion for others later.  Tomcat developers have already agreed to modify
Tomcat to ignore your error message when running in a Solaris environment.  This
change should make it into the next revision of Tomcat.

The problem seems to be specific to Solaris, and is just that Solaris reports an
 EINVAL when most other implementations do not.  Apparently, this behavior wasn'
t documented until Solaris 9, and that's why it wasn't accounted for.  Foo.java 
(written by JBoss Support) demonstrates the issue.

At a minimum, our team recommends updating Java documentation to note this condi-
tion when running in Solaris.  It would be cute if the JVM could know that Solaris
behaves that way and react accordingly."

* Here's evaluation:

"The EINVAL is in response to a TCP RST sent by the content switch.  The content 
switch sent a TCP RST because Tomcat couldn't respond within the 3 seconds allow
ed by the content switch.  Tomcat couldn't respond in time because of Garbage Co
llector was going wild.  The Garbage Collector was doing tons of work in respons
e to an application bug.

Therefore, you don't care why the EINVAL was there.  For the most part, that has
 been accounted for (there was still one unexplained occurrence.  We'll update y
ou if we find any evidence regarding that.)

Your concern is just the JVM's reaction to the EINVAL in a Solaris environment, 
should Sun care to pursue it.  

2. The Sockets API in Java is not truly portable because it still closely mirro
rs the behavior of the OS's internal socket implementation. The root of the prob
lem is that Solaris is unique in that calls to setsockopt can result in an EINVA
L if the underlying connection has closed. This behavior was actually not docume
nted on Solaris 8, they did finally document it in Solaris 9. 

So, The JVM does not know the reason for the EINVAL, and thus it just passes it 
up to the Java application as a SocketException. So they really aren't doing any
thing wrong (since it is Solaris that is doing it). I would recommend sending th
em Foo.java in case they want to add special code that relays a different messag
e, or maybe they want to update the documentation to Socket.set*() to indicate t
he behavior on Solaris.

3. Tomcat treated SocketExceptions that occur on Socket.setTcpNoDelay() (and oth
ers) as an error instead of a normal condition. This is because of the following
:

1. Most platforms do not return an error on calls to setsockopt
2. Solaris does do this, but it was not documented at the time the JVM and tomca
t were developed.
3. The tomcat error was difficult to reproduce, because it only occurs when a cl
ient quickly closes its connection between the initial call to accept() and the 
first call to setsockopt(). (This information was of course not known when the p
roblem was reported in the past, because no one has been able to gather the data
 that shows how it occurs until now)
4. EINVAL is usually used to indicate a bad argument was passed to the call (in 
fact this is what the Solaris 8 documentation says). This gives one the impressi
on of something wrong in the JVM, because it is the JVM's responsibility to pass
 correct data structures to OS system calls.

So, while this condition is rare, it is still normal, and so future versions of 
tomcat will treat it as such, and no longer log it."


* testcase

----------------Foo.java---------------------
import java.io.IOException;
import java.net.ServerSocket;
import java.net.Socket;


public class Foo implements Runnable
{
   public int turn = SERVER;
   public static final int SERVER = 1;
   public static final int CLIENT = 2;

   public static void main(String[] args) throws Exception
   {
      ServerSocket server = null;
      Socket client = null;
          try{
        server = new ServerSocket(4444);
      } catch (IOException e) {
        System.out.println("Could not listen on port 4444");
        System.exit(-1);
      }

      Foo foo = new Foo();
      new Thread(foo).start();

      try{
        client = server.accept();
      } catch (IOException e) {
         System.out.println("Accept failed: " + e);
         System.exit(-1);
      }

      System.out.println("Accepted Socket");
      foo.handOff(CLIENT);
      foo.waitFor(SERVER);

      System.out.println("Setting TCP NO_DELAY");

      // this will throw EINVAL on solaris
      client.setTcpNoDelay(false);

      // on all other OS's you will see a connection reset error here
      client.getInputStream().read();
      server.close();
   }

   public synchronized void waitFor(int who) throws InterruptedException
   {
      while (turn != who)
         wait();
   }

   public synchronized void handOff (int who) throws InterruptedException
   {
      turn = who;
      notify();
   }
   public void run()
   {
      try
      {
         Socket socket = new Socket("localhost", 4444);
         waitFor(CLIENT);
         System.out.println("Sending RST!");
         socket.setSoLinger(true, 0);
         socket.close();
         handOff(SERVER);
      }
      catch (Exception e)
      {
         throw new RuntimeException(e);
      }
   }
}
----------------Foo.java---------------------

Comments
SUGGESTED FIX *** /codereview/6378870/webrev/src/solaris/native/java/net/PlainSocketImpl.c *** 1007,1016 **** --- 1007,1026 ---- optlen = sizeof(optval.i); } if (NET_SetSockOpt(fd, level, optname, (const void *)&optval, optlen) < 0) { + #ifdef __solaris__ + if (errno == EINVAL) { + // On Solaris setsockopt will set errno to EINVAL if the socket + // is closed. The default error message is then confusing + char fullMsg[128]; + jio_snprintf(fullMsg, sizeof(fullMsg), "Invalid option or socket reset by remote peer"); + JNU_ThrowByName(env, JNU_JAVANETPKG "SocketException", fullMsg); + return; + } + #endif /* __solaris__ */ NET_ThrowByNameWithLastError(env, JNU_JAVANETPKG "SocketException", "Error setting socket option"); } }
29-09-2010

EVALUATION -- The issue here is not that the socket has been closed but rather than the peer has reset the connection. On Solaris the setsockopt(3SOCKET) call returns EINVAL when the connection is reset.
15-12-2006

EVALUATION Indeed, the Solaris documentation specified that EINVAL will be returned when the socket is closed. However, the behavior is consistent with documentation and other occurences. For instance, if the socket had been closed a SocketExceptio would be thrown as well. Granted the message in the exception is confusing, but doesn't seem to be a huge issue. It would be a good idea to clarify the situation in the future but is hardly a high priority since only the message is likely to change.
02-02-2006