JDK-6782668 : (se) IOException: Invalid argument" thrown on a call to Selector.select(value) with -d64
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.nio
  • Affected Version: 5.0u15
  • Priority: P3
  • Status: Closed
  • Resolution: Duplicate
  • OS: solaris_9
  • CPU: sparc
  • Submitted: 2008-12-09
  • Updated: 2010-04-02
  • Resolved: 2009-03-20
Related Reports
Duplicate :  
Relates :  
Description
OPERATING SYSTEM(S):
Solaris sparcv9/x64

FULL JDK VERSION(S):
java version "1.5.0_15"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_15-b04)
Java HotSpot(TM) 64-Bit Server VM (build 1.5.0_15-b04, mixed mode)

DESCRIPTION:
CR 6322825 is marked as fixed in 1.5.0_08, but we are seeing the same issue on 1.5.0_15.

When running our application we see IOExceptions with the following stack trace:

java.io.IOException: Invalid argument
    at sun.nio.ch.DevPollArrayWrapper.poll0(Native Method)
    at sun.nio.ch.DevPollArrayWrapper.poll(DevPollArrayWrapper.java:164)
    at sun.nio.ch.DevPollSelectorImpl.doSelect(DevPollSelectorImpl.java:68)
    at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
    at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
    at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:84)
    ...

We are seeing the problem with Solaris 10, on sparcv9 and x64.

There are two workarounds available according to 6322825:

1. Use the poll based SelectorProvider by setting the
   java.nio.channels.spi.SelectorProvider system property to
   sun.nio.ch.PollSelectorProvider

2. Increase the hard limit on the number of file descriptors to 8193 or
   higher, ie: check the hard limit with "ulimit -n -H" and if less than
   8193 edit /etc/system and add a "set rlim_fd_max=8193" command and
   reboot

On the x64 platform both these workarounds make the problem disappear.

However, on the sparc platform only workaround 1 works - setting the FD limit has no effect and we still see the exceptions.

We have tried to construct a testcase to demonstrate the problem using code similar to our application but we have not yet been successful - the issue only occurs when our full application is running.

We would like to have the fix for 6322825 re-assessed, as it does not seem to have solved the problem (or we are hitting the problem in a new way). We would also like more details on workaround 1, especially in regards to the impact of switching to the PollSelectorProvider, in terms of CPU usage and performance. Performance is a very high priority for this application.

Comments
EVALUATION The customer has come back with the truss output we requested. It shows that the max is RLIM_INFINITY. This has the value -3 and neagtive limits are not handled by the correct code. We will fix this for jdk7. In anyone happens to run into this then it can be worked around by removing rlim_fd_max from /etc/system or else setting it to a reasonable value.
20-02-2009

EVALUATION Dave was able to get /etc/system and rlim_fd_max is not set. There is no information yet as if this problem happens with jdk6. It would also be useful to trace usages of getrlimit(2) so that we can see the RLIMIT_NOFILE limits. This can be done with $ truss -f -t getrlimit -v getrlimit java Application
16-12-2008

WORK AROUND Use the poll based SelectorProvider by setting the java.nio.channels.spi.SelectorProvider system property to sun.nio.ch.PollSelectorProvider
09-12-2008

EVALUATION There isn't sufficient information in the bug report to diagnose this. I've asked Dave to get the customer to send /etc/system so that we can see if rlim_fd_max is set (the default is 64k on Solaris 10 if not set). We are aware of a problem with getrlimit(2) when this is set to a negative value (see 6772303). It would also be useful to know if the customer has tried jdk6. That would help narrow down the problem since we cannot duplicate the isuse on any of our test systems.
09-12-2008