JDK-7130796 : (se) Selector.select may hang due to asynchronous close of socket in poll array [win]
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.nio
  • Affected Version: 6,7
  • Priority: P3
  • Status: Closed
  • Resolution: Duplicate
  • OS: generic,windows_7
  • CPU: generic,x86
  • Submitted: 2012-01-17
  • Updated: 2012-06-04
  • Resolved: 2012-02-20
Related Reports
Duplicate :  
Relates :  
Description
FULL PRODUCT VERSION :
java version "1.7.0_02"
Java(TM) SE Runtime Environment (build 1.7.0_02-b13)
Java HotSpot(TM) 64-Bit Server VM (build 22.0-b10, mixed mode)


ADDITIONAL OS VERSION INFORMATION :
Microsoft Windows [Version 6.1.7601]

EXTRA RELEVANT SYSTEM CONFIGURATION :
Quad-core i7, Windows 64-bit

A DESCRIPTION OF THE PROBLEM :
Mark A. Ziesemer posted a testcase at https://bugs.eclipse.org/bugs/show_bug.cgi?id=357318#c2 that causes HTTP requests to hang (they never come back with a response). Upon further investigation we discovered that when a request hangs, it is blocked on sun.nio.ch.WindowsSelectorImpl.discardUrgentData().

When a "hang" occurs, Jetty will accept new connections and allow the client to send anything but it will not respond. CPU usage is zero.

I've personally run across this issue multiple times (without the testcase) and my client was only opening a single connection at a time. In other words, this issue occurs even under minimal load.

Please note that once the hang occurs you're forced to restart the server. You may lose active connections and data along the way.

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Run the testcase posted at https://bugs.eclipse.org/bugs/show_bug.cgi?id=357318#c2

The issue is 100% reproducible for me if JettyHangTest.ITERATIONS is increased to 10,000 or higher.


EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Testcase doesn't hang
ACTUAL -
Testcase hangs

REPRODUCIBILITY :
This bug can be reproduced always.

Comments
EVALUATION .
22-03-2012

EVALUATION There hasn't been any feedback from the submitter to know if they have tried recent JDK8 builds with the fix. It would be useful to know this before we consider pushing the changes to 7u6.
14-03-2012

EVALUATION The changes for 6346658 should be available in jdk8-b28. To those with an interest in this bug then please try out this build.
01-03-2012

EVALUATION This is fixed by the changes via 6346658 for jdk8. Once the changes have baked in jdk8 for a while then we should consider back-porting to a jdk7 update, probaby 7u6.
20-02-2012

EVALUATION I've changed to synopsis of this bug to properly reflect what this bug is about. Note that this is not a remote DoS, but rather something that this server is running into because it closes the SocketChannels while they are in use in a Selector. Fixing it is straight-forward but it means bringing back 4960962. It's not possible to solve all issues without support from the operating system (Windows lacks it unfortunately).
24-01-2012

EVALUATION We've duplicated this on Windows 7 and the issue is a side effect of the changes in 4960962 where it is possible to close a socket while a thread is blocked in select. We have a potential fix but it will take a bit of time to test out.
20-01-2012

EVALUATION The socket is non-blocking and not immmediately obvious how a thread may be blocked in discardUrgentData. It's possible this could be a resource issue on Windows but we should know more once we have duplicated it. It would be nice if we could also get a test case that doesn't require Jetty or other products.
18-01-2012