JDK-7132889 : (se) AbstractSelectableChannel.register and configureBlocking not safe from asynchronous close
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.nio
  • Affected Version: 6,7
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • OS: generic,linux
  • CPU: generic,x86
  • Submitted: 2012-01-24
  • Updated: 2013-06-26
  • Resolved: 2012-09-10
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 7 JDK 8
7u40Fixed 8 b55Fixed
Related Reports
Duplicate :  
Description
SYNOPSIS
--------
Race conditions in NIO Selector code

OPERATING SYSTEMS
----------------
All (discovered on Windows)

FULL JDK VERSIONS
-----------------
All (discovered on Java 6).
Not tested on JDK 7.

PROBLEM DESCRIPTION from LICENSEE
---------------------------------
We have identified a narrow timing window in AbstractSelectableChannel.implCloseChannel() where one of the channel/fds can be closed while a selector.select() operation is in
progress. According to the current implementation, the channel will be closed first by going to native level and then come back, acquire the keylock and cancel the associated key with this channel. With this approach there is (incorrectly) always a timing window between closing the channel and cancelling the corresponding key. The window increases when there is a delay to acquire the keylock due to lock contention.

We are only able to reproduce during stress testing of one of our large products - we do not have a small standalone testcase. However, the problem can be simulated by modifying the AbstractSelectableChannel class such that there is a sleep between the calls to implCloseSelectableChannel() and the grabbing of the keyLock ("synchronized (keyLock)") in AbstractSelectableChannel.implCloseChannel().

The problem can be fixed if we change the implementation such that the keys are cancelled first, before closing the corresponding channel.

We also identified another small timing window between registering the key and closing the channel, caused by the fact that these two operations acquire different locks (regLock and keyLock). While closing the channel the JDK has to ensure that no more keys are registered by the selector for that channel. The problem can be avoided if AbstractSelectableChannel.implCloseChannel() acquires reglock as well as keyLock.  We have conducted performance benchmark tests with the proposed fix in place and we see no noticeable performance degradation.

SUGGESTED FIX
-------------
The following diff is based on "6u27-b05/j2se/src/share/classes/java/nio/channels/spi/AbstractSelectableChannel.java"

165a166
>         synchronized (regLock) {
170c171
<       synchronized (regLock) {
---
>
201,202c202,203
<       implCloseSelectableChannel();
<       synchronized (keyLock) {
---
>         synchronized (regLock) {
>          synchronized (keyLock) {
209c210,212
<       }
---
>         }
>         implCloseSelectableChannel();
>        }

Comments
EVALUATION As per the discussion on nio-dev there are two race conditons that need to be fixed: 1. Calling register at around the time that the channel is closed means it is possible for register to be returned a SelectionKey that is never invalided. 2. Calling configureBlocking at around the time that channel is closed can cause implConfigureBlocking to operate on a closed file descriptor.
21-08-2012

EVALUATION The implementation has changed in jdk8 so that sockets are no longer closed while being polled by a Selector (this was a bug). We need to study the locking to see if the timing windows suggested here are an issue.
24-02-2012