JDK-6791815 : Fix for 6471657 can cause deadlock on non-Solaris platforms when initializing direct buffer support
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: hs10,6u14
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2009-01-09
  • Updated: 2012-10-08
  • Resolved: 2009-02-11
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 6 JDK 7 Other
6u18Fixed 7Fixed hs15Fixed
Related Reports
Duplicate :  
Description
The fix for 6471657 introduced a ThreadInVMFromNative transition prior to calling os::yield_all from a thread that needed to wait for another thread to complete initialization of the direct buffer support. This was needed because on Solaris yield_all calls os::sleep and os::sleep (due to a then recent change) requires that the calling thread be threadInVM.

However on linux yield_all() simply calls sched_yield() and there is no need to be threadInVM. Further the state change leads to a deadlock scenario. The thread that is busy-waiting for initialization to complete is marked as being threadInVM, so if the thread doing the initialization triggers a safepoint, the VMThread will wait for this busy-waiting thread to enter the safepoint. But that will never happen and so we get a deadlock.

Similarly on Windows, yield_all() is a win32 Sleep call, so the same deadlock could occur.

There doesn't seem to be an easy solution for positioning the state transition as it is only needed when yield_all is called from this code on Solaris, but this code is platform independent. You can't unconditionally place the tranition inside yield_all because all the other locations that call it already have the thread state set correctly - and a conditional insertion of a thread-state transition requires that the internal logic be duplicated because the transition objects operator at block scope.

Another solution is to replace the current cmpxch + busy-wait logic with a traditional mutex/condvar solution using wait/signal. A lock-free fast-path will also be needed for performance.

Comments
PUBLIC COMMENTS As per 6852404 the deadlock scenario exists on Solaris as well. If a safepoint is initiated while a thread is doing the initialization, the threads waiting for initiialization to complete would never reach a safepoint.
19-06-2009

EVALUATION http://hg.openjdk.java.net/jdk7/hotspot-rt/hotspot/rev/4db4e58c16bd
14-01-2009

PUBLIC COMMENTS There is an easy solution: moving the transition inside the loop: } else { - ThreadInVMfromNative tivn(thread); // set state as yield_all can call os:sleep while (!directBufferSupportInitializeEnded && !directBufferSupportInitializeFailed) { + ThreadInVMfromNative tivn(thread); // set state as yield_all can call os:sleep os::yield_all(); } } Thanks to Tom R.
09-01-2009

EVALUATION See description.
09-01-2009

WORK AROUND The problem arises when there is a race between two or more threads to initialize the direct buffer support. The workaround is to ensure that a direct buffer is created earlier by a single thread, hence avoiding the initialization race.
09-01-2009