The fix for 6471657 introduced a ThreadInVMFromNative transition prior to calling os::yield_all from a thread that needed to wait for another thread to complete initialization of the direct buffer support. This was needed because on Solaris yield_all calls os::sleep and os::sleep (due to a then recent change) requires that the calling thread be threadInVM.
However on linux yield_all() simply calls sched_yield() and there is no need to be threadInVM. Further the state change leads to a deadlock scenario. The thread that is busy-waiting for initialization to complete is marked as being threadInVM, so if the thread doing the initialization triggers a safepoint, the VMThread will wait for this busy-waiting thread to enter the safepoint. But that will never happen and so we get a deadlock.
Similarly on Windows, yield_all() is a win32 Sleep call, so the same deadlock could occur.
There doesn't seem to be an easy solution for positioning the state transition as it is only needed when yield_all is called from this code on Solaris, but this code is platform independent. You can't unconditionally place the tranition inside yield_all because all the other locations that call it already have the thread state set correctly - and a conditional insertion of a thread-state transition requires that the internal logic be duplicated because the transition objects operator at block scope.
Another solution is to replace the current cmpxch + busy-wait logic with a traditional mutex/condvar solution using wait/signal. A lock-free fast-path will also be needed for performance.