JDK-8256383 : PlatformMutex::try_lock has different semantics on Windows and Posix
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 13,15,16
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • OS: windows
  • Submitted: 2020-11-16
  • Updated: 2020-12-18
  • Resolved: 2020-11-18
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 16
16 b26Fixed
Related Reports
Blocks :  
Relates :  
Relates :  
Description
In a situation where the lock is already held by the thread calling try_lock, the two implementations have the opposite semantics.

Windows:
inline bool os::PlatformMutex::try_lock() {
  return TryEnterCriticalSection(&_mutex);
}

"If the critical section is successfully entered or the current thread already owns the critical section, the return value is nonzero.

If another thread already owns the critical section, the return value is zero."

Posix:

inline bool os::PlatformMutex::try_lock() {
  int status = pthread_mutex_trylock(mutex());
  assert_status(status == 0 || status == EBUSY, status, "mutex_trylock");
  return status == 0;
}

"The pthread_mutex_trylock() function shall be equivalent to pthread_mutex_lock(), except that if the mutex object referenced by mutex is currently locked (by any thread, including the current thread), the call shall return immediately."

and:
"[EDEADLK]
    A deadlock condition was detected or the current thread already owns the mutex."

This difference propagates up to the Mutex::try_lock implementation.
Comments
Changeset: 2b155713 Author: David Holmes <dholmes@openjdk.org> Date: 2020-11-18 22:48:39 +0000 URL: https://github.com/openjdk/jdk/commit/2b155713
18-11-2020

Bug is only possible on Windows
17-11-2020

Hmmmm ... We want Mutex::try_lock to return false if the lock is already held by the current thread. This is the semantics we have with Posix as we use "normal" mutexes. But on Windows, because the critical section supports recursive locking we can't tell whether it was unlocked or already locked. Ideally we want to handle this difference in semantics in the Windows specific code, but the only way to make the distinction is to check the owner which is only known at the Mutex level. But if I'm going to pass the owner through to PlatformMutex I may as well just do the check in Mutex before the call. I investigated pushing the ownership notion into the PlatformMutex but it drags too much code and complexity with it. I think we will have to allow PlatformMutex to have different semantics on different platforms, with Mutex knowing that and adapting for it. Side note: We can't utilise the recursive locking semantics on Windows with PlatformMonitor because the "wait" function only works correctly if the critical-section has been locked once.
17-11-2020

Okay. I will look at documenting the PlatformMutex semantics more clearly so that these platforms differences are known. I'll then fix Mutex so that try_lock works as expected. Thanks.
16-11-2020

I also think it makes sense to add the owned_by_self() check in Mutex::try_lock() (which I'm adding in JDK-8255678 as Stefan pointed out) to get consistent behaviour. Note that today calling Mutex::try_lock() when the thread already holds the lock will return false with the pthread version and will assert in the Windows version since we do owner checks after _lock.try_lock() succeeds (at least for non release builds). Using a raw os::PlatformMutex::try_lock() will fail on the pthread case and will succeed on Windows. But I think os::PlatformMutex/Monitor should be only directly used on low level leaf locks (e.g. JDK-8225631) where there is no possibility of calling try_lock() while holding the lock, so I don't know if it makes sense to do much there.
16-11-2020

I found this during a discussion with [~pchilanomate], where we discussed if it was OK to call Mutex::try_lock while already holding the lock. The code I was writing expected try_lock to return false if the thread already held the lock, but his JDK-8255678 patch asserted that the thread was not already holding the lock. To workaround this I had to change the code to: if (!lock->owned_by_self() && lock->try_lock()) { II'm not sure, but maybe it would make sense to move the owned_by_self check into Mutex::try_lock, so that we get a consistent semantics for Mutex::try_lock? And for PlatformMutex::try_lock, maybe it would be best to not expose that function publicly, but only use it to implement Mutex::try_lock? Just some thoughts.
16-11-2020

Backing up a bit. Our PlatformMonitor/Mutex is not intended to support recursive locking by the same thread. If you try this on Posix then you will get a hang (unless you are running with a default pthread_mutex_t type that supports deadlock detection (which they typically don't). If you try on Windows then it will succeed and if this is occurring in platform specific code you can actually use it this way. I'm assuming a recursive locking situation was encountered and the different behaviour between Posix and Windows was observed? What semantics were you looking for?
16-11-2020

Windows CriticalSections allow for reentrant locking in contrast to Pthread Mutexes. This is an oversight but I'm not sure how we can correct for it.
16-11-2020