When a requesting thread calls SuspendibleThreadSet::synchronize, it sets the request flag (_suspend_all) and then waits on the STS_lock until all the threads in the set have either left the set or called yield() (determined by comparing _nthreads_stopped to _nthreads).
When a suspendible thread calls yield() while there is an active request, it updates the yield counter (_nthreads_stopped), calls notify_all on the STS_lock (to wake up the requesting thread so it can recheck and either rewait or proceed), and then waits on the STS_lock until there is no longer an active request.
The problem with this is the suspendible thread's notification of the STS_lock to alert the requesting thread of a state change also wakes up any previously yielded suspendible threads. These wakeups are not necessary to the semantics of the operation; they are pure overhead. In the worst case, the number of excess wakeups before the requestor can proceed scales as the square of the number of suspendible threads that yield for the request.
[In practice the number of excess wakeups seems to usually be lower (though can still be significant), because some of the threads that have not yet yielded are often competing for the STS_lock with those that have been woken up from their waits, and so may perform the notification before the awakened threads have resumed waiting.]
Having a suspendible thread call leave() rather than yield() has a similar effect of waking up any already yielded threads and the requestor, unnecessarily unless that leaving thread was the last non-yielded thread in the set.