During an evacuation pause that experiences an evacuation failure, a GC thread first tries to copy an object. The thread first attempts to allocate space for the object. When this fails we have an evacuation failure and the thread enters the code that handles such a failure.
The evacuation failure handling code first attempts to atomically forward the object to itself (in case another thread was attempting to copy the object at the the same time). If successful the thread grabs a lock and installs it's own data data structures into some global fields. While holding the lock the thread pushes the failed object to a global refs_to_scan_stack and drains this stack.
Draining the global refs_to_scan_stack involves popping the failed object, scanning its reference fields and applying a copy closure specialized for evacuation failure to the referenced objects. This specialized closure will attempt to copy the referenced object and re-enter the evac failure handling code again - which pushes the referenced object.
As a result the objects that are reachable from the original failed object are self-forwarded and scanned by the thread while holding the lock. Meanwhile other threads that have their own failed copy are waiting to acquire the lock in the evac failure handling code.
Hence the evacuation failure handling mechanism is effectively serialized. The code that pushes the self-forwarded object is executed by multiple threads but not in parallel.
This contributes to the excessive object copying times seen when an evacuation failure occurs.