During class unloading, removing the dependencies (calls to DependencyContext::remove_all_dependencies) is very slow.
In a class unloading stress test, the total time spent in that call is like:
[537,573s][gc ] GC(106) do_unloading: loaders processed 279, loaders removed 6001
[537,573s][gc ] GC(106) Unloading took 45,54, depunload took 38,41 percent 84,33
I.e. dependency unloading ("depunload") took 38,41ms out of 45,54ms.
The reason is that for every dependency the code performs a cmpxchg to add that dependency into the purge list. The cmpxchg is expensive.
Batching the cmpxchg on a per instanceKlass basis decreases the time to one seventh:
[666,449s][gc ] GC(139) do_unloading: loaders processed 280, loaders removed 6001
[666,449s][gc ] GC(139) Unloading took 12,00, depunload took 5,47 percent 45,58
i.e. 5,47ms.