Starting a lot of threads in a burst can significantly delay safepoint synchronization, for example up to multiple seconds.
JVM_StartThread takes Threads_lock, and under that lock appends to the threads list for ThreadSMR support which can take ~0.1ms. If we have many concurrent calls to Thread.start, there can be an arbitrary number of callers waiting for the Threads_lock.
Safepoint synchronization also needs to acquire the Threads_lock before arming the safepoint, and it has no special priority so it can be arbitrarily delayed by JVM_StartThread calls.
The attached reproducer demonstrates the issue.
```
java -Xlog:safepoint -ThreadStartTtsp.java | grep -o 'Reaching safepoint: [0-9]* ns'
Reaching safepoint: 1291591 ns
Reaching safepoint: 59962 ns
Reaching safepoint: 1958065 ns
Reaching safepoint: 14456666258 ns <-- 14 seconds!
```