Currently the GC threads are created all at once and put to use as soon as the JVM startsup. That may not be necessary, especially on systems with large number of HW strands that would create hundreds of GC threads. AppServers would benefit more since the spend much of their startup loading classes and the such. Instead I propose an alternative:
Create a small GC thread pool on startup capped at some resonable (16 threads?) and then create more threads based on some heuristic. Ideas are allocation rate, heap size, etc
Once the system has entered some steady state, GC threads could come and go as needed and depending on load (this is more useful for concurrent collectors such as G1). I may not want to use 40% of my system to GC when I really need to finish some critical aspect of my application (JBB2012 ops come to mind here).