At GC time G1 needs to flush the hot card cache. This is done in parallel using a (too) simple mechanism to chunk the whole hot card cache into parts: In detail, the chunk size is hot card cache chunk size = size of hot card cache / number of threads Which is too coarse for moderately large hot card cache sizes: it happens that in some applications one thread may delay all other threads significantly because it got quite late to processing the (large) hot card cache. The change can be tested by e.g. setting G1ConcRSLogCacheSize to 20. The fix could be to bound the hot card cache chunk size, or even more simple fix it to some reasonable size.
|