We recently tried G1 on a 32GB heap, with a 4GB / 8GB young gen. The serial Other time was very high, around 50ms for the 4GB and 100ms for the 8GB young gen, which looks proportional to the collection set (i.e., the young gen in this case). This is because we do work proportional to the collection set before and after the parallel part of the collection (we add regions to the collection set before and we free them after). We should eliminate such expensive serial bottleneck from the code.
We're going to split this CR into two: this one and CR 6868854. This one is going to deal with high serial Other times at the beginning of a GC pause. CR 6868854 is going to deal with high serial Other times at the end of a GC pause.