There are some simple optimizations we can do do improve the performance of the concurrent marking phase. - CMOopClosure, which is used to scan objects during marking, is not specialized. By specializing it we will be able to get a nice performance boost. We should also rename it with a G1-specific name (say: G1CMOopClosure) given that its declaration will move to an .hpp file. - There are a couple of methods in the fast path that will benefit from getting inlined. These are CMTask::deal_with_reference() and CMTask::push() - We are using the wrong bitmap operations! In the parallel case we are using par_at_put() which in turn calls either par_set_bit() or par_clear_bit(). We should call the latter directly (which will also be inlined, par_at_put() is not). Ditto for at_put() and set_bit() / clear_bit(). - There are places where we can use the slightly more efficient heap_region_containing_raw(), instead of heap_region_containing(), as we know that the address is in the G1 heap. - When we check whether an object is live or not we should first check whether it's marked on the bitmap and, only if it's not, get its containing region.
|