Relates :
|
|
Relates :
|
|
Relates :
|
|
Relates :
|
During recent work the following worthwhile micro-optimizations for scanning remembered sets (or in general, cards) have been found: - HeapRegion::oops_on_card_seq_iterate_careful is faster than using HeapRegionDCTOC during scan rs. - HeapRegion::oops_on_card_seq_iterate_careful can be sped up by allowing for specialization for the use cases during gc vs. during mutator time by specialization. E.g. a lot of extra checks can go away for such a specialization, like the filter_young one, the g1h->is_gc_active(), the card_ptr != NULL, the various checks whether we are scanning into an unparseable point etc. - HeapRegion::oops_on_card_seq_iterate_careful() always does at least one unnecessary call to HeapRegion::block_size(). I.e. the one done while positioning the cursor at the object starting at or spanning into the card in question is not reused in the entry of the iteration loop. HeapRegion::block_size() is very expensive in G1. - one can aggressively specialize HeapRegion::block_size() for the use case during gc: - addr can not be >= top(), dropping the check - the repeated calculation of g1h->concurrent_mark()->prevMarkBitMap() is very expensive. Its load should be hoisted out of the oops_on_card_seq_iterate_careful() main loop and passed in from a local variable. - further, the information that the object is dead should be returned from block_size() (or a specialized one). After determining block_size(), oops_on_card_iterate() again does an expensive lookup of the prev mark bitmap to check whether the object is dead and looks up the mark bitmap again. - need to look at the called methods, if it is appropriate to make them more amenable to inlining (some short, called methods are in cpp files) - HeapRegion::block_is_obj() could be aggressively specialized for RS scan too: the first check for whether the given address is in a continues humongous region can be hoisted out of the entire oop iteration loop into oops_on_card_seq_iterate_careful(); - HeapRegion::is_obj_dead() could be specialized too: e.g. the is_archive check can be hoisted out to top-level (and actually, since archive regions do not contain any references to non-archive regions) is superfluous
|