JDK-8071280 : Specialize HeapRegion::oops_on_card_seq_iterate_careful() for use during concurrent refinement and updating the rset
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 9
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2015-01-21
  • Updated: 2018-06-21
  • Resolved: 2017-06-02
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 10
10 b21Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Description
HeapRegion::oops_on_card_seq_iterate_careful() is currently used to walk through the heap corresponding to cards both during concurrent refinement and the update RS phase during GC.

For this reason it does too much particularly during gc: the filter_young and the card_ptr are only used during refinement actually used, additionally some code that is not necessary during the update RS phase is executed.

In particular:
  - during update RS, filter young is always false, and card_ptr is always NULL.
  - the check whether we are during gc could be removed then
  - during GC it is not possible to run into an unparseable point, so all the corresponding checks are superfluous there
  - I am not sure if the loop to get from the first block_start() call to the first object reaching into the given memory range is actually necessary. A single check whether the parseable point has been reached is probably sufficient (if any), given that it should be impossible to actually get a card in the queue that has not been allocated into at all. It is likely a renmant of the previous use of  block_start_careful() instead of block_start.

The code should get slightly faster too, although probably not noticeable.
Comments
Adding more info after thinking about it more: the loop is actually required because block_start() does not return the information whether it stopped at an unparseable point, or successfully returns the address of the first object reaching into the object. Since after block_start() we do not know that, we need to try to reposition the current "finger", and that may just succeed again after the block_start() call without actually being at the position where we are. The interface of the block_start() method prevents us from avoiding the re-check and the loop. One idea would be to just return NULL from block_start() (or a new method) in that case.
22-01-2015

After some digging into the code, I think that above pt. 4 is actually true. Block_start() actually stops at the first unparseable point (naturally) before the given address. If so, block_start returns that point (that is closest to the given address). I do not see a way how checking that in a loop and retrying will improve the situation. It may be that in the meantime (since the block_size() call just now) that location magically got parseable, but there is no good reason why this would be the case just a few instructions after recognizing that we are at an unparseable point. (For the same reason we could as well wait indefinitely for that to happen here). At gc time the area between bottom() and scan_top() should be completely parseable. I am sure we are never allocating in there (below scan_top) during gc, and we should never scan the area we just allocated into (above scan_top) using this code path. Maybe I am missing something though.
21-01-2015