JDK-8163440 : Missing load-acquire in relation to CardTableModRefBS::inline_write_ref_field
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 9
  • Priority: P4
  • Status: Resolved
  • Resolution: Not an Issue
  • Submitted: 2016-08-09
  • Updated: 2016-08-16
  • Resolved: 2016-08-15
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 9
9Resolved
Related Reports
Relates :  
Relates :  
Description
./share/vm/gc/shared/cardTableModRefBS.inline.hpp:    

The CardTableModRefBS::inline_write_ref_field method takes a release parameter that causes a release_store to be used. However release-store, if needed, is generally expected to be paired with a load-acquire - unless, for example, all such loads can be proven to be safe, such as if only occurring at a safepoint.


Comments
Okay ... but that suggests that the thread doing the release-store is not involved in the synchronization that involves the CMSToken, which indicates to me that the CMSToken in itself does not negate the need for the load-acquire. Now you can argue that there are some "global" barriers lurking inside some of these other synchronization actions but the interaction with the thread doing the release-store in that "global" barrier still seems unclear.
16-08-2016

The release-store ensures the store of the value happens before the associated card marking. Without it, the reader (e.g. preclean) could see the previous value.
16-08-2016

Thanks for investigating - though the analysis makes me wonder if the release-store is even necessary.
16-08-2016

I agree with Kim's assessment that this is not a problem. This example in the precleaning is in "share/vm/gc/cms/concurrentMarkSweepGeneration.cpp" 4053 if (!dirtyRegion.is_empty()) { 4054 stopTimer(); 4055 CMSTokenSyncWithLocks ts(true, old_gen->freelistLock(), bitMapLock()) ; 4056 startTimer(); 4057 sample_eden(); 4058 verify_work_stacks_empty(); 4059 verify_overflow_empty(); 4060 HeapWord* stop_point = 4061 old_gen->cmsSpace()->object_iterate_careful_m(dirtyRegion, cl); 4062 if (stop_point != NULL) { 4063 assert((_collectorState == AbortablePreclean && should_abort_precle an()), 4064 "Should only be AbortablePreclean."); 4065 _ct->ct_bs()->invalidate(MemRegion(stop_point, dirtyRegion.end())); line 4055 (CMSTokenSyncWithLocks) is after the card table has been examined to determine the dirty region and before the examining of the locations at 4061 (object_iterate_careful_m() call). Additionally, the remark phase that also examines cards marks is at a safepoint so is after locking has occurred.
15-08-2016

Digging further, I see how this works. In CMSCollector::preclean_card_table, we get a dirty range and fill the range with precleaned_card. Then we get the CMS token. And then we iterate over the dirty range. Getting the CMS token provides the needed barrier between searching the card table and scanning the covered region. Similarly in preclean_mod_union_table. The generic code I was looking at is used by CMS in a pause. I'm not so familiar with CMS though, so I'd like a second opinion before closing this as not a bug.
11-08-2016

I agree that the problem here is CMS-specific. However, I don't think the mutex locking by run_service deals with this at all. The release_store to the card table is to ensure the value store is visible to readers before the card table update. The CMS concurrent processing searches the card table for dirty (actually non-clean?) regions and scans the covered region. There doesn't appear to be any memory barrier between the card table search and the covered region scanning. That appears to all be in "generic" code, with no specialization for CMS to account for the need for a barrier between the card table read and the reads from the covered region.
11-08-2016

I remember that this (optional) call to release_store in CardTableModRefBS::inline_write_ref_field() has been introduced for CMS in particular (and is only executed for it), see JDK-8029396. I remember some discussion (might be a different one) arguing that the corresponding load_acquires are subsumed by the mandatory MutexLocker locking in ConcurrentMarkSweepThread::run_service(). That thread is the only one reading the card table in parallel to the mutators.
10-08-2016