JDK-6940310 : G1: MT-unsafe calls to CM::region_stack_push() / CM::region_stack_pop()
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: hs16,hs18
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: generic,windows_2003
  • CPU: generic,x86
  • Submitted: 2010-04-01
  • Updated: 2013-09-18
  • Resolved: 2010-04-21
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 6 JDK 7 Other
6u21Fixed 7Fixed hs17Fixed
Related Reports
Duplicate :  
Relates :  
Relates :  
Description
We add / remove entries to / from the region stack in the ConcurrentMark class using the region_stack_push() / region_stack_pop() methods. The assumption for those methods is that region_stack_push() will only be called during evacuation pauses and the region_stack_pop() will only be called during concurrent marking. So, their implementation which is based on CAS is safe as long as we only call one of those methods concurrently by multiple threads but not both.

Unfortunately, there's a case where both can be called concurrently: during marking we mostly call region_stack_pop() to process entries from the region stack. But if we get a safepoint while we're processing one of the entries we will use region_stack_push() to push the remainder of the region we're processing so that we can carry on processing it in the future. And if we call region_stack_push() / region_stack_pop() concurrently, the results are undefined (i.e., the pop method can decrement the index and read the contents from the array before the push method has actaully installed them).

Comments
EVALUATION http://hg.openjdk.java.net/hsx/hsx17/master/rev/e9ae45b9ec48
09-04-2010

EVALUATION http://hg.openjdk.java.net/hsx/hsx17/baseline/rev/e9ae45b9ec48
08-04-2010

EVALUATION http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/72f725c5a7be
07-04-2010

SUGGESTED FIX For this fix we'll take a conservative approach. We'll carry on using the CAS-based region_stack_push() method during evacuation pauses. However, we'll introduce region_stack_push_with_lock() / region_stack_pop_with_lock() to be used during the marking phase (so that they can be safely called concurrently with each other). The alternative fix (each CMTask saves a region locally to be able to get back to it later) is a bit more fiddly.
02-04-2010

SUGGESTED FIX There are a couple of solutions to this problem. First: manipulate the region stack while holding a lock instead of using a CAS. Second: if a CM task wants to retain a region for later processing, it can save it on a local MemRegion field and make sure it processes it next time around, instead of pushing it on the region stack. The latter solution will be a bit more efficient. However, it will complicate a few things in the concurrent marking code given that there are a couple of places when we need to know whether the region stack is empty or not and having half-scanned regions saved for later processing will be the same as having a non-empty region stack.
01-04-2010

EVALUATION See Description.
01-04-2010