JDK-8043239 : G1: Missing post barrier in processing of j.l.ref.Reference objects
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 8u20,9
  • Priority: P2
  • Status: Closed
  • Resolution: Fixed
  • Submitted: 2014-05-15
  • Updated: 2017-08-02
  • Resolved: 2014-06-03
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 8 JDK 9
8u20Fixed 9 b19Fixed
Related Reports
Relates :  
Description
The fix for JDK-8029255 removed the post barrier for setting the next field of Reference objects. 

http://hg.openjdk.java.net/jdk9/hs-gc/hotspot/rev/fbc1677398c0

That post barrier dirtied a lot of cards that didn't really have to be dirty, but it saved us from other missed card marks. See for example JDK-8031703.

When running the Stress BPM another case of a missing barrier was detected.

The missing barrier occurs when we prune the discovered list in G1 and the list contains reference objects that are in different old regions.
Comments
The regression tests in example was considered to be not very stable for integration into repo.
02-08-2017

After some discussions and prototyping we came to the conclusion that there may be more barriers missing and that it is difficult to get the dirtying done the way our verification code assumes. A simpler solution seems to be to free the reference processing of all barriers and instead just make sure that we dirty all the right cards in the last pass. The fix is to re-introduce the post barrier when we iterate over the discovered list. This time it uses the discovered field for the barrier to be more explicit about what is going on.
03-06-2014

I=Crash -> H L=Hard to provoke -> L W=None -> H HLH=P2
19-05-2014

The following patch fixes the problem and makes the above test pass. It just adds back the post barrier that was removed before. Not sure that this is the right way to solve the problem. diff --git a/src/share/vm/memory/referenceProcessor.cpp b/src/share/vm/memory/referenceProcessor.cpp --- a/src/share/vm/memory/referenceProcessor.cpp +++ b/src/share/vm/memory/referenceProcessor.cpp @@ -365,6 +365,7 @@ // Self-loop next, so as to make Ref not active. // Post-barrier not needed when looping to self. java_lang_ref_Reference::set_next_raw(obj, obj); + oopDesc::bs()->write_ref_field(java_lang_ref_Reference::next_addr(obj), obj); if (next_d == obj) { // obj is last // Swap refs_list into pending_list_addr and // set obj's discovered to what we read from pending_list_addr.
15-05-2014

Here's an email that Stefan wrote to explain the situation. The test mentioned in the email is attached to this bug report (but it is no cleaned up as the email suggests :) ). This most likely happens because a missing write barrier in DiscoveredListIterator::remove function when running mixed GCs. Bengt and I (and Per) tried to reproduce this with a smaller test and managed to write a reproducer that cause us to miss to dirty the card for a Reference. The test tries to place three WeakReferences (WR1, WR2, WR3) in the discovered list, all linked through the discovered field. WR1 is placed in the old gen, while the others are kept in the young gen. Then we make sure that the referent of WR2 is kept alive, so that WR2 gets cut out of the discovered list. WR1.discovered will now be set to point to WR3 without any write barrier! I've attached a test, for the curious. Beware, that there are a lot of subtleties in the test that isn't explained. We'll refine it before attaching it to a bug report. We get the expected verification error with this command line: ALT_JAVA_HOME=/localhome/java/jdk-8-fcs-bin-b132 ~/hg/hs-gc/build/linux/linux_amd64_compiler2/debug/hotspot -XX:+VerifyBeforeGC -XX:+VerifyAfterGC -Dgud -XX:-ResizePLAB -DXX:MaxTLABSize=1024K -XX:MinTLABSize=300K -XX:TLABSize=3000K -XX:+PrintPLAB -Dgud -XX:-UseCompressedOops -Xmx16m -Xmn2m -XX:+UseG1GC -DXX:+PrintHeapAtGC{Extended,} -XX:+PrintGC -XX:+UnlockExperimentalVMOptions -XX:G1HeapWastePercent=0 -XX:G1MixedGCLiveThresholdPercent=2000 -DXX:InitiatingHeapOccupancyPercent=1 -XX:MaxTenuringThreshold=2 -XX:ParallelGCThreads=1 -XX:+ExplicitGCInvokesConcurrent ReferenceCrasher 10000
15-05-2014

Reproducer
15-05-2014