JDK-8241137 : AArch64: Volatile accesses are not sequentially consistent with VM option "-XX:+UseBarriersForVolatile"
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 15
  • Priority: P4
  • Status: Closed
  • Resolution: Duplicate
  • CPU: aarch64
  • Submitted: 2020-03-18
  • Updated: 2020-10-14
  • Resolved: 2020-10-14
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbdResolved
Related Reports
Duplicate :  
Relates :  
Relates :  
Description
Several Renassiance benchmark hang on AArch64 with the VM option "-XX:+UseBarriersForVolatile".  It happens on two types of AArch64 platforms from different partners. After analysing the cause, we found it's quite related to the memory sequentially consistent issue mentioned in JDK-8179954.

To resolve the issue caused by mixing use "LDR, DMB" for volatile load and "STLR/STLXR" for volatile store, it inserts a full barrier before the volatile load in C1/Interpreter. However it only happens with the condition "UseBarriersForVolatile == false". 

With option "-XX:+UseBarriersForVolatile", C2 inserts a leading full barrier and a trailing "LOAD_LOAD | LOAD_STORE" barrier for atomics. 
Eg: Codes generated for CAS:
                dmb    ish
        retry:
                ldxr     w0, [address]
                cmp     w0, w1
                b.ne     done
                stlxr      w8, w2, [address]
                cbnz     w8, retry
        done:
                dmb     ishld   

The trailing "LOAD_LOAD | LOAD_STORE" barrier cannot guarantee the memory consistent between "STLXR" and the subsequent "LDR" for volatile load. It needs a full barrier here to make sure the memory consistent. As a result, both inserting a full barrier before volatile load in C1/Interpreter (removing the "UseBarriersForVolatile" checking) and changing the trailing barrier of atomic to "MemBarVolatile" can resolve the hang issue on one machine. 

Except for the atomics, there are any other usages of "STLR / STLXR". This might make the hang issue happen on another arm machine. I'm not sure whether it's the same issue with JDK-8179954. But the hang issue can also be resolved by inserting a full barrier before volatile load in interpreter (codes in "./src/hotspot/cpu/aarch64/templateTable_aarch64.cpp"). Unfortunately I have not yet found the key codes that generating the wrong reorder.

So do we actually need to consider the VM option "-XX:+UseBarriersForVolatile"?  If it is needed and used somewhere, I think it's valuable to fix the issue.
Comments
Bugs should only be closed as "Fixed" if there is an actual fix (changeset). Closing as duplicate of JDK-8243339.
14-10-2020