Relates :
|
|
Relates :
|
|
Relates :
|
|
Relates :
|
|
Relates :
|
David Holmes writes: Okay Dave Dice has managed to untangle all this for me. Thanks Dave! Basic problem is that we have three separate, but related, API's defined in OrderAccess: 1. Membars These are basic membar operations to provide storestore, storeload, loadstore and loadload. 2. acquire()/release()/fence() These are "traditional" acquire/release semantics that provide the basic "roach-motel" property: // A acquire(); // B release(); // C accesses in A and C can move into section B. But nothing in B can move into A or C. Plus we have the heavyweight fence(). 3. load_acquire/release_store/store_fence/release_store_fence These are/were intended primarily to map to the ia64 instruction ld.acq and st.rel (load with acquire semantics, store with release semantics), plus additional fences as needed - and as explained on orderAccess.hpp. The critical point here is that: release_store(addr, v) IS NOT THE SAME AS release(); store(addr, v) because the ia64 release semantics are stronger than general release() semantics in that the store (region C) can not be moved up ( to region B). So release_store(lock, 0) is exactly what we want when doing the IUnlock because it is equivalent to: storeStore|storeload; store(lock, 0); with regards to the store to the lock. Implementing release_store as "storeStore|storeload; store" is valid but provides stronger semantics (the barriers are with respect to all subsequent stores) than the ia64 st.rel. The problem is that "we" have confused the implementation by defining some of these methods in terms of the others, in an incorrect way eg: storestore() != release() release_store(addr, v) != release(); *addr=v; So we need to fix this so as not to confuse others - particularly other porters. It seems to me that having explained the above, the release() and acquire() functions are not useful as we will typically want load_acquire() and store_release() for Java volatile semantics; release_store for monitor unlock, and the CAS on monitor locking gives us a full fence(). Note: on TSO systems there is no runtime problem because these store barriers are no-ops anyway. On other platforms it is not an issue if a strong enough h/w barrier is used for the actual implementation eg: DMB on ARM, lwsync on PPC (see JSR133 Compiler Writers Cookbook). But it is important that the correct semantics are expressed in the code.
|