JDK-8158854 : Ensure release_store is paired with load_acquire in lock-free code
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 9
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2016-06-06
  • Updated: 2016-09-22
  • Resolved: 2016-08-30
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 9
9 b137Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Relates :  
Description
In the follow up to JDK-8154750 Zhengyu Gu wrote:
>
> In classLoader.hpp
>
>  // Next entry in class path
>   ClassPathEntry* next() const { return _next; }
>   void set_next(ClassPathEntry* next) {
>     // may have unlocked readers, so write atomically.
>     OrderAccess::release_store_ptr(&_next, next);
>   }
>
> If there are unlocked readers, looks like that load_acquire is needed in
> next() function. 

In general we need to verify store_release is always paired with load_acquire on lock-free paths.
Comments
Here are the uses of release_store_ptr that I have examined: ./share/vm/utilities/hashtable.inline.hpp: OrderAccess::release_store_ptr(&_entry, l); Ok - Has matching load-acquire. --- ./share/vm/oops/instanceKlass.hpp: { OrderAccess::release_store_ptr(&_methods_jmethod_ids, jmeths); } ./share/vm/oops/instanceKlass.cpp: OrderAccess::release_store_ptr(&_oop_map_cache, oop_map_cache); ./share/vm/oops/instanceKlass.cpp: OrderAccess::release_store_ptr(&jmeths[idnum+1], id); Ok - Has matching load-acquire. --- ./share/vm/classfile/classLoaderData.cpp: OrderAccess::release_store_ptr(&_klasses, k); ./share/vm/classfile/classLoaderData.cpp: OrderAccess::release_store_ptr(&_packages, packages); ./share/vm/classfile/classLoaderData.cpp: OrderAccess::release_store_ptr(&_modules, modules); ./share/vm/classfile/classLoaderData.cpp: OrderAccess::release_store_ptr(&_metaspace, metaspace); Ok - Has matching load-acquire. --- ./share/vm/code/compiledMethod.hpp: void release_set_exception_cache(ExceptionCache *ec) { OrderAccess::release_store_ptr(&_exception_cache, ec); } OK. This has the comment (ref JDK-8151956): // Note: _exception_cache may be read concurrently. We rely on memory_order_consume here. By which they mean that they are relying on the cache field only being read within a data-dependency chain that ensures the correct ordering - ref: http://preshing.com/20140709/the-purpose-of-memory_order_consume-in-cpp11/ --- ./share/vm/oops/cpCache.cpp: OrderAccess::release_store_ptr(&_indices, _indices | ((u_char)code << bytecode_1_shift)); ./share/vm/oops/cpCache.cpp: OrderAccess::release_store_ptr(&_indices, _indices | ((u_char)code << bytecode_2_shift)); Ok - there is a set protocol for accessing these fields which includes first reading a particular field with load-acquire. --- ./share/vm/oops/oop.inline.hpp: OrderAccess::release_store_ptr(&_mark, m); Ok - mark management is highly complex and most paths involve Atomic::cmpxchg_ptr operations to see updates. --- ./share/vm/services/memoryPool.cpp: OrderAccess::release_store_ptr(&_memory_pool_obj, pool_obj); ./share/vm/services/memoryManager.cpp: OrderAccess::release_store_ptr(&_memory_mgr_obj, mgr_obj); Ok - load-acquire is used (even within locked region which is unnecessary) --- ./share/vm/gc/g1/g1CodeCacheRemSet.cpp: OrderAccess::release_store_ptr(&_table, temp); ./share/vm/gc/g1/g1CodeCacheRemSet.cpp: OrderAccess::release_store_ptr(&_table, temp); Ok - uses load-acquire --- ./share/vm/gc/g1/heapRegionRemSet.cpp: OrderAccess::release_store_ptr(&_hr, hr); Ok - uses load-acquire --- ./share/vm/oops/methodData.hpp: OrderAccess::release_store_ptr(&_cells[index], value); This usage is highly suspect. The methodData code has a lot of potential paths to release_store_pointer, and in some places simple release(), combined with some locking. There are no load-acquire uses, in fact there are a couple of comments: // No need for "OrderAccess::load_acquire" ops, // since the data structure is monotonic. I do not know what that is intended to mean. It is very unclear how accessors are used in this code. Filed: JDK-8162826 --- ./share/vm/classfile/classLoader.hpp: OrderAccess::release_store_ptr(&_next, next); This is the initial example from the bug description. The release happens in set_next which simply links in the next ClassPathEntry. There are a number of code paths that traverse these linled-lists of ClasspathEntyr objects, and while many of them occur during VM initialization (and/or creation of the list occurs during VM initialization) it seems prudent to add acquire semantics to the next() accessor. If this is considered a concern then we can add an additional accessor with acquire semantics and use it as appropriate - but one would then argue the same is needed for set_next(). Status: fixed. --- ./share/vm/classfile/verifier.cpp: OrderAccess::release_store_ptr(&_verify_byte_codes_fn, func); ./share/vm/classfile/verifier.cpp: OrderAccess::release_store_ptr(&_verify_byte_codes_fn, func); Needs fixing: missing load-acquire of _verify_byte_codes_fn. There's an unnecessary release_store of a boolean in the same code. Status: fixed --- ./share/vm/classfile/dictionary.cpp: OrderAccess::release_store_ptr(&_pd_set, new_head); Highly suspect. No load-acquire, and also many plain stores to _pd_set field. But upon investigation it seems reads occur at safepoints so the release_store may be unnecessary. Filed: JDK-8164207 for follow up investigation. --- ./share/vm/runtime/objectMonitor.cpp: OrderAccess::release_store_ptr(&_owner, NULL); // drop the lock ./share/vm/runtime/objectMonitor.cpp: OrderAccess::release_store_ptr(&_owner, NULL); // drop the lock ./share/vm/runtime/objectMonitor.cpp: OrderAccess::release_store_ptr(&_owner, NULL); Ok. The release-store is really intended to be a storestore|storeload barrier, but was implemented as release_store for historical reasons. The plain load of _owner is not an issue because we only care about additional lock state if we become owner and that can only happen after a successful CAS. --- ./share/vm/runtime/synchronizer.cpp: OrderAccess::release_store_ptr(&gBlockList, temp); Ok. There is one read of gBlockLIst that doesn't use load-acquire but it is in a locked section of code. Some uses of load-acquire occur at safepoints so are not strictly needed, but harmless.
17-08-2016

Uses of release_store() examined: ./share/vm/utilities/array.hpp: void release_at_put(int which, T contents) { OrderAccess::release_store(adr_at(which), contents); } ./share/vm/prims/jni.cpp: OrderAccess::release_store(&vm_created, 0); ./share/vm/code/compiledMethod.hpp: void increment_count() { OrderAccess::release_store(&_count, _count + 1); } ./share/vm/code/compiledMethod.cpp: OrderAccess::release_store((volatile jubyte*)&_unloading_clock, unloading_clock); ./share/vm/oops/typeArrayOop.hpp: void release_byte_at_put(int which, jbyte contents) { OrderAccess::release_store(byte_at_addr(which), contents); } ./share/vm/oops/oop.inline.hpp: various ./share/vm/runtime/mutex.cpp: OrderAccess::release_store(&_LockWord.Bytes[_LSBINDEX], 0); // drop outer lock ./share/vm/runtime/thread.inline.hpp: OrderAccess::release_store((volatile jint*)&_thread_state, (jint)s); Ok - all the above have suitable paired load-acquire or else do initial reads via an Atomic operation. --- ./share/vm/runtime/perfMemory.cpp: OrderAccess::release_store(&_initialized, 1); Incorrect usage: the initialization logic is broken. We write _initialized last with release_store but there is no load_acquire. Worse the only code that checks _initialized is PerfMemory:;destroy. The actual PerfMemory functions either don't check initialization or use an assert that checks a different field! Filed: JDK-8163437 --- ./share/vm/gc/shared/taskqueue.inline.hpp: OrderAccess::release_store(&_bottom, increment_index(localBot)); ./share/vm/gc/shared/taskqueue.inline.hpp: OrderAccess::release_store(&_bottom, increment_index(localBot)); Complex: this GC code uses a complex mix of raw accessors, load_acquire and various fences. This code is a quagmire I will not enter as part of this bug. The taskQueue issues are well known by GC team. --- ./share/vm/gc/shared/cardTableModRefBS.inline.hpp: OrderAccess::release_store((volatile jbyte*) byte, dirty_card); Suspect: specific API that can use release semantics if requested, but there doesn't appear to be any matching acquire-using API. That is okay if, for example, accesses are always at a safepoint. Filed: JDK-8163440
09-08-2016

Adding Derek to watching list.
08-07-2016

I've investigated the *Klass::array_klass_impl uses of double-checked locking and converted from using OrderAccess::storestore to use a "store release" together with "load acquire".
05-07-2016

Here is my analysis of ClassloaderData::Dependencies. The Dependencies class represents a singly-linked-list of dependent class-mirror or classloader oops - accessed from _list_head. The nodes in this list are two-element objArrayOops: arr[0] is the classloader or class-mirror oop; arr[1] is a link to the next node. The Dependencies instance for a ClassLoaderData is initialized (to the two-element array) shortly after the CLD is created. The CLD's themselves are published into the ClassLoaderDataGraph using Atomic::cmpxchg. So the CLD and thus its dependencies instance are safely published initially. The Dependencies list is updated using ClassLoaderData::Dependencies::add(Handle mirror_or_loader, TRAPS). This occurs in two steps: - first we walk the existing list, from the _list_head, looking for the mirror_or_loader, with no locks held. If we find it we are done; otherwise - we then create a new node for this apparent new dependency and we acquire the lock for this list (which is the monitor of the objArrayOop that _list_head refers to), and then search again. If it is found we are done; else we insert the new dependency into the list at the end. It is important to note that the mirror_or_loader object is itself already safely published - this is not a case where we create it, then insert it into the list and so have to worry about other threads seeing a partially initialized object. So our only concern with the lock-free traversal of the list, when adding, is whether things work correctly if we see a "next" link that is null when it should not be; or whether we see a mirror_or_loader element that is null when it should not be. In both cases if we see an unexpected null we will simply continue to the locked_add part of the operation, and under the lock we must see the correct values. Note that the list never shrinks or has nodes removed - the entire list is discarded when the CLD is discarded. So I conclude that the lock-free traversal in ClassLoaderData::Dependencies::add is perfectly safe without the need for additional memory barriers. The dependency list is also traversed via oops_do: void ClassLoaderData::Dependencies::oops_do(OopClosure* f) { f->do_oop((oop*)&_list_head); } As long as this can only happen at a safepoint - which I believe is always the case with oops_do methods, then this also needs no further memory barrier. We can not be at a safepoint and at the same time have a thread be in the middle of updating the list. The safepoint protocol ensures that the traversing thread is fully synchronized with the mutators so the list must be stable and visible. In conclusion I see no need for any additional memory barriers in relation to the lock-free examination of the Dependencies linked-list.
05-07-2016

Not sure what safepoint housekeeping tasks you are referring to specifically but agree they are unrelated to this bug :)
27-06-2016

VirtualSpaceList::contains() seems also problematic. Unrelated to this particular bug: I notice that these lock free lookup/check patterns have been used quite frequently, but they force more and more housekeeping tasks into safepoints, which is counterproductive as we are trying to reduce GC pause time. Thoughts?
26-06-2016

On 22/06/2016 2:55 AM, Coleen Phillimore wrote: > This looks like another instance of double check locking. > > Klass* InstanceKlass::array_klass_impl(instanceKlassHandle this_k, bool > or_null, int n, TRAPS) { > if (this_k->array_klasses() == NULL) { > if (or_null) return NULL; > > ResourceMark rm; > JavaThread *jt = (JavaThread *)THREAD; > { > // Atomic creation of array_klasses > MutexLocker mc(Compile_lock, THREAD); // for vtables > MutexLocker ma(MultiArray_lock, THREAD); > > // Check if update has already taken place > if (this_k->array_klasses() == NULL) { > Klass* k = > ObjArrayKlass::allocate_objArray_klass(this_k->class_loader_data(), 1, > this_k, CHECK_NULL); > this_k->set_array_klasses(k); > } > } > } > // _this will always be set at this point > ObjArrayKlass* oak = (ObjArrayKlass*)this_k->array_klasses(); > if (or_null) { > return oak->array_klass_or_null(n); > } > return oak->array_klass(n, THREAD); > } >
21-06-2016

method.hpp: set_method_data() and method_data() do not match
21-06-2016

ClassLoaderData::Dependencies::add() is a tricky one. Can we walk the list lock-free? Technically I think not.
20-06-2016

ClassLoaderData::Dependencies::add() and locked_add() exhibit similar problem, or it implies that ObjectLocker can provide full fence?
20-06-2016