JDK-8206471 : Race with ConcurrentHashTable deleting items on insert with cleanup thread
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 11
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2018-07-06
  • Updated: 2019-09-19
  • Resolved: 2018-07-10
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 12
11Fixed 12 b02Fixed
Related Reports
Relates :  
Relates :  
Description
While working on the SymbolTable work, we are seeing a crash in

test/hotspot/jtreg/vmTestbase/metaspace/staticReferences/StaticReferences.java

From the crash, it looks like the crashing thread is cleaning concurrently (all concurrentHashTable.inline.hpp) here:

509        if (!HaveDeletables<IsPointer<VALUE>::value, EVALUATE_FUNC>::
510            have_deletable(bucket, eval_f, prefetch_bucket)) {
511            // Nothing to remove in this bucket.
512            continue;
513        }

down:

266        if (next->next() != NULL) {
267          Prefetch::read(*next->next()->value(), 0);
268        }

the next->value() pointer is 0x8.

Another thread is trying to insert and has decided to clean at fast inserts on line 926:

922      } else if (i == 0 && clean) {
923        // We only do cleaning on fast inserts.
924        Bucket* bucket = get_bucket_locked(thread, lookup_f.get_hash());
925        assert(bucket->is_locked(), "Must be locked.");
926        delete_in_bucket(thread, bucket, lookup_f);
927        bucket->unlock();

In delete_in_bucket(), the other thread is trying to write_synchronize()

562        GlobalCounter::write_synchronize();

I've had and discarded several theories about the lock/critical section ordering.  I'm trying to see if it doesn't reproduce without the prefetching because one thread might be removing entries from the prefetched bucket while the adding thread is deleting those entries.  But the bucket linked list pointers should be using cas so it should be ok (?)

Comments
Robbin's reply: Hi, I'm not happy with the double load of next. diff -r 7d078d2daacc src/hotspot/share/utilities/concurrentHashTable.inline.hpp --- a/src/hotspot/share/utilities/concurrentHashTable.inline.hpp Thu Jul 05 14:35:03 2018 -0700 +++ b/src/hotspot/share/utilities/concurrentHashTable.inline.hpp Fri Jul 06 12:14:25 2018 +0200 @@ -265,4 +265,5 @@ } - if (next->next() != NULL) { - Prefetch::read(*next->next()->value(), 0); + Node* next_perf = next->next(); + if (next_perf != NULL) { + Prefetch::read(*next_perf->value(), 0); } Since we are inside a critical section next->next() can either give NULL or a valid next Node. And if SymbolTable does the right thing the value is always stable for a loaded Node. So in first load of next it could be non-null but in prefetch load it can be null.
06-07-2018