JDK-8181450 : assert in BasicHashtable::verify_table
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 10
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: linux
  • CPU: x86_64
  • Submitted: 2017-06-01
  • Updated: 2019-05-22
  • Resolved: 2017-06-16
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 10
10 b21Fixed
Related Reports
Relates :  
Description
I got the following assert when running a hotspot push job to jdk10/hs:

#  Internal Error (/scratch/opt/jprt/T/P1/165928.mikael/s/hotspot/src/share/vm/utilities/hashtable.cpp:339), pid=4554, tid=4584
#  assert(max_bucket_count < ((1 + number_of_entries()/table_size())*10)) failed: Table is unbalanced

Specifically, this was running the linux_x64_3.8-fastdebug-c2-hotspot_tier1_gc_2 tests, and seemed to crash when running the TestNewRatioFlag$NewRatioVerifier test.
Comments
The python script run several times may have started with the same seed, therefore generating the same sequence that doesn't generate 10 of the same values (for bucket number). The attached C program uses both c++11 uniform_int_distribution and our os::random Park-Miller implementation, and does randomly get 10 values that are the same, like: ... uniform_int_distribution max bucket is 10 a.out: rand.cpp:23: void check(const char*, std::array<int, 858ul>): Assertion `max_bucket < 10' failed. Abort count 2502218 fail 110 test_random_with_mod max bucket is 9 uniform_int_distribution max bucket is 9 test_random_with_mod max bucket is 9 test_random_with_mod max bucket is 9 test_random_with_mod max bucket is 9 uniform_int_distribution max bucket is 9 uniform_int_distribution max bucket is 9 uniform_int_distribution max bucket is 9 test_random_with_mod max bucket is 10 a.out: rand.cpp:23: void check(const char*, std::array<int, 858ul>): Assertion `max_bucket < 10' failed. Abort count 2522220 fail 111 So the probability that this assert that I added is not zero for a random distribution. Experimentally, it's something on the order of 1/25000 runs of java (with -XX:VerifyOnExit). So it is less difficult to write a C++ program than solving the generalized Birthday problem. https://en.wikipedia.org/wiki/Birthday_problem
15-06-2017

https://stackoverflow.com/questions/2999075/generate-a-random-number-within-range/2999130#2999130 The mod function might be what is skewing the results.
13-06-2017

Running a simple python script that generates randint(0,1009) for 858 entries finds that no random number is repeated 10 times. They may be repeated 2 or maybe 3 times in multiple days of running. Without doing the math for the birthday problem, I don't think hitting a 10 deep bucket for 858 entries with a decently distributed hash function is statistically possible. Which leads to suspicion that the hash function that we're using for Symbol and class_loader does not distribute well for 1009 buckets. Just running java -version gets a bucket length of 4 for only 473 entries. java -Xlog:hashtables -version java version "10-internal" Java(TM) SE Runtime Environment (fastdebug build 10-internal+0-2017-06-02-153152.cphillim.10sysdict-perf2) Java HotSpot(TM) 64-Bit Server VM (fastdebug build 10-internal+0-2017-06-02-153152.cphillim.10sysdict-perf2, mixed mode) [0.278s][info][hashtables] System Dictionary max bucket size 4 bucket 349 element count 473 table size 1009 [0.278s][info][hashtables] Protection Domain Table max bucket size 0 bucket 0 element count 0 table size 2017
12-06-2017

The DictionaryEntry hash code comes from this code: The Symbol hash code is: ... _identity_hash = (short)os::random(); ... unsigned identity_hash() const { unsigned addr_bits = (unsigned)((uintptr_t)this >> (LogMinObjAlignmentInBytes + 3)); return ((unsigned)_identity_hash & 0xffff) | ((addr_bits ^ (_length << 8) ^ (( _body[0] << 8) | _body[1])) << 16); } SystemDictionary compute_hash is: unsigned int compute_hash(const Symbol* name, const ClassLoaderData* loader_data) const { unsigned int name_hash = name->identity_hash(); // loader is null with CDS assert(loader_data != NULL || UseSharedSpaces || DumpSharedSpaces, "only allowed with shared spaces"); unsigned int loader_hash = loader_data == NULL ? 0 : loader_data->identity_hash(); return name_hash ^ loader_hash; } unsigned int ClassLoaderData::identity_hash() const { return _class_loader == NULL ? 0 : _class_loader->identity_hash(); } ... intptr_t oopDesc::identity_hash() { // Fast case; if the object is unlocked and the hash value is set, no locking is needed // Note: The mark must be read into local variable to avoid concurrent updates. markOop mrk = mark(); if (mrk->is_unlocked() && !mrk->has_no_hash()) { return mrk->hash(); } else if (mrk->is_marked()) { return mrk->hash(); } else { return slow_identity_hash(); } } ... intptr_t oopDesc::slow_identity_hash() { // slow case; we have to acquire the micro lock in order to locate the header Thread* THREAD = Thread::current(); ResetNoHandleMark rnm; // Might be called from LEAF/QUICK ENTRY HandleMark hm(THREAD); Handle object(THREAD, this); return ObjectSynchronizer::identity_hash_value_for(object); } ... runtime/globals.hpp: experimental(intx, hashCode, 5, \ runtime/globals.hpp: "(Unstable) select hashCode generation algorithm") \ static inline intptr_t get_next_hash(Thread * Self, oop obj) { intptr_t value = 0; if (hashCode == 0) { // This form uses an unguarded global Park-Miller RNG, // so it's possible for two threads to race and generate the same RNG. // On MP system we'll have lots of RW access to a global, so the // mechanism induces lots of coherency traffic. value = os::random(); } else { // Marsaglia's xor-shift scheme with thread-specific state // This is probably the best overall implementation -- we'll // likely make this the default in future releases. unsigned t = Self->_hashStateX; t ^= (t << 11); Self->_hashStateX = Self->_hashStateY; Self->_hashStateY = Self->_hashStateZ; Self->_hashStateZ = Self->_hashStateW; unsigned v = Self->_hashStateW; v = (v ^ (v >> 19)) ^ (t ^ (t >> 8)); Self->_hashStateW = v; value = v; } ClassLoaderData:: should have an os::random() _identity_hash so that the class_loader oop doesn't have to do all of this (ie have a hashCode in the oop header and potentially safepoint with biased locking). This seems expensive. Also with the NULL class loader oop, the hash is only Symbol* hashcode. Maybe this would improve things. Or not.
02-06-2017

From Mikael's core file. I think the max_bucket_length is 10 (optimized away). From 858 entries on a 1009 bucket hashtable. That does seem bad. (gdb) print *this $8 = {<CHeapObj<(MemoryType)1>> = {<AllocatedObj> = { _vptr.AllocatedObj = 0x7f95e4375270 <vtable for Dictionary+16>}, <No data fields>}, _table_size = 1009, _buckets = 0x7f95dc0f7f30, _free_list = 0x0, _first_free_entry = 0x7f95dc3e2df0 '\361' <repeats 200 times>..., _end_block = 0x7f95dc3e4cf0 '\253' <repeats 16 times>, "\272\272\272\272\272\272\272\272U", _entry_size = 48, _number_of_entries = 858}
02-06-2017

It's the Dictionary.
01-06-2017