Currently the method to add entries to the inline pointers is somewhat inefficient:
// Check if the card is already stored in the pointer.
if (contains(card_idx, bits_per_card)) {
return Found;
}
// Check if there is actually enough space.
if (card_pos_for(num_elems + 1, bits_per_card) >= BitsInValue) {
return Overflow;
}
I.e. contains() could actually return how many elements are already in the card set container.
Also, in the cmpxchg loop we start work looking for whether the element is already stored in there always from the beginning. That's not necessary, there is no overwriting of already inserted values in the inline pointer, only the recently added ones need to be looked at again.
Improve this.