There is a fairly massive performance degradation on parallel old gc when
running on x86-64/linux. The performance degradation can also be seen in the
recently released 6u14 release from Sun.
The issue is the code that is generated for these functions in file
ParMarkBitMap.hpp:
inline size_t ParMarkBitMap::bits_to_words(idx_t bits) {
return bits * obj_granularity();
}
and
inline ParMarkBitMap::idx_t ParMarkBitMap::words_to_bits(size_t words) {
return words / obj_granularity();
}
In both cases, the value returned by obj_granularity() is 1. However gcc
decides to generate a div instruction when this function is called from
ParMarkBitMap::live_words_in_range().
See openjdk bugzilla bug https://bugs.openjdk.java.net/show_bug.cgi?id=100006