United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: JDK-6814552 par compact - some compilers fail to optimize bitmap code
JDK-6814552 : par compact - some compilers fail to optimize bitmap code

Details
Type:
Enhancement
Submit Date:
2009-03-07
Status:
Resolved
Updated Date:
2010-04-02
Project Name:
JDK
Resolved Date:
2009-06-30
Component:
hotspot
OS:
generic
Sub-Component:
gc
CPU:
generic
Priority:
P3
Resolution:
Fixed
Affected Versions:
hs14
Fixed Versions:
hs16 (b05)

Related Reports
Backport:
Backport:
Relates:

Sub Tasks

Description
There is a fairly massive performance degradation on parallel old gc when
running on x86-64/linux. The performance degradation can also be seen in the
recently released 6u14 release from Sun.

The issue is the code that is generated for these functions in file
ParMarkBitMap.hpp:


inline size_t ParMarkBitMap::bits_to_words(idx_t bits) {
        return bits * obj_granularity();
}

and

inline ParMarkBitMap::idx_t ParMarkBitMap::words_to_bits(size_t words) {
        return words / obj_granularity();
}

In both cases, the value returned by obj_granularity() is 1. However gcc
decides to generate a div instruction when this function is called from
ParMarkBitMap::live_words_in_range().

See openjdk bugzilla bug https://bugs.openjdk.java.net/show_bug.cgi?id=100006

                                    

Comments
EVALUATION

The g++ compiler failed to optimize away a divide by 1 in code that is important to GC performance.  This code has been part of parallel compaction since it's initial release in jdk 5 update 6; it's possible that gcc changes resulted in the regression.  Simple fix is to change the code to use explicit shifts.
                                     
2009-03-07
EVALUATION

Using shift instead of div helps performance, but the more likely cause of the regression is changes to the BitMap class which inadvertently enabled calls to empty verification methods in the product build; see 6849716.
                                     
2009-06-10
SUGGESTED FIX

diff -r f89cf529c3c7 -r 348e8b681498 src/share/vm/gc_implementation/parallelScavenge/parMarkBitMap.hpp
--- a/src/share/vm/gc_implementation/parallelScavenge/parMarkBitMap.hpp	Mon Jun 08 16:14:19 2009 -0700
+++ b/src/share/vm/gc_implementation/parallelScavenge/parMarkBitMap.hpp	Sun Jun 07 22:08:24 2009 -0700
@@ -177,6 +177,7 @@
   // are double-word aligned in 32-bit VMs, but not in 64-bit VMs, so the 32-bit
   // granularity is 2, 64-bit is 1.
   static inline size_t obj_granularity() { return size_t(MinObjAlignment); }
+  static inline int obj_granularity_shift() { return LogMinObjAlignment; }
 
   HeapWord*       _region_start;
   size_t          _region_size;
@@ -299,13 +300,13 @@
 inline size_t
 ParMarkBitMap::bits_to_words(idx_t bits)
 {
-  return bits * obj_granularity();
+  return bits << obj_granularity_shift();
 }
 
 inline ParMarkBitMap::idx_t
 ParMarkBitMap::words_to_bits(size_t words)
 {
-  return words / obj_granularity();
+  return words >> obj_granularity_shift();
 }
 
 inline size_t ParMarkBitMap::obj_size(idx_t beg_bit, idx_t end_bit) const
                                     
2009-06-10
EVALUATION

http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/353ba4575581
                                     
2009-06-14



Hardware and Software, Engineered to Work Together