JDK 25 |
---|
25 b17Fixed |
Blocks :
|
|
Causes :
|
|
Cloners :
|
|
Relates :
|
|
Relates :
|
Dacapo xalan benchmark is around 14% slower with -XX:+UseObjectMonitorTable. For now, the OM table is off so this is when it's turned on by default. I have tried out a couple of ideas to see if they affect performance of xalan (I'm told it's pronounced zay-lon, not x-Alan). Ideas 1. adjust size of OMCache from 2, 4, 8, 12, 24. None matter. Keeping at 8. 2. not use OMCache at all: worse. 3. not clear OM cache during GC (added oops_do which unfortunately keeps things alive). Better hit rate but no better performance overall. 4. skip using OM cache in fast path (quick_enter) since it seems to repeat checks, no difference. 5. took out spinning before inflating monitor, worse, even though the hit rate is bad: _fast_lock_spin_failure = 37987135 _fast_lock_spin_success = 556770 _fast_lock_spin_attempt = 1039882 A table or om-cache lookup for each monitor enter, since these monitors are contended is 14% worse. Other benchmarks don't show this regression (except Dacapo23_spring, which is maybe the same thing). xalan perf shows the code mostly in ObjectMonitor::TrySpin with and without the table. Adaptive spinning is something that really helps xalan though. Added some counters to the runtime code (c1-only performance was equivalently slower with OM table, so ignoring c2_MacroAssembler for now) ===== DaCapo 9.12-MR1 xalan PASSED in 4435 msec ===== _om_cache_hits = 2456302 _om_cache_misses = 1327485 _try_enter_success = 1198359 _try_enter_failure = 1257943 _try_enter_slow_failure = 958268 _try_enter_slow_success = 1672344 _fast_lock_spin_attempt = 33427 _fast_lock_spin_success = 4896 _fast_lock_spin_failure = 28531 _table_lookups = 1339097 _table_hits = 1338926 _items_count = 171
|