JDK-8046133 : JEP 143: Improve Contended Locking
  • Type: JEP
  • Component: hotspot
  • Sub-Component: runtime
  • Priority: P3
  • Status: Closed
  • Resolution: Delivered
  • Fix Versions: 9
  • Submitted: 2011-11-30
  • Updated: 2017-03-06
  • Resolved: 2015-10-20
Related Reports
Duplicate :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Sub Tasks
JDK-8049158 :  
Description
Summary
-------

Improve the performance of contended Java object monitors.


Goals
-----

Improve the overall performance of contended Java object monitors as
measured by the following benchmarks and tests:

  - CallTimerGrid (though more of a stress test than a benchmark) 	
  - Dacapo-bach (was dacapo2009) 	
  - _ avrora 
  - _ batik 	
  - _ fop 	
  - _ h2 	
  - _ luindex 	
  - _ lusearch 	
  - _ pmd 	
  - _ sunflow 	
  - _ tomcat 	
  - _ tradebeans 	
  - _ tradesoap 	
  - _ xalan 	
  - DerbyContentionModelCounted 	
  - HighContentionSimulator 	
  - LockLoops-JSR166-Doug-Sept2009 (was LockLoops) 	
  - PointBase 	
  - SPECjbb2013-critical (was specjbb2005) 	
  - SPECjbb2013-max 	
  - specjvm2008 	
  - volano29 (was volano2509) 


Non-Goals
---------

It is not a goal of this project to address any performance improvements
for internal VM monitors or Mutexes; Java monitors and internal VM
monitors/mutexes are implemented by different code.  While some of the
concepts in this project might be applicable to internal VM
monitors/mutexes, the code is not directly applicable.

It is not a goal of this project to improve contended Java monitor
performance on every benchmark or test; in some cases there may be a
performance degradation in a specific benchmark or test. That performance
degradation might be considered acceptable in order to gain a performance
improvement on another benchmark or test.


Success Metrics
---------------

This project will be considered a success if there are demonstrable
performance gains as measured by the above benchmarks without offsetting
significant performance regressions.

There must not be a non-trivial performance regression for uncontended
locks.


Motivation
----------

Improving contended locking will significantly benefit real world
applications, in addition to industry benchmarks such as Volano and
DaCapo.


Description
-----------

This project will explore performance improvements in the following areas
related to contended Java Monitors:

  - Field reordering and cache line alignment
  - Speed up `PlatformEvent::unpark()`
  - Fast Java monitor enter operations
  - Fast Java monitor exit operations
  - Fast Java monitor `notify`/`notifyAll` operations

The original body of work also included changes for "faster hashcode";
since Java object hashcode support is not directly related to contended
Java monitors, that work will not be included in this project.

This project will also generate fixes for various bugs discovered during
the course of the work; these bug fixes will be managed independently of
the performance improvement work so that the fixes can be integrated
sooner.

This project is covered by the following "umbrella" bug for
administrative simplicity:

JDK-6607129 Reduce L2$ coherence miss traffic in contended lock spin loop,
specifically for derby on ctn-family

However, as sub-tasks or bug fixes are completed the work will be
integrated using a separate bug id.  This allows the entire project to be
referred to via one bug ID (JDK-6607129) while allowing incremental
improvements to be made available more quickly than waiting for the
entire project to complete.


Testing
-------

### Functional testing

There does not appear to be a specific set of functional tests
exclusively for Java monitors, nor is one necessary.  Java Monitors are
so widely used by even the simplest of Java programs that almost any
functional breakage in Java monitors should be obvious.

### Stress Tests

There needs to be a set of well known stress tests for Java monitors.
These can be targeted stress tests for specific Java monitor scenarios or
tests generally known to be heavy users of Java monitors run with
specific stress inducing options.

Note: Use '-XX:-UseBiasedLocking -XX:+UseHeavyMonitors' to bypass both
biased locking and stack based locking; forces the use of ObjectMonitor
objects.

### Field reordering and cache line alignment sub-task stress tests

Stress test should focus on generating high numbers of active
ObjectMonitor objects.  The targets of the stress testing are peak
ObjectMonitor usage, the ObjectMonitor block allocation algorithm and the
ObjectMonitor free list management code.  The following are the goals:

  1. To have the same or better peak ObjectMonitor usage for small to
     medium configurations,
  2. To have no memory leaks, and
  3. To have no data-structure management failures.

### Speed up `PlatformEvent::unpark()` sub-task stress tests

Stress test should focus on high numbers of concurrent waiters and/or
concurrent enter-exit threads.  The mix of enter-wait-exit and enter-exit
threads should be configurable.  The target of the stress testing is the
successor mechanism.

Goal: no hangs due to lost unpark operations.

### Fast Java monitor enter operations sub-task stress tests

Stress test should focus on correctness of enter-exit operations with a
scalable number of parallel threads. The target of the stress testing is
Java monitor ownership.

Goal: No ownership conflicts where more than one thread thinks it owns
the Java monitor.

### Fast Java monitor exit operations sub-task stress tests

Should be covered by the stress tests for the "speed up
`PlatformEvent::unpark()`" and "fast Java monitor enter operations"
sub-tasks.

### Fast Java monitor Notify/NotifyAll operations sub-task stress tests

Stress test should focus on correctness of enter-wait-exit operations
with a scalable number of parallel threads.  The target of the stress
testing is Java monitor ownership after `wait()` completes and the Java
monitor is re-entered.

Goal: No ownership conflicts where more than one thread thinks it owns
the Java monitor.


Comments
TOI no longer needed as confirmed by Sustaining.
06-03-2017

The last bucket for this task was pushed via JDK-8075171 which was integrated in JDK9-B76.
13-08-2015

Moved the Adaptive Spin and SpinPause on SPARC improvements out of this JEP to a future RFE due to continued investigation into better optimizations on the latest hardware.
02-07-2015