JDK-8031320 : Use Intel RTM instructions for locks
  • Type: New Feature
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 8u20
  • Priority: P2
  • Status: Closed
  • Resolution: Fixed
  • OS: generic
  • CPU: x86
  • Fix Versions: 8u20
  • Submitted: 2014-01-07
  • Updated: 2016-03-07
  • Resolved: 2014-03-25
Related Reports
Blocks :  
Duplicate :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Description
The Intel architectures codenamed Haswell has support for RTM instructions xbegin, xabort, xend and xtest as part of Intel Transactional Synchronization Extension (TSX). The xbegin and xend instructions enclose a set of instructions to be executed as a transaction. If no conflict found during execution of the transaction, the memory and register modifications are committed together at xend. xabort instruction can be used for explicit abort of transaction and xtest to check if we are in transaction.

RTM is useful for highly contended locks with low conflict in the critical region. The highly contended locks don't scale well otherwise but with RTM they show good scaling. RTM allows using coarse grain locking for applications. Also for lightly contended locks which are used by different threads RTM can reduce cache line ping pong and thereby show performance improvement. 
 
Implementation:
---------------------

Generate RTM locking code for all inflated locks when "UseRTM" option is on with normal locking mechanism as fall back handler. On abort/lock busy the lock will be retried a fixed number of times under RTM as specified by "RTMRetryCount" option.  The locks which abort too often can be auto tuned or manually tuned. 

Auto-tuning can be done on an option like UseRTMDeopt and it will need abort ratio calculation for each lock. The abort ratio will be calculated after "RTMAbortThreshold" aborts are encountered. 

On "UseRTMDeopt" if the aborts ratio reaches "RTMAbortRatio" the method containing the lock will be deoptimized and recompiled with all locks as normal locks. If the abort ratio continues to remain low after "RTMLockingThreshold" locks are attempted, then the method will be deoptimized and recompiled with all locks as RTM locks without abort ratio calculation code. The abort ratio calculation can be delayed by specifying the -XX:RTMLockingCalculationDelay in millisecond.

For manual tuning the abort statistics for each lock needs to be provided to the user on some JVM option like "PrintPreciseRTMLockingStatistics". Based on the abort statistics users can generate a .hotspot_compiler file and specify for which methods to disable RTM locking using "DoNotElide" option for that method. 

Support for stack locks using RTM locking can be provided on similar lines on an option like "UseRTMForThinLocks". 

Comments
webrev: http://cr.openjdk.java.net/~kvn/8031320/webrev.06/
06-03-2014

We need to consolidate Fast_Lock/Fast_Unlock in one place before applying RTM changes to avoid duplication.
06-02-2014

Why "Biased locking is not supported with RTM locking"? "The thought was that the TSX locking is most useful when there is high lock contention and low data contention. On high lock contention the lock is usually inflated and biased locking is not suitable for that case." Note, UseRTM is off by default. It should be used for applications with high lock contention. Also: "At the beginning of development the implementation also had used some bits in the mark word for instance based tsx locking which might have had some interference with biased locking code. The instance based tsx locking was removed in the final changes. Some portions of the current implementation also assumes that both are not on together (specifically the register usage in the fast lock case)."
09-01-2014