JDK-8133070 : Hot lock on BulkCipher.isAvailable
  • Type: Enhancement
  • Component: security-libs
  • Sub-Component: javax.net.ssl
  • Affected Version: 8u60,9
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2015-08-05
  • Updated: 2019-02-18
  • Resolved: 2015-12-23
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 7 JDK 8 JDK 9 Other
7u211Fixed 8u112Fixed 9 b100Fixed openjdk7uFixed
Description
By running EEJS benchmark, I noticed client side throws IOException creating socket. 

By looking at the jstack for worker threads in the server side, I noticed a lot of threads are blocked in sun.security.ssl.CipherSuite$BulkCipher.isAvailable(), and some other threads are blocked in 
sun.security.ssl.CipherSuite$BulkCipher.clearAvailableCache(). You can find the jstacks in the bottom. 

For each cipher suite, the method that checks the availability of the bulk cipher is synchronized, which causes the threads blocked. 

In the current implementation, when getting the default cipher suite list, it iterates over the cipher suite list and for each cipher suite, it checks its availability. Inside sun.security.ssl.CipherSuite.java, there is an availableCache, which maps the bulk cipher and its availability. 

However when getting the default cipher suite list, this availableCache is cleared each time. Thus when checking the availability of the each bulk cipher, this availableCache is not used at all, and it will execute this synchronized CipherSuite$BulkCipher.isAvaiable() method each time, which hurts performance a lot. 

I deleted the clearAvailableCache() call inside getDefaultCipherSuiteList(). I run the benchmark and there is no error and no more blocking. The response time goes down to 0.979s from 1.389s for 1000 users. And with this change, I run the benchmark with 1300 users, the response time is 1.283s.

A part of the jstack is shown like this:
"[ACTIVE] ExecuteThread: '126' for queue: 'weblogic.kernel.Default
(self-tuning)'" #165 daemon prio=5 os_prio=64 tid=0x0000000108654800
nid=0xbe waiting for monitor entry [0xffffffff07efd000]
    java.lang.Thread.State: BLOCKED (on object monitor)
         at
sun.security.ssl.CipherSuite$BulkCipher.isAvailable(CipherSuite.java:542)
         - waiting to lock <0x00000003c43a9320> (a java.lang.Class for
sun.security.ssl.CipherSuite$BulkCipher)
         at
sun.security.ssl.CipherSuite$BulkCipher.isAvailable(CipherSuite.java:527)
         at sun.security.ssl.CipherSuite.isAvailable(CipherSuite.java:194)
         at
sun.security.ssl.SSLContextImpl.getApplicableCipherSuiteList(SSLContextImpl.java:346)
         at
sun.security.ssl.SSLContextImpl.getDefaultCipherSuiteList(SSLContextImpl.java:304)
         - locked <0x00000005c481bf40> (a
sun.security.ssl.SSLContextImpl$TLSContext)
         at sun.security.ssl.SSLSocketImpl.init(SSLSocketImpl.java:614)
         at sun.security.ssl.SSLSocketImpl.<init>(SSLSocketImpl.java:555)
         at
sun.security.ssl.SSLSocketFactoryImpl.createSocket(SSLSocketFactoryImpl.java:110)
         at
org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:364)


"[ACTIVE] ExecuteThread: '125' for queue: 'weblogic.kernel.Default
(self-tuning)'" #164 daemon prio=5 os_prio=64 tid=0x0000000108655800
nid=0xbd waiting for monitor entry [0xffffffff080fe000]
    java.lang.Thread.State: BLOCKED (on object monitor)
         at
sun.security.ssl.CipherSuite$BulkCipher.clearAvailableCache(CipherSuite.java:537)
         - waiting to lock <0x00000003c43a9320> (a java.lang.Class for
sun.security.ssl.CipherSuite$BulkCipher)
         at
sun.security.ssl.SSLContextImpl.clearAvailableCache(SSLContextImpl.java:386)
         at
sun.security.ssl.SSLContextImpl.getDefaultCipherSuiteList(SSLContextImpl.java:293)
         - locked <0x00000005c44d3328> (a
sun.security.ssl.SSLContextImpl$TLSContext)
         at sun.security.ssl.SSLSocketImpl.init(SSLSocketImpl.java:614)
         at sun.security.ssl.SSLSocketImpl.<init>(SSLSocketImpl.java:555)
         at
sun.security.ssl.SSLSocketFactoryImpl.createSocket(SSLSocketFactoryImpl.java:110)
         at
org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:364) 
Comments
Code review: http://mail.openjdk.java.net/pipermail/security-dev/2015-December/013121.html In (Open)JDK 6, EC cipher suites get supported by Java. However, there is no default EC provider in JDK 6 at that time. In order to support third part's EC algorithm JCE provider dynamically, it is hard-coded to check the cipher suite availability dynamically for EC algorithms in SunJSSE provider. Here is an example update in CipherSuite.java for the checking: - static final boolean DYNAMIC_AVAILABILITY = false; + static final boolean DYNAMIC_AVAILABILITY = true; The dynamically checking impacts the performance significantly as: 1. the check of the availability is expensive as it involves crypto operations. 2. a cache is used to maintain the availability of bulk ciphers in order to mitigate the #1 performance impact. However, access and update to the cache need to be synchronized. 3. in order to support dynamically checking, the cache may be cleared if a particular cipher is not found or a new SSLContext is generated. As bring the performance impact of #1 back again. Later, AEAD algorithm is defined by Java. The same mechanism is used to support AEAD ciphers. Now, EC and GCM algorithms are supported in JDK crypto providers. The hard-coded checking can get improved. This fix updates: 1. remove the dynamically checking of cipher suite availability. 2. remove the cipher suite availability cache accordingly (need no synchronization accordingly) 3. other updates that impact by the availability checking. The performance improvement is tested with the following simple case. Run the following code 1000, 2000, 3000 times in single thread mode and measure the millisecond for each: --------- String[] cipherSuites = {"TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256"}; for (int i = 0; i < loops; i++) { // loops: 1000, 2000, 3000 SSLEngine engine = SSLContext.getDefault().createSSLEngine(); engine.getEnabledCipherSuites(); engine.getSupportedCipherSuites(); } --------- The milliseconds for each test case (less is better) look like: loops | old | fixed ---------+---------+---------- 1000 | 2736 | 735 ---------+---------+---------- 2000 | 3718 | 746 ---------+---------+---------- 3000 | 4750 | 765 ---------+---------+---------- This fix improves the performance. The existing regression testing get passed. No new regression test is planned as this is a performance enhancement fix.
23-12-2015

Attached the patch for this change.
05-08-2015