JDK-8259886 : Improve SSL session cache performance and scalability
  • Type: Enhancement
  • Component: security-libs
  • Sub-Component: javax.net.ssl
  • Affected Version: 11
  • Priority: P4
  • Status: Closed
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2021-01-14
  • Updated: 2021-08-13
  • Resolved: 2021-03-07
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 8 Other
11.0.12Fixed 8u291Fixed openjdk8u302Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Description
A DESCRIPTION OF THE PROBLEM :
Background: 
we observed that some of our servers were slow to respond, even though there were sufficient available hardware resources. Thread dump taken during such event pointed to multiple threads blocked on MemoryCache.put, with a single thread executing expungeExpiredEntries:
   java.lang.Thread.State: RUNNABLE
	at sun.security.util.MemoryCache.expungeExpiredEntries(java.base@11.0.8/Unknown Source)
	at sun.security.util.MemoryCache.put(java.base@11.0.8/Unknown Source)
	- locked <0x000000040ce43068> (a sun.security.util.MemoryCache)
	at sun.security.ssl.SSLSessionContextImpl.put(java.base@11.0.8/Unknown Source) 

Analysis:
When the cache is full of recent (non-expired) entries, expungeExpiredEntries is called on every put. This operation does a full scan of the cache, linear in the size of the cache.
That scan can be easily avoided by setting either infinite capacity or infinite session timeout; both make the put operation constant-time again. However, infinite capacity quickly leads to JDK-8210985, and infinite timeout is not in line with security best practices, so a different solution is needed.

Enhancement request:
Make the put operation constant-time (or amortized constant) in cache size even when both max size and lifetime limits are set.

Data:
JMH benchmark of MemoryCache.put run on current git master (jdk 17) gives the following result on my machine:

Benchmark       (size)  (timeout)  Mode  Cnt     Score    Error  Units
CacheBench.put   20480      86400  avgt   25    83.653 ?  6.269  us/op
CacheBench.put   20480          0  avgt   25     0.107 ?  0.001  us/op
CacheBench.put  204800      86400  avgt   25  2057.781 ? 35.942  us/op
CacheBench.put  204800          0  avgt   25     0.108 ?  0.001  us/op

Other reports:
Session cache performance problems were also reported in JDK-8202086 and JDK-8253116, but none of them points to this particular issue.



Comments
Fix Request [8u] Backport this patch to improve SSL session cache performance and scalability. Tested with tier1. No regression in tests. Review thread: https://mail.openjdk.java.net/pipermail/jdk8u-dev/2021-April/013628.html
12-04-2021

From the submitter: Yes I know, I contributed the fix :) and yes, it's better now.
23-03-2021

Requested the submitter to verify the fix with the latest version of JDK at https://jdk.java.net/17/
22-03-2021

Fix Request [11u] On behalf of Daniel JeliƄski <djelinski1@gmail.com>. Webrev: https://djelinski.github.io/8259886-11u/webrev2/index.html Clean except for changes to make/test/BuildMicrobenchmark.gmk, which does not exist in jdk11. Review thread: https://mail.openjdk.java.net/pipermail/jdk-updates-dev/2021-March/005264.html. Approved by phh https://mail.openjdk.java.net/pipermail/jdk-updates-dev/2021-March/005341.html.
17-03-2021

Changeset: 18fc3505 Author: djelinski <30433125+djelinski@users.noreply.github.com> Committer: Xue-Lei Andrew Fan <xuelei@openjdk.org> Date: 2021-03-07 01:13:24 +0000 URL: https://git.openjdk.java.net/jdk/commit/18fc3505
07-03-2021

Stateless (JDK-8211018) that was put into jdk13 should address the caching issues. I believe it's on by default in jdk14. The JMH numbers provided for jdk17 should be using stateless unless otherwise configured. Given the bug does not mention stateless at all, I assume it was not known to the user. So the provided numbers are likely not relevant to jdk 11. I'm not saying memorycache is performing optimally, but stateless maybe a better solution moving forward.
20-01-2021

Moved to JDK for further evaluations.
18-01-2021