A DESCRIPTION OF THE PROBLEM :
we observed that some of our servers were slow to respond, even though there were sufficient available hardware resources. Thread dump taken during such event pointed to multiple threads blocked on MemoryCache.put, with a single thread executing expungeExpiredEntries:
at sun.security.util.MemoryCache.expungeExpiredEntries(email@example.com/Unknown Source)
at sun.security.util.MemoryCache.put(firstname.lastname@example.org/Unknown Source)
- locked <0x000000040ce43068> (a sun.security.util.MemoryCache)
at sun.security.ssl.SSLSessionContextImpl.put(email@example.com/Unknown Source)
When the cache is full of recent (non-expired) entries, expungeExpiredEntries is called on every put. This operation does a full scan of the cache, linear in the size of the cache.
That scan can be easily avoided by setting either infinite capacity or infinite session timeout; both make the put operation constant-time again. However, infinite capacity quickly leads to JDK-8210985, and infinite timeout is not in line with security best practices, so a different solution is needed.
Make the put operation constant-time (or amortized constant) in cache size even when both max size and lifetime limits are set.
JMH benchmark of MemoryCache.put run on current git master (jdk 17) gives the following result on my machine:
Benchmark (size) (timeout) Mode Cnt Score Error Units
CacheBench.put 20480 86400 avgt 25 83.653 ? 6.269 us/op
CacheBench.put 20480 0 avgt 25 0.107 ? 0.001 us/op
CacheBench.put 204800 86400 avgt 25 2057.781 ? 35.942 us/op
CacheBench.put 204800 0 avgt 25 0.108 ? 0.001 us/op
Session cache performance problems were also reported in JDK-8202086 and JDK-8253116, but none of them points to this particular issue.