JDK-8197518 : Kerberos krb5 authentication: AuthList's put method leads to performance issue
  • Type: Bug
  • Component: security-libs
  • Sub-Component: org.ietf.jgss
  • Affected Version: 8u74
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2018-02-09
  • Updated: 2020-02-26
  • Resolved: 2018-02-26
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 8 Other
11 b03Fixed 8u172Fixed openjdk7uFixed
Description
For the same pair of client & server principal, invocation of
acceptSecContext() at a high rate leads to serious performance problem.
The main problem is with the clean-up right after adding the authentication
timestamp in the LinkedList<AuthTimeWithHash>. See the AuthList put method
which is invoked from sun.security.krb5.internal.rcache.MemoryCache
synchronized void checkAndStore(KerberosTime currTime, AuthTimeWithHash time)
method.  Removing last entry from a LinkedList one by one in a synchronized
block is not good for performance.

From JFR Hot Methods:
#233 daemon prio=5 os_prio=0 tid=0x0000000000c26000 nid=0x956 runnable
[0x00007f6c75d0f000]
  java.lang.Thread.State: RUNNABLE
at java.util.LinkedList.indexOf(LinkedList.java:604)
at sun.security.krb5.internal.rcache.AuthList.put(AuthList.java:116)
at
sun.security.krb5.internal.rcache.MemoryCache.checkAndStore(MemoryCache.java:7
1)
- locked <0x00000000b09603f0> (a
sun.security.krb5.internal.rcache.MemoryCache)
at sun.security.krb5.KrbApReq.authenticate(KrbApReq.java:323)
at sun.security.krb5.KrbApReq.<init>(KrbApReq.java:149)
at
sun.security.jgss.krb5.InitSecContextToken.<init>(InitSecContextToken.java:108
)

What we see at the server side when the same client principal sends high rate
requests in multiple threads is as below:

#229 daemon prio=5 os_prio=0 tid=0x000000000136b000 nid=0x952 runnable
[0x00007f6c765b3000]
  java.lang.Thread.State: RUNNABLE
       at
sun.security.krb5.internal.rcache.MemoryCache.checkAndStore(MemoryCache.java:7
1)
       - locked <0x00000000b09ab380> (a
sun.security.krb5.internal.rcache.MemoryCache)
       at sun.security.krb5.KrbApReq.authenticate(KrbApReq.java:323)

#231 daemon prio=5 os_prio=0 tid=0x0000000000c24000 nid=0x954 waiting for
monitor entry [0x00007f6c7456b000]
  java.lang.Thread.State: BLOCKED (on object monitor)
       at
sun.security.krb5.internal.rcache.MemoryCache.checkAndStore(MemoryCache.java:5
6)
       - waiting to lock <0x00000000b09ab380> (a
sun.security.krb5.internal.rcache.MemoryCache)
       at sun.security.krb5.KrbApReq.authenticate(KrbApReq.java:323)


Instead of LinkedList, a better suited Collection should be used, therefore a
specific implementation for this purpose might be needed.

If the service is not kerberised, normally the server can respond pretty fast
around 1 ms. 
Comments
I studied list.subList().clear() on a LinkedList and it still has to go through the node one by one. I tried JMH and it's 8% slower than my proposed removeLast-until-this approach.
22-02-2018

your current webrev looks good. I think it'll address the issue at hand. Would the subList method be a better approach for deleting the tail end of your List ? https://docs.oracle.com/javase/8/docs/api/java/util/List.html#subList-int-int- We know what index to start deleting from since you could use a simple int counter for every iteration of the "while (it.hasNext()) {" loop.
21-02-2018

[~coffeys] In fact, I've prepared a webrev that almost does exactly you described at http://cr.openjdk.java.net/~weijun/8197518/webrev.00/. Then I learnt about ConcurrentSkipListSet and wondered if it can be better.
16-02-2018

problem code : // let us cleanup while we are here long timeLimit = currentTime.getSeconds() - lifespan; ListIterator<AuthTimeWithHash> it = entries.listIterator(0); AuthTimeWithHash temp = null; int index = -1; while (it.hasNext()) { // search expired timestamps. temp = it.next(); if (temp.ctime < timeLimit) { index = entries.indexOf(temp); break; } } Stacktrace suggests that the threads are locked up due to LinkedList.indexOf call. I'm wondering if this call is necessary at all. Maybe we could use a simple counter in the while loop is keep record of where we are in the list. Also - not sure if performing a cleanup is necessary for every put call. Perhaps we could implement a time delay (e.g. clean up only if x seconds have elapsed since last put call? )
16-02-2018

I'll see if ConcurrentSkipListSet() can be used.
11-02-2018