JDK-7118809 : rcache deadlock
  • Type: Bug
  • Component: security-libs
  • Sub-Component: org.ietf.jgss:krb5
  • Affected Version: 6u29
  • Priority: P4
  • Status: Closed
  • Resolution: Fixed
  • OS: linux_ubuntu
  • CPU: x86
  • Submitted: 2011-12-07
  • Updated: 2012-04-16
  • Resolved: 2012-04-16
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 7 JDK 8
7u4Fixed 8 b23Fixed
Description
FULL PRODUCT VERSION :
java version "1.6.0_22"
Java(TM) SE Runtime Environment (build 1.6.0_22-b04)
Java HotSpot(TM) Client VM (build 17.1-b03, mixed mode, sharing)


ADDITIONAL OS VERSION INFORMATION :
Linux vivek-laptop 2.6.32-26-generic #48-Ubuntu SMP Wed Nov 24 09:00:03 UTC 2010 i686 GNU/Linux


A DESCRIPTION OF THE PROBLEM :
The program as below:
--------------------------------------------------
import sun.security.krb5.internal.rcache.*;
import sun.security.krb5.internal.*;
import java.util.*;

public class KTest2 {
  public static void main(String [] a) throws Exception{
    final CacheTable ct = new CacheTable();
    final long time = System.currentTimeMillis();
    ct.put("TheOnlyOne", new AuthTime( time - Krb5.DEFAULT_ALLOWABLE_CLOCKSKEW * 1000L, 0), time);
    final ReplayCache rc = (ReplayCache) ct.get("TheOnlyOne");
    new Thread() {
      public void run() {
        rc.put(new AuthTime( time - Krb5.DEFAULT_ALLOWABLE_CLOCKSKEW * 1000L, 0), time + 300*1000);
      }
    }.start();
    ct.put("TheOnlyOne", new AuthTime( time - Krb5.DEFAULT_ALLOWABLE_CLOCKSKEW * 1000L, 0), time);
  }
}
--------------------------------------------------
When compiles as in: javac KTest2.java
and executed as in: java KTest2
can deadlock the hosting JVM as is reproduced by the stack-trace dump, below:
-------------------------------------------------
Found one Java-level deadlock:
=============================
"Thread-0":
  waiting to lock monitor 0x0921d720 (object 0xa5621b80, a sun.security.krb5.internal.rcache.CacheTable),
  which is held by "main"
"main":
  waiting to lock monitor 0x0921caa0 (object 0xa5622fb8, a sun.security.krb5.internal.rcache.ReplayCache),
  which is held by "Thread-0"

Java stack information for the threads listed above:
===================================================
"Thread-0":
	at java.util.Hashtable.remove(Hashtable.java:435)
	- waiting to lock <0xa5621b80> (a sun.security.krb5.internal.rcache.CacheTable)
	at sun.security.krb5.internal.rcache.ReplayCache.put(ReplayCache.java:123)
	- locked <0xa5622fb8> (a sun.security.krb5.internal.rcache.ReplayCache)
	at KTest2$1.run(KTest2.java:13)
"main":
	at sun.security.krb5.internal.rcache.ReplayCache.put(ReplayCache.java:62)
	- waiting to lock <0xa5622fb8> (a sun.security.krb5.internal.rcache.ReplayCache)
	at sun.security.krb5.internal.rcache.CacheTable.put(CacheTable.java:59)
	- locked <0xa5621b80> (a sun.security.krb5.internal.rcache.CacheTable)
	at KTest2.main(KTest2.java:16)

Found 1 deadlock.

Heap
 def new generation   total 3456K, used 347K [0xa55f0000, 0xa59b0000, 0xa91f0000)
  eden space 3072K,  11% used [0xa55f0000, 0xa5646dd0, 0xa58f0000)
  from space 384K,   0% used [0xa58f0000, 0xa58f0000, 0xa5950000)
  to   space 384K,   0% used [0xa5950000, 0xa5950000, 0xa59b0000)
 tenured generation   total 7680K, used 0K [0xa91f0000, 0xa9970000, 0xb09f0000)
   the space 7680K,   0% used [0xa91f0000, 0xa91f0000, 0xa91f0200, 0xa9970000)
 compacting perm gen  total 12288K, used 1829K [0xb09f0000, 0xb15f0000, 0xb49f0000)
   the space 12288K,  14% used [0xb09f0000, 0xb0bb9600, 0xb0bb9600, 0xb15f0000)
No shared spaces configured
-------------------------------------------------

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
On my  lap-top that is a pentium-box, this deadlock does not as readily happen,
 as much as on my friends core-duo box.
But I have produced this interleaving using jdb to control the progress
of the two threads involved, to produce the stack-trace as reproduced
above.


EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
The program could run to completion if it were not to deadlock.

As a Java-user, I would desire that the JVM *always* select the interleaving that would lead the program to its completion.
ACTUAL -
Found one Java-level deadlock:
=============================
"Thread-0":
  waiting to lock monitor 0x0921d720 (object 0xa5621b80, a sun.security.krb5.internal.rcache.CacheTable),
  which is held by "main"
"main":
  waiting to lock monitor 0x0921caa0 (object 0xa5622fb8, a sun.security.krb5.internal.rcache.ReplayCache),
  which is held by "Thread-0"

Java stack information for the threads listed above:
===================================================
"Thread-0":
	at java.util.Hashtable.remove(Hashtable.java:435)
	- waiting to lock <0xa5621b80> (a sun.security.krb5.internal.rcache.CacheTable)
	at sun.security.krb5.internal.rcache.ReplayCache.put(ReplayCache.java:123)
	- locked <0xa5622fb8> (a sun.security.krb5.internal.rcache.ReplayCache)
	at KTest2$1.run(KTest2.java:13)
"main":
	at sun.security.krb5.internal.rcache.ReplayCache.put(ReplayCache.java:62)
	- waiting to lock <0xa5622fb8> (a sun.security.krb5.internal.rcache.ReplayCache)
	at sun.security.krb5.internal.rcache.CacheTable.put(CacheTable.java:59)
	- locked <0xa5621b80> (a sun.security.krb5.internal.rcache.CacheTable)
	at KTest2.main(KTest2.java:16)

Found 1 deadlock.

Heap
 def new generation   total 3456K, used 347K [0xa55f0000, 0xa59b0000, 0xa91f0000)
  eden space 3072K,  11% used [0xa55f0000, 0xa5646dd0, 0xa58f0000)
  from space 384K,   0% used [0xa58f0000, 0xa58f0000, 0xa5950000)
  to   space 384K,   0% used [0xa5950000, 0xa5950000, 0xa59b0000)
 tenured generation   total 7680K, used 0K [0xa91f0000, 0xa9970000, 0xb09f0000)
   the space 7680K,   0% used [0xa91f0000, 0xa91f0000, 0xa91f0200, 0xa9970000)
 compacting perm gen  total 12288K, used 1829K [0xb09f0000, 0xb15f0000, 0xb49f0000)
   the space 12288K,  14% used [0xb09f0000, 0xb0bb9600, 0xb0bb9600, 0xb15f0000)
No shared spaces configured.


ERROR MESSAGES/STACK TRACES THAT OCCUR :
The screen-dump above reports the details of the deadlock reached !

REPRODUCIBILITY :
This bug can be reproduced occasionally.

---------- BEGIN SOURCE ----------
import sun.security.krb5.internal.rcache.*;
import sun.security.krb5.internal.*;
import java.util.*;

public class KTest2 {
  public static void main(String [] a) throws Exception{
    final CacheTable ct = new CacheTable();
    final long time = System.currentTimeMillis();
    ct.put("TheOnlyOne", new AuthTime( time - Krb5.DEFAULT_ALLOWABLE_CLOCKSKEW * 1000L, 0), time);
    final ReplayCache rc = (ReplayCache) ct.get("TheOnlyOne");
    new Thread() {
      public void run() {
        rc.put(new AuthTime( time - Krb5.DEFAULT_ALLOWABLE_CLOCKSKEW * 1000L, 0), time + 300*1000);
      }
    }.start();
    ct.put("TheOnlyOne", new AuthTime( time - Krb5.DEFAULT_ALLOWABLE_CLOCKSKEW * 1000L, 0), time);
  }
}

---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
Hope for the best, and if not successful, try again !!

Comments
EVALUATION Basically, CacheTable.put() operates on ReplayCache inside it, and ReplayCache.put() operates on CacheTable containing it, and both methods are synchronized and using thread-safe Hashtable. The solution is to move the "table.remove(principal)" line in ReplayCache.put() outside the method into CacheTable.put() which is the only place the method is called.
16-01-2012

EVALUATION http://hg.openjdk.java.net/jdk8/tl/jdk/rev/e8e08d46cc37
16-01-2012