JDK-7011862 : java/util/concurrent utilities need to handle StackOverflowError:Class loading hang with clss21201m1
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: hs20,7,8
  • Priority: P2
  • Status: Closed
  • Resolution: Duplicate
  • OS: generic
  • CPU: generic
  • Submitted: 2011-01-12
  • Updated: 2016-12-14
  • Resolved: 2016-12-14
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 9
9Resolved
Related Reports
Duplicate :  
Duplicate :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Description
Stress test
	nsk/stress/jck60/jck60001	

intermittenly hangs starting JDK 7 b123 (HS 20 b04), does not hang with JDK 7 b122 (HS 20 b03).

Java stack may be different. Sometimes, there are other threads doing class loading too. It always seems to involve clss21201m1.

1. 
"Thread-840" prio=3 tid=0x00920800 nid=0x354 waiting on condition [0xb20fd000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0xed5905e8> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
        at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214)
        at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290)
        at java.util.concurrent.ConcurrentHashMap$Segment.put(ConcurrentHashMap.java:445)
        at java.util.concurrent.ConcurrentHashMap.putIfAbsent(ConcurrentHashMap.java:925)
        at java.lang.ClassLoader.getClassLoadingLock(ClassLoader.java:464)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:405)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:327)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
        at javasoft.sqe.tests.lang.dasg196.dasg19603m0.dasg19603m0.run(dasg19603m0.java:310)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:613)
        at nsk.stress.share.StressTest$TestThread.runTest(StressTest.java:735)
        at nsk.stress.share.StressTest$TestThread.run(StressTest.java:768)

"Thread-459" prio=3 tid=0x007b7400 nid=0x1d7 waiting on condition [0xc9f7d000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0xed5905e8> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
        at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214)
        at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290)
        at java.util.concurrent.ConcurrentHashMap$Segment.put(ConcurrentHashMap.java:445)
        at java.util.concurrent.ConcurrentHashMap.putIfAbsent(ConcurrentHashMap.java:925)
        at java.lang.ClassLoader.getClassLoadingLock(ClassLoader.java:464)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:405)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:327)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:186)
        at javasoft.sqe.tests.lang.clss212.clss21201m1.clss21201m1.run(clss21201m1.java:1200)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:613)
        at nsk.stress.share.StressTest$TestThread.runTest(StressTest.java:735)
        at nsk.stress.share.StressTest$TestThread.run(StressTest.java:768)

"Thread-174" prio=3 tid=0x005ad800 nid=0xba waiting on condition [0xdbd7c000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0xed5905e8> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
        at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214)
        at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290)
        at java.util.concurrent.ConcurrentHashMap$Segment.put(ConcurrentHashMap.java:445)
        at java.util.concurrent.ConcurrentHashMap.putIfAbsent(ConcurrentHashMap.java:925)
        at java.lang.ClassLoader.getClassLoadingLock(ClassLoader.java:464)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:405)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:327)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
        at javasoft.sqe.tests.lang.binc049.binc04901.binc04901c.<init>(binc04901c.java:33)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:530)
        at java.lang.Class.newInstance0(Class.java:372)
        at java.lang.Class.newInstance(Class.java:325)
        at javasoft.sqe.tests.lang.binc049.binc04901.binc04901.run(binc04901.java:23)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:613)
        at nsk.stress.share.StressTest$TestThread.runTest(StressTest.java:735)
        at nsk.stress.share.StressTest$TestThread.run(StressTest.java:768)



2. 

"Thread-459" prio=3 tid=0x006b2800 nid=0x1d7 waiting on condition [0xc9f7d000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0xed58f930> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
        at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214)
        at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290)
        at java.util.concurrent.ConcurrentHashMap$Segment.put(ConcurrentHashMap.java:445)
        at java.util.concurrent.ConcurrentHashMap.putIfAbsent(ConcurrentHashMap.java:925)
        at java.lang.ClassLoader.getClassLoadingLock(ClassLoader.java:464)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:405)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:327)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:186)
        at javasoft.sqe.tests.lang.clss212.clss21201m1.clss21201m1.run(clss21201m1.java:1200)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:613)
        at nsk.stress.share.StressTest$TestThread.runTest(StressTest.java:735)
        at nsk.stress.share.StressTest$TestThread.run(StressTest.java:768)

3.
Thread-459" prio=3 tid=0x007af400 nid=0x1d7 waiting on condition [0xc9f60000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0xed58f930> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
        at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214)
        at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290)
        at java.util.concurrent.ConcurrentHashMap$Segment.put(ConcurrentHashMap.java:445)
        at java.util.concurrent.ConcurrentHashMap.putIfAbsent(ConcurrentHashMap.java:925)
        at java.lang.ClassLoader.getClassLoadingLock(ClassLoader.java:464)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:405)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:327)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
        at java.lang.ClassLoader.defineClass1(Native Method)
        at java.lang.ClassLoader.defineClass(ClassLoader.java:796)
        at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:144)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:382)
        at java.net.URLClassLoader.access$100(URLClassLoader.java:75)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:294)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:288)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:287)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
        - locked <0xe8029340> (a java.lang.Object)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:327)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
...
        - locked <0xe8452dc0> (a java.lang.Object)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:327)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:186)
        at javasoft.sqe.tests.lang.clss212.clss21201m1.clss21201m1.run(clss21201m1.java:1200)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:613)
        at nsk.stress.share.StressTest$TestThread.runTest(StressTest.java:735)
        at nsk.stress.share.StressTest$TestThread.run(StressTest.java:768)

Comments
Issue resolved by https://bugs.openjdk.java.net/browse/JDK-8046936
18-07-2016

This issue has been resolved by JEP-270: Reserved Stack Areas for Critical Sections https://bugs.openjdk.java.net/browse/JDK-8046936
18-07-2016

Theory: a StackOverflowError is being encountered that is getting swallowed somewhere so that it is not evident. If this occurs here: final void lock() { if (compareAndSetState(0, 1)) setExclusiveOwnerThread(Thread.currentThread()); <== overflow then we have a lock that appears locked (state==1) but no owner is set, and no thread will ever release it.
04-06-2014

Parallel class loading in JDK 8 and JDK9 should not be impacted by this bug because the implementation of the ConcurrentHashMap has been completely changed in this changeset: changeset: 7302:d6401129327e user: dl date: Tue Jun 04 21:59:23 2013 +0100 summary: 8005704: Update ConcurrentHashMap to v8 The new implementation doesn't use java.util.concurrent.locks.ReentrantLock instances anymore, so even if the java.util.concurrent utilities still need better handling of StackOverflowError, this bug doesn't impact usages of the ConcurrentHasMap (including Parallel classloading) anymore.
03-06-2014

PUBLIC COMMENTS The failing test involved loading a class with a large class hierarchy. The basic form of the test is: try { n = new cl124(); } catch (StackOverflowError e) { out.println("res = "+res+" Exception: " + e); } where class cl124 extends cl123 extends cl122 ... extends cl1 So if you run this test normally and observe the output from -XX:+TraceExceptions you will see (depending on where the actual stackoverlfow was generated, sequences of classloading frames: Exception <a 'java/lang/StackOverflowError'> (0xb608e578 ) thrown [/tmp/workspace/jdk7-2-build-solaris-i586-product/jdk7/hotspot/src/share/vm/runtime/javaCalls.cpp, line 368] for thread 0x080b9400 Exception <a 'java/lang/StackOverflowError'> (0xb608e578) thrown in compiled method <{method} '<init>' '()V' in 'java/lang/Object'> at PC 0xca4ec427 for thread 0x80b9400 Thread 0x080b9400 continuing at PC 0xca4ec434 for exception thrown at PC 0xca4ec427 Exception <a 'java/lang/StackOverflowError'> (0xb608e578) thrown in interpreter method <{method} '<init>' '()V' in 'java/io/InputStream'> at bci 1 for thread 0x080b9400 Exception <a 'java/lang/StackOverflowError'> (0xb608e578) thrown in interpreter method <{method} '<init>' '(Ljava/io/File;)V' in 'java/io/FileInputStream'> at bci 1 for thread 0x080b9400 Exception <a 'java/lang/StackOverflowError'> (0xb608e578) thrown in interpreter method <{method} 'getInputStream' '()Ljava/io/InputStream;' in 'sun/misc/URLClassPath$FileLoader$1'> at bci 8 for thread 0x080b9400 Exception <a 'java/lang/StackOverflowError'> (0xb608e578) thrown in interpreter method <{method} 'cachedInputStream' '()Ljava/io/InputStream;' in 'sun/misc/Resource'> at bci 9 for thread 0x080b9400 Exception <a 'java/lang/StackOverflowError'> (0xb608e578) thrown in interpreter method <{method} 'getByteBuffer' '()Ljava/nio/ByteBuffer;' in 'sun/misc/Resource'> at bci 1 for thread 0x080b9400 Exception <a 'java/lang/StackOverflowError'> (0xb608e578) thrown in interpreter method <{method} 'defineClass' '(Ljava/lang/String;Lsun/misc/Resource;)Ljava/lang/Class;' in 'java/net/URLC> at bci 132 for thread 0x080b9400 Exception <a 'java/lang/StackOverflowError'> (0xb608e578) thrown in interpreter method <{method} 'access$100' '(Ljava/net/URLClassLoader;Ljava/lang/String;Lsun/misc/Resource;)Ljava/lang/C> at bci 3 for thread 0x080b9400 Exception <a 'java/lang/StackOverflowError'> (0xb608e578) thrown in interpreter method <{method} 'run' '()Ljava/lang/Class;' in 'java/net/URLClassLoader$1'> at bci 43 for thread 0x080b9400 Exception <a 'java/lang/StackOverflowError'> (0xb608e578) thrown in interpreter method <{method} 'run' '()Ljava/lang/Object;' in 'java/net/URLClassLoader$1'> at bci 1 for thread 0x080b9400 Exception <a 'java/lang/StackOverflowError'> (0xb608e578 ) thrown [/tmp/workspace/jdk7-2-build-solaris-i586-product/jdk7/hotspot/src/share/vm/prims/jvm.cpp, line 1115] for thread 0x080b9400 Exception <a 'java/lang/StackOverflowError'> (0xb608e578) thrown in interpreter method <{method} 'doPrivileged' '(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlCont> at bci 0 for thread 0x080b9400 Exception <a 'java/lang/StackOverflowError'> (0xb608e578) thrown in interpreter method <{method} 'findClass' '(Ljava/lang/String;)Ljava/lang/Class;' in 'java/net/URLClassLoader'> at bci 13 for thread 0x080b9400 Exception <a 'java/lang/StackOverflowError'> (0xb608e578) thrown in interpreter method <{method} 'loadClass' '(Ljava/lang/String;Z)Ljava/lang/Class;' in 'java/lang/ClassLoader'> at bci 70 for thread 0x080b9400 Exception <a 'java/lang/StackOverflowError'> (0xb608e578) thrown in interpreter method <{method} 'loadClass' '(Ljava/lang/String;Z)Ljava/lang/Class;' in 'java/lang/ClassLoader'> at bci 121 for thread 0x080b9400 Exception <a 'java/lang/StackOverflowError'> (0xb608e578) thrown in interpreter method <{method} 'loadClass' '(Ljava/lang/String;Z)Ljava/lang/Class;' in 'sun/misc/Launcher$AppClassLoader'> at bci 40 for thread 0x080b9400 Exception <a 'java/lang/StackOverflowError'> (0xb608e578) thrown in interpreter method <{method} 'loadClass' '(Ljava/lang/String;)Ljava/lang/Class;' in 'java/lang/ClassLoader'> at bci 3 for thread 0x080b9400 Exception <a 'java/lang/StackOverflowError'> (0xb608e578) thrown in interpreter method <{method} 'defineClass1' '(Ljava/lang/String;[BIILjava/security/ProtectionDomain;Ljava/lang/String;)> at bci 0 for thread 0x080b9400 Exception <a 'java/lang/StackOverflowError'> (0xb608e578) thrown in interpreter method <{method} 'defineClass' '(Ljava/lang/String;[BIILjava/security/ProtectionDomain;)Ljava/lang/Class;' > at bci 30 for thread 0x080b9400 Exception <a 'java/lang/StackOverflowError'> (0xb608e578) thrown in interpreter method <{method} 'defineClass' '(Ljava/lang/String;[BIILjava/security/CodeSource;)Ljava/lang/Class;' in 'ja> at bci 12 for thread 0x080b9400 Exception <a 'java/lang/StackOverflowError'> (0xb608e578) thrown in interpreter method <{method} 'defineClass' '(Ljava/lang/String;Lsun/misc/Resource;)Ljava/lang/Class;' in 'java/net/URLC> at bci 220 for thread 0x080b9400 Exception <a 'java/lang/StackOverflowError'> (0xb608e578) thrown in interpreter method <{method} 'access$100' '(Ljava/net/URLClassLoader;Ljava/lang/String;Lsun/misc/Resource;)Ljava/lang/C> at bci 3 for thread 0x080b9400 Exception <a 'java/lang/StackOverflowError'> (0xb608e578) thrown in interpreter method <{method} 'run' '()Ljava/lang/Class;' in 'java/net/URLClassLoader$1'> at bci 43 for thread 0x080b9400 Exception <a 'java/lang/StackOverflowError'> (0xb608e578) thrown in interpreter method <{method} 'run' '()Ljava/lang/Object;' in 'java/net/URLClassLoader$1'> at bci 1 for thread 0x080b9400 Exception <a 'java/lang/StackOverflowError'> (0xb608e578 ) thrown [/tmp/workspace/jdk7-2-build-solaris-i586-product/jdk7/hotspot/src/share/vm/prims/jvm.cpp, line 1115] for thread 0x080b9400 ... jvm.cpp line 1115 is JVM_DoPrivileged. If you run in Xcomp mode it seems the Java frame information is lost so all we see, in part is: Exception <a 'java/lang/StackOverflowError'> (0xb608e618 ) thrown [/tmp/workspace/jdk7-2-build-solaris-i586-product/jdk7/hotspot/src/share/vm/prims/jvm.cpp, line 1115] for thread 0x080b9800 Exception <a 'java/lang/StackOverflowError'> (0xb608e618 ) thrown [/tmp/workspace/jdk7-2-build-solaris-i586-product/jdk7/hotspot/src/share/vm/prims/jvm.cpp, line 1115] for thread 0x080b9800 Exception <a 'java/lang/StackOverflowError'> (0xb608e618 ) thrown [/tmp/workspace/jdk7-2-build-solaris-i586-product/jdk7/hotspot/src/share/vm/prims/jvm.cpp, line 1115] for thread 0x080b9800 Exception <a 'java/lang/StackOverflowError'> (0xb608e618 ) thrown [/tmp/workspace/jdk7-2-build-solaris-i586-product/jdk7/hotspot/src/share/vm/prims/jvm.cpp, line 1115] for thread 0x080b9800 Exception <a 'java/lang/StackOverflowError'> (0xb608e618 ) thrown [/tmp/workspace/jdk7-2-build-solaris-i586-product/jdk7/hotspot/src/share/vm/prims/jvm.cpp, line 1115] ... which explains the apparent recursion in doPrivileged. That leaves the remaining issue of where did the StackOverflow go once it got to the top-level of the test? The code above shows a simple println: out.println("res = "+res+" Exception: " + e); and this should not itself cause a secondary StackOverflow as we have unwound dozens of frames from the stack. However this test is actually being run as part of another stress test, and that wrapper seems to be swallowing all output from this particular test. Hence we never see the failure, only the consequence of that failure when the test execution later hangs. Edited to fix typo: should -> should not
08-02-2011

EVALUATION The test was hanging due to StackOverflowErrors that hit while deep inside the internals of AbstractQueuedSynchronizer, which is used by ReentrantLock which is used by ConcurrentHashMap which is used within the class loading framework. Taking a simple example, consider a ReentrantLock.lock() operation for the common NonFair-mode: static final class NonfairSync extends Sync { ... /** * Performs lock. Try immediate barge, backing up to normal * acquire on failure. */ final void lock() { if (compareAndSetState(0, 1)) setExclusiveOwnerThread(Thread.currentThread()); else acquire(1); } If StackOverflowError is thrown from Thread.currentThread() or setExclusiveOwnerThread then we end up with a lock in which state==1 but owner==null. The lock appears to be locked but no thread actually owns it nor can it be unlocked - hence we get a hang as all threads pile up on this lock. Similarly if StackOverflow occurs during lock release you can end up with inconsistent state that renders the lock unusable. This problem was exacerbated by the fact that something in the call-chain was silently swallowing the StackOverflowError and that needs to be found and fixed (as this was an exceedingly difficult problem to track down!). There is a question as to whether we can make the locking code more resilient to StackOverflowError. It seems simple enough to do perhaps for an individual case eg: final void lock() { if (compareAndSetState(0, 1)) { try { setExclusiveOwnerThread(Thread.currentThread()); } catch(StackOverflowError soe) { fixit(); throw soe; } } else acquire(1); } But what exactly must fixit() do? If Thread.currentThread() caused the exception then we perhaps only need to act as-if we are releasing the lock and so fix the state value. But what if it occurred inside setExclusiveOwner? Okay that's a trivial method too so we could ensure the owner field is clear, but in general we either need to know the exact state of everything in the call stack, or else assume we are calling code that itself is completely safe in the face of StackOverflowError. This is generally an impractical problem to solve and is only somewhat simpler than dealing with the dreaded asynchronous exceptions from Thread.stop. Arguably Lock and Condition implementations should be as resilient to exceptions as the built-in synchronized construct and the use of the wait/notify methods. But use of synchronized is effectively atomic with respect to exceptions: either the exception is seen before, or seen after, but it can't be seen in the middle. The same with the native/VM-based implementations for wait/notify. The utility classes in java.util.concurrent do not have, and can not have, exception atomicity - exceptions can happen on any line of code, or for stackoverflow at any line that makes a method call. Even normal use of try/catch/finally is not sufficient to handle these cases. Consider the idiomatic lock usage example: lock.lock(); try { criticalActions(); } finally { lock.unlock(); } If an exception (stackoverflow or asynchronous) occurs inside lock() after the lock is actually acquired, then we are not in the finally block yet and so unlock() will not be called. Trying to deal with this makes code almost impossible to write. We've outlawed asynchronous exceptions, and we've made some progress in making library code OutOfMemoryError resilient (though I suspect a lot more work is needed), but StackOverflow seems to be something we just have to tolerate. As long as these problems are clearly visible we should be able to manage them. To that end we need to find what code swallowed the exception here (it may be in the VM) and I think we need to add a VM option that will report when StackOverflowError gets generated, just in case the Java code does swallow the exception.
03-02-2011

PUBLIC COMMENTS The jstack1.log file shows there is some deeply nested class loading occurring, however there is no indication that any thread actually owns the lock that can't be acquired: java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xed58f930> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290) at java.util.concurrent.ConcurrentHashMap$Segment.put(ConcurrentHashMap.java:445) at java.util.concurrent.ConcurrentHashMap.putIfAbsent(ConcurrentHashMap.java:925) at java.lang.ClassLoader.getClassLoadingLock(ClassLoader.java:464) This failure on Solaris sparc is being reported at the same time as: 7011859 java/util/concurrent/Semaphore/RacingReleases.java fails in fastdebug on solaris-sparc which reports: java.lang.Error: Semaphore stuck: permits 1, thread waiting true which is reminiscent of the problem that 6822370 addressed. The failures are occurring on a T1000 and T2000 system.
13-01-2011