JDK-6982348 : Deadlock with combined use of EDT and CountDownLatch
  • Type: Bug
  • Component: client-libs
  • Sub-Component: java.awt
  • Affected Version: 7
  • Priority: P2
  • Status: Resolved
  • Resolution: Fixed
  • OS: solaris
  • CPU: x86
  • Submitted: 2010-09-03
  • Updated: 2012-03-22
  • Resolved: 2010-10-01
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 7
7 b112Fixed
Related Reports
Relates :  
Relates :  
Description
PASSES: JDK 6RI, 6u12, 6u21 - Solaris/Windows
PASSES: JDK7 b80, Solaris

FAILS : at least JDK 7b106 - Solaris (and very rarely on Windows)

Reproduced on JDK7 b106

Thread dump is attached

The following code reflects the code that causes the tests to fail
Deadlock is usually reproduces on 4-60 iteration
Was reproduced on zenit.russia.sun.com, Solaris i586
---------------------------------------------------------------------
import javax.swing.*;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.TimeUnit;

public class HangUp {

    public static void main(String[] args) throws Exception {

        for (int i = 0; i < 500; i++) {
            System.err.println("iteration = " + i);
            SwingUtilities.invokeAndWait(new Runnable() {
                public void run() { /* does nothing */ }
            });
            new CountDownLatch(1).await(1, TimeUnit.SECONDS);
        }
    }
}
---------------------------------------------------------------------

Comments
EVALUATION Introduced by the fix for 6424157 in 7.0-b102, when a single EventDispatchThread was reused for multiple EventQueues. Prior to this change, EDT was supposed to be used only once: try { pumpEvents(new Conditional() { public boolean evaluate() { return true; } }); } finally { theQueue.detachDispatchThread(this, true); } In 6424157 this piece of code was wrapped into 'while (true)' loop. The very first loop iteration workes as expected, but every next pumpEvents() just exited immediately as 'doDispatch' was false after the first iteration. As a result, we got an infinite loop in EDT.run(): EQ.detachDispatchThread() always returned false because of some events pending, but no events were dispatched in pumpEvents(). Fixed by the change in EventDispatchThread.java from 6949936: void pumpEventsForFilter(int id, Conditional cond, EventFilter filter) { addEventFilter(filter); + doDispatch = true; while (doDispatch && cond.evaluate()) { if (isInterrupted() || !pumpOneEventForFilters(id)) { doDispatch = false; } } removeEventFilter(filter); }
01-10-2010

PUBLIC COMMENTS No it isn't a deadlock it appears to be a livelock. When the test program is hung I still see the Java process consuming a full CPU. Using pstack there is a curious thread entry: ----------------- lwp# 62 / thread# 62 -------------------- cb8775bb * *java/util/Vector.removeElementAt(I)V [compiled] cdd2451e __1cJJavaCallsLcall_helper6FpnJJavaValue_pnMmethodHandle_pnRJavaCallArguments_pnGThread__v_ (b6f1dbc0, b6f1dad8, b6f1dae0, 8175000, b6f1db30, 817c850) + 53a cdd23fe0 __1cCosUos_exception_wrapper6FpFpnJJavaValue_pnMmethodHandle_pnRJavaCallArguments_pnGThread__v2468_v_ (cdd23fe4, b6f1dbc0, b6f1dad8, b6f1dae0, 8175000, 8175000) + 18 cdd3974c __1cJJavaCallsMcall_virtual6FpnJJavaValue_nGHandle_nLKlassHandle_nMsymbolHandle_5pnGThread__v_ (b6f1dbc0, 817c850, 817c854, ce5064a4, ce5067d8) + 110 cdd39626 __1cMthread_entry6FpnKJavaThread_pnGThread__v_ (8175000) + e2 cdd39270 __1cKJavaThreadDrun6M_v_ (8175000, ce501ee0, 0, ce278a72) + 2f0 ce278ee4 java_start (8175000) + 4c8 ce6c73d7 _thr_setup (cda05a00) + 4e ce6c76c0 _lwp_start (cda05a00, 0, 0, b6f1dff8, ce6c76c0, cda05a00) I don't see how we can jump straight into java/util/Vector.removeElementAt I don't think the CountDownlatch is significant here, it's just the relative timing.
30-09-2010

PUBLIC COMMENTS The strange thread is the event dispatch thread. For some reason it's stack information has been corrupt and we only seem to see the top Java frame. Repeatedly doing pstack shows it doing different Java calls.
30-09-2010

PUBLIC COMMENTS I suspect this may be the VM deadlock reported in 6988439.
30-09-2010

EVALUATION No it isn't a deadlock it appears to be a livelock. When the test program is hung I still see the Java process consuming a full CPU. Using pstack there is a curious thread entry: ----------------- lwp# 62 / thread# 62 -------------------- cb8775bb * *java/util/Vector.removeElementAt(I)V [compiled] cdd2451e __1cJJavaCallsLcall_helper6FpnJJavaValue_pnMmethodHandle_pnRJavaCallArguments_pnGThread__v_ (b6f1dbc0, b6f1dad8, b6f1dae0, 8175000, b6f1db30, 817c850) + 53a cdd23fe0 __1cCosUos_exception_wrapper6FpFpnJJavaValue_pnMmethodHandle_pnRJavaCallArguments_pnGThread__v2468_v_ (cdd23fe4, b6f1dbc0, b6f1dad8, b6f1dae0, 8175000, 8175000) + 18 cdd3974c __1cJJavaCallsMcall_virtual6FpnJJavaValue_nGHandle_nLKlassHandle_nMsymbolHandle_5pnGThread__v_ (b6f1dbc0, 817c850, 817c854, ce5064a4, ce5067d8) + 110 cdd39626 __1cMthread_entry6FpnKJavaThread_pnGThread__v_ (8175000) + e2 cdd39270 __1cKJavaThreadDrun6M_v_ (8175000, ce501ee0, 0, ce278a72) + 2f0 ce278ee4 java_start (8175000) + 4c8 ce6c73d7 _thr_setup (cda05a00) + 4e ce6c76c0 _lwp_start (cda05a00, 0, 0, b6f1dff8, ce6c76c0, cda05a00) I don't see how we can jump straight into java/util/Vector.removeElementAt I don't think the CountDownlatch is significant here, it's just the relative timing.
30-09-2010

EVALUATION I suspect this may be the VM deadlock reported in 6988439.
30-09-2010

EVALUATION Another observation is that if I replace CountDownLatch.await(1sec) with a plain Thread.sleep(1sec), the hang is no longer reproducible.
29-09-2010

EVALUATION Here is what AWT EventDispatchThread does according to the thread dump above: "AWT-EventQueue-0" prio=10 tid=0x6ff54000 nid=0x1e22 runnable [0x6fa2a000] java.lang.Thread.State: RUNNABLE at java.util.concurrent.locks.ReentrantLock$NonfairSync.tryAcquire(ReentrantLock.java:218) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290) at java.awt.EventQueue.peekEvent(EventQueue.java:577) at java.awt.EventQueue.detachDispatchThread(EventQueue.java:972) at java.awt.EventDispatchThread.run(EventDispatchThread.java:97) This stack trace is pretty strange as in 'detachDispatchThread()' the same lock as used in peekEvent() must be already aquired. Here is a piece of detachDispatchThread(): ->> pushPopLock.lock(); try { if (edt == dispatchThread) { ->> if ((peekEvent() != null) || !SunToolkit.isPostEventQueueEmpty()) { return false; } dispatchThread = null; ... } } finally { pushPopLock.unlock(); }
29-09-2010

EVALUATION Even more strange thing is that sometimes I observe the following entry in the thread dump when the hang occurs: "AWT-EventQueue-0" prio=3 tid=0x082a9800 nid=0x20 runnable [0xb592b000] java.lang.Thread.State: RUNNABLE at java.awt.EventDispatchThread.run(EventDispatchThread.java:90)
29-09-2010

EVALUATION Reproduced on ubuntu10 with JDK7b104. Here is the thread dump. Full thread dump Java HotSpot(TM) Server VM (19.0-b05 mixed mode): "AWT-Shutdown" prio=10 tid=0x6ff58800 nid=0x1e29 in Object.wait() [0x6f9d9000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x74850470> (a java.lang.Object) at java.lang.Object.wait(Object.java:502) at sun.awt.AWTAutoShutdown.run(AWTAutoShutdown.java:287) - locked <0x74850470> (a java.lang.Object) at java.lang.Thread.run(Thread.java:717) "AWT-EventQueue-0" prio=10 tid=0x6ff54000 nid=0x1e22 runnable [0x6fa2a000] java.lang.Thread.State: RUNNABLE at java.util.concurrent.locks.ReentrantLock$NonfairSync.tryAcquire(ReentrantLock.java:218) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290) at java.awt.EventQueue.peekEvent(EventQueue.java:577) at java.awt.EventQueue.detachDispatchThread(EventQueue.java:972) at java.awt.EventDispatchThread.run(EventDispatchThread.java:97) "AWT-XAWT" daemon prio=10 tid=0x6ff52800 nid=0x1dee runnable [0x6fa7b000] java.lang.Thread.State: RUNNABLE at sun.awt.X11.XToolkit.waitForEvents(Native Method) at sun.awt.X11.XToolkit.run(XToolkit.java:625) at sun.awt.X11.XToolkit.run(XToolkit.java:589) at java.lang.Thread.run(Thread.java:717) "Java2D Disposer" daemon prio=10 tid=0x6ff3d400 nid=0x1ded in Object.wait() [0x6fadc000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x74858890> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135) - locked <0x74858890> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151) at sun.java2d.Disposer.run(Disposer.java:145) at java.lang.Thread.run(Thread.java:717) "Low Memory Detector" daemon prio=10 tid=0x6ff06400 nid=0x1deb runnable [0x00000000] java.lang.Thread.State: RUNNABLE "CompilerThread1" daemon prio=10 tid=0x6ff04400 nid=0x1dea waiting on condition [0x00000000] java.lang.Thread.State: RUNNABLE "CompilerThread0" daemon prio=10 tid=0x091f1000 nid=0x1de9 waiting on condition [0x00000000] java.lang.Thread.State: RUNNABLE "Signal Dispatcher" daemon prio=10 tid=0x091ef400 nid=0x1de8 waiting on condition [0x00000000] java.lang.Thread.State: RUNNABLE "Finalizer" daemon prio=10 tid=0x091dfc00 nid=0x1de7 in Object.wait() [0x700f3000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x74858c58> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135) - locked <0x74858c58> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:177) "Reference Handler" daemon prio=10 tid=0x091de400 nid=0x1de6 in Object.wait() [0x70144000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x74858760> (a java.lang.ref.Reference$Lock) at java.lang.Object.wait(Object.java:502) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133) - locked <0x74858760> (a java.lang.ref.Reference$Lock) "main" prio=10 tid=0x0915f800 nid=0x1dde in Object.wait() [0xb69fc000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x748584d8> (a java.awt.EventQueue$1AWTInvocationLock) at java.lang.Object.wait(Object.java:502) at java.awt.EventQueue.invokeAndWait(EventQueue.java:1156) - locked <0x748584d8> (a java.awt.EventQueue$1AWTInvocationLock) at javax.swing.SwingUtilities.invokeAndWait(SwingUtilities.java:1346) at HangUp.main(HangUp.java:11) "VM Thread" prio=10 tid=0x091dbc00 nid=0x1de5 runnable "GC task thread#0 (ParallelGC)" prio=10 tid=0x09166c00 nid=0x1ddf runnable "GC task thread#1 (ParallelGC)" prio=10 tid=0x09168400 nid=0x1de0 runnable "VM Periodic Task Thread" prio=10 tid=0x6ff08400 nid=0x1dec waiting on condition JNI global references: 988 Heap PSYoungGen total 5248K, used 1057K [0x9e850000, 0x9fa50000, 0xb3850000) eden space 5184K, 20% used [0x9e850000,0x9e954790,0x9ed60000) from space 64K, 25% used [0x9fa30000,0x9fa34000,0x9fa40000) to space 64K, 0% used [0x9fa40000,0x9fa40000,0x9fa50000) PSOldGen total 43008K, used 227K [0x74850000, 0x77250000, 0x9e850000) object space 43008K, 0% used [0x74850000,0x74888c70,0x77250000) PSPermGen total 16384K, used 3565K [0x70850000, 0x71850000, 0x74850000) object space 16384K, 21% used [0x70850000,0x70bcb788,0x71850000)
06-09-2010