United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
JDK-6241823 : Infinite loop in timed Semaphore.tryAcquire

Details
Type:
Bug
Submit Date:
2005-03-16
Status:
Closed
Updated Date:
2010-05-09
Project Name:
JDK
Resolved Date:
2005-04-01
Component:
core-libs
OS:
solaris_9,windows_xp
Sub-Component:
java.util.concurrent
CPU:
x86,sparc
Priority:
P3
Resolution:
Fixed
Affected Versions:
5.0
Fixed Versions:

Related Reports
Backport:
Duplicate:
Relates:

Sub Tasks

Description
FULL PRODUCT VERSION :
java version "1.5.0_02"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_02-b09)
Java HotSpot(TM) Client VM (build 1.5.0_02-b09, mixed mode, sharing)


A DESCRIPTION OF THE PROBLEM :
Consecutive calls to Semaphore.tryAcquire(int permits, long timeout, TimeUnit unit) on a fair semaphore with no permits available causes infinite loop.


STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
    public static void main(String[] args) throws Exception {
        Semaphore sem = new Semaphore(0,true);
        System.out.println(sem.tryAcquire(100,TimeUnit.MILLISECONDS));
        System.out.println(sem.tryAcquire(100,TimeUnit.MILLISECONDS));
        System.out.println("OK!");
    }


EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
false
false
OK!
ACTUAL -
false


ERROR MESSAGES/STACK TRACES THAT OCCUR :
There is no error message, as the second call to tryAcquire() never returns.

REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
import java.util.concurrent.*;
public class Test {
    public static void main(String[] args) throws Exception {
        Semaphore sem = new Semaphore(0,true);
        System.out.println(sem.tryAcquire(100,TimeUnit.MILLISECONDS));
        System.out.println(sem.tryAcquire(100,TimeUnit.MILLISECONDS));
        System.out.println("OK!");
    }
}
---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
Do not use fair semaphores (changing the above case to new Semaphore(0,false) works)
###@###.### 2005-03-16 22:22:49 GMT

                                    

Comments
SUGGESTED FIX

--- /u/martin/ws/AQS/webrev/src/share/classes/java/util/concurrent/locks/AbstractQueuedSynchronizer.java-       2005-03-27 14:13:44.458487000 -0800
+++ AbstractQueuedSynchronizer.java     2005-03-21 14:45:28.988658000 -0800
@@ -1235,51 +1235,46 @@
 
     /**
      * Version of getFirstQueuedThread called when fastpath fails
      */
     private Thread fullGetFirstQueuedThread() {
-        /*
-         * This loops only if the queue changes while we read sets of
-         * fields.
-         */
-        for (;;) {
             Node h = head;
             if (h == null)                    // No queue
                 return null;
 
             /*
              * The first node is normally h.next. Try to get its
              * thread field, ensuring consistent reads: If thread
              * field is nulled out or s.prev is no longer head, then
              * some other thread(s) concurrently performed setHead in
-             * between some of our reads, so we must reread.
+         * between some of our reads.
              */
             Node s = h.next;
             if (s != null) {
                 Thread st = s.thread;
                 Node sp = s.prev;
                 if (st != null && sp == head)
                     return st;
             }
 
             /*
-             * Head's next field might not have been set yet, or may
-             * have been unset after setHead. So we must check to see
-             * if tail is actually first node, in almost the same way
-             * as above.
+         * Head's next field might not have been set yet, or may have
+         * been unset after setHead. So we must check to see if tail
+         * is actually first node. If not, we continue on, safely
+         * traversing from tail back to head to find first,
+         * guaranteeing termination.
              */
-            Node t = tail; 
-            if (t == h)                       // Empty queue
-                return null;
 
-            if (t != null) {
+        Node t = tail;
+        Thread firstThread = null;
+        while (t != null && t != head) {
                 Thread tt = t.thread;
-                Node tp = t.prev;
-                if (tt != null && tp == head)
-                    return tt;
-            }
+            if (tt != null)
+                firstThread = tt;
+            t = t.prev;
         }
+        return firstThread;
     }
 
     /**
      * Returns true if the given thread is currently queued. 
      *
                                     
2005-07-22
EVALUATION

I can reproduce this.  Here is a complete program:

import java.util.concurrent.*;

public class Bug {
    public static void main(String[] args) throws Exception {
        Semaphore sem = new Semaphore(0,true);
        System.out.println(sem.tryAcquire(100,TimeUnit.MILLISECONDS));
        System.out.println(sem.tryAcquire(100,TimeUnit.MILLISECONDS));
        System.out.println("OK!");
    }
}

Doug Lea writes:
Ugh. The problem was that AbstractQueuedSynchronizer
fullGetQueuedThread could fail to notice a node had cancelled, and 
instead retried (infinitely.)

The fix will be checked in after paranoically running all stress tests
overnight.
(It's impressive/embarassing that the error could be triggered by a
3-line program.)

###@###.### 2005-03-17 01:16:05 GMT
                                     
2005-03-17



Hardware and Software, Engineered to Work Together