JDK-6311057 : Java Thread.sleep(timeout) is influenced by changes to System time on Linux
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 5.0u3
  • Priority: P2
  • Status: Closed
  • Resolution: Fixed
  • OS: linux_redhat_9.0
  • CPU: sparc
  • Submitted: 2005-08-15
  • Updated: 2021-11-09
  • Resolved: 2005-09-22
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 6
6 b53Fixed
Related Reports
Relates :  
Relates :  
Description
Thread.sleep(timeout) is influenced by changes to System time on *_Linux_*.
   
    Tested both on:

        Red Hat Linux release 9 (Shrike)

        Kernel 2.4.20-8 on an i686

    and

       Red Hat Enterprise Linux ES release 3 (Taroon Update 4)



    The test is simple,

    call Thread.sleep(15000) and change the time two hours back

    i.e. start at 13:00 call thread.sleep and at 13:05 change the time to 11:05

    you will notice that the sleep does not return at 11:15, not even when the time

    is then fixed back to 13:15


        The test case is simple,

        System.out.println("Press key before sleeping for 15 seconds...");
        System.in.read();
        Thread.sleep(15000);
        System.out.println("Woke-up.");

        ---
        e.g. 
          run the test, Press <Enter> at 11:34:00 (00 seconds)
          At 11:34:05 change the clock to 09:34:05 (two hours back)
          Expected result: at 09:34:15 the thread will wake up.

          If the same test is conducted on Windows machine then it works.
          If the clock is set to a _future _time, the test works also for linux.

          I have tested this with the following JDKs:
          1. Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_03-b07)
             Java HotSpot(TM) Client VM (build 1.5.0_03-b07, mixed mode, sharing)
                        [ FAILED ]
          2. Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0-b64)
             Java HotSpot(TM) Client VM (build 1.5.0-b64, mixed mode, sharing)
                        [ FAILED ]
          3. Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_07-b05)
             Java HotSpot(TM) Client VM (build 1.4.2_07-b05, mixed mode)
                        [ PASSED ]
        ---
        Please note that this same test _*PASSED*_ on build 1.4.2_07-b05.

        Please find attached are the following:
        Tester.java - test case
        specification.txt - machine specification
        threadDump.txt - thread dump while thread.sleep didn't wake up

Comments
EVALUATION It is unfortunate that no detailed explanation of this bug or its fix were given - and worthwhile to do so now. In Java 5+ a Thread.sleep(millis) eventually calls os::sleep(millis) which in turn uses ParkEvent.park(millis) which in turn uses the OS specific PlatformEvent.park(millis) method. On Linux, PlatformEvent is simply a pthread_mutex and pthread_cond pair. A "park" does a pthread_condwait on the condition variable and an "unpark" does a pthread_condsignal. The timed park - park(millis) - uses pthread_condtimedwait which takes in an absolute time value. So the VM code converts the millis value into an absolute value "millis from now" based on the current clock time. Herein lies the basic problem: - the delay passed to sleep is a relative delay (block this thread until this number of milliseconds has passed) - the value passed to pthread_condtimedwait is an absolute time: block this thread until the clock reaches the time specified By default that clock is CLOCK_REALTIME - the time-of-day clock. Now consider what happens if someone changes the TOD clock: Case 1: If the TOD is adjusted forward, the absolute time will occur sooner than expected and the park(millis) will return early. Case 2: If the TOD is adjusted backwards the absolute time will occur much later than expected (by the amount of the time adjustment) and the park(millis) will return late. That was the theory of operation assuming that Linux actually handled pthread_cond_timedwait as described. In practice it seems this is not what happens and that (depending on the Linux version perhaps) changes to the TOD clock have no affect on the timed-wait! So what was the bug? The time/timed-waiting functions have a history of being unreliable on all operating systems so our os::sleep code already had in place some code to watch for early returns. That code used javaTimeMillis to check the current time before calling park(millis) and after, to ensure we don't return less than millis milliseconds after the call. But that was to catch things like signals terminating the pthread_cond_timedwait early (which they shouldn't). If the TOD is adjusted forward, the park returns after the expected elapsed interval, we check the current time (which has advanced due to the change in the TOD) and see we waited long enough and so return and all appears well. However, if the TOD is adjusted back in time - as per this CR - then when we check the current time it will be in the past relative to the original time (before park was called). The elapsed time will be negative - and on some debug versions of the VM this will trigger an assertion failure - and we then subtract that elapsed time from the original millis value - substracting a negative is the same as adding so our millis value actually increases! Because we do not perceive that we have waited long enough (milis is still >= 0) we make another park call with the new, larger, millis value. If the TOD is adjusted forward again this has no effect on the new park call - as observed by the reporter. The fix applied here, is to change the use of javaTimeMillis to javaTimeNanos which simply uses a monotonic clock source (if available - and Java 6+ only!) to measure the elapsed time around the park(millis) call. Now when the park returns at the expected time, the measured elapsed time is not affected by the change in the TOD clock and so all appears to work correctly. However, it should be noted that if the Linux system treats absolute timeouts the way they "should" be treated then changing the clock forward would again result in longer than expected sleeps - because the underlying park would not return when desired. The correct fix here is to not use condition variables associated with the TOD clock (CLOCK_REALTIME) to implement relative timed-wait operations (Thread.sleep, Object.wait and some of the j.u.c operations). Rather these condition variables should be associated with CLOCK_MONOTONIC and hence be "guaranteed" to be immune to changes in the system clock.
11-11-2009

EVALUATION Fixed by using monotonic clock if it's available on the system (2.6 kernel).
14-09-2005