JDK-8146527 : Scheduled execution ignores PC suspend mode
  • Type: Enhancement
  • Component: core-libs
  • Sub-Component: java.util.concurrent
  • Affected Version: 8u66
  • Priority: P4
  • Status: Open
  • Resolution: Unresolved
  • OS: windows_8
  • CPU: x86_64
  • Submitted: 2016-01-05
  • Updated: 2019-04-04
Related Reports
Relates :  
Relates :  
Description
Steps to reproduce:
* Start the following program on Windows 8.1, then immediately suspend PC. Wait for ten minutes. Then wake up the PC.
```
	private static final ScheduledExecutorService SCHEDULER = Executors.newSingleThreadScheduledExecutor();

	static final void demonstrateSchedulerSleepover() {
		final long requestedDelay = 60L;
		final long startTime = System.nanoTime();
		SCHEDULER.schedule(() -> System.out.printf("Requested Delay: %ds / Actual delay: %ds%n", requestedDelay, (System.nanoTime() - startTime) / 1000000000L),
				requestedDelay, TimeUnit.SECONDS);
	}
```

Expected result:
* Program prints "Requested Delay: 60s / Actual delay: 600s"

Actual result:
* Program prints "Requested Delay: 60s / Actual delay: 660s"

Conclusion:
* The length of the sleep period is added ontop of the requested delay. This is annoying and unexpected. There seems to be no way to tell executors that the delay has to be measured in times of "real time" not in time of "non-suspended VM ticks".
Comments
I agree with Markus about the desired goal, but it's not going to be easy getting there. The problem is not just about Windows - modern OSes are confused about whether "time" should elapse while "suspended". We will not try to solve this within java.util.concurrent, which we could do only by polling.
13-01-2016

As noted in JDK-8146730 waitable timers may be the means to implement an absolute-time based scheduling mechanism.
10-01-2016

As in the case of LockSupport.parkUntil the docs clearly say it is an absolute deadline, I have filed JDK-8146730, which is *really* a bug -- either in the JavaDocs or in the implementation.
09-01-2016

As the current behaviour is as designed, and usually desired, I've changed this to an Enhancement request. I would encourage further discussion on this, perhaps on Doug Lea's concurrency-interest@cs.oswego.edu mailing list (where 99% of discussion of java.util.concurrent API's occurs). Thanks.
08-01-2016

As I stated previously the Windows APIs that are used to implemented timed-blocking only take a relative timeout, hence the observed problem with parkUntil. [Edited as I misread the timer queue API comment.] As you note the timer queue API would work as you desired in old Windows versions, but not current ones, so it seems that the OS itself has moved away from directly supporting this. Even if an API existed that did track the time while suspended, using it would require a major redesign of this part of the VM. Using a polling loop in j.u.c or the VM would be a much simpler alternative. Regardless I would only consider this for a schedule(Date at, Runnable task) style method - no change to relative timeout method behaviour. The email example is a good one as it demonstrates that the requirement pertains to time as observed outside of the machine (ie by the user). Such examples do want to see time tracked across a suspension. Effectively categorizing/describing these different kinds of requirements are difficult.
08-01-2016

Side note: A quote from the Windows API description of "CreateTimerQueueTimer": <quote> The time that the system spends in sleep or hibernation does not count toward the expiration of the timer. The timer is signaled when the cumulative amount of elapsed time the system spends in the waking state matches the timer's due time or period. Windows Server 2003 and Windows XP: The time that the system spends in sleep or hibernation counts toward the expiration of the timer. If the timer expires while the system is sleeping, the timer is signaled immediately when the system wakes. </quote> This means, that the API functions work differently of different Windows versions. Also, there still are API functions that work "correctly" even on Windows 8.1. So as Windows can provide "correct" events, the JRE should use them. Possibly the only thing to do is to replace a Windows API function by a different one.
07-01-2016

I have tried out using LockSupport#parkUntil on Windows 8.1 and it fails just as I expected it: static final void demonstrateParkUntilSleepover() { final long requestedDelay = 60L; final long startTime = System.nanoTime(); LockSupport.parkUntil(Instant.now().plus(1L, ChronoUnit.MINUTES).toEpochMilli()); System.out.printf("Requested Delay: %ds / Actual delay: %ds%n", requestedDelay, (System.nanoTime() - startTime) / 1000000000L, requestedDelay, TimeUnit.SECONDS); } Test 1: Start program, keep laptop lid open, wait for sixty seconds, and it prints "60 / 60" exactly at 60 seconds after starting the program. :-) Test 2: Start program, close laptop lid for 30 seconds, open laptop lid, and it prints "60 / 90" exactly at *90* seconds after starting the program! :-( This is really not what an application developer expects to get from a method that says in its JavaDocs: "@param deadline the **absolute** time, in milliseconds from the Epoch, to wait until"! See, it even does not work using those methods which are explicitly intended for use with *absolute* time. :-(
07-01-2016

While I understand your arguments, I hope you also understand mine: Think of an application vendor that wants to write a simple program that checks for new emails once per real-word hour. What he will do in 99% of all cases is "ScheduledExecutorService#schedule()". Only 1% all coders would understand the JavaDocs in a way that means "The JRE will check your mails once per hour while it is running, but if you close your laptop lid for 59 Minutes after just having checked your mails, you will still have to wait anoter 60 minutes after opening the laptop lid.". Really, the 99% would expect that one minute after opening the laptop lid, the software polls for mails, because the 60 minutes are over.". Things like this are the main driver for using a *scheduler* by application developers. Hence, I see a huge difference between a system developer just wanting to park a thread for some seconds, and an application developer wanting to write business applications. Both want relative times. But system programmers think in virtual periods (running only while JVM is running) while application programms think in real-world periods. Both are "relative to the previous event", but the reference system is different. The virtual reference system is non-linear. The real-world reference system is. Certainly you can ask the application programmer to periodically check the current time to see whether the real-world period is over. But shouldn't such a task be part of scheduler's API? Anyways, apparently your opinion is that the JRE shall work as it does (counter-intuitive and different on Windows and Mac), so we should at leas clearly tell in the JavaDocs of the schedule() methods that the delays will pause as long as the JVM is suspended and continue to tick once it resumes, so application programmers will know. The current wording "relative vs absolute" is everything but clear; it is rather ambiguous as "relative" and "absolute" are used differently in normal language: "Call me in five minutes" is relative. "Call me at five p.m." is absolute. At least in my language. ;-)
07-01-2016

Okay to be fair/honest at either the VM or j.u.c level you could convert timed-waits into smaller polling loops that would allow you to re-assess whether a wakeup time should have occurred. However that has its own problems with performance and overhead e.g. in power managed environments.
07-01-2016

I don't agree with the analogy of modelling suspension as having an overloaded system. To me suspension is a "pause" button. It is also not something you do on a machine that runs applications that have to interact outside the machine in "real time". Anyway nothing we can do in the VM either - the underlying blocking calls will not unblock. You would have to try and factor suspension support into the VM - get an event when it happens and another when you resume. You'd have to track all timed-blocking calls and unblock them "just in case". Coming up with meaningful cross-platform semantics would be hard - actually implementing them harder still. If someone thinks that is worthwhile then they can start a discussion, file a JEP and start work on it. I haven't seen any demand for it. I can see the conceptual case for a schedule-at-absolute-time function on ScheduledExecutor, but with no underlying OS support on Windows you wouldn't gain anything. Even parkUntil has to convert the absolute time to relative on Windows and so would also wakeup late after suspension. Also note that we (JSR166 expert group) explicitly killed the absolute-time schedule methods due to their problematic nature (as evidenced by bugs reports on java.util.Timer functionality) - and that wasn't even considering PC sleep/suspend/hibernate issues. This lack was documented: http://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ScheduledExecutorService.html "All schedule methods accept relative delays and periods as arguments, not absolute times or dates. It is a simple matter to transform an absolute time represented as a Date to the required form. For example, to schedule at a certain future date, you can use: schedule(task, date.getTime() - System.currentTimeMillis(), TimeUnit.MILLISECONDS). Beware however that expiration of a relative delay need not coincide with the current Date at which the task is enabled due to network time synchronization protocols, clock drift, or other factors. " PC suspension would be covered by "other factors" but it wasn't explicitly discussed. There is/was simply no goal of having any specified behaviour for the JVM across a machine suspension.
07-01-2016

David: I did some thinking ... anything timing-related will be messed up by a suspension, but I still think trying to come as close as possible to Real World elapsed time is probably best. This is especially true for ScheduledExecutors. Despite the risk of a wakeup storm. Another goal is for all JDK platforms to act similarly. You can model a suspension as simply running on a suddenly overloaded system. Real world software systems often end up with retry loops at a higher level than the JDK, to deal with unreliable modern computing environments. I don't think there's anything we can do in j.u.c. library code.
07-01-2016

Markus: time in the "real world" IS absolute time. If now is 3pm and you want to fire in 1 hour then you want to fire at 4pm. The distinction becomes important in this situation where real-world time and machine time cease to pass at the same rate due to, in this case, machine suspension. Martin: think carefully about this. You generally do not want relative timeouts to fire because the machine returns from suspension - because your timeout is generally related to the actions of other threads/processes on the same machine. If you suspend the VM you want the application frozen (I'll ignore issues about socket connections etc) and then resumed as-if it hadn't been frozen. The win32 documentation for the timed-blocking calls make no mention of the timeout tracking elapsed time while the PC is suspended/sleeping/hibernating, so I would not expect them to do so and experimentation indeed shows that is the case. While the clocks will update (questionable in itself for QPC), the blocked calls will not return until the elapsed time elapses in the non-suspended machine.
07-01-2016

I should be glad. Will hack it together later this week and post the test results here for both, Win 8.1 and Mac.
06-01-2016

Markus, I think we're in agreement about the desired result. Could you try the experiment of calling low-level timed LockSupport park methods (including parkUntil) and seeing whether they also "over-sleep" and how nanoTime behaves? Then we can move this to hotspot runtime?
06-01-2016

Martin, yes. As you can see in the test program, what the application programmer intents is to get an event fired after 60s of real world time. Hence the expectation is "60s / 60s" (as event time, not as printed result). Honestly, I did a mistake in my description. In fact I do not care about the actual output. I only care about the fact that that the event is getting fired 60s after the end of hibernation (what is simply weird), instead of INSTANTLY after hibernation (what is what people would expect). Sorry if that was not clear enough. The problem is not the content of the output. The problem solely is the time *when* the event is fired. Actually I don't care about nanoTime. My care only is about WHEN DOES THE EVENT FIRE. Please fire it after the expected number of REAL WORLD SECONDS. Do not fire it after VIRTUAL SECONDS. Hope it is clear now. :-)
06-01-2016

Markus: shouldn't your expected output include "Actual delay: 600s" , not "Actual delay: 60s" ? I agree with Markus that I expect nanoTime and java.util.concurrent APIs to reflect elapsed time as accurately as possible. We should figure out what mechanism is actually failing in the windows suspend case. I'm not aware of anything in j.u.c. itself. If there's anything to fix here, it's probably in hotspot. David says that nanoTime is correctly updated across suspends. Perhaps the mechanism that is failing is timed park?
06-01-2016

This is not an issue of "virtual time" versus "real-world time" but absolute time versus relative time. Most APIs in the JDK deal with relative time - an elapsed time interval - including the ScheduledExecutor as I already said. What you seem to want is an absolute-time API for the scheduled executor - not unreasonable, but usually easily approximated by using relative times (except when wall clock time can "jump" - via direct time changes, or apparent time changes due to suspension). Though even then I could not guarantee things would work as you expect across a machine suspension - that would depend on the details of the OS (eg it might convert absolute time to relative internally and so still exhibit the same behaviour; or it may not support absolute times in APIs to begin with - win32 does not!). Exactly how the demo program behaves depends on the OS behaviour with regard to the low-level APIs and the exact implementation at the Java and JVM levels. If nanoTime is implemented using a mechanism that does not advance when the machine is suspended (or is adjusted by the OS) then you will not see the time that was "lost" during the suspension. On Windows it does advance: https://msdn.microsoft.com/en-us/library/windows/desktop/dn553408%28v=vs.85%29.aspx but OS X may well be different.
06-01-2016

Update: You say the OS triggers the JVM at absolute time. This seems to be not true. The JVM uses System.nanoTime() for computing the target time. System.nanoTime() is *not* wallclock time. So at least on Windows the OS will definitively *not* fire at absolute time. I would never have filed this JIRA ticket if that would be the case, as that actually is what I want it to be. Okay so how to go on? * If the intended design target of the existing API is *virtual* time (i. e. time axis is not advancing in sleeping JVMs), then this SHOULD get explicitly get mentioned in the JavaDocs, as things explicitly called "scheduler" in the IT world typically work with *real world time* (i e. time axis is advancing even while a JVM is sleeping), not *virtual* time. Also in that case I would plea for adding an additional method to schedule *real world time* delays, so application programmers can safely rely upon immediate firing of the event once the JVM is resumed after hibernation. * If instead the intended design target of the existing API is *real world time* (which is what I assumed prior to opening this ticket) then the question acually is: Why is the above demo application prooving that it does *not work* on Windows (BTW, on Mac it seems to fail, too [it fires too late], but it prints the correct time -- strange, but true)? Adding a new API that allows applications to get notified upon suspend and resume OS events is nice bout out of scope of this ticket I think, as it solves a different problem and offloads this JVM-internal problem to each application vendor.
06-01-2016

The API you use above is a relative-time API not an absolute-time one. If there is any flaw here it is that ScheduledExecutorService does not provide a mechanism to schedule a task at an absolute time. The OS does not "send us periodic timer ticks" we use API's that block until specific amounts of time have elapsed, or until a specific absolute (wall-clock) time is reached (if supported by the OS) depending on whether the Java level API uses relative or absolute time values.
06-01-2016

The solution is not to *support* suspension. You have two possible solutions. (a) If the OS has an API that can send you an event at an absolute point in time, use that one. (b) If the OS only can send you periodic timer ticks, then at each tick (or a number of ticks) check the absolute time of the OS and compare the actually spent delay (just as the test program does it).
06-01-2016

Also note that semantically many applications would want the suspension to be invisible - ie a relative sleep/timeout/delay should behave as if the suspension never occurred. Consider a blocking call with a timeout specified, if you make the call just before suspension you generally don't want it to timeout upon resumption. Functions involving absolute times should already work as expected.
06-01-2016

I'd be interested to hear how this can be simply fixed. If the OS API's already support this then many things should "just work". My understanding is that applications have to be explicitly designed to support suspension - they need to be notified when suspension occurs, so they can save state in a way that is resumable, and then they must be notified upon resumption.
06-01-2016

Sounds like a poor excuse for a really big problem. That might be the case, but for the application programmer this simply means that its *his* repsonsibility then. The app programmer does not know that it is *his* responsibility as in native applications the Windows API solves the problem for him and the JavaDocs do not tell him to care. So at least the JavaDocs should absolutely clearly say (which is a poor solution), while actually it should be simple to fix this.
06-01-2016

I'm not aware of anything in the JDK that is designed to work across suspension of the host machine. Certainly none of the timing facilities are.
06-01-2016

Sorry but you still miss my point. I do *not* want to say "fire this timer at 8:00 p.m." -- that would be *absolute* time. What I *do* want is to say "fire this timer in one hour from now". The problem is that this does not work. It fires the timer in 1 virtual hour from now, but I want it to fire in 1 real world hour from now. So actually I *want* relative time, but relative in the *real world*! I hope it is clear now. :-)
06-01-2016