Bug ID: JDK-8092246 Delays in D3D Present affect performance

Type: Enhancement
Component: javafx
Sub-Component: graphics
Affected Version: fx2.0

Priority: P4
Status: Open
Resolution: Unresolved

Submitted: 2011-04-08
Updated: 2018-09-05

Other
tbdUnresolved

A very simple animation shows that the render thread spends significant time in com.sun.prism.d3d.D3DSwapChain.nPresent().
The fact that the length of those delays increases linearly and resets periodically suggests that the pulse is out of sync with VSync and occurs at a lower frequency (approx. 16ms vs 16.66 ms). Since the user thread waits for the renderer to finish its job, time spent in nPresent() affects the number of frames that could be created and rendered per second for computationally intensive applications. (See attached timeline view for the user and render threads).

Reopened as a "Tweak" for Lombard to allow Oleg a chance to continue the analysis. I have marked this as a potential duplicate of RT-15195. That can be verified once we are able to run in parallel.
29-03-2012
Until the threads can run in parallel, this fact isn't going to change. That issue is tracked separately by http://javafx-jira.kenai.com/browse/RT-15195
14-01-2012
I realize the initial description used too simple an example to hint at a larger problem. Skipping frames due to interference of two frequencies (that of pulse and vsync) is gone: we get healthy 60 fps where we used to get 57-58. For computationally intensive apps (fps < 60), the problem remains, though: time spent in present sleeping could be used in the application thread, but it's not.
14-01-2012
I think this issue is not yet resolved. With the new scheme, a new pulse can't start until D3DSwapChain.nPresent() returns and it may still accumulate quite noticeable time waiting, which is wasted. For example, GUIMark2.Bitmap with 5000 monsters runs at 30 fps on my machine achieving only 0.73 CPU load (one CPU is used at 73%) while nPresent() spends ~26% of total time sleeping. If a new pulse could be started before nPresent() is called then ~100% CPU load and higher fps could be achieved.
13-01-2012
Fixed in p21-graphics with this change set changeset: 15283:4ce4d30ab43d user: Morris Meyer <morris.meyer@sun.com> date: Thu Jan 12 22:17:45 2012 -0500 summary: RT-13660 PaintCollectorBehavior v07 - reviewed by Kevin R, Oleg S, Oleg M, Ekaterina P, Artem A, Kirill P, Chien Y, Brent C and many others Would like to see new JPA runs on Windows though.
13-01-2012
SQE - Ok to defer
23-08-2011
OK to defer.
23-08-2011
Since RT-13660 is deferred from Presidio, this needs to be deferred as well. Target to Lombard.
19-08-2011
Based on the analysis, this seems more like a toolkit issue and is similar to (if not the same as) the issue described in RT-13660.
27-05-2011
Are you saying that a precise timer could be implemented completely in Java using the Object,wait() method? If so, then I don't see why Glass should provide the timer. Glass is an interface to the native platform, and native platform API provides support for 1ms-resolution multimedia timers on MS Windows: http://msdn.microsoft.com/en-us/library/dd742877(v=vs.85).aspx http://msdn.microsoft.com/en-us/library/ms713423(v=vs.85).aspx Note that Glass is no longer responsible for a 'FPS' property - Quantum simply uses the utility machinery of timers provided by Glass. So, if Quantum requires more precision for FPS timers and if they're implementable in Java w/o any native code, then it may simply implement one at Quantum level. There's no need to involve Glass into that.
04-05-2011
FWIW, Thread.sleep and Object.wait() have versions that take (long millis, int nanos) arguments. In my experience this is more precise & accurate than the versions that take a single (long ms) arg, even (perhaps "particularly") on Windows. I don't have details handy, but I know that the previous, FX Script-based runtime used Executor/Future from the java.util.concurrent package for pulse scheduling. This resulted in more precise delivery of pulses than we got using SwingTimer/java.util.Timer. (IIRC, the java.util.concurrent implementation boiled down to an Object.wait(long,int) call). However the Futures were only ever scheduled on a millisecond basis, so we never got (or, technically, asked for) a true 60Hz. See my comments in RT-340 regarding the 62.5 fps framerate. Vsync-aligned timers sound promising, though I would still like to see if simply moving the current code to nanoSecond APIs and actually asking for 60Hz can get rid of our visual hiccup. At the time of RT-340, vsync improved things as far as getting closer to 60Hz, though I wonder if that was "in spite of" running a 62.5Hz timer - that our 62.5Hz timer was being "quantized" down such that painting happened closer to 60Hz, if you will. Now that we can specify the pulse rate, an interesting experiment to run from an animation smoothness standpoint would be to increase the clock pulse up to, say, 120Hz or 200Hz (we often used 200Hz in the FX script days) to see if the visual "hiccup" disappears.
02-05-2011
Getting nanoTime and nano-precision timers are two different things. Even System.nanoTime() has a dislaimer: This method provides nanosecond precision, but not necessarily nanosecond accuracy. No guarantees are made about how frequently values change. If we tried to implement our own timer based on System.nanoTime() in Java, we would have to use some kind of synchronization, e.g. Thread.sleep(), which has ~16ms resolution, or Object.wait(), which is ~1ms precise. Therefore, our implementation would be as accurate as Glass timers or worse. On Windows, it's impossible to get nano- or even micro-second timers. Even if there were such timers, precision would be lost during dispatching timer events to user thread. What, I believe, Quantum needs instead is vsync-aligned timers. DirectX provides some API of that kind, but we don't use it in Glass, not we want to introduce such dependency. Any vsync synchronization seems to be a Prism issue, not a Glass functionality.
28-04-2011
The old Prism toolkit had microsecond accuracy by using System.nanoTime to set the proper pulse width. Glass only has millisecond accuracy in its Windows implementation.
27-04-2011
I believe this effect can be seen visually - even a simple "box moving back and forth" animation will periodically "hiccup", perhaps missing the repaint cycle for a single frame every second or so. I don't have the code in front of me, but I'm fairly certain that pulses are scheduled with millisecond (rather than nanosecond) precision, so we are asking for pulses every 16 ms, rather than every 16.666 ms. I strongly suspect that this is at least part of the problem.
27-04-2011

Blocks :	JDK-8100864 - Quantum threading
Duplicate :	JDK-8100622 - Allow QuantumRenderer thread and FX Application thread to run in parallel
Relates :	JDK-8104684 - Animation stutters
Relates :	JDK-8100622 - Allow QuantumRenderer thread and FX Application thread to run in parallel
Relates :	JDK-8113170 - A simple animation is jerky
Relates :	JDK-8101330 - QuantumToolkit can schedule the pulse better to improve performance