Bug ID: JDK-6525100 OGL: Toolkit.sync() should issue a glFinish()

Versions (Unresolved/Resolved/Fixed)

The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.

JDK 6	JDK 7
6u4Fixed	7 b10Fixed

In recent fixes involving fragment shaders (6514990, 6521533) it became clear that
the OpenGL pixel pipeline can be quite long, meaning for complex operations happening
on the GPU it can take a while for the pixels to go through the complete pipeline
and appear on the screen.  Our current approach in the OpenGL-based pipeline is to
simply call glFlush() at the end of processing a batch of operations.  The glFlush()
method is basically a way to asynchronously flush operations pending in hardware.
This is one of the main benefits of hardware acceleration in that we can offload
complex work to the GPU without having to wait for it on the CPU.

Currently if you call Toolkit.sync(), all it does is force Java 2D's RenderQueue
to be flushed, followed by a call to glFlush().  This is usually sufficient for
ensuring that pending operations are flushed to the screen in a timely manner, but
for complex operations like the ones described above, it's usually not enough.
The problem] is that in certain scenarios (e.g. microbenchmarks, regression
tests that sample screen pixels via Robot) our current approach with glFlush()
does not always get the pixels to the screen in a timely manner.  This can wreak
havoc with J2DBench results (due to high variance) and can cause regression tests
to fail (because the hardware isn't flushed by the time Robot needs to sample pixels).

EVALUATION Actually, in my measurements so far I haven't seen a significant drop in SwingMark with this change in place; it ranges anywhere from "in the noise" to a 3-4% "slowdown". As discussed above, all this change is doing is making benchmark results more accurate and predictable, so any slowdown shouldn't be considered an actual performance loss (since "real applications" don't have a need to call Toolkit.sync() anyway). J2DBench results now have very small variance, as expected (that was one of the goals of this fix after all).

23-02-2007

EVALUATION The fix is pretty straightforward: we should really be calling glFinish() from Toolkit.sync(), which synchronously forces all operations in hardware to be flushed to the screen (this is a blocking call). I have hesitated doing this in the past mainly because of the SwingMark microbenchmark, which issues a ton of calls to Toolkit.sync() when taking measurements. So if we start making lots of calls to glFinish(), the reported SwingMark performance will drop quite a bit (around 10%, depending on the board could be more). To make matters worse, we currently have some smart code in OGLRenderQueue.sync() that avoids flushing the RenderQueue if it's currently empty. If we change sync() to now add a SYNC opcode and flush the queue no matter what, it will mean that SwingMark will likely be causing even more thread switching for each repaint: once to flush the current queue and copy the Swing backbuffer to the screen, and then again to process the SYNC opcode and call glFinish(). So really we're making an already artificial benchmark with artificially good numbers and making them worse; for this reason, I'm not too concerned with this particular downside.

14-02-2007