Bug ID: JDK-6664068 D3D: resizing a window which is rendered to in a tight loop results in artifacts and crashes

Type: Bug
Component: client-libs
Sub-Component: 2d
Affected Version: 6u10

Priority: P2
Status: Closed
Resolution: Fixed
OS: windows_xp
CPU: x86

Submitted: 2008-02-15
Updated: 2010-10-14
Resolved: 2008-03-04

JDK 6
6u10 b13Fixed

Run the attached test on windows on a recent 6u10 build, it will
produce artifacts and eventually crash.

SUGGESTED FIX http://sa.sfbay.sun.com/projects/java2d_data/6u10/6664068.0

20-02-2008

EVALUATION The test renders an unaccelerated BI directly to the screen in a tight loop, resizing the window from time to time. After a while, the test stops updating the screen, shows garbage and eventually crashes. The problem is in the D3DScreenUpdateManager's handling of on-screen surfaces. When a window is resized the current surface gets replaced with a new one, which is initially in the "lost" state. This is so that we don't waste resources on surfaces which aren't rendered to. When a surface is first rendered to it is "restored" - the native surface gets created. The problem is that this may happen from different threads at the same time: the screen updater thread, which checks the surfaces and flips the ones rendered to, as well as restores the lost ones, and any other thread which does the rendering - the main thread in this particular case. Since the 'lost' state of the surface data and the restoration isn't synchronized, it may get "restored" from both threads at the same time. Here's an example of what happens T1 ScreenUpdater Thread g = getGraphics() run() ../// resize happens, the new dst surface is installed, in lost state dst.restoreSurface() D3DSD.initSurface //created new swap chain S1 g.drawImage() BufferedContext.validate(dst) .. setRenderTarget(S1) dst.restoreSurface() .. D3DSD.initSurface() // dst now has new swap chain S2 ... dst.flip() // we flip S2, not S1! Here's a stack trace illustrating this: initSurface this=sun.java2d.d3d.D3DSurfaceData$D3DWindowSurfaceData@5e176f java.lang.Exception: Stack trace at java.lang.Thread.dumpStack(Thread.java:1206) at sun.java2d.d3d.D3DSurfaceData.initSurface(D3DSurfaceData.java:353) at sun.java2d.d3d.D3DSurfaceData.restoreSurface(D3DSurfaceData.java:752) at sun.java2d.d3d.D3DSurfaceData$D3DWindowSurfaceData.restoreSurface(D3DSurfaceData.java:853) at sun.java2d.d3d.D3DScreenUpdateManager.validate(D3DScreenUpdateManager.java:478) at sun.java2d.d3d.D3DScreenUpdateManager.createGraphics(D3DScreenUpdateManager.java:252) at sun.awt.windows.WComponentPeer.getGraphics(WComponentPeer.java:523) at java.awt.Component.getGraphics(Component.java:2666) at PerfTest.main(PerfTest.java:81) initSurface this=sun.java2d.d3d.D3DSurfaceData$D3DWindowSurfaceData@5e176f setting RT: sun.java2d.d3d.D3DSurfaceData$D3DWindowSurfaceData@5e176f java.lang.Exception: Stack trace at java.lang.Thread.dumpStack(Thread.java:1206) at sun.java2d.d3d.D3DSurfaceData.initSurface(D3DSurfaceData.java:353) at sun.java2d.d3d.D3DSurfaceData.restoreSurface(D3DSurfaceData.java:752) at sun.java2d.d3d.D3DSurfaceData$D3DWindowSurfaceData.restoreSurface(D3DSurfaceData.java:853) at sun.java2d.d3d.D3DScreenUpdateManager.validate(D3DScreenUpdateManager.java:478) at sun.java2d.d3d.D3DScreenUpdateManager.run(D3DScreenUpdateManager.java:453) at java.lang.Thread.run(Thread.java:619) Now the BufferedContext thinks that it had set the render target on thenative level (since the destination surface data object doesn't change), but in reality the new native surface is never set as render target, so nothing gets rendered. Also, we're leaking the first surface that was restored. The native resource managers still tracks it, but it will only be released when device reset happens. So if we have tons of resizes we quickly exhaust video memory, and then strange things start to happen - createSwapChain returns some weird errors (not the OUT_OF_VIDEO_MEMORY as one would expect), and we sometimes crash when transfering pixels to a locked a managed surface. This looks like a bug in d3d runtime - since lock succeeds, and we don't render outside of its memory (I've verified it with memset(ptr, 0, h*lineStride). A better and more risky fix would be to make surfaceData's lost state to be thread-safe with locking. But this is prone to issues, so instead it appears that it is easier to deal with the consequences of it not being thread-safe. First of all, this situation can only happen to our own "on-screen" surfaces, belonging to D3DScreenUpdateManager. This is because the D3DVolatileSurfaceManager doesn't actually restore the current accelerated surface, it creates a new one instead. Also, typically applications don't validate volatile images and render to/from them on different threads (hopefully). Preventing leaks causing the crashes is simple: we just need to release the current resource in the native surface before allocating a new one. But we still have a problem of BufferedContext not resetting the render target (the cause of rendering artifacts). For that a workaround is to reset the BufferedContext after a successful restoration of the D3DWindowSurfaceData. This will make sure that the next rendering call will set the proper render target to the d3d device. With this fix the test runs with no problems.

15-02-2008