JDK-4409306 : W2K: mouse and menu flicker occurs when repainting components
  • Type: Bug
  • Component: client-libs
  • Sub-Component: 2d
  • Affected Version: 1.3.0,1.4.0
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • OS: windows_2000,windows_xp
  • CPU: x86
  • Submitted: 2001-01-29
  • Updated: 2002-10-09
  • Resolved: 2002-05-13
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
1.4.1 betaFixed
Related Reports
Duplicate :  
Relates :  
Relates :  
Relates :  
Relates :  
Description

Name: rmT116609			Date: 01/29/2001


java version "1.3.0"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.0-C)
Java HotSpot(TM) Client VM (build 1.3.0-C, mixed mode)

/*
 * W2kFlicker.java
 *
 * Philip Koester
 * ###@###.###
 */

import java.awt.*;
import java.awt.event.*;

/*
 * Bug description:
 *
 * This is a simple program that displays System.currentTimeMillis() in
 * a Frame, ca. 50 times a second. For the time to be updated on screen,
 * the panel's repaint() method is called.
 *
 * This program (or any AWT program that makes use of automatic repaints)
 * will cause excessive mouse and menu flicker on a W2K system. The
 * problem does *not* exist in NT4.
 *
 * Start the program and watch the mouse cursor (wait a few seconds). The mouse
 * will flicker. It doesn't matter where the mouse is or if this program's frame
 * is activated. The mouse will always flicker when a portion of the time
 * panel is visible on screen. (If the panel is obscured by another window,
 * no flicker occurs.)
 *
 * To see the menu flicker, minimize all windows (except this program's frame)
 * and right-click on the Windows desktop. W2K will show one of those new
 * fancy fading-in menus. Again, excessive flicker.
 *
 * There is weird work-around: Go CTRL+ALT+Del once and press ESCAPE. The
 * flicker is gone until the next time the program is run.
 */
public class W2kFlicker {
	public static void main(String[] args) throws Exception {
		// create and configure frame
		Frame f = new Frame("W2K Flicker");
		f.setBounds(new Rectangle(100, 100, 300, 300));
		f.setLayout(new GridBagLayout());
		
		// add the panel that displays currentTimeMillis
		final Component timePanel = new TimePanel();
		GridBagConstraints c = new GridBagConstraints();
		c.fill = c.BOTH;
		c.weightx = c.weighty = 1;
		f.add(timePanel, c);
		
		// make the frame closable
		f.addWindowListener(new WindowAdapter() {
			public void windowClosing(WindowEvent e) {
				System.exit(0);
			}
		} );
		
		// show the frame
		f.setVisible(true);
		
		// start a thread that repeatedly repaints the time panel
		new Thread(new Runnable() {
			public void run() {
				while (true) {
					timePanel.repaint();
					try { Thread.sleep(20); } 
					catch(InterruptedException e) { }
				}
			}
		} ).start();
	}
	
	static class TimePanel extends Panel {
		TimePanel() {
			// set a bigger font
			setFont(new Font("SansSerif", Font.BOLD, 28));
		}
		
		public void paint(Graphics g) {
			super.paint(g);
			
			// draw currentTimeMillis
			
			String time = "" + System.currentTimeMillis();
			FontMetrics fm = getFontMetrics(getFont());
			Dimension size = getSize();
			
			int x = (size.width - fm.stringWidth(time)) / 2;
			int y = (size.height - fm.getHeight()) / 2 +fm.getAscent();
			g.drawString(time, x, y);
			
		}
	}
}

// EOF
(Review ID: 115929) 
======================================================================





Name: jl125535			Date: 01/30/2002


The test case also causes problems on Windows XP with merlin-rc1.  In addition, this situation can cause the Start menu to flicker. 

It's a pretty serious flaw for any client-based program to have such a strong effect on normal operation of the system.  Supplying the no.ddraw option fixes the issue, however.  Perhaps the ddraw option should be turned off by default so this doesn't happen as much?
(Review ID: 138798)
======================================================================

Verified fix with jdk build 1.4.1-beta-b09, mixed mode on w2k.
###@###.### 2002-04-24


Comments
CONVERTED DATA BugTraq+ Release Management Values COMMIT TO FIX: hopper-beta FIXED IN: hopper-beta INTEGRATED IN: hopper-beta
14-06-2004

EVALUATION e internal flags that can be used to affect this fix include: -Dsun.java2d.gdiblit=false This disables the first part of this fix, which uses GDI to copy some image types to the screen. If this flag is forced to false, we will attempt to do the copy via a screen lock operation instead. -Dsun.java2d.ddlock=true This forces us to use ddraw to lock the screen if possible, which we currently do not do on win2k and XP. -Dsun.java2d.ddlock=false This forces us to avoid the use of ddraw for locking (we still use ddraw by default on win9x and NT4) If gdiblit=false and ddlock=true are both used, the effect will be the same performance and flicker effects prior to this bug fix.
11-06-2004

EVALUATION Sounds kind of like 4362500. eric.hawkes@eng 2001-02-08 Info from Chet.Haase@eng follows: I think that's [4362500] a related issue, but not the same bug. The cursor artifacts come from badness in ddraw drivers (I think) that creeps in when we do a screen->screen copy. The problem that this user is talking about is purely a cursor flicker issue. This is usually caused by: - a software cursor (as in the bug that I fixed) - A ddraw Lock/copy/Unlock operation. Basically, ddraw is hiding the cursor so that the data being copied to the screen doesn't obliterate the cursor. The reason (I think) that the user sees the problem on win2k and not NT4 is that win2k comes with a drop shadow for the cursor by default. If the user's video card cannot handle this kind of cursor in hardware, then it becomes a software cursor. But on NT4, the only way to get a software cursor is by specifically choosing a "Pointer Scheme" that has a color cursor (assuming the video card doesn't support color cursors either). The reason it "works" after a Ctrl-Alt-Del action is that this forces a display change as the task manager comes up. in jdk1.3, this display change forced us to punt ddraw forever until the app is restarted. So the user just found a way to run without our ddraw support. The workaround is for the user to turn off the cursor shadow (it's a checkbox in the mouse control panel). Or they can run without ddraw (-Dsun.java2d.noddraw). Otherwise, there is no workaround (or fix) that I know of... eric.hawkes@eng 2001-02-08 More info from ###@###.### Yes, we should be able to reproduce this effect in a native-only app, if we are running on the right kind of video hardware (one that doesn't support win2k shadow cursors in hardware). One question about this bug, though, is whether there is anything we can do (that we haven't yet thought of) to address this problem with software cursors. For example, we could provide some sort of hack that copies the cursor into the back buffer so that the copy of the buffer didn't cause the cursor flickering (or at least not as much). We'd need to play around with this (and other) approaches. I'd leave the bug open for now and maybe someone can spend some more quality time with cursor issues for hopper or the follow-on. ###@###.### 2002-02-13 I'm taking over this bug as the fix appears to be in Java2D code. First, a little more information about the cause of the bug: A software cursor is drawn to the screen through a process of the GDI runtime calling the GDI driver at frequent intervals. This is in contrast to a hardware cursor, where the cursor simply remains where it is since its presence in its own rendering plane does not disturb rendering in the regular framebuffer. When Ddraw takes a lock on the screen, this essentially locks out everyone except the lock holder from rendering to the screen, including the GDI driver that would like to redraw the cursor. So the flickering artifact is not so much that the cursor is being taken down or overwritten, but that it is not being redrawn at the same frequency (the refresh of the screen, presumably) as it usually is. Similarly, the other artifacts seen (such as the flickering start menu) are due to the same problem. In Win2k and XP, menus may fade in, which again requires frequent redrawing by the GDI driver. When the GDI driver cannot draw to the display, the menu will not be drawn and the effect is a flicker seen by the user. Now to the fix: Since DDraw locks are the cause of the problem, the only possible solution is to avoid using DDraw for locks, preferably only when the software cursor was active. There are several issues with this: - We could take the DDraw lock only on the region that we would like to repaint. Currently, we lock the entire screen, for simplicity and performance reasons. If we, instead, locked only the region we are painting to, then there would be no flickering artifacts outside of this region. BUT, there _would_ still be flickering inside the region, including the cursor and any fade-in menus that happened to appear on top of the locking region. Having the cursor reside within the window that is being repainted is probably the common case, so this solution is not acceptible (at least not by itself). - It does not appear possible to detect a sw cursor. We can get some information from the registry on cursors in use, but that will only tell us whether the default cursor or some variant is in use (and possibly whether a drop shadow is active); it still does not tell us whether the video card and driver support that type of cursor in hw or sw. It might be possible to install kernel-level software that could detect information about the GDI driver and percolate it up to us at the user level, but this is still uncertain and I don't relish having to install a kernel driver as a standard part of running the jdk. However, detecting the cursor capabilities may still not allow us to work around the same problems with the fading menus; just because a video card supports a color cursor does not mean that it can handle an arbitrarily large menu fade-in effect in hardware (in fact, I would bet against this in general; I think the support for fading is built into the GDI runtime, not the GDI drivers (yet)). So whatever fix we do will basically have to work across all platforms, regardless of video cards/drivers/cursors/etc. We could optionally limit the fix to Win2k/XP and future OS's, since we know that this bug is generally not a problem prior to win2k. - Simply turning off DDraw locking is not something we want to do. This support was added in 1.3 as a performance feature and allows us to do much faster copying to the screen for things like software back buffers (such as Swing uses). 1.4 added the support for hardware back buffers, so users can now work around this problem _and_ get better performance, but we still need to support older apps (and applets) that use the old software-based buffers. That's the list of issues as we look at the fix. Basically: - we need to turn of DDraw locking on all platforms (or at least win2k/XP) to get rid of the flicker - turning off DDraw locking significantly decreases performance for common operations (such as double-buffering, or anything else that copies pixels to the screen). So now what? It turns out that the mechanism we use when DDraw is not enabled is not the optimal approach. Currently, if ddraw locking fails (or is disabled), we punt to GDI by doing the following: - Create a temporary bitmap - Copy the screen pixels into the bitmap if we need to read from those pixels - Copy our offscreen image into this temporary bitmap - Use GDI's BitBlt() function to copy the results onto the screen. This operation takes significantly longer (10-90% in my recent informal testing) than using DDraw. But there's a better way. Using DIB operations such as StretchDIBits and SetDIBitsToDevice, we should be able to simply wrap a DIB structure around the existing offscreen image memory and call one of these handy functions to copy that image directly onto the screen. This eliminates the overhead of creating the temporary buffer and the time it takes for the redundant copy to that temporary buffer. Testing so far confirms that this is possible, at least for common cases. Performance ranges anywhere from 10-20% slower than DDraw to 25% _faster_ than DDraw locking. These results are, I believe, highly dependent on OS and video card/driver. More testing is required to see if we are going to be better or worse in general. There may be some issues with general image support (GDI will only support certain types of known image formats), so we will still need a fallback position. If we can keep the number of corner cases low, then the fallback will probably be to use the old, slower GDI method; hopefully the slower performance will be an uncommon case and worth the improved quality of not using DDraw. Also, based on performance testing, we will see whether we can eliminate DDraw locking on all platforms or whether we should constrain this fix to key platforms. In my testing so far, I saw much better performance of the new GDI approach on win2k than I did on win98. Since the flicker artifact is win2k/XP specific, we could leave the DDraw approach in the code for the older win9x platforms it if looks significantly faster than GDI on those OS's. Implementation underway, followed by testing, but it appears like at least part of what is described above will work. ###@###.### 2002-04-03 This approach works great. The only caveat is that we cannot support all source image types (only types that GDI understands) and cannot support 8-bit destination types (GDI's dithering is way slower than ours). In these cases, we will use the old GDI approach (which uses a temporary buffer; it's slower than the new GDI approach, but it works without flickering and should only occur in these corner cases). In general, the new approach tends to be faster than ddraw locking, in addition to removing the annoying flicker. There are cases, however, where it may be a bit slower instead; this is due to GDI's need for 32-bit aligned scanline strides. This means that: - All 32-bit images are aligned and will go at full speed - One quarter (on average) of 24-bit images will go at full speed - One half (average) of 16-bit images will go at full speed - One quarter (average) of 8-bit images will go at full speed Adjusting the image sizes with respect to alignment will affect this. Anyway, the flicker is gone and performance, on average, is pretty good. ###@###.### 2002-04-21 One unforeseen side-effect of this fix is that _all_ rendering operations performed directly on the screen are potentially adversely affected. For example, any app that draws text to an onscreen window will end up going through an additional level of indirection as we avoid using ddraw to lock the screen. The fix discussed above will cover most situations of double-buffering and image copying to the screen; the use of GDI for these situations has been shown to be similar in performance to our current use of ddraw (sometimes better, sometimes worse). But other situations (such as rendering text to the screen) end up being consistently worse when we avoid using ddraw to lock the screen. The problem is that using ddraw to lock the framebuffer ends up being the most direct (and therefore fastest) way of copying bits onto the screen (not including, of course, using the ddraw Blt command to copy bits, but this can only be done when both the src and dest are ddraw surfaces). So when we disable ddraw locking of the screen, any method we use to get the bits onto the screen will necessarily be slower. So the problem and fix here boils down to a performance vs. quality tradeoff: how much performance are we willing to trade for better quality? The quality, in this case, relates to the cursor (and other alpha-effect) flicker effect. The performance tradeoff depends on the operation we're talking about. Some micro benchmarks show degradations of up to 70% on text operations, but these do not reflect reality because most applications will not be rendering thousands of text strings directly onto the screen with no other intervening rendering operations. I should note, also, that there are things we can do to mitigate the performance problems we are seeing with disabling ddraw locking on the screen. Currently, when an attempt to lock the screen fails, we fall back to using a temporary GDI buffer. In the case of text (or other operations that have, essentially, transparent pixels in the bounding rectangle of the primitive), we copy the bits from the screen into the buffer first. Then we do our rendering operation into this buffer and Blt it onto the screen. This operation tends to be quite slow, especially since we end up copying pixels out from and then back to the screen in some cases. Some steps that we could/should take to speed this up include: - using a static GDI buffer so that we do not suffer the cost of recreating it every time - using a ddraw buffer instead of GDI in case the speed of a ddraw Blt has any speed advantages - using a ddraw buffer with chromakey enabled to allow us to render transparent-backed primitives (e.g., text) without having to copy screen pixels out to the buffer. - caching common primitives (e.g., strings and glyphs) as ddraw surfaces so that we do not always have to re-render common strings and characters but can reduce those operations to simply copying images. ###@###.### 2002-05-07 Note that a partial fix for this has already been integrated into the workspace as of hopper beta: the GDI Blit loops discussed above are now the default way of handling blits from various GDI-friendly image types onto the screen, thus eliminating cursor flickering for those types of situations (including hopefully most double-buffered applications). The fix still to come for this bug is just in the area of direct pixel access to the screen, as discussed in the previous section of this evaluation. ###@###.### 2002-05-07 The complete fix has been putback. We now avoid ddraw locking of the screen entirely on win2k and XP. Since ddraw locking tends to provide the fastest access to the screen pixels, this change causes a performance hit for some operations. Some work was done to mitigate that hit: our previous approach to screen pixel access without ddraw involved creating a temporary Gdi DIB, copying pixels into that bitmap, and then copying that bitmap to the screen. Now, instead of destroying and recreating that bitmap, we simply reuse the same bitmap for every access (recreating as necessary when a larger bitmap is required). Some operations are somewhat slower with this change. For example, drawing text directly to the screen and doing any copies of bitmask transparent images are both slower. This slowness is due to the fact that these operations require us to copy screen pixels out to the bitmap first so that we do not blow away screen data when we copy the bitmap back (since text and bitmask operations do not cover the entire bitmap area). Some operations, however, have gotten faster with this change. For example, translucent operations to the screen (such as copying translucent images to the screen) are significant faster (from 2 to 7 times as fast, in some micro benchmarks). This is because translucent copies require reading destination pixels and operating on them. Reading memory in VRAM is incredibly expensive, so it turns out to be faster to copy those pixels out once (into our bitmap), operate on them there, and then copy the result back to the screen. Some image scaling operations have also gotten faster, as it is faster to operate on the pixels locally in system memory than it is to do it directly in VRAM. Som
11-06-2004

WORK AROUND The workaround is for the user to turn off the cursor shadow (it's a checkbox in the mouse control panel). Or they can run without ddraw (-Dsun.java2d.noddraw). brent.christian@eng 2001-07-18
18-07-2001