JDK-4276423 : drawImage of an offscreen image to the screen much slower in JDK 1.2
  • Type: Bug
  • Component: client-libs
  • Sub-Component: 2d
  • Affected Version: 1.2.0,1.4.0
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: windows_nt,windows_2000
  • CPU: x86
  • Submitted: 1999-09-29
  • Updated: 2013-11-01
  • Resolved: 2001-07-18
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
1.4.0 betaFixed
Related Reports
Relates :  
Relates :  
Relates :  
Description
The attached test case measures the performance of copying an offscreen
image to the screen.  The performance of this operation is much slower
in JDK1.2 than it was in JDK 1.1.8, by a factor of more than 3x on the
win32 runtime.

Comments
CONVERTED DATA BugTraq+ Release Management Values COMMIT TO FIX: merlin-beta FIXED IN: merlin-beta2 INTEGRATED IN: merlin-beta
14-06-2004

EVALUATION The performance is off by a bit on Solaris, especially if there are no DGA drivers for the video card, but win32 is seeing the majority of the impact. jim.graham@Eng 1999-09-28 I recently got the following results on my PIII-dual 866 NT4 system (video card ATI Rage Pro Turbo), at 32 bits per pixel: jdk1.1: 20x20 16,909,090 pps 100x100 22,268,000 pps 300x300 24,488,304 pps jdk1.2: 20x20 9,440,362 pps 100x100 21,333,600 pps 300x300 24,570,419 pps jdk1.3: 20x20 9,106,579 pps 100x100 21,231,683 pps 300x300 24,616,363 pps jdk1.4 (my most recent build): 20x20 7,495,593 100x100 22,236,607 300x300 23,695,652 And on my PIII-500 (single CPU) win98 system with a Matrox G400 running 32 bits per pixel: jdk1.1: 20x20 37,724,773 100x100 36,661,107 300x300 36,808,163 jdk1.2: 20x20 3,631,375 100x100 31,629,213 300x300 43,824,489 jdk1.3: 20x20 3,728,680 100x100 33,635,275 300x300 42,096,774 jdk1.4: 20x20 5,750,953 100x100 108.602,065 300x300 165,263,578 From these results, it looks like: - There are definitely differences between OS's and video cards, especially when we are comparing hardware-accelerated images and non- accelerated images. - The overhead of the small (20x20) images appears to drag down the performance of 1.4 offscreen images to nearly the level of the 1.2/1.3 software-based images. In fact, on the older ATI video card, the hw-based images were even slower than the software-based images. - NT performance of all images seems gated at some maximum amount. This might be a restriction on NT, or it could be a constraint of the older video card. More investigation would be necessary to figure it out. But all larger image sizes on all releases seem about the same. - win98 shows the difference between jdk1.1 hw-based images (flying at about 36M pps) versus jdk1.2/1.2 sw-based images (limited to only about 3M on the smallest image). - win98 on this fast video card shows the advantage to directDraw in the latest jdk1.4 builds; performance of jdk1.1 was gated about around 36M pps, but the performance of DirectDraw-based images appears much higher, at around 165M pps for th largest image size. more work is necessary. We need to make sure that we eliminate any overhead that might be contributing to the lower scores in jdk1.4 for small image sizes. Profiling is necessary... chet.haase@Eng 2001-04-24 I did a little more debugging/profiling and got the following information: One of the key pieces of overhead in our Blt processing is due to the ddraw Clipper object. When I eliminate the Clipper (i.e., I don't attach it to the window or set the clipper on the primary), then I more than double the performance of the smalles (20x20) image copies. On my test system (PIII-866 dual processor, nVidia TNT2), this made the performance go from 11 M pixels per second to over 26 M pixels per second. Of course, this is a bottleneck that we cannot do much about: drawing without a Clipper object requires that we do our own clipping to the window (not too hard) but it also means that we would be subject to Windows events that could cause rendering artifacts. For example, if our window was obstructed, we would do our Blts over any overlapping windows, regardless of which window was supposed to be on top (ddraw draws directly to the screen without regard for Window properties). And even if our window was on top at the time we issued the Blt call, this might not prevent some event (such as the user dragging a window) from overlapping the window at the time of the the actual Blt operation (there is a delay between our issuing the call and that call actually being processed by the hardware). Actually, this situation might be handled for us through context switching mechanisms of the driver/hardware (hopefully the hardware would flush the graphics pipe before allowing the window system to move things around). But there is still a small hole of opportunity between our checking for obstruction and actually issuing the call. Anyway, this got our performance up to 26 M pixels per second. But the jdk1.1 version is still at 44 M, nearly twice the performance of our non-clipped jdk1.4 version. I think this difference can be attributed to various overhead elements in our drawImage() processing. During a profiling run (using Compuware's TrueTime product), I found that we are spending significant amounts of time (on the order of one to five percent) in the following routines: ClipInfo (used to derive the actual src/dst values after clipping against sg.getCompBounds() Blit.getFromCache() (gets the cache entry for our Blit call) DrawImage.blitSurfaceData (spends a couple of percent just dealing with setting the CompositeType) AcceleratedOffScreenImage.getSourceSurfaceData (gets the accelerated surfaceData object for accelerated images) There are various other methods and simple operations which end up taking over a percent of the runtime. Many of these functions are very simple (like the equals() comparison when retrieving the Blit from the cache), but when called over 60000 times (in this case), they add up to significant overhead. The reason for performance loss due to overhead in this case is that the primitives in question are so small (20x20) that the more we do between issuing the call from the application and actually issuing the ddraw call, the more we suffer from each intermediate step. For the larger primitives, the amount of overhead is now insignificant compared to the performance time of the actual rendering so we see the performance benefits of ddraw much more clearly. chet.haase@Eng 2001-07-18 I am closing this bug and opening a new bug on just the small-image case. See bug 4481344 for more details on this problem. The original reason for this bug was to fix general image copying performance; we have done that in jdk1.4 via hardware-accelerated images and performance of most image sizes is way beyond the performance in any prior jdk. However, since there are still issues with small image sizes (such as the 20x20 case quoted in this bug report), a bug should still be open against this problem. I am marking this bug as Fixed and Integrated because the original problem was fixed many releases ago for general images. chet.haase@Eng 2001-07-18
18-07-2001