United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: JDK-6739267 D3D/OGL: add missing ThreeByteBgr to texture upload blit loop
JDK-6739267 : D3D/OGL: add missing ThreeByteBgr to texture upload blit loop

Details
Type:
Bug
Submit Date:
2008-08-20
Status:
Resolved
Updated Date:
2010-04-02
Project Name:
JDK
Resolved Date:
2009-01-09
Component:
client-libs
OS:
windows_xp
Sub-Component:
2d
CPU:
x86
Priority:
P3
Resolution:
Fixed
Affected Versions:
6u10
Fixed Versions:

Related Reports
Backport:
Backport:
Relates:

Sub Tasks

Description
D3D is missing a loop for uploading data from a ThreeByteBgr image
into a texture, so the upload for these images happens through a 
generic loop which is slow and uses an intermediate image
(so there's one extra copy and some GC activity if the intermediate
image is collected).

ThreeByteBgr is often used by video decoders (directshow specifically)
so the upload happens on every frame.

                                    

Comments
EVALUATION

We need to add the loop.
                                     
2008-08-20
EVALUATION

The same applies to the OpenGL pipeline, but with a twist.

Adding these loops only helps in case of scaling, straight
blits are much slower in OGL because we need to upload the data
scan line by scan line because of possible alignment issues (see 
bug 6207877).

Here's some performance data for  non-optimized vs optimized (with the 
loops added) case. The benchmark tests blit/scale of unmanaged 3bytebgr image.

d3d:
graphics.imaging.tests.drawimage,graphics.opts.sizes=1000:
3byte_noopt: 86301.92230 (var=0.49%) (100.0%)
3byte_opt: 104693.14079 (var=0.73%) (121.31%)
graphics.imaging.tests.drawimage,graphics.opts.sizes=250:
3byte_noopt: 83373.93846 (var=1.73%) (100.0%)
3byte_opt: 103826.17728 (var=0.79%) (124.53%)
graphics.imaging.tests.drawimagescaleup,graphics.opts.sizes=1000:
3byte_noopt: 1527.494908 (var=2.59%) (100.0%)
3byte_opt: 235402.19134 (var=0.56%) (15411.0%)
graphics.imaging.tests.drawimagescaleup,graphics.opts.sizes=250:
3byte_noopt: 1087.896986 (var=1.48%) (100.0%)
3byte_opt: 233458.12958 (var=0.82%) (21459.58%)
Summary:
  3byte_noopt:
    Number of tests:  4
    Overall average:  43072.81316563192
    Best spread:      0.49% variance
    Worst spread:     2.59% variance
    (Basis for results comparison)
  3byte_opt:
    Number of tests:  4
    Overall average:  169344.90975135576
    Best spread:      0.56% variance
    Worst spread:     0.82% variance
    Comparison to basis:
      Best result:      21459.58% of basis
      Worst result:     121.31% of basis
      Number of wins:   4
      Number of ties:   0
      Number of losses: 0

ogl:
graphics.imaging.tests.drawimage,graphics.opts.sizes=1000:
3byte_noopt: 85970.39250 (var=0.78%) (100.0%)
3byte_opt: 7409.440175 (var=0.95%) (8.62%)
graphics.imaging.tests.drawimage,graphics.opts.sizes=250:
3byte_noopt: 80193.14868 (var=5.45%) (100.0%)
3byte_opt: 5703.422053 (var=21.79%) (7.11%)
graphics.imaging.tests.drawimagescaleup,graphics.opts.sizes=1000:
3byte_noopt: 5492.270138 (var=1.89%) (100.0%)
3byte_opt: 319366.47955 (var=0.89%) (5814.84%)
graphics.imaging.tests.drawimagescaleup,graphics.opts.sizes=250:
3byte_noopt: 2676.494431 (var=6.58%) (100.0%)
3byte_opt: 313342.59059 (var=4.39%) (11707.2%)
Summary:
  3byte_noopt:
    Number of tests:  4
    Overall average:  43583.076440138895
    Best spread:      0.78% variance
    Worst spread:     6.58% variance
    (Basis for results comparison)
  3byte_opt:
    Number of tests:  4
    Overall average:  161455.48309378154
    Best spread:      0.89% variance
    Worst spread:     21.79% variance
    Comparison to basis:
      Best result:      11707.2% of basis
      Worst result:     7.11% of basis
      Number of wins:   2
      Number of ties:   0
      Number of losses: 2
                                     
2008-08-20
EVALUATION

J2DBench results for d3d, optimized vs non-optimized:

graphics.imaging.tests.drawimage,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=1000:
d3d_3byte_noopt: 86677.36757 (var=0.56%) (100.0%)
d3d_3byte_opt: 104417.67068 (var=0.77%) (120.47%)
graphics.imaging.tests.drawimage,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=250:
d3d_3byte_noopt: 84648.67617 (var=1.55%) (100.0%)
d3d_3byte_opt: 103302.07501 (var=0.92%) (122.04%)
graphics.imaging.tests.drawimagescaledown,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=1000:
d3d_3byte_noopt: 1433.121019 (var=1.24%) (100.0%)
d3d_3byte_opt: 58772.01761 (var=0.6%) (4100.98%)
graphics.imaging.tests.drawimagescaledown,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=250:
d3d_3byte_noopt: 995.8624898 (var=0.45%) (100.0%)
d3d_3byte_opt: 57882.63793 (var=1.13%) (5812.31%)
graphics.imaging.tests.drawimagescaleup,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=1000:
d3d_3byte_noopt: 1571.229050 (var=2.45%) (100.0%)
d3d_3byte_opt: 234860.55776 (var=0.4%) (14947.57%)
graphics.imaging.tests.drawimagescaleup,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=250:
d3d_3byte_noopt: 1092.344644 (var=1.19%) (100.0%)
d3d_3byte_opt: 231106.37876 (var=0.64%) (21156.91%)
graphics.imaging.tests.drawimagetxform,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=1000:
d3d_3byte_noopt: 1718.75 (var=1.19%) (100.0%)
d3d_3byte_opt: 126001.58982 (var=0.92%) (7331.0%)
graphics.imaging.tests.drawimagetxform,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=250:
d3d_3byte_noopt: 1418.912175 (var=1.17%) (100.0%)
d3d_3byte_opt: 123388.15789 (var=0.89%) (8695.97%)

Results for OGL, inlcuding loops which we know are slower
and won't be inlcuded in the fix:

graphics.imaging.tests.drawimage,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=1000:
ogl_3byte_noopt: 86021.50537 (var=0.56%) (100.0%)
ogl_3byte_opt: 7416.563658 (var=0.33%) (8.62%)
graphics.imaging.tests.drawimage,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=250:
ogl_3byte_noopt: 80210.42084 (var=0.81%) (100.0%)
ogl_3byte_opt: 5773.092369 (var=1.17%) (7.2%)
graphics.imaging.tests.drawimagescaledown,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=1000:
ogl_3byte_noopt: 3155.048076 (var=0.64%) (100.0%)
ogl_3byte_opt: 2708.667736 (var=0.89%) (85.85%)
graphics.imaging.tests.drawimagescaledown,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=250:
ogl_3byte_noopt: 1621.264588 (var=1.91%) (100.0%)
ogl_3byte_opt: 2673.937004 (var=1.0%) (164.93%)
graphics.imaging.tests.drawimagescaleup,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=1000:
ogl_3byte_noopt: 5601.659751 (var=1.25%) (100.0%)
ogl_3byte_opt: 10856.45355 (var=0.89%) (193.81%)
graphics.imaging.tests.drawimagescaleup,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=250:
ogl_3byte_noopt: 2662.923045 (var=0.6%) (100.0%)
ogl_3byte_opt: 10749.11347 (var=1.56%) (403.66%)
graphics.imaging.tests.drawimagetxform,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=1000:
ogl_3byte_noopt: 3942.973523 (var=0.41%) (100.0%)
ogl_3byte_opt: 5826.645264 (var=1.38%) (147.77%)
graphics.imaging.tests.drawimagetxform,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=250:
ogl_3byte_noopt: 2360.369609 (var=1.32%) (100.0%)
ogl_3byte_opt: 5833.668139 (var=0.84%) (247.15%)
                                     
2008-08-21
EVALUATION

Ok, here's the data with the latest version of the fix (full
J2DBench results files attached):

I have fixed J2DBench to add a set of
  drawImage+touch tests. But currently one would have
  to specify accthreshold=0 to properly test texture
  uploads (SwToTexture) because otherwise we'd be
  testing the 'unmanaged image' case (SwToSurface).

  So I ran the benchmarks, and now most variants show
  improvement, and the results are consistent between
  ogl and d3d.

  I tested
    unmanaged scale/blit/tx
    'managed, touched on each iteration' scale/blit/tx

  d3d_3byte_noopt:
    Number of tests:  32
    Overall average:  1371683.1740006404
    Best spread:      0.04% variance
    Worst spread:     8.94% variance
    (Basis for results comparison)
  d3d_3byte_opt:
    Number of tests:  32
    Overall average:  1434948.8126830158
    Best spread:      0.0% variance
    Worst spread:     1.58% variance
    Comparison to basis:
      Best result:      20532.49% of basis
      Worst result:     99.58% of basis
      Number of wins:   24
      Number of ties:   8
      Number of losses: 0

  ogl_3byte_noopt:
    Number of tests:  32
    Overall average:  557509.3068086591
    Best spread:      0.04% variance
    Worst spread:     1.31% variance
    (Basis for results comparison)
  ogl_3byte_opt:
    Number of tests:  32
    Overall average:  659663.2264295145
    Best spread:      0.04% variance
    Worst spread:     4.64% variance
    Comparison to basis:
      Best result:      11709.7% of basis
      Worst result:     99.96% of basis
      Number of wins:   24
      Number of ties:   8
      Number of losses: 0

  The ties are "managed, untouched" cases, where we only pay
  the penalty of missing loops once when uploading to the texture
  for the first time.
                                     
2008-08-21
SUGGESTED FIX

http://hg.openjdk.java.net/jdk7/jdk7/rev/cd88b4ad7f25, http://sa.sfbay.sun.com/projects/java2d_data/7/6739267.2
                                     
2008-08-28



Hardware and Software, Engineered to Work Together