JDK-6529101 : OGL: antialiasing performance needs to be improved
  • Type: Bug
  • Component: client-libs
  • Sub-Component: 2d
  • Affected Version: 6
  • Priority: P4
  • Status: Closed
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2007-02-27
  • Updated: 2011-03-08
  • Resolved: 2011-03-08
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 6 JDK 7
6u4Fixed 7 b14Fixed
Description
As discussed on this forum thread, AA performance could be improved when the OGL
pipeline is enabled:
http://forums.java.net/jive/thread.jspa?threadID=23548&tstart=0

This bug report doesn't recommend one particular way to improve performance.  There
are likely a number of things we could do that would add up to some real
performance gains.

Comments
EVALUATION (cont'd...) text.Rendering.tests.drawString,text.opts.data.tlength=128,text.opts.font.fsize=13.0: base.1100: 2880.499152 (var=0.1%) (100.0%) fix.1100: 3324.782841 (var=0.03%) (115.42%) text.Rendering.tests.drawString,text.opts.data.tlength=128,text.opts.font.fsize=36.0: base.1100: 148.9793605 (var=0.28%) (100.0%) fix.1100: 165.1497197 (var=0.36%) (110.85%) text.Rendering.tests.drawString,text.opts.data.tlength=16,text.opts.font.fsize=13.0: base.1100: 1968.584026 (var=0.25%) (100.0%) fix.1100: 2910.673605 (var=0.78%) (147.86%) text.Rendering.tests.drawString,text.opts.data.tlength=16,text.opts.font.fsize=36.0: base.1100: 133.3463414 (var=0.31%) (100.0%) fix.1100: 151.1997249 (var=1.0%) (113.39%) text.Rendering.tests.drawString,text.opts.data.tlength=64,text.opts.font.fsize=13.0: base.1100: 2697.500674 (var=0.13%) (100.0%) fix.1100: 3272.727272 (var=0.2%) (121.32%) text.Rendering.tests.drawString,text.opts.data.tlength=64,text.opts.font.fsize=36.0: base.1100: 145.2682545 (var=0.27%) (100.0%) fix.1100: 162.9563200 (var=1.71%) (112.18%) Summary: base.1100: Number of tests: 78 Overall average: 150090.2789310724 Best spread: 0.0% variance Worst spread: 10.83% variance (Basis for results comparison) fix.1100: Number of tests: 78 Overall average: 178987.5180214179 Best spread: 0.0% variance Worst spread: 10.31% variance Comparison to basis: Best result: 6896.97% of basis Worst result: 86.12% of basis Number of wins: 60 Number of ties: 11 Number of losses: 7
03-05-2007

EVALUATION I ended up making a lot of changes, for various reasons. Unfortunately these don't help improve performance of the simple cases (e.g. antialiased drawLines) all that much, but there are huge improvements in the more complex cases, and overall the code is actually being simplified and will open the doors for future performance improvements. We will continue to work on improving antialiased rendering performance in future putbacks. Now, on to the specific changes... There are a few ways performance is improved with this fix: 1. Avoid redundant calls to glBindTexture(). We do this by adding two new state tokens (one for mask operations, one for glyph operations). This helps quite a bit (5-10% improvement) for cases like large AA operations where there are 10's or 100's of OGLMaskFill ops in a row and the mask texture is staying constant, or for grayscale text operations where the glyph cache texture is used over and over again. 2. Introduce vertex caching for mask/glyph operations (see OGLVertexCache.c/h). This allows us to batch up a bunch of mask tiles (up to 31 masks; that number chosen because that's about how many 1024 byte masks can fit on our existing RenderBuffer) or cached glyphs (up to 256 glyph quads). This is a bit more streamlined than our existing approach: glTexSubImage2D(); glBegin(GL_QUADS); glVertex2f(); // x4 glEnd(); // repeat over and over Now we have: glTexSubImage2D(); // repeat up to 31 times and then a single call to: glDrawArrays(...); to flush the vertex cache. The main savings here is in not doing so many glBegin/End() sequences. It only provides a small gain for mask operations because 31 quads isn't exactly a huge batch, but for long batches of text it can be quite beneficial. The code is actually simplified quite a bit in OGLMaskFill and OGLTextRenderer because of this change. 3. Change the way we validate color/paint state at the Java level. Before we only would validate single "int pixel" values in BufferedContext.validate(). So changes in Color would go through that path, but changes in Paint (such as GradientPaint, TexturePaint, etc) would be handled specially in OGLPaints in a rather dumb way. Before every operation (even if the Paint hasn't changed since last time), we would: a. setup/enable the paint state at the native level b. issue the rendering operation c. disable the paint state at the native level This approach is really bad for cases like large AA shapes where we may issue 10's or 100's of OGLMaskFill calls, each time setting up and tearing down the same paint state redundantly! It's also a total waste for small non-AA operations where the Paint is staying constant. By pushing Paint validation into BufferedContext, we're able to now only setup the Paint once when it has changed, which greatly reduces the amount of work we have to do in the above cases. There are gains here anywhere from 2x to 200x depending on the operation. (It's really only a win for the non-Color cases; we don't get any improvement here for the simple Color case, unfortunately/obviously.) Anyway, there are no major performance regressions associated with this fix; things are only getting better (or staying the same, in the case of non-AA operations). There are also a number of changes here that move lots of Java-level setup code into shared code, where it will be used by the new D3D pipeline. The goal is to get as many of these shared code changes into the various workspaces so that we can rely on them in the near future. Besides code sharing, there are also some concepts like mask/glyph caching that I expect will be using ASAP in the D3D pipeline. And now for the detailed performance numbers... I've attached comprehensive J2DBench results for two configurations to this bug report: - Windows XP, 2x 2.8 GHz P4, ATI Radeon 9800 Pro, Catalyst 7.3 - Solaris 10, 2.0 GHz Opteron, Nvidia Quadro FX 1100, 97.76 The numbers look similar for other platforms/hardware. I've generated numbers for "def" (the default pipeline), "base" (the OGL pipeline in JDK 7-b12), and "fix" (the workspace with my latest changes). I'll include the "base" vs "fix" numbers here inline. By comparing to the "def" numbers, it is clear that there are still some cases where the OGL pipeline isn't as fast as the default/software-based pipelines; in many other cases, the OGL pipeline blows them away. As I stated earlier, we will continue to tune the OGL pipeline with the hope of being faster in all cases.
03-05-2007

EVALUATION Config #1: Windows XP, 2x 2.8 GHz P4, ATI Radeon 9800 Pro, Catalyst 7.3 Options common across all tests: text.opts.advopts.gvstyle=0 text.opts.font.fname=Lucida Sans text.opts.graphics.textaa=On global.dest=VolatileImg text.opts.graphics.tfm=false graphics.opts.anim=2 graphics.opts.renderhint=Default graphics.opts.clip=false text.opts.font.ftx=Identity text.opts.advopts.maptype=FONT text.opts.graphics.gtx=Identity text.opts.graphics.gaa=false graphics.opts.extraalpha=false graphics.render.opts.alphacolor=false graphics.opts.alpharule=SrcOver text.opts.advopts.tlruns=1 text.opts.font.fstyle=0 graphics.opts.xormode=false text.opts.data.tscript=english graphics.render.tests.drawLine,graphics.opts.sizes=1,graphics.render.opts.antialias=false,graphics.render.opts.paint=gradient2: base.9800: 201.5039999 (var=2.06%) (100.0%) fix.9800: 2292.850146 (var=0.53%) (1137.87%) graphics.render.tests.drawLine,graphics.opts.sizes=1,graphics.render.opts.antialias=false,graphics.render.opts.paint=radial3: base.9800: 31.45661641 (var=0.54%) (100.0%) fix.9800: 2154.952917 (var=0.03%) (6850.56%) graphics.render.tests.drawLine,graphics.opts.sizes=1,graphics.render.opts.antialias=false,graphics.render.opts.paint=random: base.9800: 1861.184125 (var=0.0%) (100.0%) fix.9800: 1831.173383 (var=0.03%) (98.39%) graphics.render.tests.drawLine,graphics.opts.sizes=1,graphics.render.opts.antialias=false,graphics.render.opts.paint=single: base.9800: 2497.170588 (var=1.66%) (100.0%) fix.9800: 2660.681366 (var=0.56%) (106.55%) graphics.render.tests.drawLine,graphics.opts.sizes=1,graphics.render.opts.antialias=true,graphics.render.opts.paint=gradient2: base.9800: 33.70633333 (var=0.53%) (100.0%) fix.9800: 46.85638474 (var=0.0%) (139.01%) graphics.render.tests.drawLine,graphics.opts.sizes=1,graphics.render.opts.antialias=true,graphics.render.opts.paint=radial3: base.9800: 17.31323283 (var=1.59%) (100.0%) fix.9800: 47.09182305 (var=0.03%) (272.0%) graphics.render.tests.drawLine,graphics.opts.sizes=1,graphics.render.opts.antialias=true,graphics.render.opts.paint=random: base.9800: 45.53433333 (var=0.5%) (100.0%) fix.9800: 46.53467336 (var=0.54%) (102.2%) graphics.render.tests.drawLine,graphics.opts.sizes=1,graphics.render.opts.antialias=true,graphics.render.opts.paint=single: base.9800: 46.75753471 (var=0.0%) (100.0%) fix.9800: 47.57774798 (var=0.03%) (101.75%) graphics.render.tests.drawLine,graphics.opts.sizes=20,graphics.render.opts.antialias=false,graphics.render.opts.paint=gradient2: base.9800: 3886.139999 (var=0.53%) (100.0%) fix.9800: 46888.13581 (var=0.52%) (1206.55%) graphics.render.tests.drawLine,graphics.opts.sizes=20,graphics.render.opts.antialias=false,graphics.render.opts.paint=radial3: base.9800: 619.1774461 (var=0.03%) (100.0%) fix.9800: 42490.56666 (var=0.0%) (6862.42%) graphics.render.tests.drawLine,graphics.opts.sizes=20,graphics.render.opts.antialias=false,graphics.render.opts.paint=random: base.9800: 35269.15333 (var=0.0%) (100.0%) fix.9800: 33131.48135 (var=0.0%) (93.94%) graphics.render.tests.drawLine,graphics.opts.sizes=20,graphics.render.opts.antialias=false,graphics.render.opts.paint=single: base.9800: 47160.94818 (var=0.03%) (100.0%) fix.9800: 46097.80111 (var=0.0%) (97.75%) graphics.render.tests.drawLine,graphics.opts.sizes=20,graphics.render.opts.antialias=true,graphics.render.opts.paint=gradient2: base.9800: 527.5399999 (var=0.0%) (100.0%) fix.9800: 679.4333333 (var=0.0%) (128.79%) graphics.render.tests.drawLine,graphics.opts.sizes=20,graphics.render.opts.antialias=true,graphics.render.opts.paint=radial3: base.9800: 289.7866666 (var=0.0%) (100.0%) fix.9800: 679.0733333 (var=0.0%) (234.34%) graphics.render.tests.drawLine,graphics.opts.sizes=20,graphics.render.opts.antialias=true,graphics.render.opts.paint=random: base.9800: 651.5199999 (var=0.0%) (100.0%) fix.9800: 673.7733333 (var=0.0%) (103.42%) graphics.render.tests.drawLine,graphics.opts.sizes=20,graphics.render.opts.antialias=true,graphics.render.opts.paint=single: base.9800: 663.0800000 (var=0.53%) (100.0%) fix.9800: 683.0764075 (var=0.54%) (103.02%) graphics.render.tests.drawLine,graphics.opts.sizes=250,graphics.render.opts.antialias=false,graphics.render.opts.paint=gradient2: base.9800: 46400.55999 (var=0.48%) (100.0%) fix.9800: 470531.25 (var=0.0%) (1014.06%) graphics.render.tests.drawLine,graphics.opts.sizes=250,graphics.render.opts.antialias=false,graphics.render.opts.paint=radial3: base.9800: 7782.422922 (var=0.03%) (100.0%) fix.9800: 43673.92761 (var=0.03%) (561.19%) graphics.render.tests.drawLine,graphics.opts.sizes=250,graphics.render.opts.antialias=false,graphics.render.opts.paint=random: base.9800: 445811.27272 (var=0.0%) (100.0%) fix.9800: 417928.30253 (var=0.0%) (93.75%) graphics.render.tests.drawLine,graphics.opts.sizes=250,graphics.render.opts.antialias=false,graphics.render.opts.paint=single: base.9800: 470689.0 (var=0.0%) (100.0%) fix.9800: 471119.58333 (var=0.0%) (100.09%) graphics.render.tests.drawLine,graphics.opts.sizes=250,graphics.render.opts.antialias=true,graphics.render.opts.paint=gradient2: base.9800: 369.0109514 (var=0.0%) (100.0%) fix.9800: 527.8530308 (var=0.58%) (143.05%) graphics.render.tests.drawLine,graphics.opts.sizes=250,graphics.render.opts.antialias=true,graphics.render.opts.paint=radial3: base.9800: 171.1937716 (var=0.03%) (100.0%) fix.9800: 531.3240772 (var=0.03%) (310.36%) graphics.render.tests.drawLine,graphics.opts.sizes=250,graphics.render.opts.antialias=true,graphics.render.opts.paint=random: base.9800: 509.9898408 (var=0.03%) (100.0%) fix.9800: 530.9007788 (var=0.03%) (104.1%) graphics.render.tests.drawLine,graphics.opts.sizes=250,graphics.render.opts.antialias=true,graphics.render.opts.paint=single: base.9800: 512.4234172 (var=0.03%) (100.0%) fix.9800: 531.3240772 (var=0.0%) (103.69%) graphics.render.tests.fillOval,graphics.opts.sizes=1,graphics.render.opts.antialias=false,graphics.render.opts.paint=gradient2: base.9800: 47.40766666 (var=0.5%) (100.0%) fix.9800: 66.28090266 (var=1.59%) (139.81%) graphics.render.tests.fillOval,graphics.opts.sizes=1,graphics.render.opts.antialias=false,graphics.render.opts.paint=radial3: base.9800: 20.15366666 (var=0.0%) (100.0%) fix.9800: 65.92400000 (var=0.0%) (327.11%) graphics.render.tests.fillOval,graphics.opts.sizes=1,graphics.render.opts.antialias=false,graphics.render.opts.paint=random: base.9800: 65.78884856 (var=1.03%) (100.0%) fix.9800: 64.42766666 (var=0.0%) (97.93%) graphics.render.tests.fillOval,graphics.opts.sizes=1,graphics.render.opts.antialias=false,graphics.render.opts.paint=single: base.9800: 66.81544192 (var=0.0%) (100.0%) fix.9800: 65.77594339 (var=0.54%) (98.44%) graphics.render.tests.fillOval,graphics.opts.sizes=1,graphics.render.opts.antialias=true,graphics.render.opts.paint=gradient2: base.9800: 38.02177554 (var=0.54%) (100.0%) fix.9800: 56.31233243 (var=0.03%) (148.11%) graphics.render.tests.fillOval,graphics.opts.sizes=1,graphics.render.opts.antialias=true,graphics.render.opts.paint=radial3: base.9800: 18.22466666 (var=0.5%) (100.0%) fix.9800: 56.46883378 (var=0.03%) (309.85%) graphics.render.tests.fillOval,graphics.opts.sizes=1,graphics.render.opts.antialias=true,graphics.render.opts.paint=random: base.9800: 52.19266666 (var=0.5%) (100.0%) fix.9800: 55.24733333 (var=0.0%) (105.85%) graphics.render.tests.fillOval,graphics.opts.sizes=1,graphics.render.opts.antialias=true,graphics.render.opts.paint=single: base.9800: 53.38391959 (var=0.54%) (100.0%) fix.9800: 56.44700000 (var=0.5%) (105.74%) graphics.render.tests.fillOval,graphics.opts.sizes=20,graphics.render.opts.antialias=false,graphics.render.opts.paint=gradient2: base.9800: 11022.29413 (var=0.54%) (100.0%) fix.9800: 14078.08533 (var=0.0%) (127.72%) graphics.render.tests.fillOval,graphics.opts.sizes=20,graphics.render.opts.antialias=false,graphics.render.opts.paint=radial3: base.9800: 5438.061333 (var=0.0%) (100.0%) fix.9800: 13989.03315 (var=0.53%) (257.24%) graphics.render.tests.fillOval,graphics.opts.sizes=20,graphics.render.opts.antialias=false,graphics.render.opts.paint=random: base.9800: 14148.52599 (var=0.0%) (100.0%) fix.9800: 13840.28266 (var=0.0%) (97.82%) graphics.render.tests.fillOval,graphics.opts.sizes=20,graphics.render.opts.antialias=false,graphics.render.opts.paint=single: base.9800: 14297.78066 (var=0.53%) (100.0%) fix.9800: 14138.68733 (var=0.0%) (98.89%) graphics.render.tests.fillOval,graphics.opts.sizes=20,graphics.render.opts.antialias=true,graphics.render.opts.paint=gradient2: base.9800: 7886.922613 (var=0.54%) (100.0%) fix.9800: 10229.45728 (var=0.03%) (129.7%) graphics.render.tests.fillOval,graphics.opts.sizes=20,graphics.render.opts.antialias=true,graphics.render.opts.paint=radial3: base.9800: 4487.583333 (var=0.54%) (100.0%) fix.9800: 10229.51809 (var=0.03%) (227.95%) graphics.render.tests.fillOval,graphics.opts.sizes=20,graphics.render.opts.antialias=true,graphics.render.opts.paint=random: base.9800: 9575.011333 (var=0.54%) (100.0%) fix.9800: 10131.52399 (var=0.0%) (105.81%) graphics.render.tests.fillOval,graphics.opts.sizes=20,graphics.render.opts.antialias=true,graphics.render.opts.paint=single: base.9800: 9683.131999 (var=0.0%) (100.0%) fix.9800: 10228.86399 (var=0.0%) (105.64%) graphics.render.tests.fillOval,graphics.opts.sizes=250,graphics.render.opts.antialias=false,graphics.render.opts.paint=gradient2: base.9800: 638425.52199 (var=0.0%) (100.0%) fix.9800: 717392.83355 (var=0.53%) (112.37%) graphics.render.tests.fillOval,graphics.opts.sizes=250,graphics.render.opts.antialias=false,graphics.render.opts.paint=radial3: base.9800: 65138.44900 (var=0.0%) (100.0%) fix.9800: 65465.69566 (var=0.0%) (100.5%) graphics.render.tests.fillOval,graphics.opts.sizes=250,graphics.render.opts.antialias=false,graphics.render.opts.paint=random: base.9800: 721218.92866 (var=0.0%) (100.0%) fix.9800: 713839.51633 (var=0.0%) (98.98%) graphics.render.tests.fillOval,graphics.opts.sizes=250,graphics.render.opts.antialias=false,graphics.render.opts.paint=single: base.9800: 726962.10766 (var=0.0%) (100.0%) fix.9800: 720597.16000 (var=0.0%) (99.12%) graphics.render.tests.fillOval,graphics.opts.sizes=250,graphics.render.opts.antialias=true,graphics.render.opts.paint=gradient2: base.9800: 42869.31333 (var=0.0%) (100.0%) fix.9800: 73532.32600 (var=0.5%) (171.53%) graphics.render.tests.fillOval,graphics.opts.sizes=250,graphics.render.opts.antialias=true,graphics.render.opts.paint=radial3: base.9800: 16935.01499 (var=0.5%) (100.0%) fix.9800: 73925.02199 (var=0.0%) (436.52%) graphics.render.tests.fillOval,graphics.opts.sizes=250,graphics.render.opts.antialias=true,graphics.render.opts.paint=random: base.9800: 65809.30466 (var=0.0%) (100.0%) fix.9800: 73926.60120 (var=0.03%) (112.33%) graphics.render.tests.fillOval,graphics.opts.sizes=250,graphics.render.opts.antialias=true,graphics.render.opts.paint=single: base.9800: 65858.39166 (var=0.0%) (100.0%) fix.9800: 73925.02199 (var=0.0%) (112.25%) graphics.render.tests.fillRect,graphics.opts.sizes=1,graphics.render.opts.antialias=false,graphics.render.opts.paint=gradient2: base.9800: 171.8133373 (var=0.03%) (100.0%) fix.9800: 2402.098507 (var=0.03%) (1398.09%) graphics.render.tests.fillRect,graphics.opts.sizes=1,graphics.render.opts.antialias=false,graphics.render.opts.paint=radial3: base.9800: 30.83800000 (var=0.54%) (100.0%) fix.9800: 2218.189046 (var=0.0%) (7193.04%) graphics.render.tests.fillRect,graphics.opts.sizes=1,graphics.render.opts.antialias=false,graphics.render.opts.paint=random: base.9800: 1635.320352 (var=0.0%) (100.0%) fix.9800: 1652.669746 (var=0.03%) (101.06%) graphics.render.tests.fillRect,graphics.opts.sizes=1,graphics.render.opts.antialias=false,graphics.render.opts.paint=single: base.9800: 2341.066353 (var=0.03%) (100.0%) fix.9800: 2452.532001 (var=0.0%) (104.76%) graphics.render.tests.fillRect,graphics.opts.sizes=1,graphics.render.opts.antialias=true,graphics.render.opts.paint=gradient2: base.9800: 40.62177554 (var=0.54%) (100.0%) fix.9800: 61.82412060 (var=0.54%) (152.19%) graphics.render.tests.fillRect,graphics.opts.sizes=1,graphics.render.opts.antialias=true,graphics.render.opts.paint=radial3: base.9800: 18.80799999 (var=0.0%) (100.0%) fix.9800: 61.82412060 (var=0.5%) (328.71%) graphics.render.tests.fillRect,graphics.opts.sizes=1,graphics.render.opts.antialias=true,graphics.render.opts.paint=random: base.9800: 56.79932998 (var=0.54%) (100.0%) fix.9800: 60.06635388 (var=0.03%) (105.75%) graphics.render.tests.fillRect,graphics.opts.sizes=1,graphics.render.opts.antialias=true,graphics.render.opts.paint=single: base.9800: 57.93937352 (var=0.54%) (100.0%) fix.9800: 61.85752778 (var=0.0%) (106.76%) graphics.render.tests.fillRect,graphics.opts.sizes=20,graphics.render.opts.antialias=false,graphics.render.opts.paint=gradient2: base.9800: 76924.0 (var=0.0%) (100.0%) fix.9800: 981919.81036 (var=0.03%) (1276.48%) graphics.render.tests.fillRect,graphics.opts.sizes=20,graphics.render.opts.antialias=false,graphics.render.opts.paint=radial3: base.9800: 12262.28855 (var=0.53%) (100.0%) fix.9800: 109386.66666 (var=0.0%) (892.06%) graphics.render.tests.fillRect,graphics.opts.sizes=20,graphics.render.opts.antialias=false,graphics.render.opts.paint=random: base.9800: 653234.71749 (var=0.03%) (100.0%) fix.9800: 673207.70553 (var=0.03%) (103.06%) graphics.render.tests.fillRect,graphics.opts.sizes=20,graphics.render.opts.antialias=false,graphics.render.opts.paint=single: base.9800: 880836.70843 (var=0.0%) (100.0%) fix.9800: 990650.53908 (var=0.0%) (112.47%) graphics.render.tests.fillRect,graphics.opts.sizes=20,graphics.render.opts.antialias=true,graphics.render.opts.paint=gradient2: base.9800: 11860.93333 (var=0.0%) (100.0%) fix.9800: 16248.39999 (var=0.0%) (136.99%) graphics.render.tests.fillRect,graphics.opts.sizes=20,graphics.render.opts.antialias=true,graphics.render.opts.paint=radial3: base.9800: 6308.800000 (var=0.0%) (100.0%) fix.9800: 16248.39999 (var=0.0%) (257.55%) graphics.render.tests.fillRect,graphics.opts.sizes=20,graphics.render.opts.antialias=true,graphics.render.opts.paint=random: base.9800: 15007.90884 (var=0.54%) (100.0%) fix.9800: 15938.39999 (var=0.5%) (106.2%) graphics.render.tests.fillRect,graphics.opts.sizes=20,graphics.render.opts.antialias=true,graphics.render.opts.paint=single: base.9800: 15208.39999 (var=0.54%) (100.0%) fix.9800: 16335.52278 (var=0.03%) (107.41%) graphics.render.tests.fillRect,graphics.opts.sizes=250,graphics.render.opts.antialias=false,graphics.render.opts.paint=gradient2: base.9800: 2089979.89276 (var=0.03%) (100.0%) fix.9800: 2363687.5 (var=0.0%) (113.1%) graphics.render.tests.fillRect,graphics.opts.sizes=250,graphics.render.opts.antialias=false,graphics.render.opts.paint=radial3: base.9800: 127995.14075 (var=0.03%) (100.0%) fix.9800: 129416.66666 (var=0.0%) (101.11%) graphics.render.tests.fillRect,graphics.opts.sizes=250,graphics.render.opts.antialias=false,graphics.render.opts.paint=random: base.9800: 2363687.5 (var=0.0%) (100.0%) fix.9800: 2363687.5 (var=0.0%) (100.0%) graphics.render.tests.fillRect,graphics.opts.sizes=250,graphics.render.opts.antialias=false,graphics.render.opts.paint=single: base.9800: 2363687.5 (var=0.0%) (100.0%) fix.9800: 2364479.16666 (var=0.0%) (100.03%) graphics.render.tests.fillRect,graphics.opts.sizes=250,graphics.render.opts.antialias=true,graphics.render.opts.paint=gradient2: base.9800: 100062.5 (var=0.0%) (100.0%) fix.9800: 327062.5 (var=0.54%) (326.86%) graphics.render.tests.fillRect,graphics.opts.sizes=250,graphics.render.opts.antialias=true,graphics.render.opts.paint=radial3: base.9800: 25418.76046 (var=0.54%) (100.0%) fix.9800: 107248.14690 (var=0.0%) (421.93%) graphics.render.tests.fillRect,graphics.opts.sizes=250,graphics.render.opts.antialias=true,graphics.render.opts.paint=random: base.9800: 234479.16666 (var=0.54%) (100.0%) fix.9800: 327062.5 (var=0.0%) (139.48%) graphics.render.tests.fillRect,graphics.opts.sizes=250,graphics.render.opts.antialias=true,graphics.render.opts.paint=single: base.9800: 235845.89614 (var=0.54%) (100.0%) fix.9800: 328812.5 (var=0.0%) (139.42%)
03-05-2007

EVALUATION (cont'd...) text.Rendering.tests.drawString,text.opts.data.tlength=128,text.opts.font.fsize=13.0: base.9800: 2635.594988 (var=0.54%) (100.0%) fix.9800: 3128.503723 (var=0.54%) (118.7%) text.Rendering.tests.drawString,text.opts.data.tlength=128,text.opts.font.fsize=36.0: base.9800: 271.5737574 (var=0.04%) (100.0%) fix.9800: 338.2261402 (var=0.04%) (124.54%) text.Rendering.tests.drawString,text.opts.data.tlength=16,text.opts.font.fsize=13.0: base.9800: 2011.474530 (var=1.08%) (100.0%) fix.9800: 2797.095318 (var=0.54%) (139.06%) text.Rendering.tests.drawString,text.opts.data.tlength=16,text.opts.font.fsize=36.0: base.9800: 247.4717494 (var=0.03%) (100.0%) fix.9800: 315.8496444 (var=0.0%) (127.63%) text.Rendering.tests.drawString,text.opts.data.tlength=64,text.opts.font.fsize=13.0: base.9800: 2505.093833 (var=0.03%) (100.0%) fix.9800: 3026.371967 (var=0.0%) (120.81%) text.Rendering.tests.drawString,text.opts.data.tlength=64,text.opts.font.fsize=36.0: base.9800: 269.1703435 (var=0.04%) (100.0%) fix.9800: 340.9086384 (var=0.04%) (126.65%) Summary: base.9800: Number of tests: 78 Overall average: 162964.87022159933 Best spread: 0.0% variance Worst spread: 2.06% variance (Basis for results comparison) fix.9800: Number of tests: 78 Overall average: 196744.76896701416 Best spread: 0.0% variance Worst spread: 1.59% variance Comparison to basis: Best result: 7193.04% of basis Worst result: 93.75% of basis Number of wins: 60 Number of ties: 16 Number of losses: 2
03-05-2007

EVALUATION Config #2: Solaris 10, 2.0 GHz Opteron, Nvidia Quadro FX 1100, 97.76 Options common across all tests: text.opts.advopts.gvstyle=0 text.opts.graphics.textaa=On text.opts.font.fname=Lucida Sans global.dest=VolatileImg text.opts.graphics.tfm=false graphics.opts.anim=2 graphics.opts.renderhint=Default graphics.opts.clip=false text.opts.font.ftx=Identity text.opts.advopts.maptype=FONT text.opts.graphics.gtx=Identity text.opts.graphics.gaa=false graphics.opts.extraalpha=false graphics.render.opts.alphacolor=false graphics.opts.alpharule=SrcOver text.opts.advopts.tlruns=1 text.opts.font.fstyle=0 graphics.opts.xormode=false text.opts.data.tscript=english graphics.render.tests.drawLine,graphics.opts.sizes=1,graphics.render.opts.antialias=false,graphics.render.opts.paint=gradient2: base.1100: 83.58235699 (var=1.04%) (100.0%) fix.1100: 3916.111827 (var=0.1%) (4685.33%) graphics.render.tests.drawLine,graphics.opts.sizes=1,graphics.render.opts.antialias=false,graphics.render.opts.paint=radial3: base.1100: 45.57173219 (var=0.11%) (100.0%) fix.1100: 3143.068006 (var=0.03%) (6896.97%) graphics.render.tests.drawLine,graphics.opts.sizes=1,graphics.render.opts.antialias=false,graphics.render.opts.paint=random: base.1100: 2671.197310 (var=0.54%) (100.0%) fix.1100: 2427.155753 (var=0.45%) (90.86%) graphics.render.tests.drawLine,graphics.opts.sizes=1,graphics.render.opts.antialias=false,graphics.render.opts.paint=single: base.1100: 5046.914728 (var=0.56%) (100.0%) fix.1100: 4925.314899 (var=1.21%) (97.59%) graphics.render.tests.drawLine,graphics.opts.sizes=1,graphics.render.opts.antialias=true,graphics.render.opts.paint=gradient2: base.1100: 26.75960219 (var=0.38%) (100.0%) fix.1100: 44.57964748 (var=10.13%) (166.59%) graphics.render.tests.drawLine,graphics.opts.sizes=1,graphics.render.opts.antialias=true,graphics.render.opts.paint=radial3: base.1100: 20.86790540 (var=0.51%) (100.0%) fix.1100: 44.65165165 (var=0.1%) (213.97%) graphics.render.tests.drawLine,graphics.opts.sizes=1,graphics.render.opts.antialias=true,graphics.render.opts.paint=random: base.1100: 37.81812102 (var=0.17%) (100.0%) fix.1100: 43.10896367 (var=0.13%) (113.99%) graphics.render.tests.drawLine,graphics.opts.sizes=1,graphics.render.opts.antialias=true,graphics.render.opts.paint=single: base.1100: 38.32694355 (var=0.28%) (100.0%) fix.1100: 44.62287982 (var=0.29%) (116.43%) graphics.render.tests.drawLine,graphics.opts.sizes=20,graphics.render.opts.antialias=false,graphics.render.opts.paint=gradient2: base.1100: 1662.588392 (var=0.33%) (100.0%) fix.1100: 80379.92695 (var=0.03%) (4834.63%) graphics.render.tests.drawLine,graphics.opts.sizes=20,graphics.render.opts.antialias=false,graphics.render.opts.paint=radial3: base.1100: 910.0765391 (var=0.33%) (100.0%) fix.1100: 14477.02269 (var=0.0%) (1590.75%) graphics.render.tests.drawLine,graphics.opts.sizes=20,graphics.render.opts.antialias=false,graphics.render.opts.paint=random: base.1100: 52197.35432 (var=0.5%) (100.0%) fix.1100: 49382.04819 (var=0.54%) (94.61%) graphics.render.tests.drawLine,graphics.opts.sizes=20,graphics.render.opts.antialias=false,graphics.render.opts.paint=single: base.1100: 84414.76126 (var=1.29%) (100.0%) fix.1100: 81337.62062 (var=0.04%) (96.35%) graphics.render.tests.drawLine,graphics.opts.sizes=20,graphics.render.opts.antialias=true,graphics.render.opts.paint=gradient2: base.1100: 430.6057596 (var=0.23%) (100.0%) fix.1100: 643.3920851 (var=0.23%) (149.42%) graphics.render.tests.drawLine,graphics.opts.sizes=20,graphics.render.opts.antialias=true,graphics.render.opts.paint=radial3: base.1100: 352.6600000 (var=0.3%) (100.0%) fix.1100: 642.5806451 (var=0.2%) (182.21%) graphics.render.tests.drawLine,graphics.opts.sizes=20,graphics.render.opts.antialias=true,graphics.render.opts.paint=random: base.1100: 571.2840984 (var=0.37%) (100.0%) fix.1100: 627.3131849 (var=0.27%) (109.81%) graphics.render.tests.drawLine,graphics.opts.sizes=20,graphics.render.opts.antialias=true,graphics.render.opts.paint=single: base.1100: 578.4876748 (var=0.1%) (100.0%) fix.1100: 642.1645021 (var=0.13%) (111.01%) graphics.render.tests.drawLine,graphics.opts.sizes=250,graphics.render.opts.antialias=false,graphics.render.opts.paint=gradient2: base.1100: 20750.58275 (var=0.03%) (100.0%) fix.1100: 479703.75626 (var=0.03%) (2311.76%) graphics.render.tests.drawLine,graphics.opts.sizes=250,graphics.render.opts.antialias=false,graphics.render.opts.paint=radial3: base.1100: 11350.83333 (var=0.2%) (100.0%) fix.1100: 15255.82944 (var=0.03%) (134.4%) graphics.render.tests.drawLine,graphics.opts.sizes=250,graphics.render.opts.antialias=false,graphics.render.opts.paint=random: base.1100: 479577.54590 (var=9.52%) (100.0%) fix.1100: 479737.63782 (var=0.03%) (100.03%) graphics.render.tests.drawLine,graphics.opts.sizes=250,graphics.render.opts.antialias=false,graphics.render.opts.paint=single: base.1100: 479703.75626 (var=0.03%) (100.0%) fix.1100: 479577.12854 (var=0.03%) (99.97%) graphics.render.tests.drawLine,graphics.opts.sizes=250,graphics.render.opts.antialias=true,graphics.render.opts.paint=gradient2: base.1100: 269.1049913 (var=0.21%) (100.0%) fix.1100: 434.8603542 (var=0.96%) (161.6%) graphics.render.tests.drawLine,graphics.opts.sizes=250,graphics.render.opts.antialias=true,graphics.render.opts.paint=radial3: base.1100: 207.6417704 (var=10.83%) (100.0%) fix.1100: 347.1423740 (var=0.03%) (167.18%) graphics.render.tests.drawLine,graphics.opts.sizes=250,graphics.render.opts.antialias=true,graphics.render.opts.paint=random: base.1100: 376.4102564 (var=0.24%) (100.0%) fix.1100: 433.6177474 (var=0.1%) (115.2%) graphics.render.tests.drawLine,graphics.opts.sizes=250,graphics.render.opts.antialias=true,graphics.render.opts.paint=single: base.1100: 377.3890784 (var=0.17%) (100.0%) fix.1100: 433.4072305 (var=0.34%) (114.84%) graphics.render.tests.fillOval,graphics.opts.sizes=1,graphics.render.opts.antialias=false,graphics.render.opts.paint=gradient2: base.1100: 82.98881355 (var=2.8%) (100.0%) fix.1100: 114.9562093 (var=1.62%) (138.52%) graphics.render.tests.fillOval,graphics.opts.sizes=1,graphics.render.opts.antialias=false,graphics.render.opts.paint=radial3: base.1100: 60.57806627 (var=0.34%) (100.0%) fix.1100: 114.5581629 (var=0.81%) (189.11%) graphics.render.tests.fillOval,graphics.opts.sizes=1,graphics.render.opts.antialias=false,graphics.render.opts.paint=random: base.1100: 110.2422297 (var=0.17%) (100.0%) fix.1100: 112.7843530 (var=0.54%) (102.31%) graphics.render.tests.fillOval,graphics.opts.sizes=1,graphics.render.opts.antialias=false,graphics.render.opts.paint=single: base.1100: 112.4370460 (var=0.61%) (100.0%) fix.1100: 115.7366609 (var=1.17%) (102.93%) graphics.render.tests.fillOval,graphics.opts.sizes=1,graphics.render.opts.antialias=true,graphics.render.opts.paint=gradient2: base.1100: 30.37437437 (var=0.3%) (100.0%) fix.1100: 54.50133779 (var=0.44%) (179.43%) graphics.render.tests.fillOval,graphics.opts.sizes=1,graphics.render.opts.antialias=true,graphics.render.opts.paint=radial3: base.1100: 22.96699999 (var=0.2%) (100.0%) fix.1100: 54.42928452 (var=0.3%) (236.99%) graphics.render.tests.fillOval,graphics.opts.sizes=1,graphics.render.opts.antialias=true,graphics.render.opts.paint=random: base.1100: 45.07395069 (var=0.3%) (100.0%) fix.1100: 52.74245939 (var=0.7%) (117.01%) graphics.render.tests.fillOval,graphics.opts.sizes=1,graphics.render.opts.antialias=true,graphics.render.opts.paint=single: base.1100: 45.90027991 (var=0.14%) (100.0%) fix.1100: 54.48816155 (var=0.73%) (118.71%) graphics.render.tests.fillOval,graphics.opts.sizes=20,graphics.render.opts.antialias=false,graphics.render.opts.paint=gradient2: base.1100: 11817.12717 (var=0.38%) (100.0%) fix.1100: 22863.47023 (var=0.82%) (193.48%) graphics.render.tests.fillOval,graphics.opts.sizes=20,graphics.render.opts.antialias=false,graphics.render.opts.paint=radial3: base.1100: 8530.442817 (var=1.16%) (100.0%) fix.1100: 21995.08483 (var=0.2%) (257.84%) graphics.render.tests.fillOval,graphics.opts.sizes=20,graphics.render.opts.antialias=false,graphics.render.opts.paint=random: base.1100: 21414.64981 (var=0.34%) (100.0%) fix.1100: 21832.82570 (var=1.01%) (101.95%) graphics.render.tests.fillOval,graphics.opts.sizes=20,graphics.render.opts.antialias=false,graphics.render.opts.paint=single: base.1100: 22144.24615 (var=0.94%) (100.0%) fix.1100: 22931.92927 (var=0.78%) (103.56%) graphics.render.tests.fillOval,graphics.opts.sizes=20,graphics.render.opts.antialias=true,graphics.render.opts.paint=gradient2: base.1100: 6979.320917 (var=9.65%) (100.0%) fix.1100: 10299.70173 (var=0.13%) (147.57%) graphics.render.tests.fillOval,graphics.opts.sizes=20,graphics.render.opts.antialias=true,graphics.render.opts.paint=radial3: base.1100: 5628.546848 (var=0.07%) (100.0%) fix.1100: 10290.27039 (var=0.1%) (182.82%) graphics.render.tests.fillOval,graphics.opts.sizes=20,graphics.render.opts.antialias=true,graphics.render.opts.paint=random: base.1100: 9139.770207 (var=0.2%) (100.0%) fix.1100: 10151.72560 (var=10.31%) (111.07%) graphics.render.tests.fillOval,graphics.opts.sizes=20,graphics.render.opts.antialias=true,graphics.render.opts.paint=single: base.1100: 9270.546061 (var=0.4%) (100.0%) fix.1100: 10289.24900 (var=0.17%) (110.99%) graphics.render.tests.fillOval,graphics.opts.sizes=250,graphics.render.opts.antialias=false,graphics.render.opts.paint=gradient2: base.1100: 702983.04789 (var=0.61%) (100.0%) fix.1100: 841664.86430 (var=0.03%) (119.73%) graphics.render.tests.fillOval,graphics.opts.sizes=250,graphics.render.opts.antialias=false,graphics.render.opts.paint=radial3: base.1100: 22824.30887 (var=0.0%) (100.0%) fix.1100: 22882.16938 (var=0.03%) (100.25%) graphics.render.tests.fillOval,graphics.opts.sizes=250,graphics.render.opts.antialias=false,graphics.render.opts.paint=random: base.1100: 794734.69032 (var=1.93%) (100.0%) fix.1100: 803147.93552 (var=0.31%) (101.06%) graphics.render.tests.fillOval,graphics.opts.sizes=250,graphics.render.opts.antialias=false,graphics.render.opts.paint=single: base.1100: 841789.48025 (var=0.07%) (100.0%) fix.1100: 841856.80561 (var=0.07%) (100.01%) graphics.render.tests.fillOval,graphics.opts.sizes=250,graphics.render.opts.antialias=true,graphics.render.opts.paint=gradient2: base.1100: 32866.09640 (var=0.37%) (100.0%) fix.1100: 67623.34965 (var=0.43%) (205.75%) graphics.render.tests.fillOval,graphics.opts.sizes=250,graphics.render.opts.antialias=true,graphics.render.opts.paint=radial3: base.1100: 23199.99732 (var=0.6%) (100.0%) fix.1100: 30539.68637 (var=0.03%) (131.64%) graphics.render.tests.fillOval,graphics.opts.sizes=250,graphics.render.opts.antialias=true,graphics.render.opts.paint=random: base.1100: 57221.69829 (var=0.23%) (100.0%) fix.1100: 67484.36776 (var=0.23%) (117.93%) graphics.render.tests.fillOval,graphics.opts.sizes=250,graphics.render.opts.antialias=true,graphics.render.opts.paint=single: base.1100: 57290.04144 (var=0.1%) (100.0%) fix.1100: 67506.87225 (var=0.2%) (117.83%) graphics.render.tests.fillRect,graphics.opts.sizes=1,graphics.render.opts.antialias=false,graphics.render.opts.paint=gradient2: base.1100: 87.63894455 (var=1.38%) (100.0%) fix.1100: 3015.298709 (var=0.19%) (3440.59%) graphics.render.tests.fillRect,graphics.opts.sizes=1,graphics.render.opts.antialias=false,graphics.render.opts.paint=radial3: base.1100: 47.07386363 (var=0.07%) (100.0%) fix.1100: 1571.556260 (var=0.0%) (3338.49%) graphics.render.tests.fillRect,graphics.opts.sizes=1,graphics.render.opts.antialias=false,graphics.render.opts.paint=random: base.1100: 2496.548548 (var=0.17%) (100.0%) fix.1100: 2150.119015 (var=0.13%) (86.12%) graphics.render.tests.fillRect,graphics.opts.sizes=1,graphics.render.opts.antialias=false,graphics.render.opts.paint=single: base.1100: 3733.265656 (var=0.13%) (100.0%) fix.1100: 3382.865214 (var=0.3%) (90.61%) graphics.render.tests.fillRect,graphics.opts.sizes=1,graphics.render.opts.antialias=true,graphics.render.opts.paint=gradient2: base.1100: 31.31862253 (var=0.17%) (100.0%) fix.1100: 58.59481037 (var=0.5%) (187.09%) graphics.render.tests.fillRect,graphics.opts.sizes=1,graphics.render.opts.antialias=true,graphics.render.opts.paint=radial3: base.1100: 23.61730449 (var=0.2%) (100.0%) fix.1100: 58.05931080 (var=0.17%) (245.83%) graphics.render.tests.fillRect,graphics.opts.sizes=1,graphics.render.opts.antialias=true,graphics.render.opts.paint=random: base.1100: 48.23235685 (var=0.27%) (100.0%) fix.1100: 56.85880398 (var=0.17%) (117.89%) graphics.render.tests.fillRect,graphics.opts.sizes=1,graphics.render.opts.antialias=true,graphics.render.opts.paint=single: base.1100: 48.75929203 (var=0.21%) (100.0%) fix.1100: 58.46151104 (var=0.36%) (119.9%) graphics.render.tests.fillRect,graphics.opts.sizes=20,graphics.render.opts.antialias=false,graphics.render.opts.paint=gradient2: base.1100: 35049.96686 (var=0.63%) (100.0%) fix.1100: 1206426.93558 (var=0.0%) (3442.02%) graphics.render.tests.fillRect,graphics.opts.sizes=20,graphics.render.opts.antialias=false,graphics.render.opts.paint=radial3: base.1100: 18802.66844 (var=0.37%) (100.0%) fix.1100: 38098.73248 (var=0.0%) (202.62%) graphics.render.tests.fillRect,graphics.opts.sizes=20,graphics.render.opts.antialias=false,graphics.render.opts.paint=random: base.1100: 951601.98675 (var=1.72%) (100.0%) fix.1100: 860638.36578 (var=0.19%) (90.44%) graphics.render.tests.fillRect,graphics.opts.sizes=20,graphics.render.opts.antialias=false,graphics.render.opts.paint=single: base.1100: 1376050.50234 (var=0.84%) (100.0%) fix.1100: 1216777.85737 (var=0.16%) (88.43%) graphics.render.tests.fillRect,graphics.opts.sizes=20,graphics.render.opts.antialias=true,graphics.render.opts.paint=gradient2: base.1100: 9985.165385 (var=0.37%) (100.0%) fix.1100: 15512.01602 (var=0.03%) (155.35%) graphics.render.tests.fillRect,graphics.opts.sizes=20,graphics.render.opts.antialias=true,graphics.render.opts.paint=radial3: base.1100: 7906.422628 (var=0.33%) (100.0%) fix.1100: 15356.19425 (var=0.9%) (194.22%) graphics.render.tests.fillRect,graphics.opts.sizes=20,graphics.render.opts.antialias=true,graphics.render.opts.paint=random: base.1100: 13728.95086 (var=0.23%) (100.0%) fix.1100: 15269.91327 (var=0.2%) (111.22%) graphics.render.tests.fillRect,graphics.opts.sizes=20,graphics.render.opts.antialias=true,graphics.render.opts.paint=single: base.1100: 13851.08514 (var=0.1%) (100.0%) fix.1100: 15526.37545 (var=0.2%) (112.1%) graphics.render.tests.fillRect,graphics.opts.sizes=250,graphics.render.opts.antialias=false,graphics.render.opts.paint=gradient2: base.1100: 1532074.29048 (var=0.0%) (100.0%) fix.1100: 1666645.81247 (var=0.03%) (108.78%) graphics.render.tests.fillRect,graphics.opts.sizes=250,graphics.render.opts.antialias=false,graphics.render.opts.paint=radial3: base.1100: 45061.05720 (var=0.03%) (100.0%) fix.1100: 45249.24723 (var=0.03%) (100.42%) graphics.render.tests.fillRect,graphics.opts.sizes=250,graphics.render.opts.antialias=false,graphics.render.opts.paint=random: base.1100: 1673694.52786 (var=0.03%) (100.0%) fix.1100: 1673694.52786 (var=0.03%) (100.0%) graphics.render.tests.fillRect,graphics.opts.sizes=250,graphics.render.opts.antialias=false,graphics.render.opts.paint=single: base.1100: 1673694.52786 (var=0.03%) (100.0%) fix.1100: 1673694.52786 (var=0.0%) (100.0%) graphics.render.tests.fillRect,graphics.opts.sizes=250,graphics.render.opts.antialias=true,graphics.render.opts.paint=gradient2: base.1100: 65339.95990 (var=0.77%) (100.0%) fix.1100: 296165.72096 (var=0.43%) (453.27%) graphics.render.tests.fillRect,graphics.opts.sizes=250,graphics.render.opts.antialias=true,graphics.render.opts.paint=radial3: base.1100: 32065.76305 (var=0.03%) (100.0%) fix.1100: 37955.55177 (var=0.0%) (118.37%) graphics.render.tests.fillRect,graphics.opts.sizes=250,graphics.render.opts.antialias=true,graphics.render.opts.paint=random: base.1100: 236303.36891 (var=0.33%) (100.0%) fix.1100: 295475.43449 (var=0.2%) (125.04%) graphics.render.tests.fillRect,graphics.opts.sizes=250,graphics.render.opts.antialias=true,graphics.render.opts.paint=single: base.1100: 237294.18682 (var=0.34%) (100.0%) fix.1100: 295521.32303 (var=0.2%) (124.54%)
03-05-2007

EVALUATION Sigh... Lots of experiments, but no clear solution thus far. [All performance results in this part of the evaluation were taken on an Nvidia Quadro FX 1100, Solaris 10, 2x 2.0GHz Opteron, 2GB RAM. The numbers may vary on other platforms, but this data should be representative.] First I tried to see how much impact the enqueuing code has on performance. Here are some relative results from commenting out certain phases of the RenderBuffer.put(byte[]) process: graphics.render.tests.fillOval: 1100.base: 57889.38067 (var=1.09%) (100.0%) 1100.nomemcpy: 59515.27650 (var=0.44%) (102.81%) 1100.nonative: 60198.08501 (var=1.32%) (103.99%) 1100.noput: 60421.20839 (var=0.41%) (104.37%) So even if RenderBuffer.put(byte[]) did absolutely nothing, we'd only see a 4% gain. It looks like we're not spending much time in this part of the code, so the Unsafe.copyMemory() enhancements mentioned earlier are unlikely to help us much. Next experiment: where are we spending time on the OGL side? Options common across all tests: graphics.opts.xormode=false graphics.render.opts.paint=single graphics.opts.renderhint=Default graphics.opts.alpharule=SrcOver graphics.opts.extraalpha=false graphics.render.opts.alphacolor=false global.dest=VolatileImg graphics.render.opts.antialias=true graphics.opts.sizes=250 graphics.opts.clip=false graphics.opts.anim=2 graphics.render.tests.drawLine: 1100.x11: 544.5535714 (var=1.61%) (100.0%) 1100.base: 376.7876787 (var=1.29%) (69.19%) 1100.align: 375.4134509 (var=1.48%) (68.94%) 1100.notex: 607.7712609 (var=1.58%) (111.61%) 1100.nobind: 660.3756043 (var=2.38%) (121.27%) 1100.pbo: 220.5935251 (var=1.12%) (40.51%) graphics.render.tests.fillOval: 1100.x11: 47272.04647 (var=1.39%) (100.0%) 1100.base: 57889.38067 (var=1.09%) (122.46%) 1100.align: 57537.63123 (var=1.03%) (121.72%) 1100.notex: 87988.36304 (var=1.17%) (186.13%) 1100.nobind: 92778.33252 (var=1.25%) (196.26%) 1100.pbo: 42446.40652 (var=0.82%) (89.79%) As you can see from these numbers, the OGL pipeline today is about 30% slower than software/X11 for AA drawLines, but about 20% faster for AA fillOvals. The "align" numbers show the benefit from using UNPACK_ALIGNMENT values of 4 or 8 (instead of 1, which we use now in all cases). Looks like little to be gained here. The "notex" numbers show how much improvement we'd get from skipping the call to glTexSubImage2D() entirely (which we use to upload the alpha mask into the cached mask texture). Clearly this is where we're spending the bulk of our time. If this call were completely free, then we might have a shot at beating the X11 numbers, but obviously it'll never be free. I don't know of any way to make this method go any faster (assuming the drivers are already as fast as they can possibly be)... The "nobind" numbers show how much improvement we'd get from avoiding calls to glBindTexture() when the texture hasn't changed since last time. This would show a small gain, but it's pretty tricky to get right. (I explored this a couple years ago, but it would complicate the code quite a bit for only a minor performance win. Maybe we could reconsider later.) The "pbo" numbers were from an experiment that used the GL_ARB_pixel_buffer_object (PBO) extension as a way to speed up the calls to glTexSubImage2D(). Not surprisingly, this only added more copying and overhead (the extension is meant more for streaming situations, but the OGLMaskFill operation needs more synchronous processing, so PBO's aren't a good fit). Still more experimentation required...
07-03-2007

EVALUATION Most people probably wonder why we don't take advantage of OpenGL's built-in line antialiasing. From what we've seen in the past, visual quality of AA lines in OpenGL varies wildly depending on the hardware vendor. It doesn't seem to be something we can completely rely on. (I suppose we could consider using hardware smoothed lines if the developer explicitly sets a RenderingHint like RENDER_SPEED, but that should probbaly be seen as a last resort. There are other things we could do to improve performance of our existing code first.) Another possibility is to use texturing tricks or fragment shaders to antialias primitives on the fly, but these techniques tend to be too heavyweight and are more experimental. The same can be said for using OpenGL's multisampling (aka FSAA) capabilities; I've done experiments in the past and have found that the quality of hardware AA still doesn't come close to what we've come to expect from our software rasterizer. These techniques could be considered in the future, but again, we should try to squeeze the last bit of performance out of our existing code first. As of JDK 6, antialiasing performance comes down to two distinct steps. We rely on Ductus to produce a 32x32 tile containing 8-bit alpha/coverage values. In the new STR architecture we enqueue those byte values onto the RenderBuffer. When it comes time to process the queue, we read those byte values off the queue into an OpenGL GL_INTENSITY texture, modulate those values with the current OpenGL color state, et voila, we produce (part of) an antialiased primitive on the destination. The reason this can be faster than the same operation performed to a software surface is that we use OpenGL for that last modulation and compositing step, which is performed in hardware. The first potential bottleneck is in copying those 1024 byte values onto the queue. Currently the RenderBuffer.put(byte[]) has to go down to native code in order to efficiently copy the byte values onto the queue. This means one JNI downcall per tile, and JNI downcalls are something we like to avoid in our STR world. The HotSpot team is working on a new Unsafe.copyMemory() method that offers great performance and allows us to copy a byte[] to the RenderBuffer's memory without going down through JNI. Hopefully that alone will buy us some performance improvement, but it will likely be a few more weeks before that new method is available to us. The second potential problem is that OGLMaskFill performance seems to be lower than I would expect, especially on the Nvidia GeForce 7xxx series. I don't have any details on this yet; more investigation required.
27-02-2007