JDK-8097926 : Pisces Renderer shows huge performance win when coded in C
  • Type: Enhancement
  • Component: javafx
  • Sub-Component: graphics
  • Affected Version: 7u6
  • Priority: P2
  • Status: Closed
  • Resolution: Fixed
  • Submitted: 2012-06-11
  • Updated: 2015-06-12
  • Resolved: 2012-08-08
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 8
8Fixed
Related Reports
Blocks :  
Relates :  
Relates :  
Description
I wanted to compare the performance of Java code versus C code for the FX Pisces Renderer class (in particular, the inner class ScanlineIterator).  To do this, I developed a version of ScanlineIterator (called ScanlineIterator2) that would compile under both C and Java and changed Renderer to use this class.  This allowed me to very quickly get working Java and C code for the comparison.

The win is huge.  Here are the results of VectorTest running full speed (the second run has -DNATIVE_ALPHA to enable the C code, both print the total time in the method every 1000 times it is called):

**** SLOW:

chart width: 1200.0
chart height: 600.0
NATIVE_ALPHA: false
Count=1, Time=39
Count=101, Time=321
Count=201, Time=594
Count=301, Time=867
Count=401, Time=1148
FPS: 22.60
Count=501, Time=1418
Count=601, Time=1696
Count=701, Time=1979
Count=801, Time=2263
Count=901, Time=2549
Count=1001, Time=2832
FPS: 28.77
Count=1101, Time=3113
Count=1201, Time=3394
Count=1301, Time=3676
Count=1401, Time=3962
Count=1501, Time=4242
Count=1601, Time=4523
FPS: 28.29
Count=1701, Time=4811
Count=1801, Time=5092
Count=1901, Time=5377
Count=2001, Time=5662
Count=2101, Time=5945
FPS: 28.57
Count=2201, Time=6230
Count=2301, Time=6520
Count=2401, Time=6804
Count=2501, Time=7085
Count=2601, Time=7365
Count=2701, Time=7632
FPS: 28.45
Count=2801, Time=7902
Count=2901, Time=8189
Count=3001, Time=8495
Count=3101, Time=8788
Count=3201, Time=9081
Count=3301, Time=9375
FPS: 27.63
Count=3401, Time=9668
Count=3501, Time=9966
Count=3601, Time=10263
Count=3701, Time=10569
Count=3801, Time=10866
Count=3901, Time=11161
FPS: 28.28
Count=4001, Time=11454
Count=4101, Time=11743
Count=4201, Time=12034
Count=4301, Time=12335
Count=4401, Time=12640
FPS: 28.43
Count=4501, Time=12933
Count=4601, Time=13225
Count=4701, Time=13523
Count=4801, Time=13820
Count=4901, Time=14120
Count=5001, Time=14422
FPS: 28.66
Count=5101, Time=14722
Count=5201, Time=15021
Count=5301, Time=15317
Count=5401, Time=15618
Count=5501, Time=15911
Count=5601, Time=16206
FPS: 28.89
Count=5701, Time=16507
Count=5801, Time=16789
Count=5901, Time=17087
Count=6001, Time=17380
Count=6101, Time=17676
FPS: 29.28
Count=6201, Time=17974
Count=6301, Time=18261
Count=6401, Time=18559
Count=6501, Time=18862
Count=6601, Time=19158

***** FAST:

chart width: 1200.0
chart height: 600.0
NATIVE_ALPHA: true
Count=1, Time=3
Count=101, Time=203
Count=201, Time=400
Count=301, Time=600
Count=401, Time=795
Count=501, Time=995
Count=601, Time=1192
FPS: 30.89
Count=701, Time=1392
Count=801, Time=1590
Count=901, Time=1782
Count=1001, Time=1980
Count=1101, Time=2181
Count=1201, Time=2379
Count=1301, Time=2577
Count=1401, Time=2774
FPS: 38.40
Count=1501, Time=2974
Count=1601, Time=3167
Count=1701, Time=3361
Count=1801, Time=3557
Count=1901, Time=3745
Count=2001, Time=3950
Count=2101, Time=4139
Count=2201, Time=4336
FPS: 39.40
Count=2301, Time=4529
Count=2401, Time=4725
Count=2501, Time=4916
Count=2601, Time=5108
Count=2701, Time=5304
Count=2801, Time=5502
Count=2901, Time=5700
Count=3001, Time=5889
FPS: 40.32
Count=3101, Time=6087
Count=3201, Time=6283
Count=3301, Time=6479
Count=3401, Time=6675
Count=3501, Time=6873
Count=3601, Time=7076
Count=3701, Time=7273
Count=3801, Time=7466
FPS: 40.36
Count=3901, Time=7660
Count=4001, Time=7858
Count=4101, Time=8057
Count=4201, Time=8264
Count=4301, Time=8460
Count=4401, Time=8668
Count=4501, Time=8860
Count=4601, Time=9055
FPS: 40.20
Count=4701, Time=9270
Count=4801, Time=9463
Count=4901, Time=9666
Count=5001, Time=9866
Count=5101, Time=10062
Count=5201, Time=10266
Count=5301, Time=10467
Count=5401, Time=10659
FPS: 39.89
Count=5501, Time=10857
Count=5601, Time=11060
Comments
closing as verified using set of benchmarks
22-01-2014

This fix was checked by testing the performance of various Path benchmarks, especially on embedded. It looks like Ekaterina used something in Aurora, and the GUIMark2 Vector test run in benchmark mode should also confirm the performance boost. The native renderer can be turned off and on using -Dprism.nativepisces={true,false} and the performance seen without specifying anything should match the faster of those two options (which should usually be "true" except on some flavors of Linux).
02-01-2014

According to Aurora there are following performance improvements caused by this change on WinXP-Mid-Range machines: Charts.Bubble: 26.72 fps +20% (+4.49 fps) Charts.Stock: 19.44 fps +33% (+4.82 fps) FXFire.Path: 28,819.33 objects +31% (+6,893.67 objects) GUIMark2.Vector: 30.03 fps +23% (+5.65 fps)
09-08-2012

For now this is enabled on all platforms except for non-embedded Linux pending the resolution of RT-24104.
08-08-2012

Fixed with changeset: changeset: 16807:836a8a208426 date: Wed Aug 08 14:21:07 2012 -0700 summary: Fix RT-22244: Pisces rasterizer runs faster in native code. Changeset includes a minor unit test in prism-common\test\com\sun\prism\impl\shape\NativePiscesRasterizerTest.java
08-08-2012

Using the native code does not seem to have much of an effect on this case.
14-06-2012

Attached a test program that runs much faster on j2d pipeline than on d3d pipeline. This can be used as additional perf test.
13-06-2012

It is also interesting to compare ScanlineIterator2 (the one coded in Java to the original inner class versions). **** SLOW (but faster than inner class) chart width: 1200.0 chart height: 600.0 NATIVE_ALPHA: true Count=1, Time=36 Count=101, Time=307 Count=201, Time=579 Count=301, Time=850 Count=401, Time=1126 Count=501, Time=1397 FPS: 22.96 Count=601, Time=1669 Count=701, Time=1943 Count=801, Time=2217 Count=901, Time=2494 Count=1001, Time=2778 FPS: 28.90 Count=1101, Time=3060 Count=1201, Time=3335 Count=1301, Time=3615 Count=1401, Time=3886 Count=1501, Time=4158 Count=1601, Time=4428 FPS: 29.17 Count=1701, Time=4700 Count=1801, Time=4978 Count=1901, Time=5261 Count=2001, Time=5530 Count=2101, Time=5802 Count=2201, Time=6082 FPS: 29.46 Count=2301, Time=6351 Count=2401, Time=6622 Count=2501, Time=6906 Count=2601, Time=7168 Count=2701, Time=7431 Count=2801, Time=7705 FPS: 29.97 Count=2901, Time=7965 Count=3001, Time=8230 Count=3101, Time=8491 Count=3201, Time=8753 Count=3301, Time=9021 Count=3401, Time=9283 FPS: 31.23 Count=3501, Time=9549 Count=3601, Time=9811 Count=3701, Time=10075 Count=3801, Time=10351 Count=3901, Time=10624 Count=4001, Time=10888 FPS: 30.14 Count=4101, Time=11160 Count=4201, Time=11416 Count=4301, Time=11682 Count=4401, Time=11950 Count=4501, Time=12208 Count=4601, Time=12470 Count=4701, Time=12739 FPS: 31.61 Count=4801, Time=13002 Count=4901, Time=13276 Count=5001, Time=13544 Count=5101, Time=13802 Count=5201, Time=14069 Count=5301, Time=14330 FPS: 31.33 Count=5401, Time=14606 Count=5501, Time=14869 Count=5601, Time=15139 Count=5701, Time=15420 Count=5801, Time=15683 Count=5901, Time=15950 FPS: 30.59 Count=6001, Time=16211 Count=6101, Time=16477 Count=6201, Time=16749 Count=6301, Time=17021 Count=6401, Time=17291 Count=6501, Time=17554 FPS: 30.86 Count=6601, Time=17822 Count=6701, Time=18089 Count=6801, Time=18354 Count=6901, Time=18625 Count=7001, Time=18893 Count=7101, Time=19158 Count=7201, Time=19432 FPS: 30.98 Count=7301, Time=19698 Count=7401, Time=19974 Count=7501, Time=20245
11-06-2012