EVALUATION
As mentioned in the evaluation above, we can't simply empty the finalize()
method at the ImageInputStreamImpl level, since third-party IISI subclasses
may rely on the finalize() method calling their close() method as per the spec.
But for ImageInputStreamImpl subclasses under our control, we can empty their
finalize() methods and use the Java2D Disposer mechanism instead. A few of
them are easy (from the javax.imageio.stream package):
FileImageInputStream
FileImageOutputStream
FileCacheImageInputStream
MemoryCacheImageInputStream
Unfortunately, we cannot easily apply this technique to the two remaining
public stream classes that ship with the JDK:
FileCacheImageOutputStream
MemoryCacheImageOutputStream
In these two cases, there is some complex logic in the close() method that
ensures that any remaining cached data is written to the underlying output
stream before the ImageOutputStream is disposed itself. With some major
reworking we could probably find a way to get the Disposer mechanism to
handle this work, but it will be messy, so for now I suggest we just forget
about having an empty finalize() method in these two subclasses.
Interestingly, there are other subclasses of Image{Input,Output}StreamImpl
in our implementation that I had not previously considered:
com.sun.imageio.plugins.common.SubImageInputStream
com.sun.imageio.plugins.png.PNGImageWriter (ChunkStream and IDATOutputStream)
These classes are completely under our control and do not require any
explicit disposal, so we can simply add an empty finalize() method to these
classes.
There are some pretty incredible performance gains associated with these
changes, especially for small images where creating and disposing
ImageInputStreams accounts for much of the overhead. Most of the benefit
comes from more prompt disposal of native resources (e.g. RandomAccessFile
handles) via the Disposer mechanism, instead of waiting around for finalizers
to be run. This means smaller, less-frequent GCs are required overall.
For most of my J2DBench test runs, I've used large heap settings to help minimize
the cost of GC pauses in the results, but as you can see, even with a large
heap there will still be lots of pauses from GC due to the cost of
finalization (prior to my changes). The results I report here will look even
better with smaller/default heap sizes (for example, if you see a performance
improvement of 40% with a large heap size, you might see a performance
improvement of 3x with a smaller heap size with these changes since less
full GCs are required now).
For example, here are some J2DBench results for constructing and closing
ImageInputStreams in a tight loop. These results were taken with
solaris-sparc on SB2000, 900 MHz USIII, 1GB RAM, -client, -Xms256M -Xmx512M,
"base" is the 2005-11-19.mustang build, "test" contains the changes outlined
above:
Options common across all tests:
testname=imageio.input.stream.tests.construct
imageio.opts.size=1
imageio.opts.content=blank
imageio.input.opts.imageio.useCache=false
imageio.input.opts.general.source=byteArray:
base: 10.26186612 (var=1.96%) (100.0%)
test: 13.51616839 (var=3.39%) (131.71%)
imageio.input.opts.general.source=file:
base: 11.54184630 (var=2.51%) (100.0%)
test: 16.06037735 (var=2.72%) (139.15%)
Here are some more J2DBench results, this time measuring performance of
ImageReader.read() in a tight loop. In this case, we construct and close a
new ImageInputStream for each iteration. These results were taken with
solaris-i586 on W2100z, 2.0 GHz Opteron, 2GB RAM, -client, -Xms256M -Xmx512M,
"base" is the 2005-11-19.mustang build, "test" contains the changes outlined
above:
Options common across all tests:
testname=imageio.input.image.imageio.reader.tests.read
imageio.input.image.imageio.reader.opts.ignoreMetadata=true
imageio.input.image.imageio.reader.opts.installListener=false
imageio.input.image.imageio.reader.opts.seekForwardOnly=true
imageio.opts.content=blank
imageio.input.image.imageio.opts.format=core-jpg,imageio.input.opts.general.source=byteArray,imageio.input.opts.imageio.useCache=false,imageio.opts.size=1:
base: 9.386375088 (var=1.38%) (100.0%)
test: 11.34964176 (var=0.62%) (120.92%)
imageio.input.image.imageio.opts.format=core-jpg,imageio.input.opts.general.source=byteArray,imageio.input.opts.imageio.useCache=false,imageio.opts.size=20:
base: 2369.359445 (var=0.75%) (100.0%)
test: 2788.373506 (var=0.8%) (117.68%)
imageio.input.image.imageio.opts.format=core-jpg,imageio.input.opts.general.source=byteArray,imageio.input.opts.imageio.useCache=true,imageio.opts.size=1:
base: 3.650076962 (var=0.42%) (100.0%)
test: 4.027091058 (var=0.4%) (110.33%)
imageio.input.image.imageio.opts.format=core-jpg,imageio.input.opts.general.source=byteArray,imageio.input.opts.imageio.useCache=true,imageio.opts.size=20:
base: 1205.155702 (var=0.43%) (100.0%)
test: 1306.425452 (var=0.8%) (108.4%)
imageio.input.image.imageio.opts.format=core-jpg,imageio.input.opts.general.source=file,imageio.input.opts.imageio.useCache=false,imageio.opts.size=1:
base: 9.402652627 (var=0.34%) (100.0%)
test: 11.49206186 (var=0.87%) (122.22%)
imageio.input.image.imageio.opts.format=core-jpg,imageio.input.opts.general.source=file,imageio.input.opts.imageio.useCache=false,imageio.opts.size=20:
base: 2365.254495 (var=0.34%) (100.0%)
test: 2761.758152 (var=0.37%) (116.76%)
imageio.input.image.imageio.opts.format=core-jpg,imageio.input.opts.general.source=file,imageio.input.opts.imageio.useCache=true,imageio.opts.size=1:
base: 9.327962944 (var=0.55%) (100.0%)
test: 11.36734489 (var=0.92%) (121.86%)
imageio.input.image.imageio.opts.format=core-jpg,imageio.input.opts.general.source=file,imageio.input.opts.imageio.useCache=true,imageio.opts.size=20:
base: 2391.495451 (var=0.82%) (100.0%)
test: 2749.798224 (var=0.12%) (114.98%)
imageio.input.image.imageio.opts.format=core-png,imageio.input.opts.general.source=byteArray,imageio.input.opts.imageio.useCache=false,imageio.opts.size=1:
base: 7.070189840 (var=0.53%) (100.0%)
test: 8.279342487 (var=1.22%) (117.1%)
imageio.input.image.imageio.opts.format=core-png,imageio.input.opts.general.source=byteArray,imageio.input.opts.imageio.useCache=false,imageio.opts.size=20:
base: 2471.124744 (var=0.71%) (100.0%)
test: 2852.417970 (var=0.56%) (115.43%)
imageio.input.image.imageio.opts.format=core-png,imageio.input.opts.general.source=byteArray,imageio.input.opts.imageio.useCache=true,imageio.opts.size=1:
base: 2.924363636 (var=0.33%) (100.0%)
test: 3.168499688 (var=0.28%) (108.35%)
imageio.input.image.imageio.opts.format=core-png,imageio.input.opts.general.source=byteArray,imageio.input.opts.imageio.useCache=true,imageio.opts.size=20:
base: 1101.486641 (var=0.83%) (100.0%)
test: 1185.383022 (var=0.54%) (107.62%)
imageio.input.image.imageio.opts.format=core-png,imageio.input.opts.general.source=file,imageio.input.opts.imageio.useCache=false,imageio.opts.size=1:
base: 6.617219917 (var=0.44%) (100.0%)
test: 7.626905644 (var=0.88%) (115.26%)
imageio.input.image.imageio.opts.format=core-png,imageio.input.opts.general.source=file,imageio.input.opts.imageio.useCache=false,imageio.opts.size=20:
base: 2307.011070 (var=0.72%) (100.0%)
test: 2639.233097 (var=1.09%) (114.4%)
imageio.input.image.imageio.opts.format=core-png,imageio.input.opts.general.source=file,imageio.input.opts.imageio.useCache=true,imageio.opts.size=1:
base: 6.542915185 (var=0.69%) (100.0%)
test: 7.696662270 (var=1.27%) (117.63%)
imageio.input.image.imageio.opts.format=core-png,imageio.input.opts.general.source=file,imageio.input.opts.imageio.useCache=true,imageio.opts.size=20:
base: 2296.733617 (var=0.96%) (100.0%)
test: 2658.816412 (var=1.23%) (115.77%)
So this is showing a 10-20% performance improvement in reading smallish (1x1
and 20x20) images. As described earlier, if we use default heap sizes, the
performance gain becomes much more apparent. This would also be true in a
server side situation, for example, where full GCs can bring the server to
a crawl, so the changes suggested in this bug report should help reduce
expensive full GCs in heavily loaded server environments. Here are some
J2DBench results for windows-i586 showing this effect, taken on a 2x 2.8 GHz P4,
1GB RAM, -client, default heap settings:
Options common across all tests:
testname=imageio.input.image.imageio.reader.tests.read
imageio.input.image.imageio.reader.opts.ignoreMetadata=true
imageio.input.image.imageio.reader.opts.installListener=false
imageio.input.image.imageio.reader.opts.seekForwardOnly=true
imageio.opts.content=blank
imageio.input.image.imageio.opts.format=core-jpg,imageio.input.opts.general.source=byteArray,imageio.input.opts.imageio.useCache=false,imageio.opts.size=1:
base: 2.843131123 (var=0.01%) (100.0%)
test: 9.026369168 (var=0.31%) (317.48%)
imageio.input.image.imageio.opts.format=core-jpg,imageio.input.opts.general.source=byteArray,imageio.input.opts.imageio.useCache=false,imageio.opts.size=20:
base: 1048.280175 (var=0.16%) (100.0%)
test: 2378.619701 (var=0.0%) (226.91%)
imageio.input.image.imageio.opts.format=core-jpg,imageio.input.opts.general.source=byteArray,imageio.input.opts.imageio.useCache=true,imageio.opts.size=1:
base: 0.514964610 (var=6.45%) (100.0%)
test: 0.627415777 (var=5.53%) (121.84%)
imageio.input.image.imageio.opts.format=core-jpg,imageio.input.opts.general.source=byteArray,imageio.input.opts.imageio.useCache=true,imageio.opts.size=20:
base: 203.4689116 (var=6.33%) (100.0%)
test: 239.4026666 (var=3.37%) (117.66%)
imageio.input.image.imageio.opts.format=core-jpg,imageio.input.opts.general.source=file,imageio.input.opts.imageio.useCache=false,imageio.opts.size=1:
base: 2.113619422 (var=0.01%) (100.0%)
test: 5.392725355 (var=0.33%) (255.14%)
imageio.input.image.imageio.opts.format=core-jpg,imageio.input.opts.general.source=file,imageio.input.opts.imageio.useCache=false,imageio.opts.size=20:
base: 747.0786044 (var=0.66%) (100.0%)
test: 1592.436363 (var=0.0%) (213.16%)
imageio.input.image.imageio.opts.format=core-jpg,imageio.input.opts.general.source=file,imageio.input.opts.imageio.useCache=true,imageio.opts.size=1:
base: 2.113211532 (var=0.5%) (100.0%)
test: 5.340571428 (var=0.65%) (252.72%)
imageio.input.image.imageio.opts.format=core-jpg,imageio.input.opts.general.source=file,imageio.input.opts.imageio.useCache=true,imageio.opts.size=20:
base: 746.4361029 (var=0.5%) (100.0%)
test: 1598.984635 (var=27.19%) (214.22%)
imageio.input.image.imageio.opts.format=core-png,imageio.input.opts.general.source=byteArray,imageio.input.opts.imageio.useCache=false,imageio.opts.size=1:
base: 1.273921200 (var=0.0%) (100.0%)
test: 5.714314374 (var=0.0%) (448.56%)
imageio.input.image.imageio.opts.format=core-png,imageio.input.opts.general.source=byteArray,imageio.input.opts.imageio.useCache=false,imageio.opts.size=20:
base: 537.3260227 (var=0.01%) (100.0%)
test: 1916.055469 (var=0.97%) (356.59%)
imageio.input.image.imageio.opts.format=core-png,imageio.input.opts.general.source=byteArray,imageio.input.opts.imageio.useCache=true,imageio.opts.size=1:
base: 0.448228595 (var=2.31%) (100.0%)
test: 0.593630835 (var=3.27%) (132.44%)
imageio.input.image.imageio.opts.format=core-png,imageio.input.opts.general.source=byteArray,imageio.input.opts.imageio.useCache=true,imageio.opts.size=20:
base: 180.4028535 (var=1.97%) (100.0%)
test: 225.3961804 (var=3.76%) (124.94%)
imageio.input.image.imageio.opts.format=core-png,imageio.input.opts.general.source=file,imageio.input.opts.imageio.useCache=false,imageio.opts.size=1:
base: 1.077513513 (var=0.17%) (100.0%)
test: 3.444194142 (var=1.29%) (319.64%)
imageio.input.image.imageio.opts.format=core-png,imageio.input.opts.general.source=file,imageio.input.opts.imageio.useCache=false,imageio.opts.size=20:
base: 439.9187947 (var=6.58%) (100.0%)
test: 1324.614065 (var=0.51%) (301.1%)
imageio.input.image.imageio.opts.format=core-png,imageio.input.opts.general.source=file,imageio.input.opts.imageio.useCache=true,imageio.opts.size=1:
base: 1.103473684 (var=0.5%) (100.0%)
test: 3.428396447 (var=0.48%) (310.69%)
imageio.input.image.imageio.opts.format=core-png,imageio.input.opts.general.source=file,imageio.input.opts.imageio.useCache=true,imageio.opts.size=20:
base: 467.3706666 (var=0.33%) (100.0%)
test: 1295.015384 (var=2.59%) (277.09%)
It is interesting to note that the PNGImageReader actually
creates one or more SubImageInputStreams in addition to the provided
ImageInputStream as part of each decoding process. There will be one
SubImageInputStream for every IDAT chunk within the image stream, so for
example, small PNG images may use only one SubImageInputStream, but larger
PNG images with little compression may use lots of SubImageInputStreams.
Now that we've added an empty finalize() method to SubImageInputStream, we
will see performance improvements for all PNG images, both large and small.
More results taken with the solaris-i586 configuration described above:
Options common across all tests:
testname=imageio.input.image.imageio.reader.tests.read
imageio.input.image.imageio.reader.opts.ignoreMetadata=true
imageio.input.opts.general.source=file
imageio.input.image.imageio.opts.format=core-png
imageio.input.image.imageio.reader.opts.installListener=false
imageio.input.image.imageio.reader.opts.seekForwardOnly=true
imageio.input.opts.imageio.useCache=false
imageio.opts.content=random
imageio.opts.size=1000:
base: 25511.84478 (var=0.06%) (100.0%)
test: 26318.99597 (var=0.2%) (103.16%)
imageio.opts.size=20:
base: 2266.548403 (var=0.53%) (100.0%)
test: 2833.257023 (var=0.29%) (125.0%)
imageio.opts.size=250:
base: 25526.77439 (var=0.04%) (100.0%)
test: 25802.99502 (var=0.67%) (101.08%)
|