JDK-6340082 : OGL: Slowness when rendering transparent image created via getScaledInstance()
  • Type: Bug
  • Component: client-libs
  • Sub-Component: 2d
  • Affected Version: 6
  • Priority: P4
  • Status: Closed
  • Resolution: Duplicate
  • OS: linux
  • CPU: x86
  • Submitted: 2005-10-21
  • Updated: 2011-02-16
  • Resolved: 2005-10-27
Related Reports
Duplicate :  
Description
FULL PRODUCT VERSION :
java version "1.6.0-ea"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.6.0-ea-b56)
Java HotSpot(TM) Client VM (build 1.6.0-ea-b56, mixed mode, sharing)


ADDITIONAL OS VERSION INFORMATION :
Linux ce 2.6.13-1.1526_FC4 #1 Wed Sep 28 19:15:10 EDT 2005 i686 i686 i386 GNU/Linux


EXTRA RELEVANT SYSTEM CONFIGURATION :
[ce@ce ~]$ glxinfo
name of display: :0.0
display: :0  screen: 0
direct rendering: Yes
server glx vendor string: NVIDIA Corporation
server glx version string: 1.3
server glx extensions:
    GLX_EXT_visual_info, GLX_EXT_visual_rating, GLX_SGIX_fbconfig,
    GLX_SGIX_pbuffer, GLX_SGI_video_sync, GLX_SGI_swap_control
client glx vendor string: NVIDIA Corporation
client glx version string: 1.3
client glx extensions:
    GLX_ARB_get_proc_address, GLX_ARB_multisample, GLX_EXT_visual_info,
    GLX_EXT_visual_rating, GLX_EXT_import_context, GLX_SGI_video_sync,
    GLX_NV_swap_group, GLX_NV_video_out, GLX_SGIX_fbconfig, GLX_SGIX_pbuffer,
    GLX_SGI_swap_control, GLX_NV_float_buffer, GLX_ARB_fbconfig_float
GLX extensions:
    GLX_EXT_visual_info, GLX_EXT_visual_rating, GLX_SGIX_fbconfig,
    GLX_SGIX_pbuffer, GLX_SGI_video_sync, GLX_SGI_swap_control,
    GLX_ARB_get_proc_address
OpenGL vendor string: NVIDIA Corporation
OpenGL renderer string: GeForce4 488 Go/AGP/SSE2
OpenGL version string: 1.5.3 NVIDIA 76.76
OpenGL extensions:
    GL_ARB_imaging, GL_ARB_multitexture, GL_ARB_point_parameters,
    GL_ARB_point_sprite, GL_ARB_shader_objects, GL_ARB_shading_language_100,
    GL_ARB_texture_compression, GL_ARB_texture_cube_map,
    GL_ARB_texture_env_add, GL_ARB_texture_env_combine,
    GL_ARB_texture_env_dot3, GL_ARB_texture_mirrored_repeat,
    GL_ARB_texture_rectangle, GL_ARB_transpose_matrix,
    GL_ARB_vertex_buffer_object, GL_ARB_vertex_program, GL_ARB_vertex_shader,
    GL_ARB_window_pos, GL_S3_s3tc, GL_EXT_texture_env_add, GL_EXT_abgr,
    GL_EXT_bgra, GL_EXT_blend_color, GL_EXT_blend_minmax,
    GL_EXT_blend_subtract, GL_EXT_clip_volume_hint,
    GL_EXT_compiled_vertex_array, GL_EXT_Cg_shader,
    GL_EXT_draw_range_elements, GL_EXT_fog_coord, GL_EXT_multi_draw_arrays,
    GL_EXT_packed_pixels, GL_EXT_paletted_texture, GL_EXT_pixel_buffer_object,
    GL_EXT_point_parameters, GL_EXT_rescale_normal, GL_EXT_secondary_color,
    GL_EXT_separate_specular_color, GL_EXT_shared_texture_palette,
    GL_EXT_stencil_wrap, GL_EXT_texture_compression_s3tc,
    GL_EXT_texture_cube_map, GL_EXT_texture_edge_clamp,
    GL_EXT_texture_env_combine, GL_EXT_texture_env_dot3,
    GL_EXT_texture_filter_anisotropic, GL_EXT_texture_lod,
    GL_EXT_texture_lod_bias, GL_EXT_texture_object, GL_EXT_vertex_array,
    GL_IBM_rasterpos_clip, GL_IBM_texture_mirrored_repeat,
    GL_KTX_buffer_region, GL_NV_blend_square, GL_NV_fence,
    GL_NV_fog_distance, GL_NV_light_max_exponent, GL_NV_packed_depth_stencil,
    GL_NV_pixel_data_range, GL_NV_point_sprite, GL_NV_register_combiners,
    GL_NV_texgen_reflection, GL_NV_texture_env_combine4,
    GL_NV_texture_rectangle, GL_NV_vertex_array_range,
    GL_NV_vertex_array_range2, GL_NV_vertex_program, GL_NV_vertex_program1_1,
    GL_SGIS_generate_mipmap, GL_SGIS_multitexture, GL_SGIS_texture_lod,
    GL_SUN_slice_accum
glu version: 1.3
glu extensions:
    GLU_EXT_nurbs_tessellator, GLU_EXT_object_space_tess

   visual  x  bf lv rg d st colorbuffer ax dp st accumbuffer  ms  cav
 id dep cl sp sz l  ci b ro  r  g  b  a bf th cl  r  g  b  a ns b eat
----------------------------------------------------------------------
0x21 24 tc  0 32  0 r  y  .  8  8  8  0  4 24  8 16 16 16 16  0 0 None
0x22 24 dc  0 32  0 r  y  .  8  8  8  0  4 24  8 16 16 16 16  0 0 None
0x23 24 tc  0 32  0 r  y  .  8  8  8  8  4 24  8 16 16 16 16  0 0 None
0x24 24 tc  0 32  0 r  .  .  8  8  8  0  4 24  8 16 16 16 16  0 0 None
0x25 24 tc  0 32  0 r  .  .  8  8  8  8  4 24  8 16 16 16 16  0 0 None
0x26 24 tc  0 32  0 r  y  .  8  8  8  0  4 24  0 16 16 16 16  0 0 None
0x27 24 tc  0 32  0 r  y  .  8  8  8  8  4 24  0 16 16 16 16  0 0 None
0x28 24 tc  0 32  0 r  .  .  8  8  8  0  4 24  0 16 16 16 16  0 0 None
0x29 24 tc  0 32  0 r  .  .  8  8  8  8  4 24  0 16 16 16 16  0 0 None
0x2a 24 tc  0 32  0 r  y  .  8  8  8  0  4 16  0 16 16 16 16  0 0 None
0x2b 24 tc  0 32  0 r  y  .  8  8  8  8  4 16  0 16 16 16 16  0 0 None
0x2c 24 tc  0 32  0 r  .  .  8  8  8  0  4 16  0 16 16 16 16  0 0 None
0x2d 24 tc  0 32  0 r  .  .  8  8  8  8  4 16  0 16 16 16 16  0 0 None
0x2e 24 tc  0 32  0 r  y  .  8  8  8  0  4  0  0 16 16 16 16  0 0 None
0x2f 24 tc  0 32  0 r  y  .  8  8  8  8  4  0  0 16 16 16 16  0 0 None
0x30 24 tc  0 32  0 r  .  .  8  8  8  0  4  0  0 16 16 16 16  0 0 None
0x31 24 tc  0 32  0 r  .  .  8  8  8  8  4  0  0 16 16 16 16  0 0 None
0x32 24 dc  0 32  0 r  y  .  8  8  8  8  4 24  8 16 16 16 16  0 0 None
0x33 24 dc  0 32  0 r  .  .  8  8  8  0  4 24  8 16 16 16 16  0 0 None
0x34 24 dc  0 32  0 r  .  .  8  8  8  8  4 24  8 16 16 16 16  0 0 None
0x35 24 dc  0 32  0 r  y  .  8  8  8  0  4 24  0 16 16 16 16  0 0 None
0x36 24 dc  0 32  0 r  y  .  8  8  8  8  4 24  0 16 16 16 16  0 0 None
0x37 24 dc  0 32  0 r  .  .  8  8  8  0  4 24  0 16 16 16 16  0 0 None
0x38 24 dc  0 32  0 r  .  .  8  8  8  8  4 24  0 16 16 16 16  0 0 None
0x39 24 dc  0 32  0 r  y  .  8  8  8  0  4 16  0 16 16 16 16  0 0 None
0x3a 24 dc  0 32  0 r  y  .  8  8  8  8  4 16  0 16 16 16 16  0 0 None
0x3b 24 dc  0 32  0 r  .  .  8  8  8  0  4 16  0 16 16 16 16  0 0 None
0x3c 24 dc  0 32  0 r  .  .  8  8  8  8  4 16  0 16 16 16 16  0 0 None
0x3d 24 dc  0 32  0 r  y  .  8  8  8  0  4  0  0 16 16 16 16  0 0 None
0x3e 24 dc  0 32  0 r  y  .  8  8  8  8  4  0  0 16 16 16 16  0 0 None
0x3f 24 dc  0 32  0 r  .  .  8  8  8  0  4  0  0 16 16 16 16  0 0 None
0x40 24 dc  0 32  0 r  .  .  8  8  8  8  4  0  0 16 16 16 16  0 0 None

A DESCRIPTION OF THE PROBLEM :
When mixing blits of scaled and not-scaled images weird slowdowns occur when the OpenGL pipeline is enabled.
The strange thing is that when doing only blits of scaled images they are as fast as blits of unscaled images, but as soon as you add blits of not scaled image performance goes down.

We discovered the problem when writing a spreadsheet-like application which uses image-scaling so that some signs can be stored in images and also be used also with dynamically adjustable grid sizes.
In our grid we do not use any non-scaled images, maybe that happens because of other operations like fills or glyph-painting.

The image I used is a 20x20 grayscale image and is 1-bit transparent.

This are the results I get when running the sample-code on my test  machine which blits each image 10.000 times to a JFrame:
Scaled image took: 747
Full image took: 32

This I get when blitting only scaled images:
Scaled image took: 1
Scaled image took: 1



STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Run the sample-code with the OpenGL pipeline enabled.

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
blit should be as fast as when only doing scaled blits.
ACTUAL -
very slow blits.

REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
import java.awt.*;

import javax.swing.*;

public class ScaleBench extends JFrame
{
	Image origImage;
	Image scaledImage;

	public ScaleBench()
	{
		origImage = new ImageIcon("img.gif").getImage();
		scaledImage = origImage.getScaledInstance(25, 30, Image.SCALE_SMOOTH);

		setSize(500, 500);
		setVisible(true);
	}

	public void paint(Graphics g)
	{
		for (int m = 0; m < 1000; m++)
		{
			long startNorm = System.currentTimeMillis();
			for (int i = 0; i < 10000; i++)
			{
				g.drawImage(origImage, 25, 25, this);
			}
			long endNorm = System.currentTimeMillis();

			long startScaled = System.currentTimeMillis();
			for (int i = 0; i < 10000; i++)
			{
				g.drawImage(scaledImage, 25, 25, this);
			}
			long endScaled = System.currentTimeMillis();

			System.out.println("Scaled image took: " + (endScaled - startScaled));
			System.out.println("Full image took: " + (endNorm - startNorm));
			System.out.println();
		}
	}

	public static void main(String[] args)
	{
		new ScaleBench();
	}
}

---------- END SOURCE ----------

Comments
WORK AROUND Don't use getScaledInstance(). Instead use drawImage() to create a scaled version of the original image, as shown in the evaluation above.
27-10-2005

EVALUATION I've modified the testcase so that it fully loads the scaled image using MediaTracker, and I've attached it to this bug report (along with opaque and 1-bit transparent GIF images). I also changed the testcase so that it renders each 100000 times. I am not able to reproduce the problem with the opaque image (note that the opaque image here is 111x140, which explains why it renders slower than the 25x30 scaled instance): bash-2.05$ ~/ws5/build/solaris-sparc/bin/java -Dsun.java2d.opengl=True -Dsun.java2d.trace=count ScaleBench OpenGL pipeline enabled for default config on screen 0 Scaled image took: 893 Full image took: 15385 4 calls to sun.java2d.loops.Blit::Blit(ByteIndexed, SrcNoEa, IntArgbPre) 2 calls to sun.java2d.opengl.OGLGeneralBlit::Blit(Any, AnyAlpha, "OpenGL Surface") 2 calls to sun.java2d.opengl.OGLGeneralBlit::Blit(Any, SrcNoEa, "OpenGL Texture") 199998 calls to sun.java2d.opengl.OGLTextureToSurfaceBlit::Blit("OpenGL Texture", AnyAlpha, "OpenGL Surface") 200006 total calls to 4 different primitives I am able to reproduce the problem reported by the submitter if I use a 1-bit transparent GIF image (this one is 55x68, which is larger than the scaled instance, so it should exhibit the same characteristics as the opaque case above, but it does not): bash-2.05$ ~/ws5/build/solaris-sparc/bin/java -Dsun.java2d.opengl=True -Dsun.java2d.trace=count ScaleBench foo OpenGL pipeline enabled for default config on screen 0 Scaled image took: 31475 Full image took: 6188 100001 calls to sun.java2d.opengl.OGLGeneralBlit::Blit(Any, AnyAlpha, "OpenGL Surface") 100000 calls to sun.java2d.loops.Blit::Blit(IntArgb, SrcNoEa, IntArgbPre) 99999 calls to sun.java2d.opengl.OGLTextureToSurfaceBlit::Blit("OpenGL Texture", AnyAlpha, "OpenGL Surface") 1 call to sun.java2d.opengl.OGLGeneralBlit::Blit(Any, SrcNoEa, "OpenGL Texture")2 calls to sun.java2d.loops.Blit::Blit(ByteIndexed, SrcNoEa, IntArgbPre) 300003 total calls to 5 different primitives Note that rendering a scaled image is significantly slower in this case, and this is even seen in the trace report (lots of software blits for the scaled instance case). It turns out that this is the same issue filed in 6231864. The reason this slowdown is only seen for bitmask transparent images created via getScaledInstance() is that the latter will internally generate an IntArgb BufferedImage to create the scaled instance (from the original ByteIndexed image). Since this scaled instance is IntArgb, we will go through the code in ImageRepresentation.getOpaqueRGBImage(), which grabs the underlying DataBuffer and prevents the image from being accelerated in hardware (as described in the evaluation of 6231864). Note that this problem is not seen for the original/scaled opaque image or the original transparent image because those are all classified as ByteIndexed and therefore do not hit the offending code in getOpaqueRGBImage(). I will close this as a duplicate of 6231864. I urge the submitter to workaround this problem by avoiding getScaledInstance() and using drawImage() to create a scaled version of the image (this is usually more efficient anyway).
27-10-2005

EVALUATION I haven't looked into this issue in great detail yet, but one recommendation I have (as always) is: don't use Image.getScaledInstance(). It is not the most efficient approach to scaling an image. Try this instead: BufferedImage scaledBI = new BufferedImage(w, h, type); Graphics2D g = scaledBI.createGraphics(); g.setComposite(AlphaComposite.Src); g.drawImage(originalImage, 0, 0, w, h, null); g.dispose(); Also, (as mentioned in the JDC comments from the submitter) the given testcase does not use MediaTracker to ensure that the original image is fully loaded before rendering it, which explains the reason for the unrealistically fast numbers when only doing scaling. I will fix up the testcase to use MediaTracker and will look into it further.
27-10-2005