draw_trans_sprite, Devcpp and optimizations
Paul whoknows

I am writting a game using only sprites with alpha channel, obviously I am using only draw_trans_sprite for blitting.
The weird thing is that without optimization my game runs at 60fps, but when I use the compiler optimization options(devcpp): -O1, -O2 or -O3 the fps drops to 35 or less! what´s wrong here? code compiled with optimization options should run faster, I think.

Milan Mimica

It should run faster, unless there is a compiler bug. What is the version of gcc? It doesn't happen often to hit a bug with just a -O1.

Paul whoknows

Dev-C++ 5.0 beta 9.2 (4.9.9.2) with Mingw/GCC 3.4.2
I get it from here
The frame drop happens with draw_trans_sprite only, I am also using Fladimir´s blender routines and they run faster(as expected) with -O3, so I am not sure about the compiler bug.

Milan Mimica

Sounds like a compiler bug that is triggered by allegro's draw_trans_sprite(). But someone with the same compiler version should confirm.

Fladimir da Gorf

Could you try out with -O2? -O3 may turn on some experimental stuff, that may or may not speed things up...

Paul whoknows
Quote:

Could you try out with -O2?

draw_trans_sprite runs slower with optimizations enabled, I tried -O1, -O2, -O3, there is no difference it works slower, really slower, 60fps went to 35 fps.

Fladimir da Gorf

Maybe there's something fishy in your FPS calculations? After all, I bet you're not compiling Allegro with different settings...

ImLeftFooted

Do you do a complete recompile each time you switch the -On?

Try recompiling allegro with all asm turned off and see if the difference still occurs.

Paul whoknows

I compiled exalpha.c example 3 times, with -O3, -O2, and without optimization, these 3 files are attached, try them in your PCs.

Here is the exalpha source (slightly modified) I used.

#SelectExpand
1/* 2 * Example program for the Allegro library, by Shawn Hargreaves. 3 * 4 * This program demonstrates how to use the 32 bit RGBA 5 * translucency functions to store an alpha channel along with 6 * a bitmap graphic. Two images are loaded from disk. One will 7 * be used for the background and the other as a sprite. The 8 * example generates an alpha channel for the sprite image, 9 * composing the 32 bit RGBA bitmap during runtime, and draws 10 * it at the position of the mouse cursor. 11 */ 12 13 14#include <allegro.h> 15 16 17 18int main(int argc, char *argv[]) 19{ 20 char buf[256]; 21 BITMAP *background; 22 BITMAP *alpha; 23 BITMAP *sprite; 24 BITMAP *buffer; 25 int bpp = -1; 26 int ret = -1; 27 int x, y, c, a; 28 29 if (allegro_init() != 0) 30 return 1; 31 install_keyboard(); 32 install_mouse(); 33 install_timer(); 34 35 /* what color depth should we use? */ 36 if (argc > 1) { 37 if ((argv[1][0] == '-') || (argv[1][0] == '/')) 38 argv[1]++; 39 bpp = atoi(argv[1]); 40 if ((bpp != 15) && (bpp != 16) && (bpp != 24) && (bpp != 32)) { 41 allegro_message("Invalid color depth '%s'\n", argv[1]); 42 return 1; 43 } 44 } 45 46 if (bpp > 0) { 47 /* set a user-requested color depth */ 48 set_color_depth(bpp); 49 ret = set_gfx_mode(GFX_AUTODETECT, 640, 480, 0, 0); 50 } 51 else { 52 /* autodetect what color depths are available */ 53 static int color_depths[] = { 16, 15, 32, 24, 0 }; 54 for (a=0; color_depths[a]; a++) { 55 bpp = color_depths[a]; 56 set_color_depth(bpp); 57 ret = set_gfx_mode(GFX_AUTODETECT, 640, 480, 0, 0); 58 if (ret == 0) 59 break; 60 } 61 } 62 63 /* did the video mode set properly? */ 64 if (ret != 0) { 65 set_gfx_mode(GFX_TEXT, 0, 0, 0, 0); 66 allegro_message("Error setting %d bit graphics mode\n%s\n", bpp, 67 allegro_error); 68 return 1; 69 } 70 71 /* load the background picture */ 72 replace_filename(buf, argv[0], "allegro.pcx", sizeof(buf)); 73 background = load_bitmap(buf, NULL); 74 if (!background) { 75 set_gfx_mode(GFX_TEXT, 0, 0, 0, 0); 76 allegro_message("Error reading %s!\n", buf); 77 return 1; 78 } 79 80 /* make a copy of it */ 81 set_color_depth(32); 82 sprite = create_bitmap(background->w, background->h); 83 blit(background, sprite, 0, 0, 0, 0, background->w, background->h); 84 85 /* load the alpha sprite image. Note that we specifically force this 86 * to load in a 32 bit format by calling set_color_depth(). That is 87 * because the disk file is actually only a 256 color graphic: if it 88 * was already a 32 bit RGBA sprite, we would probably want to use 89 * set_color_conversion(COLORCONV_NONE) instead. 90 */ 91 replace_filename(buf, argv[0], "mysha.pcx", sizeof(buf)); 92 alpha = load_bitmap(buf, NULL); 93 if (!alpha) { 94 set_gfx_mode(GFX_TEXT, 0, 0, 0, 0); 95 allegro_message("Error reading %s!\n", buf); 96 return 1; 97 } 98 99 /* normally we would have loaded an RGBA image directly from disk. Since 100 * I don't have one lying around, and am too lazy to draw one (or I could 101 * rationalise this by saying that I'm trying to save download size by 102 * reusing graphics :-) I'll just have to generate an alpha channel in 103 * code. I do this by using greyscale values from the mouse picture as an 104 * alpha channel for the Allegro image. Don't worry about this code: you 105 * wouldn't normally need to write anything like this, because you'd just 106 * get the right graphics directly out of a datafile. 107 */ 108 drawing_mode(DRAW_MODE_TRANS, NULL, 0, 0); 109 set_write_alpha_blender(); 110 111 for (y=0; y<sprite->h; y++) { 112 for (x=0; x<sprite->w; x++) { 113 c = getpixel(alpha, x, y); 114 a = getr(c) + getg(c) + getb(c); 115 a = MID(0, a/2-128, 255); 116 117 putpixel(sprite, x, y, a); 118 } 119 } 120 121 destroy_bitmap(alpha); 122 123 set_color_depth(bpp); 124 125 /* darken the background image down a bit */ 126 drawing_mode(DRAW_MODE_TRANS, NULL, 0, 0); 127 set_multiply_blender(0, 0, 0, 255); 128 rectfill(background, 0, 0, background->w, background->h, 129 makecol(32, 16, 128)); 130 solid_mode(); 131 132 /* create a double buffer bitmap */ 133 buffer = create_bitmap(SCREEN_W, SCREEN_H); 134 135 /* scale the background image to be the same size as the screen */ 136 stretch_blit(background, buffer, 0, 0, background->w, background->h, 137 0, 0, SCREEN_W, SCREEN_H); 138 139 textprintf_ex(buffer, font, 0, 0, makecol(255, 255, 255), -1, 140 "%dx%d, %dbpp", SCREEN_W, SCREEN_H, bpp); 141 142 destroy_bitmap(background); 143 background = create_bitmap(SCREEN_W, SCREEN_H); 144 blit(buffer, background, 0, 0, 0, 0, SCREEN_W, SCREEN_H); 145 146 set_alpha_blender(); 147 while (!keypressed()) { 148 /* draw the alpha sprite */ 149 x = mouse_x - sprite->w/2; 150 y = mouse_y - sprite->h/2; 151 152 // set_alpha_blender(); 153 draw_trans_sprite(buffer, sprite, x, y); 154 draw_trans_sprite(buffer, sprite, x, y); 155 draw_trans_sprite(buffer, sprite, x, y); 156 draw_trans_sprite(buffer, sprite, x, y); 157 draw_trans_sprite(buffer, sprite, x, y); 158 draw_trans_sprite(buffer, sprite, x, y); 159 draw_trans_sprite(buffer, sprite, x, y); 160 draw_trans_sprite(buffer, sprite, x, y); 161 draw_trans_sprite(buffer, sprite, x, y); 162 draw_trans_sprite(buffer, sprite, x, y); 163 draw_trans_sprite(buffer, sprite, x, y); 164 draw_trans_sprite(buffer, sprite, x, y); 165 draw_trans_sprite(buffer, sprite, x, y); 166 draw_trans_sprite(buffer, sprite, x, y); 167 draw_trans_sprite(buffer, sprite, x, y); 168 draw_trans_sprite(buffer, sprite, x, y); 169 draw_trans_sprite(buffer, sprite, x, y); 170 draw_trans_sprite(buffer, sprite, x, y); 171 draw_trans_sprite(buffer, sprite, x, y); 172 draw_trans_sprite(buffer, sprite, x, y); 173 draw_trans_sprite(buffer, sprite, x, y); 174 draw_trans_sprite(buffer, sprite, x, y); 175 draw_trans_sprite(buffer, sprite, x, y); 176 draw_trans_sprite(buffer, sprite, x, y); 177 draw_trans_sprite(buffer, sprite, x, y); 178 draw_trans_sprite(buffer, sprite, x, y); 179 draw_trans_sprite(buffer, sprite, x, y); 180 draw_trans_sprite(buffer, sprite, x, y); 181 draw_trans_sprite(buffer, sprite, x, y); 182 draw_trans_sprite(buffer, sprite, x, y); 183 draw_trans_sprite(buffer, sprite, x, y); 184 draw_trans_sprite(buffer, sprite, x, y); 185 draw_trans_sprite(buffer, sprite, x, y); 186 draw_trans_sprite(buffer, sprite, x, y); 187 draw_trans_sprite(buffer, sprite, x, y); 188 draw_trans_sprite(buffer, sprite, x, y); 189 draw_trans_sprite(buffer, sprite, x, y); 190 draw_trans_sprite(buffer, sprite, x, y); 191 draw_trans_sprite(buffer, sprite, x, y); 192 draw_trans_sprite(buffer, sprite, x, y); 193 draw_trans_sprite(buffer, sprite, x, y); 194 draw_trans_sprite(buffer, sprite, x, y); 195 draw_trans_sprite(buffer, sprite, x, y); 196 draw_trans_sprite(buffer, sprite, x, y); 197 draw_trans_sprite(buffer, sprite, x, y); 198 draw_trans_sprite(buffer, sprite, x, y); 199 draw_trans_sprite(buffer, sprite, x, y); 200 draw_trans_sprite(buffer, sprite, x, y); 201 draw_trans_sprite(buffer, sprite, x, y); 202 draw_trans_sprite(buffer, sprite, x, y); 203 draw_trans_sprite(buffer, sprite, x, y); 204 draw_trans_sprite(buffer, sprite, x, y); 205 draw_trans_sprite(buffer, sprite, x, y); 206 draw_trans_sprite(buffer, sprite, x, y); 207 draw_trans_sprite(buffer, sprite, x, y); 208 draw_trans_sprite(buffer, sprite, x, y); 209 draw_trans_sprite(buffer, sprite, x, y); 210 draw_trans_sprite(buffer, sprite, x, y); 211 draw_trans_sprite(buffer, sprite, x, y); 212 draw_trans_sprite(buffer, sprite, x, y); 213 draw_trans_sprite(buffer, sprite, x, y); 214 draw_trans_sprite(buffer, sprite, x, y); 215 draw_trans_sprite(buffer, sprite, x, y); 216 draw_trans_sprite(buffer, sprite, x, y); 217 draw_trans_sprite(buffer, sprite, x, y); 218 draw_trans_sprite(buffer, sprite, x, y); 219 draw_trans_sprite(buffer, sprite, x, y); 220 draw_trans_sprite(buffer, sprite, x, y); 221 draw_trans_sprite(buffer, sprite, x, y); 222 draw_trans_sprite(buffer, sprite, x, y); 223 draw_trans_sprite(buffer, sprite, x, y); 224 draw_trans_sprite(buffer, sprite, x, y); 225 draw_trans_sprite(buffer, sprite, x, y); 226 draw_trans_sprite(buffer, sprite, x, y); 227 draw_trans_sprite(buffer, sprite, x, y); 228 draw_trans_sprite(buffer, sprite, x, y); 229 draw_trans_sprite(buffer, sprite, x, y); 230 draw_trans_sprite(buffer, sprite, x, y); 231 draw_trans_sprite(buffer, sprite, x, y); 232 draw_trans_sprite(buffer, sprite, x, y); 233 draw_trans_sprite(buffer, sprite, x, y); 234 draw_trans_sprite(buffer, sprite, x, y); 235 draw_trans_sprite(buffer, sprite, x, y); 236 draw_trans_sprite(buffer, sprite, x, y); 237 draw_trans_sprite(buffer, sprite, x, y); 238 draw_trans_sprite(buffer, sprite, x, y); 239 draw_trans_sprite(buffer, sprite, x, y); 240 draw_trans_sprite(buffer, sprite, x, y); 241 draw_trans_sprite(buffer, sprite, x, y); 242 draw_trans_sprite(buffer, sprite, x, y); 243 draw_trans_sprite(buffer, sprite, x, y); 244 draw_trans_sprite(buffer, sprite, x, y); 245 draw_trans_sprite(buffer, sprite, x, y); 246 draw_trans_sprite(buffer, sprite, x, y); 247 draw_trans_sprite(buffer, sprite, x, y); 248 draw_trans_sprite(buffer, sprite, x, y); 249 draw_trans_sprite(buffer, sprite, x, y); 250 draw_trans_sprite(buffer, sprite, x, y); 251 draw_trans_sprite(buffer, sprite, x, y); 252 draw_trans_sprite(buffer, sprite, x, y); 253 draw_trans_sprite(buffer, sprite, x, y); 254 draw_trans_sprite(buffer, sprite, x, y); 255 draw_trans_sprite(buffer, sprite, x, y); 256 draw_trans_sprite(buffer, sprite, x, y); 257 draw_trans_sprite(buffer, sprite, x, y); 258 draw_trans_sprite(buffer, sprite, x, y); 259 draw_trans_sprite(buffer, sprite, x, y); 260 draw_trans_sprite(buffer, sprite, x, y); 261 draw_trans_sprite(buffer, sprite, x, y); 262 draw_trans_sprite(buffer, sprite, x, y); 263 draw_trans_sprite(buffer, sprite, x, y); 264 draw_trans_sprite(buffer, sprite, x, y); 265 266 267 268 269 270 /* flip it across to the screen */ 271 blit(buffer, screen, 0, 0, 0, 0, SCREEN_W, SCREEN_H); 272 273 /* replace the background where we drew the sprite */ 274 blit(background, buffer, x, y, x, y, sprite->w, sprite->h); 275 } 276 277 clear_keybuf(); 278 279 destroy_bitmap(background); 280 destroy_bitmap(sprite); 281 destroy_bitmap(buffer); 282 283 return 0; 284} 285 286END_OF_MAIN()

Quote:

After all, I bet you're not compiling Allegro with different settings...

I don't understand what you meant, can you reword it please?

Evert

He means, are you recompiling Allegro along with your program?
If not, then what optimisation options you specify should not influence draw_trans_sprite.

Kauhiz

I think he means, that you only compile your project with different settings, not allegro.

Edit: beaten...

Paul whoknows

I am only compiling my project, I am using pre-built binaries for MinGW.
here is the compile log:

Building Makefile: "C:\Documents and Settings\pablo\Escritorio\Makefile.win" Finding dependencies for file: C:\Documents and Settings\pablo\Escritorio\allegro-misc-4.2.1\examples\exalpha.c Executing make... make.exe -f "C:\Documents and Settings\pablo\Escritorio\Makefile.win" all g++.exe -c allegro-misc-4.2.1/examples/exalpha.c -o allegro-misc-4.2.1/examples/exalpha.o -I"C:/Dev-Cpp/lib/gcc/mingw32/3.4.2/include" -I"C:/Dev-Cpp/include/c++/3.4.2/backward" -I"C:/Dev-Cpp/include/c++/3.4.2/mingw32" -I"C:/Dev-Cpp/include/c++/3.4.2" -I"C:/Dev-Cpp/include" -O3 g++.exe allegro-misc-4.2.1/examples/exalpha.o -o "draw_alpha.exe" -L"C:/Dev-Cpp/lib" -mwindows ../../../Dev-Cpp/lib/liballeg.a Execution terminated Compilation successful

Am I doing something wrong? if yes, then how should I compile my project with optimizations in devcpp?

Fladimir da Gorf

Paul, by the way, my routines are optimized for anti-aliased sprite rendering, so they perform best when the percentage of translucent pixels is low.

Evert

Do you know how to use a profiler? That could help determine what's going on.

Paul whoknows
Quote:

Paul, by the way, my routines are optimized for anti-aliased sprite rendering, so they perform best when the percentage of translucent pixels is low.

Anti-aliased sprites is exactly what I am using.:)
.
591104
.
And your routines(with -O3) are faster than draw_trans_sprite(without optimization), but they are much much much more faster than draw_trans_sprite(with -O3).

Quote:

Do you know how to use a profiler? That could help determine what's going on.

Yes, I'll profile it, however I don't need to profile anything to appreciate the speed lose in draw_trans_sprite with -On.

Fladimir da Gorf

In fact, I could implement a run-length encoding of translucent sprites which might be even faster... but it'd be some work. However, it seems that these days the memory speed plays even a bigger role than before, so the MMX logic doesn't speed things up as drastically as in my old Pentium 800Mhz...

ImLeftFooted

draw_trans_sprite wont change when you change your -On parameter. The problem would have to be somewhere else...

Paul whoknows
Quote:

In fact, I could implement a run-length encoding of translucent sprites which might be even faster... but it'd be some work. However, it seems that these days the memory speed plays even a bigger role than before, so the MMX logic doesn't speed things up as drastically as in my old Pentium 800Mhz...

That would be great! more speed == more sprites on the screen.
I don't know, you are the expert, but if the RLE works on the alpha channel as I think it should, then blitting a big sprite, with an alpha channel like the one I posted below, should obtain a significant gain in speed, or not?

.
{"name":"591106","src":"\/\/djungxnpq2nug.cloudfront.net\/image\/cache\/2\/6\/26ca77ddda0e128005fa9f427b2fabba.png","w":320,"h":240,"tn":"\/\/djungxnpq2nug.cloudfront.net\/image\/cache\/2\/6\/26ca77ddda0e128005fa9f427b2fabba"}591106
.

Quote:

draw_trans_sprite wont change when you change your -On parameter. The problem would have to be somewhere else...

Yes, but does someone know why -On in Devcpp makes exalpha.c run slower? without optimization it runs faster, what is wrong then?

Milan Mimica

I see no difference with gcc-3.4.6.

Paul whoknows
Quote:

I see no difference with gcc-3.4.6.

I am using devcpp with GCC 3.4.2.
Did you try the binaries I posted in my previous post? they run at different speeds in my P4 2400Mhz, if you have a faster PC probably you would not notice the difference.

Fladimir da Gorf
Quote:

if the RLE works on the alpha channel as I think it should, then blitting a big sprite, with an alpha channel like the one I posted below, should obtain a significant gain in speed

Yes, indeed. The more there's either transparent or solid pixels, the faster it'd be.

Thread #589770. Printed from Allegro.cc