|
Pushing allegro performance (sprites per second) to its limits |
kovarex
Member #14,203
April 2012
|
This is what we are already doing:
This way we managed to get quite far (~27 000 sprites drawn per tick at 60 FPS) but it is still not enough with maximum zoomout in crowded places.(screenshot for reference, warning, huge http://www.factorioforums.com/download/big_screenshot.jpg) When doing performance analysis, I found out, that in the _draw_tinted_rotated_scaled_bitmap_region, big part of the time is lost in the manipulation with backup ... ALLEGRO_TRANSFORM backup; ... al_copy_transform(&backup, al_get_current_transform()); ... al_compose_transform(&t, &backup); ... al_use_transform(&backup); ... When I tried to comment these lines out, the game was working almost perfectly (just some primitives were off, but it is solvable) and it proved to be noticable performance boost as almost all sprites are drawn this way. Another thing, most of the pictures are also not rotated, so if I made optimised version of the function, that just doesn't count with rotation and used that in these cases, it might help etc. I'm looking for tricks like this to push the performance to the limits, it can be nasty tricks, it can modify allegro code, it can be heavily customised. Any advice would be appreciated. www.factorio.com - a factory building game. |
ph03nix
Member #15,028
April 2013
|
Perhaps instead you could use al_use_transform() with your own ALLEGRO_TRANSFORM matrix and just draw a tinted bitmap instead. You can modify the values of the matrix yourself or use al_scale_transform, al_rotate_transform, and al_translate_transform (and al_identity_transform of course). If performance is critical I would suggest manipulating the array yourself since it's a 4x4 array but you only need to modify the 2x2 part for 2D, and those functions I mentioned probably multiply two 4x4 matrices together. EDIT: for sprites not using rotation and scaling, you can simply skip all the matrix stuff and al_use_transform |
kazzmir
Member #1,786
December 2001
|
Just to avoid a dumb situation, are you using an optimized build of Allegro? |
SiegeLord
Member #7,827
October 2006
|
ph03nix said: Perhaps instead you could use al_use_transform() with your own ALLEGRO_TRANSFORM matrix and just draw a tinted bitmap instead. All al_draw_*_bitmap calls use the same function (al_draw_tinted_scaled_rotated_bitmap_region), so you are unlikely to get any speedups by switching functions. kovarex said: Any advice would be appreciated. I made a primitives addon replacement for held drawing that did not fiddle with transformations: https://github.com/SiegeLord/FastDraw. I found it to be faster than held drawing (3x as much on my machine). I am implementing vertex buffers for Allegro and will try them for the same purpose. In a different test, they are 1.5x faster than al_draw_prim. So, it might be the case that I'll get nearly 5x faster drawing than default Allegro's by using this approach. "For in much wisdom is much grief: and he that increases knowledge increases sorrow."-Ecclesiastes 1:18 |
kovarex
Member #14,203
April 2012
|
One of the time consuming tasks in our render preparation is the sorting of the sprites to be drawn to have the isometric view. kazzmir: ph03nix: SiegeLord: www.factorio.com - a factory building game. |
l j
Member #10,584
January 2009
|
kovarex said: If I understand it correctly, the 2d is rendered in fact as 3d, couldn't we just use some trick, to set the depth of the bitmaps by some formula to avoid sorting of these? So it would be sorted by the hardware almost for free. Draw the bitmaps on different Z values with depth buffering enabled. Don't ask me how to do that though, I barely know anything about DirectX and even less about OpenGL. I do know that you'll still have to sort any transparent bitmaps yourself.
|
kovarex
Member #14,203
April 2012
|
Ok, so after whole day of profiling, digging and fiddling with the code, I achieved 2.5X speed improvement of the rendering method (the call to al_draw_tinted_scaled_rotated_bitmap_region). Some of these optimisations were very custom and result of tighter integration of our rendering method with allegro, but big part of it could be applied to allegro to make it generally faster, I believe. These changes were the most important
Custom optimisations were mainly: I diminished the method calls by using the internals of the public allegro draw method and the drawer method and used its code directly in my system draw routine, as well as connecting some methods. All my sprites are sub-bitmaps (parts of atlases), so I could remove those ifs that check for sub-bitmaps. I removed all the branches we never use (non-accelerated drawing, drawing from backbuffer and similar), smaller functions are better for cache hits. www.factorio.com - a factory building game. |
Dizzy Egg
Member #10,824
March 2009
|
What compiler flags are you using
---------------------------------------------------- |
axilmar
Member #1,204
April 2001
|
kovarex said: This way we managed to get quite far (~27 000 sprites drawn per tick at 60 FPS) but it is still not enough with maximum zoomout in crowded places. Have you thought about using a different algorithm? for example, using lower res textures for lower zoom levels? |
SiegeLord
Member #7,827
October 2006
|
kovarex said: Could it be extended to support rotating of bitmaps as well? Yes, sure. kovarex said: * The backup could be easily removed as long as I initialised the identity transform in drawing of primitives, but the overall gain is big I'm not sure I understand... does this work only if you don't use transformations, or will this work if the user has non-identity transformations set? If not, maybe we could detect the identity transform and do a "fast" path if its active. Quote: * in d3d drawing, it checks if the VERTICAL/HORIZONTAL flip is active, but that is already dealt with in the allegro method (and the flag is turned off), so these ifs are always off and can be removed These two probably can be applied to the non-optimized Allegro functions, no? Quote: * The internal quad function could be integrated into the d3d drawing function Not sure what you mean here. "For in much wisdom is much grief: and he that increases knowledge increases sorrow."-Ecclesiastes 1:18 |
|