|
This thread is locked; no one can reply to it. |
1
2
|
Allegro 5 performance |
Chris Katko
Member #1,881
January 2002
|
95% of all problems would be solved by posting the project file instead of just guessing into a black box. -----sig: |
Andrew Gillett
Member #15,868
January 2015
|
I have uploaded a project here: https://drive.google.com/file/d/106m6ZxFLGTzWhnULdlzYuJM172r8dgkp/view?usp=sharing What I've found is that for this test, performance is much worse in OpenGL, as compared to Direct3D with ALLEGRO_NO_PRESERVE_TEXTURE set on the backbuffer. The Render1 step which draws lots of small sprites takes about 24ms, compared with 8 for Direct3D. If you want to try it, the key lines to look at are: Psector.cpp 15, 441, 469, 499 |
Chris Katko
Member #1,881
January 2002
|
I installed visual studio 2019 and loaded your project and got this error C:\Users----\Downloads\psector.ultrasimplifiedprofiling\psector.vcxproj : error : Error HRESULT E_FAIL has been returned from a call to a COM component. sigh... why does this stuff never work the first time. Quote:
</entry> ... C++ didn't install right maybe? I have no idea why I'm a magnet for such obscure errors. [edit] hmmm........ maybe a combination of visual studio "wanting to restart to finish" + windows feature update applying on restart clobbered a dependency. I bet that's it. GG Microsoft. [edit] it compiles now. -----sig: |
Andrew Gillett
Member #15,868
January 2015
|
Which version of MSVC 2019 are you on? Here they suggest updating and deleting the .vs folder |
Chris Katko
Member #1,881
January 2002
|
[edit] nvm it's a user pragma Well this looks wrong. I sure as heck don't have a "23bit" or "13bit" color screen. 1The thread 0x3dec has exited with code 0 (0x0).
20: 640x480, 23bit, 60Hz
30: 640x480, 23bit, 72Hz
40: 640x480, 23bit, 75Hz
50: 720x480, 23bit, 60Hz
60: 720x480, 23bit, 59Hz
70: 720x576, 23bit, 50Hz
80: 800x600, 23bit, 60Hz
90: 800x600, 23bit, 72Hz
100: 800x600, 23bit, 75Hz
110: 1024x768, 23bit, 60Hz
120: 1024x768, 23bit, 70Hz
130: 1024x768, 23bit, 75Hz
140: 1152x864, 23bit, 75Hz
150: 1176x664, 23bit, 50Hz
160: 1176x664, 23bit, 60Hz
170: 1176x664, 23bit, 59Hz
180: 1280x720, 23bit, 60Hz
190: 1280x720, 23bit, 59Hz
200: 1280x720, 23bit, 50Hz
210: 1280x768, 23bit, 60Hz
220: 1280x800, 23bit, 60Hz
230: 1280x960, 23bit, 60Hz
240: 1280x960, 23bit, 75Hz
250: 1280x1024, 23bit, 60Hz
260: 1280x1024, 23bit, 75Hz
270: 1360x768, 23bit, 60Hz
280: 1366x768, 23bit, 60Hz
290: 1600x900, 23bit, 60Hz
300: 1600x1024, 23bit, 59Hz
310: 1600x1024, 23bit, 60Hz
320: 1600x1200, 23bit, 60Hz
330: 1600x1200, 23bit, 59Hz
340: 1600x1200, 23bit, 24Hz
350: 1600x1200, 23bit, 23Hz
360: 1600x1200, 23bit, 25Hz
370: 1600x1200, 23bit, 30Hz
380: 1600x1200, 23bit, 29Hz
390: 1600x1200, 23bit, 50Hz
400: 1680x1050, 23bit, 59Hz
410: 1680x1050, 23bit, 60Hz
420: 1768x992, 23bit, 25Hz
430: 1768x992, 23bit, 30Hz
440: 1768x992, 23bit, 29Hz
450: 1920x1080, 23bit, 60Hz
460: 1920x1080, 23bit, 59Hz
470: 1920x1080, 23bit, 50Hz
480: 1920x1080, 23bit, 30Hz
490: 1920x1080, 23bit, 29Hz
500: 1920x1080, 23bit, 25Hz
510: 1920x1080, 23bit, 24Hz
520: 1920x1080, 23bit, 23Hz
530: 1920x1200, 23bit, 60Hz
540: 1920x1200, 23bit, 59Hz
550: 1920x1200, 23bit, 24Hz
560: 1920x1200, 23bit, 23Hz
570: 1920x1200, 23bit, 25Hz
580: 1920x1200, 23bit, 30Hz
590: 1920x1200, 23bit, 29Hz
600: 1920x1200, 23bit, 50Hz
610: 1920x1440, 23bit, 60Hz
620: 1920x1440, 23bit, 59Hz
630: 1920x1440, 23bit, 24Hz
640: 1920x1440, 23bit, 23Hz
650: 1920x1440, 23bit, 25Hz
660: 1920x1440, 23bit, 30Hz
670: 1920x1440, 23bit, 29Hz
680: 1920x1440, 23bit, 50Hz
690: 2048x1536, 23bit, 60Hz
700: 2048x1536, 23bit, 59Hz
710: 2048x1536, 23bit, 24Hz
720: 2048x1536, 23bit, 23Hz
730: 2048x1536, 23bit, 25Hz
740: 2048x1536, 23bit, 30Hz
750: 2048x1536, 23bit, 29Hz
760: 2048x1536, 23bit, 50Hz
770: 2560x1440, 23bit, 60Hz
780: 2560x1440, 23bit, 59Hz
790: 2560x1440, 23bit, 24Hz
800: 2560x1440, 23bit, 23Hz
810: 2560x1440, 23bit, 25Hz
820: 2560x1440, 23bit, 30Hz
830: 2560x1440, 23bit, 29Hz
840: 2560x1440, 23bit, 50Hz
850: 2560x1600, 23bit, 60Hz
860: 2560x1600, 23bit, 59Hz
870: 2560x1600, 23bit, 24Hz
880: 2560x1600, 23bit, 23Hz
890: 2560x1600, 23bit, 25Hz
900: 2560x1600, 23bit, 30Hz
910: 2560x1600, 23bit, 29Hz
920: 2560x1600, 23bit, 50Hz
930: 1440x900, 23bit, 60Hz
940: 1400x1050, 23bit, 59Hz
950: 1400x1050, 23bit, 60Hz
960: 3840x2160, 23bit, 60Hz
970: 3840x2160, 23bit, 59Hz
980: 3840x2160, 23bit, 24Hz
990: 3840x2160, 23bit, 23Hz
1000: 3840x2160, 23bit, 25Hz
1010: 3840x2160, 23bit, 30Hz
1020: 3840x2160, 23bit, 29Hz
1030: 3840x2160, 23bit, 50Hz
1040: 4096x2160, 23bit, 50Hz
1050: 4096x2160, 23bit, 60Hz
1060: 4096x2160, 23bit, 59Hz
1070: 4096x2160, 23bit, 24Hz
1080: 4096x2160, 23bit, 23Hz
1090: 4096x2160, 23bit, 25Hz
1100: 4096x2160, 23bit, 30Hz
1110: 4096x2160, 23bit, 29Hz
1120: 640x480, 13bit, 60Hz
1130: 640x480, 13bit, 72Hz
1140: 640x480, 13bit, 75Hz
1150: 720x480, 13bit, 60Hz
1160: 720x480, 13bit, 59Hz
1170: 720x576, 13bit, 50Hz
1180: 800x600, 13bit, 60Hz
1190: 800x600, 13bit, 72Hz
1200: 800x600, 13bit, 75Hz
1210: 1024x768, 13bit, 60Hz
1220: 1024x768, 13bit, 70Hz
1230: 1024x768, 13bit, 75Hz
1240: 1152x864, 13bit, 75Hz
1250: 1176x664, 13bit, 50Hz
1260: 1176x664, 13bit, 60Hz
1270: 1176x664, 13bit, 59Hz
1280: 1280x720, 13bit, 60Hz
1290: 1280x720, 13bit, 59Hz
1300: 1280x720, 13bit, 50Hz
1310: 1280x768, 13bit, 60Hz
1320: 1280x800, 13bit, 60Hz
1330: 1280x960, 13bit, 60Hz
1340: 1280x960, 13bit, 75Hz
1350: 1280x1024, 13bit, 60Hz
1360: 1280x1024, 13bit, 75Hz
1370: 1360x768, 13bit, 60Hz
1380: 1366x768, 13bit, 60Hz
1390: 1600x900, 13bit, 60Hz
1400: 1600x1024, 13bit, 59Hz
1410: 1600x1024, 13bit, 60Hz
1420: 1600x1200, 13bit, 60Hz
1430: 1600x1200, 13bit, 59Hz
1440: 1600x1200, 13bit, 24Hz
1450: 1600x1200, 13bit, 23Hz
1460: 1600x1200, 13bit, 25Hz
1470: 1600x1200, 13bit, 30Hz
1480: 1600x1200, 13bit, 29Hz
1490: 1600x1200, 13bit, 50Hz
1500: 1680x1050, 13bit, 59Hz
1510: 1680x1050, 13bit, 60Hz
1520: 1768x992, 13bit, 25Hz
1530: 1768x992, 13bit, 30Hz
1540: 1768x992, 13bit, 29Hz
1550: 1920x1080, 13bit, 60Hz
1560: 1920x1080, 13bit, 59Hz
1570: 1920x1080, 13bit, 50Hz
1580: 1920x1080, 13bit, 30Hz
1590: 1920x1080, 13bit, 29Hz
1600: 1920x1080, 13bit, 25Hz
1610: 1920x1080, 13bit, 24Hz
1620: 1920x1080, 13bit, 23Hz
1630: 1920x1200, 13bit, 60Hz
1640: 1920x1200, 13bit, 59Hz
1650: 1920x1200, 13bit, 24Hz
1660: 1920x1200, 13bit, 23Hz
1670: 1920x1200, 13bit, 25Hz
1680: 1920x1200, 13bit, 30Hz
1690: 1920x1200, 13bit, 29Hz
1700: 1920x1200, 13bit, 50Hz
1710: 1920x1440, 13bit, 60Hz
1720: 1920x1440, 13bit, 59Hz
1730: 1920x1440, 13bit, 24Hz
1740: 1920x1440, 13bit, 23Hz
1750: 1920x1440, 13bit, 25Hz
1760: 1920x1440, 13bit, 30Hz
1770: 1920x1440, 13bit, 29Hz
1780: 1920x1440, 13bit, 50Hz
1790: 2048x1536, 13bit, 60Hz
1800: 2048x1536, 13bit, 59Hz
1810: 2048x1536, 13bit, 24Hz
1820: 2048x1536, 13bit, 23Hz
1830: 2048x1536, 13bit, 25Hz
1840: 2048x1536, 13bit, 30Hz
1850: 2048x1536, 13bit, 29Hz
1860: 2048x1536, 13bit, 50Hz
1870: 2560x1440, 13bit, 60Hz
1880: 2560x1440, 13bit, 59Hz
1890: 2560x1440, 13bit, 24Hz
1900: 2560x1440, 13bit, 23Hz
1910: 2560x1440, 13bit, 25Hz
1920: 2560x1440, 13bit, 30Hz
1930: 2560x1440, 13bit, 29Hz
1940: 2560x1440, 13bit, 50Hz
1950: 2560x1600, 13bit, 60Hz
1960: 2560x1600, 13bit, 59Hz
1970: 2560x1600, 13bit, 24Hz
1980: 2560x1600, 13bit, 23Hz
1990: 2560x1600, 13bit, 25Hz
2000: 2560x1600, 13bit, 30Hz
2010: 2560x1600, 13bit, 29Hz
2020: 2560x1600, 13bit, 50Hz
2030: 1440x900, 13bit, 60Hz
2040: 1400x1050, 13bit, 59Hz
2050: 1400x1050, 13bit, 60Hz
2060: 3840x2160, 13bit, 60Hz
2070: 3840x2160, 13bit, 59Hz
2080: 3840x2160, 13bit, 24Hz
2090: 3840x2160, 13bit, 23Hz
2100: 3840x2160, 13bit, 25Hz
2110: 3840x2160, 13bit, 30Hz
2120: 3840x2160, 13bit, 29Hz
2130: 3840x2160, 13bit, 50Hz
2140: 4096x2160, 13bit, 50Hz
2150: 4096x2160, 13bit, 60Hz
2160: 4096x2160, 13bit, 59Hz
2170: 4096x2160, 13bit, 24Hz
2180: 4096x2160, 13bit, 23Hz
2190: 4096x2160, 13bit, 25Hz
2200: 4096x2160, 13bit, 30Hz
2210: 4096x2160, 13bit, 29Hz
2220: Config screen resolution is invalid - trying desktop res (fullscreen) or first resolution below desktop (windowed)
2230: Changing video mode to 1400x1050, windowed, 32 bit colour
-----sig: |
Andrew Gillett
Member #15,868
January 2015
|
That's fine, it's just a reminder to me that I need to update that code at some point. |
Chris Katko
Member #1,881
January 2002
|
One thing I notice is al_set_target_bitmap(dest); called for every sprite. That may incur a cost (and the driver update may have reduced that cost). If the target is always a render buffer, or the screen, etc, you should only be calling that once [per actual need] as the driver may be using the moving [current target] to decide when to cache / move memory around. (complete guess but it's definitely different than I've ever coded my projects) -----sig: |
Andrew Gillett
Member #15,868
January 2015
|
I mentioned in a previous post: This is the code which prints the modes: 1for (int i = 0; i < al_get_num_display_modes(); ++i)
2{
3 ALLEGRO_DISPLAY_MODE mode;
4 al_get_display_mode(i, &mode);
5 Log("%dx%d, %dbit, %dHz\n", mode.width, mode.height, mode.format, mode.refresh_rate);
6}
|
Chris Katko
Member #1,881
January 2002
|
profile shows 7% of time was spent just in al_set_target() on my machine though i'm not used to this MSVC UI for profiling [edit] 24% was in waiting for events, so it's either vsync or set to only draw a set rate. [edit] I set al_set_new_display_option(ALLEGRO_VSYNC, 2, ALLEGRO_REQUIRE); //vsync off 600: Advance: 0.0ms Render1: 7.0ms Render2: 0.5ms Frame: 7.4ms 600: Advance: 0.0ms Render1: 7.5ms Render2: 0.5ms Frame: 8.0ms 600: Advance: 0.0ms Render1: 7.9ms Render2: 0.3ms Frame: 8.3ms 600: Advance: 0.0ms Render1: 9.9ms Render2: 0.3ms Frame: 10.3ms 600: Advance: 0.0ms Render1: 7.3ms Render2: 0.3ms Frame: 7.6ms 600: Advance: 0.0ms Render1: 9.5ms Render2: 0.3ms Frame: 9.8ms 600: Advance: 0.0ms Render1: 7.3ms Render2: 0.3ms Frame: 7.7ms 600: Advance: 0.0ms Render1: 9.5ms Render2: 0.5ms Frame: 10.0ms 600: Advance: 0.0ms Render1: 7.9ms Render2: 0.4ms Frame: 8.2ms 600: Advance: 0.0ms Render1: 7.7ms Render2: 0.3ms Frame: 8.0ms I'm getting over 100 FPS even with al_set_target_bitmap (still uses 8% cpu time!). Granted, it's a GTX 1060. Still, it's showing plenty of time taken in the event queue so maybe I'll need to force no vsync with my drivers. That, or you did something wrong with the event queue/timers and it's forcing it to wait. [edit] Still blowing 25% time waiting for events with vsync forced off. Take a look through in your event handler / timing code when you can, I'm out of time for the moment. 1//25% in this block
2//psector.cpp:354
3 while (!al_is_event_queue_empty(eventQueue))
4 {
5 ALLEGRO_EVENT event;
6
7 al_wait_for_event(eventQueue, &event);
8
9 switch (event.type)
10 {
11 case ALLEGRO_EVENT_DISPLAY_CLOSE:
12 quit = true;
13 break;
14 case ALLEGRO_EVENT_DISPLAY_LOST:
15 Log("ALLEGRO_EVENT_DISPLAY_LOST\n");
16 break;
17 case ALLEGRO_EVENT_DISPLAY_HALT_DRAWING:
18 Log("ALLEGRO_EVENT_DISPLAY_HALT_DRAWING\n");
19 break;
20 case ALLEGRO_EVENT_DISPLAY_RESUME_DRAWING:
21 Log("ALLEGRO_EVENT_DISPLAY_RESUME_DRAWING\n");
22 break;
23 case ALLEGRO_EVENT_TIMER:
24 ++timer_value;
25 ++totalTicks;
26 break;
27 }
28 }
You know, the second I look at that, it looks completely normal. I was expecting something more... squirrelly. [edit] Wait, no... how... if you're waiting for Allegro to fire off an event, it's literally impossible to exceed that speed. I don't think that's how you'd setup timing for a benchmark. Here you set a timer based on the refresh rate of the screen: //psector.cpp:232 alTimer = al_create_timer(1.0 / update_rate); al_register_event_source(eventQueue, al_get_timer_event_source(alTimer)); al_start_timer(alTimer); and that's the only thing triggering the exit from the while loop above. p.s. profilers rule. -----sig: |
Peter Hull
Member #1,136
March 2001
|
Great work CK! Just one thing, Chris Katko said: Well this looks wrong. I sure as heck don't have a "23bit" or "13bit" color screen. The format member is not a bit depth it's a pixel format, one of these. So, 13 is ALLEGRO_PIXEL_FORMAT_RGB_565 and 23 is ALLEGRO_PIXEL_FORMAT_XRGB_8888. Hopefully that makes more sense.
|
Andrew Gillett
Member #15,868
January 2015
|
The game runs at a maximum of 60fps. The timings I've been looking at are those measured by my own timing system, which measures the render1 and render2 times and outputs them once the level has been running for 10 seconds. You're right in saying that this system isn't ideal for profiling using the MSVC profiler. Commenting out "if (timer_value > 0)" and "if (timer_value == 0 || ++framesSkipped == 4)" should allow it to run flat-out. UPDATE: Direct3D Without targetBitmap optimisation Render1: 9.3ms Render2: 0.0ms Flip: 0.5ms Frame: 9.8ms Render1: 9.2ms Render2: 0.0ms Flip: 0.6ms Frame: 9.9ms Render1: 9.3ms Render2: 0.0ms Flip: 0.7ms Frame: 9.9ms Render1: 8.9ms Render2: 0.0ms Flip: 0.5ms Frame: 9.5ms Render1: 9.0ms Render2: 0.0ms Flip: 0.5ms Frame: 9.5ms With targetBitmap optimisation Render1: 6.5ms Render2: 0.0ms Flip: 0.9ms Frame: 7.5ms Render1: 6.9ms Render2: 0.0ms Flip: 1.0ms Frame: 7.9ms Render1: 6.9ms Render2: 0.0ms Flip: 0.9ms Frame: 7.9ms Render1: 7.4ms Render2: 0.0ms Flip: 1.0ms Frame: 8.4ms Render1: 6.7ms Render2: 0.0ms Flip: 1.0ms Frame: 7.7ms I also added an option for it to run flat out, without waiting for a timer event. When I enable that, I get these results: With RUN_FLAT_OUT and targetBitmap optimisation Render1: 7.9ms Render2: 0.0ms Flip: 8.0ms Frame: 16.0ms Render1: 7.6ms Render2: 0.0ms Flip: 8.0ms Frame: 15.7ms Render1: 7.4ms Render2: 0.0ms Flip: 7.7ms Frame: 15.2ms Render1: 7.4ms Render2: 0.0ms Flip: 7.3ms Frame: 14.7ms Render1: 6.0ms Render2: 0.0ms Flip: 8.7ms Frame: 14.8ms This is with vsync off using ALLEGRO_REQUIRE, although as you say the driver does not necessarily honour that. Updated project: See these lines in updated project: |
Chris Katko
Member #1,881
January 2002
|
Andrew Gillett said: I added the optimisation to only call al_set_target_bitmap when it needs to change. I see a boost to Render1 performance but strangely, al_flip_display consistently takes longer. I have split it out from the Render2 timing, which confirms that all the Render2 time is taken up by al_flip_display. That does still does seem like vsync (or another hard limiter). So the question is now, isn't so much performance but "is there something wrong in your code, or does everyone else have the same FPS limit when using Allegro in Windows?" Also, do you have Intel, AMD, or nVidia hardware? Thanks, [edit] I dev in Linux and I've never had these problems so I'm guessing it's a Windows only issue unless there's another squirrelly timing issue hidden. -----sig: |
Andrew Gillett
Member #15,868
January 2015
|
Intel Core i5-3570K with integrated Intel HD Graphics 4000 GPU |
Chris Katko
Member #1,881
January 2002
|
Is it a laptop, with some sort of power saving mode? Andrew Gillett said: This is with vsync off using ALLEGRO_REQUIRE, although as you say the driver does not necessarily honour that. Oddly enough, with REQUIRE, it's supposed to fail/crash if it can't force it. But who knows if that actually works. -----sig: |
Andrew Gillett
Member #15,868
January 2015
|
No, it's a desktop. |
Edgar Reynaldo
Major Reynaldo
May 2007
|
Allegro can't always override the driver's decision about VSYNC. You need to disable VSYNC in your driver for your application to be sure. ALLEGRO_VSYNC is just a suggestion, even with ALLEGRO_REQUIRE, because the driver is in control. 16ms is 60HZ, which indicates VSYNC. My Website! | EAGLE GUI Library Demos | My Deviant Art Gallery | Spiraloid Preview | A4 FontMaker | Skyline! (Missile Defense) Eagle and Allegro 5 binaries | Older Allegro 4 and 5 binaries | Allegro 5 compile guide |
|
1
2
|