OpenGL Programming/Performance
When to optimize
editA common pitfall is to be obsessed with optimization, to the point of spending a lot of coding time on tiny optimization, while complexifying the code and making it harder to debug.
Make sure you optimize when you have evidence that the impacted code truly slows down the application. Make measurements. Compare in different use cases and possibly different hardware.
Also, we recommend that when implementing a new feature, you write your first version as clear as possible, and optimize it in a second step when the feature works correctly.
Measuring frame time
editA common way to measure performance in applications is by the use of FPS, or Frames Per Second. However, because of the definition of FPS itself (frames / 1 second), it does not fully convey performance because of its non-linearity. There are many pages around the web that fully describe this, but the basic idea is this: A linear change in FPS does not result in a linear change in actual performance. A 450 FPS drop from 900 FPS is about 1 millisecond extra time, but a 5 FPS drop from 60 FPS is about.. 1 millisecond extra time. The amount of performance decrease is inversely related to the amount of FPS lost, so as FPS tends towards 0, execution time starts growing out of control fast. Yet, you only see a linear change in FPS, so you just shrug this off.
Instead, you should use frame time, or the amount of time needed to render 1 frame. Although it may seem counter-intuitive, it provides a reliable way to measure if code is causing a bottleneck.
Here's a simple code to display the amount of time taken per frame (measured in milliseconds) on the console:
/* Global */
static unsigned int fps_start = 0;
static unsigned int fps_frames = 0;
/* init_resources() */
fps_start = glutGet(GLUT_ELAPSED_TIME);
/* idle() */
/* FPS count */
{
fps_frames++;
int delta_t = glutGet(GLUT_ELAPSED_TIME) - fps_start;
if (delta_t > 1000) {
cout << delta_t / fps_frames << endl;
fps_frames = 0;
fps_start = glutGet(GLUT_ELAPSED_TIME);
}
}
Vertical sync
editOften, OpenGL is configured to wait for the physical screen's vertical refresh before pushing the new color buffer:
- this prevents tearing, a visual artifact that mixes part of the previous buffer with part of the new one
- there is no visual need to display more frames than what the screen can handle, typically 60-75Hz (depending on the screen)
- this saves resources such as the battery
However, this means that even if we can display 200 FPS, our application will be capped to 60-75 FPS, which makes it difficult to measure the performances.
In such case, it can be useful to disable the vertical sync:
- your graphic card driver may come with utilities to enable and disable it
- with Mesa, you can start your application with
vblank_mode=0
, or configure this behavior more permanently in~/.drirc
; some earlier versions have bugs, so you may have to try both.
Stencil buffer tests
editSee performances tips in the Stencil buffer section.