Out of curiosity, I recently spent some time (maybe too much time) taking measurements of my software renderer, DIWide. I compiled it with different compilers, at different optimization levels, running different code paths, and measured the speed of the generated code, as well as the size of the resulting executable and the time it took to compile.
Here is the code being measured: https://github.com/notnullnotvoid/DIWide/tree/c4fa0b297b1f30794fe01e796e6ca595d106abc3
The program in question has a complex inner loop with many small functions to inline, a wealth of common subexpressions and loop-invariant calculations to hoist out, a healthy mix of integer and floating point calculations, and a variety of memory access patterns and different kinds of instructions to schedule. For this reason, I consider it a good code sample for testing the compiler's overall capabilities, or at least better than a series of microbenchmarks.
methodology: [ul] [li]Execution times represent time