As I spent this last week profiling and trying to figure out why the C++ version turned out to be slower than the C one I am going to share some tips and hints that one should follow when refactoring in order to keep a decent level of performance.
Avoid implementing copy constructors if you don’t need to
When I was implementing classes such as Vector3 and Matrix4 I implemented my own copy constructor and assignment operator by using memcpy, it turned out that my own version was slower than what the compiler could have generated on its own, so if you don’t have any particular need then just avoid implementing it (you’ll also have less code to maintain!)
Never assume that return value optimization will be employed
This goes along with function inlining, you should never assume that RVO will be employed by the compiler when you are implementing functions. When I replaced function that modified a reference of an object instead of the classic assignment I got a huge increase in performance so consider this if you’re noticing some suspicious performance problems after a refactoring.
Be careful about function inlining
Even if you put the code in the header file and use the keyword “inline”, you’re just giving the compiler an hint but you have no guarantees that the code will actually be inlined. Sometimes it’s easy to think that an inline function that performs some simple operation and returns a new value might be inlined (and you might also think that the compiler will use RVO to remove the unnecessary temporary object) but more than often the compiler will ignore you and generate more overhead than you’d have expected in the first place.
And that’s all for this week!
One more advice I would give is to use compare performance reports if you’re using visual studio profiler as it will help you a lot in finding out exactly which function is slower than before and by how much.