As I spent this last week profiling and trying to figure out why the C++ version turned out to be slower than the C one I am going to share some tips and hints that one should follow when refactoring in order to keep a decent level of performance.
As planned, I am still working on refactoring the maths code as I’ve encountered some problems on the road.
The renderer is now working but there are some minor lighting issues that I am trying to address, however the code is now much cleaner and more readable than before and to show you this I am going to paste some snippets of before and after refactoring scenarios:
Before refactoring –
float *m; V4 *n; if (c->lighting_enabled) { // eye coordinates needed for lighting m = &c->matrix_stack_ptr[0]->m[0][0]; v->ec.X = (v->coord.X * m[0] + v->coord.Y * m[1] + v->coord.Z * m[2] + m[3]); v->ec.Y = (v->coord.X * m[4] + v->coord.Y * m[5] + v->coord.Z * m[6] + m[7]); v->ec.Z = (v->coord.X * m[8] + v->coord.Y * m[9] + v->coord.Z * m[10] + m[11]); v->ec.W = (v->coord.X * m[12] + v->coord.Y * m[13] + v->coord.Z * m[14] + m[15]); // projection coordinates m = &c->matrix_stack_ptr[1]->m[0][0]; v->pc.X = (v->ec.X * m[0] + v->ec.Y * m[1] + v->ec.Z * m[2] + v->ec.W * m[3]); v->pc.Y = (v->ec.X * m[4] + v->ec.Y * m[5] + v->ec.Z * m[6] + v->ec.W * m[7]); v->pc.Z = (v->ec.X * m[8] + v->ec.Y * m[9] + v->ec.Z * m[10] + v->ec.W * m[11]); v->pc.W = (v->ec.X * m[12] + v->ec.Y * m[13] + v->ec.Z * m[14] + v->ec.W * m[15]); m = &c->matrix_model_view_inv.m[0][0]; n = &c->current_normal; v->normal.X = (n->X * m[0] + n->Y * m[1] + n->Z * m[2]); v->normal.Y = (n->X * m[4] + n->Y * m[5] + n->Z * m[6]); v->normal.Z = (n->X * m[8] + n->Y * m[9] + n->Z * m[10]); if (c->normalize_enabled) { gl_V3_Norm(&v->normal); } } else { // no eye coordinates needed, no normal // NOTE: W = 1 is assumed m = &c->matrix_model_projection.m[0][0]; v->pc.X = (v->coord.X * m[0] + v->coord.Y * m[1] + v->coord.Z * m[2] + m[3]); v->pc.Y = (v->coord.X * m[4] + v->coord.Y * m[5] + v->coord.Z * m[6] + m[7]); v->pc.Z = (v->coord.X * m[8] + v->coord.Y * m[9] + v->coord.Z * m[10] + m[11]); if (c->matrix_model_projection_no_w_transform) { v->pc.W = m[15]; } else { v->pc.W = (v->coord.X * m[12] + v->coord.Y * m[13] + v->coord.Z * m[14] + m[15]); } } v->clip_code = gl_clipcode(v->pc.X, v->pc.Y, v->pc.Z, v->pc.W);
After Refactoring –
Matrix4 *m; Vector4 *n; if (c->lighting_enabled) { // eye coordinates needed for lighting m = c->matrix_stack_ptr[0]; v->ec = m->transform3x4(v->coord); // projection coordinates m = c->matrix_stack_ptr[1]; v->pc = m->transform(v->ec); m = &c->matrix_model_view_inv; n = &c->current_normal; v->normal = m->transform3x3(n->toVector3()); if (c->normalize_enabled) { v->normal.normalize(); } } else { // no eye coordinates needed, no normal // NOTE: W = 1 is assumed m = &c->matrix_model_projection; v->pc = m->transform3x4(v->coord); if (c->matrix_model_projection_no_w_transform) { v->pc.setW(m->get(3,3)); } } v->clip_code = gl_clipcode(v->pc.getX(), v->pc.getY(), v->pc.getZ(), v->pc.getW());
As you can see the code is more readable and the operations performed are clearly stated.
While trying to fix some issues that arised during the refactoring I also stumbled upon the git command “stash”: this command lets you store the changes in your code in a place and then apply them afterwards, I used this system to keep my changes while I was switching betweeen branches to execute the old and new version of the code while fixing all the issues I found, so I highly reccommend to read more about it and learn how to use it!
Today I’m going to write a post about how refactoring is useful to increase the readability (and thus, maintainability) of the code.
I will start with a basic example: tinyGL uses a struct called V3 to represent three dimensional vectors:
struct V3 { float v[3]; };
The APi also defines some functions to work with vectors (namely construction, function to get the normal vector, multiplication, copy, etc)
V3 gl_V3_New(float x, float y, float z); int gl_V3_Norm(V3 *a); void gl_MulM3V3(V3 *a, const M4 *b, const V3 *c); void gl_MoveV3(V3 *a, const V3 *b);
Which in turn leads to code like this:
V3 a, b; a = gl_V3_New(5,5,5); M4 transform; gl_MulM4V3(b,transform,a);
My goal with this refactoring is to increase the readability by introducing a class Vector3, which will make creating vectors and using the much easier (I also plan to write classes that represent matrices so that operations between vector and matrices can be expressed in a much concise and intuitive way)
class Matrix4 { public: Matrix4(); Matrix4(const Matrix4 &other); Matrix4 operator=(const Matrix4 &other); Matrix4 operator*(const Matrix4 &b); static Matrix4 identity(); Matrix4 transpose() const; Matrix4 inverse_ortho() const; Matrix4 inverse() const; Matrix4 rotation() const; Vector3 transform(const Vector3 &vector) const; Vector4 transform(const Vector4 &vector) const; private: float m[4][4]; }; class Vector3 { public: Vector3(); Vector3(const Vector3 &other); Vector3(float x, float y, float z); Vector3 operator=(const Vector3 &other); static Vector3 normal(const Vector3 &v); Vector3 operator*(float value); Vector3 operator+(const Vector3 &other); Vector3 operator-(const Vector3 &other); private: float v[3]; };
Those changes would make that previous snippet look like this:
Vector3 a(5,5,3); Matrix4 transform; Vector3 b = transform.transform(a);
As we can see from this little snippet the code is more readable, plus now that vectors are represented by classes we can also overload binary operators such as + and – to express vector addition and subtraction in a more concise way (instead of having to write the addition separately for each component).
That’s all for now! I will try to update the blog again this week to show more examples.
Google Summer of Code begins today, the 19th of May.
I thought it would be nice to share in this blog post my plan for the next weeks and how I am going to proceed with it.
According to my project proposal I will be working on math code refactoring for the next 2 weeks: the reason why I chose to put this task as the first one is that refactoring will allow me to get used to the codebase and coding conventions.
Moreover, refactoring the math code will simplify the task of optimizing the code in different ways: since I am going to refactor a substantial amount of code I need to make sure that after this process the library runs at the same speed at before, so I will need to setup some sort of performance benchmark in order to make sure that the refactoring didn’t actually impact the performance of the library; also, refactoring the code will yield a better readability on the whole codebase, which will make it easier to eventually spot any performance bottleneck and try to address it in some way.
By going more in depth about the task, I will need to rewrite most of the maths functions from C to C++, this will involve writing a Vector3/4 class and a Matrix class.
One of my goal for the design of these classes is to make sure that a SIMD implementation will be possible in the future without having to change all the code that is using those classes.
I think that’s all for this blog post, I will keep you updated with code snippets in the future, thanks for readng!
Hello everyone!
I am extremely happy to be blogging this today, as I have been accepted in the google summer of code 2014 programme!
I will be working on ResidualVM and my goal will be to optimize and refactor TinyGL ( a software implementation of openGL ).
I am still working on my university projects but I will be done soon, and I can’t wait to share this code journey with other people, it is my intention to write about the challenges I will be facing during the project and I think this might be valuable for other people interested in optimization and software design in general.
Thanks for reading this!
In the past week I’ve been working on refactoring the “z buffer” code.
This class main purpose is to store and manage all the rendering information that happens inside tinyGL.
I started off by mving all the external C functions inside the struct ZBuffer (which was subsequently renamed to FrameBuffer) and then I removed all the direct access of the rendering information by encapsulating it in member functions, this also opened the possibility of implementing different logic of pixel blending directly inside the class, without having to modify every external access to the code to add this kind of logic.
During the refactoring of the z buffer code I also had to heavily rewrite some functions that performed triangle rasterization on the screen as they relied too much on macros and other C-style performance trick, what I did was replace those functions with a single templatized function that handled all the different cases at compile time, yielding a different version of the function based on the parameters passed on the templatized version.
My task for this week is to provide an implementation of the function glBlendFunc, which will allow the renderer to support alpha, additive and subtractive blending. In order to implement this I did some research about how the blending should be performed and the results on the screen and I found out this image that describes visually what should happen with every combination of parameter: