Hi all! As promised, in this week’s blog post we’ll look more into the performance issues I faced while porting.
The progress so far
For those who did not read my previous blog post, I’ll quickly go through what the performance looked like in my previous blog post.
We had the game running at 10 fps when in reality it should run at 60 fps. I tried optimizing some minor things but did not gain any massive advantage in fps. This prompted a deeper look into my blitting code and how the game blits the level itself.
A deeper look into the problem.
Before we dive into the solutions, we must first ensure that we understand the root cause of the problem and create a mental idea of what we really can optimize and what we can’t.
Let’s look at the video again.
Notice, how the camera moves around with the player? This means that everything on the screen(every tile, every object, etc.) needs to be redrawn every frame. Now as you can imagine, this is really costly.
By looking at the video we can think of following optimizations:
- This game did work at 60 fps in the stock build, which means ScummVM’s blitting code is slow. If we could somehow optimize the blitting code, we could gain a lot of performance.
- Most of the objects on the screen appear to be static, i.e., props and tiles do not seem to change a lot. We could pre-render the stage/level to a texture and move that texture around as per our needs.
Let’s talk about these ideas in a slightly bit more detail.
The important thing to ask in the first idea is why. Why are ScummVM’s blitting calls slower when compared to SDL? The answer is vectorization. SDL has vectorized blitting routines which are orders of magnitude faster. I suggest you to read Eklipsed’s excellent blog on the same subject where he talks about this in a more technical fashion. “Overall the worst culprit was converting the pixel formats, and then blending.” If we could somehow optimize this, we would definitely get better performance.
Coming to the next idea, the instant question that pops into mind is that how is it any better if one still has to redraw the whole screen every time. The thing is that every level is divided into several layers and each of these layers is drawn over other layers to render the full level. The bottom layers could have pixels that are then overdrawn by pixels of the upper layer which wastes a lot of CPU cycles. We could simply pre-render this to a texture and move that texture around. Since we won’t have to redraw every layer again and again, we can expect to gain some performance.
Now, we’ll see how I actually implemented these ideas.
Idea 1(Optimize pixel conversion and alpha blending/testing)
Both our screen and our textures are RGBA8888, so we actually don’t need to perform this step at all. 😉
However, we still need to access every individual pixel because of alpha blending/testing. I implemented alpha testing for a quick test as per a suggestion by sev since all that needs to be done in alpha testing is to check whether the specified pixel has some alpha value.
Sadly, this did not give us the massive improvement which I was hoping for.
Idea 2(Optimize drawing code)
At the cost of increasing the load times, we could pre-render a level with its different layers to a texture. We can actually combine both idea 1 and idea 2 to create a more-optimized solution. Theoretically, once you have the texture of the level, you just need to copy it to the screen. You do not need to perform any sort of blending or pixel conversion! This sounds like a perfect use-case for memcpy and guess what? Yup, you guessed it right, memcpy is vectorized.
Implementing this was a bit harder, but at last, I was able to get it to work.
We finally get the glorious 60 fps, we had been waiting for. But as some of you might have already noticed, layering is broken. 🙂
Since we’re pre-rendering the objects and prop layer, the character is simply drawn over the texture totally breaking layering. There is one possible solution to this situation: Calculate what object is colliding with the player and redraw that.
However, while this works for objects, this does not work with the normal layer.
What I ended up doing was, re-rendering only the parts of the normal layer and the objects which were colliding with the character. This performs even better since we redraw only parts and not the whole texture!
I would like to note that the solution is not perfect but it does work quite nicely.
TL;DR: I was able to optimize the code and get it to render at 60 fps albeit with some small issues.
Thanks for reading and please look forward to future blog posts!