Categories
Uncategorized

Optimizing performance

Hi all! As promised, in this week’s blog post we’ll look more into the performance issues I faced while porting.

The progress so far

For those who did not read my previous blog post, I’ll quickly go through what the performance looked like in my previous blog post.

We had the game running at 10 fps when in reality it should run at 60 fps. I tried optimizing some minor things but did not gain any massive advantage in fps. This prompted a deeper look into my blitting code and how the game blits the level itself.

A deeper look into the problem.

Before we dive into the solutions, we must first ensure that we understand the root cause of the problem and create a mental idea of what we really can optimize and what we can’t.

Let’s look at the video again.

Notice, how the camera moves around with the player? This means that everything on the screen(every tile, every object, etc.) needs to be redrawn every frame. Now as you can imagine, this is really costly.

By looking at the video we can think of following optimizations:

  • This game did work at 60 fps in the stock build, which means ScummVM’s blitting code is slow. If we could somehow optimize the blitting code, we could gain a lot of performance.
  • Most of the objects on the screen appear to be static, i.e., props and tiles do not seem to change a lot. We could pre-render the stage/level to a texture and move that texture around as per our needs.

Let’s talk about these ideas in a slightly bit more detail.

The important thing to ask in the first idea is why. Why are ScummVM’s blitting calls slower when compared to SDL? The answer is vectorization. SDL has vectorized blitting routines which are orders of magnitude faster. I suggest you to read Eklipsed’s excellent blog on the same subject where he talks about this in a more technical fashion. “Overall the worst culprit was converting the pixel formats, and then blending.” If we could somehow optimize this, we would definitely get better performance.

Coming to the next idea, the instant question that pops into mind is that how is it any better if one still has to redraw the whole screen every time. The thing is that every level is divided into several layers and each of these layers is drawn over other layers to render the full level. The bottom layers could have pixels that are then overdrawn by pixels of the upper layer which wastes a lot of CPU cycles. We could simply pre-render this to a texture and move that texture around. Since we won’t have to redraw every layer again and again, we can expect to gain some performance.

Now, we’ll see how I actually implemented these ideas.

Solutions:

Idea 1(Optimize pixel conversion and alpha blending/testing)

Both our screen and our textures are RGBA8888, so we actually don’t need to perform this step at all. 😉

However, we still need to access every individual pixel because of alpha blending/testing. I implemented alpha testing for a quick test as per a suggestion by sev since all that needs to be done in alpha testing is to check whether the specified pixel has some alpha value.

Sadly, this did not give us the massive improvement which I was hoping for.

15 fps only :/ and with artefacts

Idea 2(Optimize drawing code)

At the cost of increasing the load times, we could pre-render a level with its different layers to a texture. We can actually combine both idea 1 and idea 2 to create a more-optimized solution. Theoretically, once you have the texture of the level, you just need to copy it to the screen. You do not need to perform any sort of blending or pixel conversion! This sounds like a perfect use-case for memcpy and guess what? Yup, you guessed it right, memcpy is vectorized.

Implementing this was a bit harder, but at last, I was able to get it to work.

We finally get the glorious 60 fps, we had been waiting for. But as some of you might have already noticed, layering is broken. 🙂

Uh-oh!

Since we’re pre-rendering the objects and prop layer, the character is simply drawn over the texture totally breaking layering. There is one possible solution to this situation: Calculate what object is colliding with the player and redraw that.

Fixed layering – the character is drawn behind the pillar as intended!

However, while this works for objects, this does not work with the normal layer.

What I ended up doing was, re-rendering only the parts of the normal layer and the objects which were colliding with the character. This performs even better since we redraw only parts and not the whole texture!

I would like to note that the solution is not perfect but it does work quite nicely.

TL;DR: I was able to optimize the code and get it to render at 60 fps albeit with some small issues.

Thanks for reading and please look forward to future blog posts!

 

 

 

Categories
Uncategorized

Preliminary Work

Introduction

Sev promoted me to actually start the work on porting the CRAB engine after I had completed some tasks which he had assigned me.

On this page, I hope to document to the best of my efforts what work I have already accomplished as of writing this page.

Importing the Engine Source code into ScummVM

The very first thing, intuitively, was to import the engine sources into ScummVM and get them to compile. To accomplish this, I first created a new skeleton engine and then imported all the files.

Initially, I took the wrong approach, I focused on removing STL usage before anything even compiled and rewrote some functions. Sev later made me realize that this was probably not the best approach and that I should first focus on getting all files to compile, and then remove STL usage.

I got introduced to `FORBIDDEN_SYMBOL_ALLOW_ALL` which allowed linking with C++ STL(future engine porters take note :p). 

One thing that I took advantage of was that the src code of CRAB was well-separated into different components. So, instead of getting the whole codebase to compile at once, I instead made every component compilable one by one.

With commit #07a97c7f all files were compilable. It should be noted that I had to stub/comment out some functions as they errored out.

It is during this stage, that I also got the first visuals on screen by porting the Image class to use ScummVM’s equivalent of SDL drawing functions.

This is the game loading screen just before the main menu.

Getting game to render and running into issues

Next, I ported the text drawing functions and soon I was able to go in-game. Yay!

The game renders correctly!

However, if I pressed the left mouse button even once, the screen would simply scroll way past the point it was supposed to.

I spent days debugging this, and ofcourse, the problem was the new rewritten functions I had created :). Reverting those, fixed the issue and now it worked as intended. (A more techincal description: A templated function did not take into account that floats/doubles could be passed to it.)

One important aspect that I haven’t talked about is – Performance. Originally the game is supposed to run at 60 fps, however I was only getting a glorious ~6 fps with ScummVM’s drawing functions.

Optimizing performance

The very first suggestion I got was to work with the minimal – resolution that the game supported, which in this case was 1280×720.

This boosted the fps to 10. It was definitely not a massive improvement, but a improvement, nonetheless.

 

This is where the current progress stands. There are a few other optimizations which have definately helped but that is for a future blog post!

Thanks for reading.