Tiled Surfaces

Okay, today’s entry is about optimizing tiled surfaces by caching them.
Here’s the deal – we have tiled surfaces, surfaces where a little square is repeated all over.

It’s great for backgrounds, as people who had a Geocities page in 1999 know (if you did not: http://wonder-tonic.com/geocitiesizer/index.php)

An example of a tiled image is the default system menu:

Another is dialogue balloons in Rosemary:

They are heavy.

Each tile is handled more or less as a stand alone surface at every redraw, which means: huge overhead.

The fix is: cache them.

This involves hacking the renderer a bit so that it draws to a “dummy” surface and changing the UITIledImage code accordingly:

bool BaseRenderOSystem::startSpriteBatch(bool swap, int width, int height) {
    if (swap) {
        _swapped = true;
        _auxSurface = _renderSurface;
        _renderSurface = new
 
 Graphics::Surface();
        _renderSurface->create(width, height, _auxSurface->format);
    }
    _spriteBatch = true;
    _batchNum = 1;
    return STATUS_OK;
}
 
bool BaseRenderOSystem::endSpriteBatch(bool swap) {
    if (_swapped) {
        _swapped = false;
        Graphics::Surface *temp;
        temp = _renderSurface;
        _renderSurface = _auxSurface;
        _auxSurface = temp;
    }
    _spriteBatch = false;
    _batchNum = 0;
    return STATUS_OK;
}
 
// ....
 
bool UITiledImage::display(int x, int y, int width, int height) {
 
    int tileWidth = _middleMiddle.right - _middleMiddle.left;
    int tileHeight = _middleMiddle.bottom - _middleMiddle.top;
 
    int nuColumns = (width - (_middleLeft.right - _middleLeft.left) - (_middleRight.right - _middleRight.left)) / tileWidth;
    int nuRows = (height - (_upMiddle.bottom - _upMiddle.top) - (_downMiddle.bottom - _downMiddle.top)) / tileHeight;
 
    int col, row;
 
    assert (width != 0);
    assert (height != 0);
 
    // Below is the new if block, which is executed only once:
 
    if (_cache == nullptr || width != _width || height != _height) {
 
        // This as a side effect enables the special "dummy" rendering mode
        _gameRef->_renderer->startSpriteBatch(true, width, height);
 
        int x = 0;
        int y = 0;
        _width = width;
        _height = height;
        // top left/right
        _image->_surface->displayTrans(x,                                                       y, _upLeft);
        _image->_surface->displayTrans(x + (_upLeft.right - _upLeft.left) + nuColumns * tileWidth, y, _upRight);
 
        // Etc, same for SW, NW, NE tiles
 
        // tiles
        if (nuRows > 0 && nuColumns > 0) {
            yyy = y + (_upMiddle.bottom - _upMiddle.top);
            xxx = x + (_upLeft.right - _upLeft.left);
            _image->_surface->displayTrans(xxx, yyy, _middleMiddle);
            _image->_surface->repeatLastDisplayOp(tileWidth, tileWidth, nuColumns, nuRows);
        }
 
        _gameRef->_renderer->endSpriteBatch(false);
 
        // This is where we retrieve the complete image
        _cache = _gameRef->_renderer->getAuxSurface();
    }
 
    Rect32 dst;
    dst.top = 0;
    dst.left = 0;
    dst.setWidth(width);
    dst.setHeight(height);
    _cache->displayTrans(x, y, dst);
    return STATUS_OK;
}

Then we store that surface and when we need to redraw the UITiledImage we simply use the cached version: single drawing call, less overhead.

Multi rects

In our last post we were talking about the dirty rect system.
The simplest implementation uses a single rectangle; it can save us quite some unnecessary blitting already, but there is some dangers to be taken into

account that could spoil our grand plan.

For example, what happens if a character is moving in the SE corner of the screen and I’m moving the mouse cursor in the NW corner?

What would be the screen area to be considered “dirty” and thus redrawn?

Yu guessed it – basically, the whole screen.

Baaaaad.

This was in fact the way dirty recting was originally implemented in the WME engine for ScummVM:

void BaseRenderOSystem::addDirtyRect(const Common::Rect &rect) {
  if (!_dirtyRect) {
    _dirtyRect = new Common::Rect(rect);
  } else {
    _dirtyRect->extend(rect);
  }
  _dirtyRect->clip(_renderRect); 
}

That’s it, one rect, extend it as needed.

We needed multiple dirty rects.

Whenever we “dirty” a screen area, we need to mark that separately.
But a simple array is probably not enough.

For example, what happens if multiple partially overlapping sprites are to be redrawn and we naively traverse the array?

We redraw the same area over and over again; if the rects are big enough, we may end up doing *more* blitting than without dirty rects, *more* blitting than

with a single rect.

Also bad, very, very bad.

Even if unnecessary redrawing is avoided, needlessly big rect lists can be a problem.

I chose to address the issue with a special class that:

a) Takes requests for adding rects…
a’) …while doing some on-the-fly-optimizations (e.g. discarding duplicates)
b) Can spit out, upon request, an optimized version with only the necessary rectangles.

It is still in the works so this is not final code, but the basic idea is definitely there.
Have a look – code is partial and amended as usual, but you don’t want to know about all the boring housekeeping and utility methods, do you?

namespace Wintermute {
 
 
DirtyRectContainer::DirtyRectContainer() {
    _disableDirtyRects = false;
    _clipRect = nullptr;
}
 
DirtyRectContainer::~DirtyRectContainer() {
    delete _clipRect;
}
 
void DirtyRectContainer::addDirtyRect(const Common::Rect &rect, const Common::Rect *clipRect) {
 
    if (_clipRect == nullptr) {
        _clipRect = new Common::Rect(*clipRect);
    } else {
        assert(clipRect->equals(*_clipRect));
    }
 
    if (target > kMaxInputRects) {
        _disableDirtyRects = true;
    } else if (isHuge(&rect)) { // More than 90% of the viewport
        _disableDirtyRects = true;
    } else if (rect.width() == 0 || rect.height() == 0) {
    } else {
        _rectArray.insert_at(target, tmp);
        _rectArray[target]->clip(*clipRect);
    }
}
 
 
Common::Array DirtyRectContainer::getOptimized() {
 
    if (_disableDirtyRects) {
        return getFallback();
    }
 
    Common::Array ret;
    Common::Array queue;
 
    for (int i = 0; i < _rectArray.size(); i++) {
        queue.insert_at(queue.size(), _rectArray[i]);
    }
 
    int j = 0;
    while (j < queue.size()) { Common::Rect *candidate = queue[j]; if (candidate->width() == 0 || candidate->height() == 0) {
            // We have no use for this
            queue.remove_at(j);
            continue;
        }
 
        if (ret.size() > kMaxOutputRects) {
            _disableDirtyRects = true;
            return getFallback();
        }
 
        bool discard = false;
 
        // See if it's contained or containes
        for (uint i = 0; i < ret.size() && !discard; i++) { Common::Rect *existing = ret[i]; if (existing->contains(*candidate) || existing->equals(*candidate)) {
                // It is already contained by an existing rect - useless!
                discard = true;
                continue;
            }
 
            if (candidate->contains(*(existing))) {
                // Contains an existing one.
                // Extend the pre-existing one and discard this.
                existing->extend(*candidate);
                discard = true;
                continue;
            }
 
            // Okay, we now see if we have an overlapping corner and slice accordingly.
            Common::Rect intersecting = existing->findIntersectingRect(*candidate);
            // We have to remove intersecting and enqueue the rest
            // We know that it's not a simple contained rect, we know that
            // it's not a cross-shaped thing like above, we know that it's not
            // a Commodore-logo type of deal (where the C is the bigger rect)
            // So intersecting is either the NE, SE, SW or NW corner of candidate
            if (intersecting.width() >= kMaxSplicingX &&
                    intersecting.height() >= kMaxSplicingY
               ) {
                if (intersecting.top >= candidate->top &&
                        intersecting.left >= candidate->left &&
                        intersecting.bottom == candidate->bottom &&
                        intersecting.right == candidate->right
                   ) { // SE case
                    Common::Rect *neslice = new Common::Rect(*candidate);
                    neslice->bottom = intersecting.top;
                    neslice->left = intersecting.left;
                    if (neslice->width() != 0 && neslice->height() != 0) {
                        queue.insert_at(queue.size(), neslice);
                    }
                    candidate->right = intersecting.left;
                    if (candidate->width() == 0) {
                        discard = true;
                    }
                     
                } else if () //SAME FOR NE 
                { // ...
                } else if () //SAME FOR NW
                { // ...
                } else if () //SAME FOR SW 
                { // ...
                } else if (existing->left <= candidate->left &&
                           existing->right >= candidate->right &&
                           existing->top >= candidate->top &&
                           existing->bottom <= candidate->bottom) { // Cross shaped intersections
                    Common::Rect *top_slice = new Common::Rect(*candidate);
                    Common::Rect *bottom_slice = new Common::Rect(*candidate);
                    top_slice->bottom = existing->top;
                    bottom_slice->top = existing->bottom;
 
                    if (top_slice->height() > 0 && top_slice->width() > 0) {
                        queue.insert_at(queue.size(), top_slice);
                    }
 
                    if (bottom_slice->height() > 0 && bottom_slice->width() > 0) {
                        queue.insert_at(queue.size(), bottom_slice);
                    }
 
                    discard = true;
                    continue;
                } else if () // Same for vertical vs horizontal cross intersection
                {
                }
            } // End of intersecting test
 
        } // End loop
        if (discard) {
            queue.remove_at(j);
        } else {
            if (isHuge(candidate)) {
                _disableDirtyRects = true;
                return getFallback();
            }
            ret.insert_at(ret.size(), candidate);
            j++;
        }
    } // End loop
 
    return ret;
}
 
}

The Dreaded Rosemary Bug

The Rosemary Bug, as we call it, is the first bug I’ve run into in ScummVM.

That means – after 10 years I found a bug in an unsupported, unstable, unreleased (in builds) piece of code, which must mean something about the QA at ScummVM.

It caused the main character to appear “jigsawed” when looking east AND you crossed her with the mouse pointer.
Like this:

Not pretty, huh?

This is due to the dirty rect system.
Basically, the idea is that normally you don’t need to redraw ALL of the rectangle containing the sprite, but just the little square where the cursor was and is no more.
This applies not only to the cursor, of course, but to ANY need that may arise to redraw part of a sprite that’s been obscured by something smaller without redrawing the rest, which stays unchanged.

E.g. if your game had a lunar eclipse, when it’s over you would need to redraw the moon, not the whole sky.

Now, there is a problem.

Consider this example:

1) Our background sprite, painted in our viewport:

2) An eclipse! Another sprite (the black disc) is drawn on top of the first one.

3) For the eclipse to be over, we only need redraw the bit “ruined” by the black disc; everything else is *already* on the viewport.
We take the area that corresponds to the pink square from the original sprite and paint it over:

4) All is fine!

Everything is okay, but what if a sprite is mirrored – that is , it has a bit set that tells the engine “always draw this flipped”?
Look what happens if we don’t take special measures:

1) We draw the original texture FLIPPED.

2)

3) Now, as before, we take the area that corresponds to the pink square from the original texture (which we had drawn flipped before)

4)

Uh, oh.

Mike Oldfield is not happy.

We need to “mirror” the offsets too for dirty rects to work with mirrored sprites, of course.

That’s what was happening with Rosemary, in transparent_surface.cpp:

Common::Rect TransparentSurface::blit(Graphics::Surface &target, int posX, int posY, int flipping, Common::Rect *pPartRect, uint color, int width, int height) {
    int ca = (color >> 24) & 0xff;
 
    Common::Rect retSize;
    retSize.top = 0;
    retSize.left = 0;
    retSize.setWidth(0);
    retSize.setHeight(0);
    // Check if we need to draw anything at all
    if (ca == 0)
        return retSize;
 
    int cr = (color >> 16) & 0xff;
    int cg = (color >> 8) & 0xff;
    int cb = (color >> 0) & 0xff;
 
    // Compensate for transparency. Since we're coming
    // down to 255 alpha, we just compensate for the colors here
    if (ca != 255) {
        cr = cr * ca >> 8;
        cg = cg * ca >> 8;
        cb = cb * ca >> 8;
    }
 
    // Create an encapsulating surface for the data
    TransparentSurface srcImage(*this, false);
    // TODO: Is the data really in the screen format?
    if (format.bytesPerPixel != 4) {
        warning("TransparentSurface can only blit 32 bpp images");
        return retSize;
    }
 
    if (pPartRect) {
        srcImage.pixels = getBasePtr(pPartRect->left, pPartRect->top);
        srcImage.w = pPartRect->width();
        srcImage.h = pPartRect->height();
    }
 
    if (width == -1)
        width = srcImage.w;
    if (height == -1)
        height = srcImage.h;
 
    Graphics::Surface *img = nullptr;
    Graphics::Surface *imgScaled = nullptr;
    byte *savedPixels = nullptr;
        /*
         * et cetera
         * ...
         */
    }
}

This is how it’s been fixed:

if (pPartRect) {
 
    int xOffset = pPartRect->left;
    int yOffset = pPartRect->top;
 
    if (flipping & FLIP_V) {
        yOffset = srcImage.h - pPartRect->bottom;
    }
 
    if (flipping & FLIP_H) {
        xOffset = srcImage.w - pPartRect->right;
    }
 
    srcImage.pixels = getBasePtr(xOffset, yOffset);
    srcImage.w = pPartRect->width();
    srcImage.h = pPartRect->height();
}

You can now move the cursor over Rosemary all you want – it’s not gonna break ?

The debug console

So, we were talking about the debugger.

We’ve seen how hooks are inserted into the execution unit, let’s see how you interact with the debugger.
We extend Common::Console, which already provides some very minimal functionality.
It mainly enables us to easily bind commands to functions, tokenizing arguments on our behalf and giving us C-style argc/argv.
It works like this:

Console::Console(WintermuteEngine *vm) : GUI::Debugger() {
    DCmd_Register("print_foo", WRAP_METHOD(Console, Cmd_PrintFoo));
    DCmd_Register("do_bar", WRAP_METHOD(Console, Cmd_DoBar));
    //
}
 
// ...
bool Console::Cmd_AddBreakpoint(int argc, const char **argv) {
    DebugPrintf("Foo!\n");
    return true;
}
//...

Then, we proceed to parse user input on our own.

Our methods must return true or false – true for keeping the console open, false for closing it.

Ideally, for simple tasks, we could work on the engine directly from there, e.g.:

DCmd_Register( ....  Cmd_SkipInstruction));
 
Cmd_SkipInstruction (...) {
    // usage: skip threadnumber
    _script[atoi(argv[1]))->_iP++;
    return true;
}

This could get messy fast, though.
What we have going on here, though, is a MVA sort of pattern, which is more or less like its more famous brother MVC, except that the +** is bidirectional – the adapter can fiddle with the model and vice versa – the reason is decoupling and ease of mantainance: http://en.wikipedia.org/wiki/Model%E2%80%93view%E2%80%93adapter

MVC:

MVA:

(Image: https://www.palantir.com)

We have a middle layer called the DebuggerAdapter that works on the script engine for us, while we need to knownothing about the script engine itself in the console proper.

The header looks kinda like this:

namespace Wintermute {
class DebuggerAdapter {
public:
    DebuggerAdapter(WintermuteEngine *vm);
    // Called by Script (=~Model)
    bool triggerBreakpoint(ScScript *script);
    bool triggerStep(ScScript *script);
    bool triggerWatch(ScScript *script, const char *symbol);
    // Called by Console (~=View)
    int addWatch(const char *filename, const char *symbol);
    int addBreakpoint(const char *filename, int line);
    // ...
    int stepOver();
    int stepInto();
    int stepContinue();
    // ...
    SourceFile *_lastSource;
private:
    bool compiledExists(Common::String filename);
    void reset();
    WintermuteEngine *_engine;
    int32 _lastDepth;
    ScScript *_lastScript;
    int32 _lastLine;
};
}

An example of how it works:

// in debugger.cpp:
 
bool Console::Cmd_AddBreakpoint(int argc, const char **argv) {
    /**
     * Add a breakpoint
     */
    if (argc == 3) {
        int error = ADAPTER->addBreakpoint(argv[1], atoi(argv[2]));
        if (!error) {
            DebugPrintf("%s: OK\n", argv[0]);
        } else if (error == NO_SUCH_SCRIPT) {
            Common::String msg = Common::String::format("no such script: %s, breakpoint NOT created\n", argv[1]);
            debugWarning(argv[0], ERROR, msg);
        } else if (error == NO_SUCH_SOURCE) {
            Common::String msg = Common::String::format("no such source file: %s\n", argv[1]);
            debugWarning(argv[0], WARNING, msg);
        } else if (error == NO_SUCH_LINE) {
            Common::String msg = Common::String::format("source %s has no line %d\n", argv[1], atoi(argv[2]));
            debugWarning(argv[0], WARNING, msg);
        } else if (error == IS_BLANK) {
            Common::String msg = Common::String::format("%s:%d looks like a comment/blank line.\n", argv[1], atoi(argv[2]));
            debugWarning(argv[0], WARNING, msg);
        } else {
            Common::String msg = Common::String::format("Error code %d", error);
            debugWarning(argv[0], WARNING, msg);
        }
    } else {
        DebugPrintf("Usage: %s   to break at line  of file \n", argv[0]);
    }
    return true;
}

// in debugger_adapter.cpp
 
int DebuggerAdapter::addBreakpoint(const char *filename, int line) {
    // TODO: Check if file exists, check if line exists
    assert(SCENGINE);
    if (!compiledExists(filename)) {
        return NO_SUCH_SCRIPT;
    }
    int isLegal = isBreakpointLegal(filename, line);
    if (isLegal == OK) {
        SCENGINE->addBreakpoint(filename, line);
        return OK;
    } else if (isLegal == IS_BLANK) {
        // We don't have the SOURCE. A warning will do.
        SCENGINE->addBreakpoint(filename, line);
        return IS_BLANK;
    } else if (isLegal == NO_SUCH_SOURCE) {
        // We don't have the SOURCE. A warning will do.
        SCENGINE->addBreakpoint(filename, line);
        return NO_SUCH_SOURCE;
    } else if (isLegal == NO_SUCH_LINE) {
        // No line in the source A warning will do.
        SCENGINE->addBreakpoint(filename, line);
        return NO_SUCH_LINE;
    } else {
        // Something weird? Don't do anything.
        return isLegal;
    }
}

This is a somewhat trivial case since SCENGINE gives us addBreakpoint and we simply wrap it, but DebuggerAdapter does get more complex.

This way, when and if we want a prettier user interface, we don’t need to worry (not too much, at least) about breaking functionality, and if we decide to restructure the script engine and / or the way the adapter interacts with it, we don’t need to worry too much about rendering the interface unusable.

This could actually turn out useful if somebody decides to stitch together a graphical debugger – just code the actual GUI and add water.

One cool thing our interface does is displaying source files – we’ll talk about it in the next installment.