Categories
Uncategorized

Week 10: AGI

Introduction

This week, I opened a PR for adding text-to-speech to AGI. It was the last engine marked on my proposal, which means that I have now made PRs for every engine that was planned for my project. This is a major milestone, and gives me time to work on Gob as a stretch goal.


AGI

I spent much of the week working on AGI, which was fortunately a rather simple engine. While it supports many different games and fangames, most of these games seem to function very similarly: there may be introductions or other screens that display text using TextMgr::display, with most other text being displayed through popup windows, typed commands, or menus. Thus, AGI was fairly straightforward, requiring only a handful of TTS calls in key methods, as well as some consideration for when to voice the status menu and clock, which are always visible – I settled for voicing one or the other primarily when the game is loaded, the status menu changes, and the clock is enabled, to avoid voicing them too frequently – and when to stop TTS.

The major concerns with AGI were how it handles timing and TextMgr::display across games. Some games appear to use timer variables to signal when it’s time to change the text on screen, while others use when a sound ends. Therefore, from what I could tell, there was no consistent way to predict when text will change, which made it difficult to delay text changes until TTS is finished. I eventually settled for queuing text and delaying room and window changes until TTS finishes, which seems to be enough to allow TTS to voice everything displayed on screen in a reasonable manner. In addition, TextMgr::display differs between games: some call it only once, while others call it every frame, which necessitates keeping track of the previously said text to avoid speech loops. However, much of the text passed through this method is displayed in sentences or chunks, rather than all at once in a single call. This makes simply tracking the previously said text fail and results in awkward, choppy voicing when a sentence is broken across lines. My solution to this issue was to combine the text passed to TextMgr::display in a _combinedText variable, and then voice this when the game script returns or halts. Using this method, paragraphs and blocks of text are spoken cleanly as one sentence, and the previously said text can simply be set to this combined text to prevent speech loops. I also decided to add in newlines between these pieces of text if they aren’t displayed on subsequent rows, since in most cases, pieces of text that aren’t in subsequent rows shouldn’t be voiced as one quick sentence.

In conclusion, AGI was a fairly simple engine, since its games didn’t have much complexity. Most of the work went into handling timing and subsequently displayed text, as these factors can differ significantly between its many games. Otherwise, AGI fangames seem to have little variation in how they display text, which simplified work on it, though there’s a chance that games that aren’t fangames may function differently, as I mainly worked with fangames.


Gob

After opening a PR for AGI, I started work on Gob. So far, Gob doesn’t seem too complex: its text seems limited and appears to be displayed through only a few select methods. However, certain games, like Adibou 2, separate text from many of their buttons, which requires a technique to sync them. So far, I’ve decided on a method of checking whether the displayed text intersects with a hotspot, then expanding the collision rectangle of the text accordingly, which seems to work fairly well. Nonetheless, I would like to see if I can make this solution more robust for better compatibility, as it seems that Gob supports a variety of games with many differences between them.


Conclusion

This week, I opened a PR for my last planned engine, AGI, and started work on Gob as a stretch goal. I plan to have a PR up for Gob by the end of this week, which may possibly give me enough time to start another stretch goal.

Categories
Uncategorized

Week 9: SCUMM

Introduction

This week, I opened a PR for adding text-to-speech to SCUMM, which was an interesting engine to work on. I’m quite happy with the result, though with the wide variety of games supported by SCUMM, it may need more work in the future.


SCUMM

Most of my week was spent on adding TTS to SCUMM, an engine that was neither especially difficult nor especially easy to work on. On one hand, SCUMM games are very similar to most of the games I’ve worked on before. Most of them have few menus and rather simple means of displaying text, unlike engines like Efh or MM. Thus, creating a user-friendly TTS system was fairly easy for SCUMM. In addition, finding where text is displayed wasn’t difficult: most of it goes through printString and drawString, with actor speech being displayed with displayDialog. Furthermore, most of the GUI controls and buttons come with labels built into them, which makes voicing buttons when they’re hovered over a very simple process. As such, much of the work for SCUMM involved simply adding TTS voicing calls to these functions, while accounting for situations such as subtitles for voiced dialog and the need to delay the disappearance of text until TTS is finished.

On the other hand, however, SCUMM is larger than some of the engines I’ve worked with previously, and it supports many different games. This means that compatibility issues are a recurring problem, as what works for TTS in one version of the SCUMM engine may not work in others. A good example of this is verbMouseOver. When I first came across this method, I thought that it would be a great place to voice verbs when they’re hovered over. However, SCUMM version 5 – or at least, Indiana Jones and the Fate of Atlantis – doesn’t seem to reliably use this function for detection of hovering over verbs. In addition, while some games only call drawVerb once as a verb is hovered over, games like Fate of Atlantis call it every frame. Thus, to try to voice verbs as robustly as possible, I decided to add code to drawVerb, which most SCUMM versions seem to go through for verb drawing, to check whether the current verb is highlighted before voicing it, a strategy that seemed to work for many games. There were other compatibility issues that I had to resolve as well, such as SCUMM versions 0, 1, and 2 using custom text encodings that needed to be replicated; drawVerb sometimes being used to print strings that aren’t verbs, requiring them to be voiced even if they aren’t highlighted; and Passport to Adventure having a special help menu with buttons that aren’t considered GUI controls, which required a means of storing the text for each button and detecting when they’re hovered over.

Another concern was that SCUMM versions 7 and 8 use their own methods for displaying text, though they were fortunately similar enough to those used by earlier versions that it wasn’t too difficult to voice them. Humongous Entertainment games also seem to have different means of handling text, but because they don’t have much text in the first place, I didn’t have to worry as much about them. Thus, most of the compatibility issues were in earlier SCUMM versions, with later ones having fewer problems.

Ultimately, the most difficult component of SCUMM was the wide variety of games supported. Each version has its own ways of handling text that have to be considered, requiring careful thought about the best places to voice or stop text. I’m fairly happy with my implementation of TTS for this engine, but because of its many games, there may be some oddities that will need to be resolved.


Conclusion

I opened a PR for SCUMM this week, which was an interesting engine to explore, due to its greater size and variety of versions. I also revisited my MADE PR, an engine that had its own compatibility issues, with text indices varying across game versions, that should be solved now. Next week, I’ll be working on AGI, the last engine listed on my project.