Week 3: Draci – Ellen's GSoC Blog

Introduction

Another week of GSoC has passed, with more text-to-speech work done, as a PR has been opened adding TTS to Draci. In addition, TTS for WAGE and CruisE has been merged, and TTS for Cine has been more rigorously tested.

Draci

Much of my time this week was invested in adding TTS to Draci. In terms of text complexity, Draci was roughly below average: most of its text was displayed in a few places, which were easy to track down. There were, however, several considerations regarding the engine. For one, Draci’s Czech and Polish translations have full (or almost full) voiceovers, while the German and English translations only have subtitles. To account for these differences, I split TTS into subtitles and objects, my first time doing so, and had to refrain from voicing subtitles for versions with voiceovers. Beyond that, I had to implement making the subtitles last as long as the TTS is speaking, as the subtitles normally move so fast that the speech can’t keep up; TTS for missing voice clips for the Czech and Polish versions, which received its own option at the suggestion of my mentor; and recognition of cases where the text is only punctuation, which isn’t voiced by the TTS system, and is thus immediately skipped (the text remains on screen for as long as TTS is speaking, meaning if the text isn’t voiced, the system immediately believes that it’s time to move to the next subtitle). All of these were interesting exercises in timing and accommodating different versions.

In my opinion, one of the most interesting components of Draci was the encoding. Draci uses encodings not present in ScummVM’s CodePage enum, which meant I had to hunt them down and manually write a translation table. Fortunately, the Dragon History website states that the Czech, English, and German translations all use Kamenický encoding, with the German translation having an exception for ß. This simplified the process greatly, as it just required creating a translation table for Kamenický encoding and converting the bytes, a similar process to the fix for TeenAgent’s Russian translation. Unfortunately, the Polish encoding is only described by the same website as “some ridiculous proprietary encoding”. I initially tried observing the bytes of the in-game strings to figure out the encoding for each character, but I worried that I’d miss a few characters. Instead, I found the character sets in the original Draci source code, which showed the bytes of UTF-8 encoding for each character as characters in the Windows-1250 code page. This meant that all I had to do was translate these characters according to Windows-1250 encoding to bytes, then combine these bytes to obtain the UTF-8 encodings for the Polish characters. That was most of the work for encodings, aside from altering the encoding table for German and English to replace certain Czech characters with equivalents in those languages, as TTS for the credits struggled to pronounce those characters.

Ultimately, Draci’s encodings were interesting to explore, and I’m very glad that the Dragon History website had them listed.

MADE

At the end of the week, I started work on MADE, but more work must be done for it. It seems that Return to Zork handles text a little differently from what I’m used to, as it displays text in several simultaneous channels every frame. My usual solution of tracking the previously said text with a variable won’t work here, as it’ll be overridden by the other text. I’ve thought of a few possible solutions, including tracking several different previously said texts or finding where the text is set – which I may have already found, but I need to do more investigation – and I plan to explore them over this week.

Miscellaneous

Aside from Draci and MADE, I fixed a few problems with CruisE – thanks to my mentor for noticing that the encoding was wrong for some of the versions, among other necessary changes – and tested TTS for Cine for other versions. For Cine, I was worried that my method of checking image names for text in the form of an image would fail across different versions of Future Wars and Operation Stealth, but I was pleasantly surprised to not find any issues, at least in my testing. The only changes I ended up making were adding a few new translations, fixing some small caveats with the encoding for the German and French translations, adding line-by-line TTS for the Operation Stealth copy protection screen (which I initially had, but I removed due to fears that it wouldn’t work across versions), adding TTS for Future Wars’s grid copy protection screen, and adjusting for exceptions across versions (such as the US Amiga version of Future Wars using a different copy protection screen from all other Amiga versions). Cine is thus much more thoroughly tested across different game editions now.

Conclusion

This week involved more work on TTS, with testing for Cine, TTS implementation for Draci, and the beginnings of TTS implementation for MADE. Exploring Draci’s encodings was the most entertaining part, and I hope to bring over my more in-depth knowledge of encoding methods into future implementations. Next week, I’ll be finishing TTS for MADE and beginning TTS for ADL, alongside any changes I need to make to my open PRs for Cine and Draci.

Recent Posts

Recent Comments

Archives

Categories

Introduction

Draci

MADE

Miscellaneous

Conclusion

Leave a Reply Cancel reply