Introduction
This week, I worked on adding text-to-speech to Prince, which was a fun engine to work on. A PR has been opened for it, though more work may be needed in the future. In addition, I began work on adding TTS to Efh, which I plan to finish next week. My ADL PR was also merged this week, and I updated some of my earlier TTS PRs.
Prince
Most of my week was spent on adding TTS to Prince. Fortunately, the Prince and the Coward has no text in the form of images that I could find, which meant no hardcoded text was needed for this engine. However, Prince had some complexities in how it displays text. Its text is displayed rather simply in methods such as checkMob
and printAt
– which are either only called once or have means of tracking text changes by checking changes in indices, meaning there is no need to track the previously spoken text for this engine – but there are several exceptions to consider. For example, how the text in printAt
should be voiced depends on several factors, including slot and location: slot 9 is generally subtitles, while slot 10 is often either subtitles or, if the location is the map, map text. Differentiating between these types of text is important because of the presence of the dub, which necessitates splitting TTS into several categories of subtitles, objects, and missing voiceovers. Since the Polish, German, and Russian translations all have dubs in their languages, subtitles should almost never be voiced for them, while the English and Spanish translations, which lack dubs in their languages, should only have subtitles voiced if the dub is muted. Thus, a fair amount of consideration had to be given to splitting up the text.
Prince also had a few other key exceptions. For one, when I worked on voicing the text of objects when they’re hovered over, I initially thought to use the _selectedMob
variable, which keeps track of the mob that the player is hovering over: if, in checkMob
, the selected mob doesn’t match the current mob number, then the user must be hovering over a new mob, meaning that the text should be voiced. However, I found that left clicking resets _selectedMob
, which results in the text being awkwardly voiced again even though it hasn’t changed. This was easily fixed by introducing a new variable that tracks the selected mob, but is not reset upon left clicking. In addition, I worked on speaking missing voiceovers; solving the issue with the gambling merchants in the town, which constantly talk even as the player interacts with the environment and thus interrupt other TTS, requiring an exception for them that only voices their text if the player isn’t in dialog; and creating several custom encoding tables.
Another significant problem was changing voices. There appears to be no easy indicator that differentiates speaking characters in Prince: text colors are shared across several characters, mob numbers are not unique to certain characters and are instead specific to each location, and dialog seems to be controlled almost entirely by game scripts without any key character indicators. Therefore, my solution was to use a combination of several factors to determine the voice. The text color is enough to differentiate some characters, as the color is sometimes unique. For characters that share text colors, I opted to also check for the location number, since most characters don’t move locations, and those that do can be a catch-all for cases when the location number doesn’t match that of other characters. However, I found that this didn’t work in a few specific scenarios, such as the tavern with Arivald and the bard, who both have the same text color and are in the same location. For such exceptions, I decided to check for the mob number as well, as it differs between them. The result is different voices for each character, though I do wonder if there may be some other cleaner indicator I could use.
Ultimately, Prince was an entertaining engine to explore. It was neither particularly difficult nor particularly easy, as it had its own unique set of challenges, but none that were daunting.
Efh
After opening a PR for Prince, I started work on Efh. So far, Efh seems fairly simple, as much of its text is directly hardcoded, making displayed text very easy to find. However, the fact that its menus display many pieces of text at once every frame is different from most of the engines I’ve worked with, though I’ve currently solved the issue with a simple flag that toggles on after user input and is then toggled off after voicing occurs. Aside from that, there doesn’t seem to be much complexity with Efh, though I still have a fair amount of text left to voice, since I need to account for user input and ease of use.
Conclusion
During this week of GSoC, I opened a PR for adding TTS to Prince and started work on Efh, as well as updated some of my earlier PRs. It was an interesting week, since I enjoyed Prince. Next week, I’ll be continuing work on Efh, and possibly beginning MM if all goes well.