{"id":200,"date":"2020-07-20T12:00:29","date_gmt":"2020-07-20T12:00:29","guid":{"rendered":"https:\/\/blogs.scummvm.org\/djsrv\/?p=200"},"modified":"2022-05-24T00:31:19","modified_gmt":"2022-05-24T00:31:19","slug":"learning-the-lingo","status":"publish","type":"post","link":"https:\/\/blogs.scummvm.org\/djsrv\/2020\/07\/20\/learning-the-lingo\/","title":{"rendered":"Learning the Lingo"},"content":{"rendered":"<section class=\"body\">This week I worked out some longstanding Lingo issues!<\/p>\n<h2 id=\"duplicate-scripts\">Duplicate Scripts<\/h2>\n<p>The first issue was duplicate scripts in Director 4 movies. Each cast member should have at most one Lingo script associated with it, but we were running into movies in which a cast member seemingly had several scripts. There was no obvious way to deal with this &#8211; redefining the script usually led to incorrect behavior, and so did keeping the original definition.<\/p>\n<p>These duplicate scripts were rare in most movies, so the problem went ignored for a while, but in our recently added target <em>Majestic Part 1: Alien Encounter<\/em>, there were several hundred scripts, and almost every one conflicted with another.<\/p>\n<p>Initially, I thought that there must be something that indicated certain scripts, or at least certain handlers within these scripts, were unused. The first place I investigated was the script\u2019s \u201chandler vectors.\u201d These differed between some of the duplicate scripts, and I thought they might hold the key to how the script conflicts should be handled.<\/p>\n<p>\u201cHandler vectors\u201d were identified as an array of 16-bit integers in <a href=\"https:\/\/docs.google.com\/document\/d\/1jDBXE4Wv1AEga-o1Wi8xtlNZY4K2fHxW2Xs8RgARrqk\/edit\">Anthony Kleine\u2019s Director documentation<\/a>, but there was no explanation of their purpose. I got in touch with Anthony, but he couldn\u2019t remember what they were for, and they remained a mystery to me for weeks. Once I began deeper investigation, it quickly became apparent that the \u201chandler vectors\u201d are just used to map event IDs to handler IDs. Totally unrelated.<\/p>\n<p>The next suspect was the Lingo context, a container which maps script IDs to script data:<\/p>\n<pre><code>Section LctX {\r\n\tStruct header {\r\n\t\t...\r\n\t\tUint16 [big] freePtr\r\n\t}\r\n\t...\r\n\tArray scripts(count) {\r\n\t\tStruct scriptLink {\r\n\t\t\tUint32 unknown\r\n\t\t\tUint32 [big] ID \/\/ use MMAP!\r\n\t\t\tUint16 [big] used \/\/ 0 : unused , 4: used\r\n\t\t\tUint12 [big] link \/\/ For unused entries: link to next unused, or -1.\r\n\t\t}\r\n\t}\r\n}\r\n<\/code><\/pre>\n<p><a href=\"https:\/\/github.com\/Earthquake-Project\/Format-Documentation\/blob\/master\/structure\/scripting\/FormatNotes_Scripts.txt\">(Source: Brian151)<\/a><\/p>\n<p>The two areas of interest are:<\/p>\n<ol>\n<li>The script entry\u2019s <code>used<\/code> field<\/li>\n<li>A linked list of unused scripts, which begins at the script entry at index <code>freePtr<\/code>. The entry\u2019s <code>link<\/code> field gives the index of the next unused script, or -1.<\/li>\n<\/ol>\n<p>However, after much investigation, it seems that there is actually no difference in how a script with <code>used = 0<\/code> and a script with <code>used = 4<\/code> should be handled. The linked list does indeed indicate unused scripts, but all of the entries in the list seem to have an <code>ID<\/code> of -1. Thus, these unused scripts have no script data associated with them, and they were never being loaded in the first place. There was no way they could be causing conflicts, since they didn\u2019t really exist.<\/p>\n<p>To finally solve the mystery, I had to throw out what I thought I knew about how Lingo scripts are linked to cast members. For years, the Director reverse engineering community understood that:<\/p>\n<blockquote><p>The scripts are not owned by their individual Cast Members in the Key Table [which links cast members to most of their assets] as you may expect. Instead, each Lingo Script has the number of its corresponding Cast Member <a href=\"https:\/\/docs.google.com\/document\/d\/1jDBXE4Wv1AEga-o1Wi8xtlNZY4K2fHxW2Xs8RgARrqk\/edit\">(Source: Anthony Kleine)<\/a>.<\/p><\/blockquote>\n<p>After a few days of testing, I noticed that the cast member ID stored within Lingo scripts was sometimes incorrect. Or, as in <em>Majestic<\/em>, almost always incorrect. There had to be some other way by which cast members were linked to their scripts.<\/p>\n<p>The obvious place to look was in the cast member data, which is split into two parts &#8211; data specific to the cast member type, followed by largely standard cast member info. We had previously identified a <code>scriptId<\/code> field in the data specific to script cast members, and these IDs always seemed to be correct. However, other types of cast members could have scripts as well, and since they wouldn\u2019t have this field, this solution wouldn\u2019t work for them. Or so it seemed.<\/p>\n<p>Long story short, we were treating too many bytes as type-specific data, and the <code>scriptId<\/code> was actually in the standard cast member info. Once that was fixed, every cast member had a single, correct <code>scriptId<\/code> associated with it. Use that to link cast members to the scripts, and no more duplicate scripts!<\/p>\n<h2 id=\"grammar\">Grammar<\/h2>\n<p>Next was improvements to the Lingo grammar.<\/p>\n<p>First, I needed to differentiate between statements and expressions. An expression by itself, like <code>2 + 2<\/code>, isn\u2019t a valid Lingo script &#8211; it needs to be an argument to a statement, like <code>put 2 + 2<\/code>. However, we were treating expressions and statements exactly the same, which allowed incorrect scripts and significantly complicated the grammar. Once this was fixed, half of the grammar\u2019s 441 conflicts were gone.<\/p>\n<p>Next, I needed to get rid of the differentiation between Lingo\u2019s subroutine types. Confusingly, Lingo has (at least) 3 different types, with overlapping purposes:<\/p>\n<ul>\n<li>Commands &#8211; These are built-in, and invoked by a call statement, like <code>foo()<\/code> or <code>foo<\/code>.<\/li>\n<li>Functions &#8211; These are also built-in, and invoked by a call expression, like <code>put foo()<\/code>. Very rarely, you can also invoke them as statements.<\/li>\n<li>Handlers &#8211; These are user-defined, and can be invoked by either call statements or call expressions.<\/li>\n<\/ul>\n<p>Now, these are separate things, but they should only be treated separately during execution. Previously we were differentiating them in the grammar, which again complicated things.<\/p>\n<p>With that done, I began general cleanup. Reorganizing things where conflicts could be eliminated, reducing the use of right recursion, and adding support for fun statements like this one:<\/p>\n<pre><code>put cast cast\r\n<\/code><\/pre>\n<p>What should this do? Why, of course, it prints the cast member whose ID is equal to the variable <code>cast<\/code>:<\/p>\n<pre><code>set cast = 1\r\nput cast cast\r\n-- (cast 1)\r\n<\/code><\/pre>\n<p>All in all, the grammar is now truer to the original, and we\u2019re down to 6 conflicts from 441!<\/p>\n<\/section>\n","protected":false},"excerpt":{"rendered":"<p>This week I worked out some longstanding Lingo issues! Duplicate Scripts The first issue was duplicate scripts in Director 4 movies. Each cast member should have at most one Lingo script associated with it, but we were running into movies in which a cast member seemingly had several scripts. There was no obvious way to [&hellip;]<\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-200","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/blogs.scummvm.org\/djsrv\/wp-json\/wp\/v2\/posts\/200","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.scummvm.org\/djsrv\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.scummvm.org\/djsrv\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.scummvm.org\/djsrv\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.scummvm.org\/djsrv\/wp-json\/wp\/v2\/comments?post=200"}],"version-history":[{"count":1,"href":"https:\/\/blogs.scummvm.org\/djsrv\/wp-json\/wp\/v2\/posts\/200\/revisions"}],"predecessor-version":[{"id":201,"href":"https:\/\/blogs.scummvm.org\/djsrv\/wp-json\/wp\/v2\/posts\/200\/revisions\/201"}],"wp:attachment":[{"href":"https:\/\/blogs.scummvm.org\/djsrv\/wp-json\/wp\/v2\/media?parent=200"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.scummvm.org\/djsrv\/wp-json\/wp\/v2\/categories?post=200"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.scummvm.org\/djsrv\/wp-json\/wp\/v2\/tags?post=200"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}