{"id":14,"date":"2013-06-12T22:59:34","date_gmt":"2013-06-12T22:59:34","guid":{"rendered":"https:\/\/blogs.scummvm.org\/richiesams\/?p=14"},"modified":"2022-05-22T19:52:07","modified_gmt":"2022-05-22T19:52:07","slug":"zfs-file-format","status":"publish","type":"post","link":"https:\/\/blogs.scummvm.org\/richiesams\/2013\/06\/12\/zfs-file-format\/","title":{"rendered":"ZFS File format"},"content":{"rendered":"<p>Over the years I&#8217;ve reverse engineered quite a few file formats, but I&#8217;ve never really sat down and picked apart why a format was designed the way it was. With that said, I wanted to show the ZFS archive file format and highlight\u00a0some of the peculiarities I saw and perhaps you guys can answer some of my questions.<\/p>\n<div>\nFor some context, Z-engine was created around 1995 and was used on Macintosh, MS-DOS, and Windows 95.<\/p>\n<\/div>\n<div><b>Format<\/b><\/div>\n<div>The main file header is defined as:<\/div>\n<pre class=\"brush:cpp\">struct ZfsHeader {\r\n    uint32 magic;\r\n    uint32 unknown1;\r\n    uint32 maxNameLength;\r\n    uint32 filesPerBlock;\r\n    uint32 fileCount;\r\n    byte xorKey[4];\r\n    uint32 fileSectionOffset;\r\n};\r\n<\/pre>\n<div>\n<ul>\n<li>magic\u00a0and\u00a0unknown1\u00a0are self explanatory<\/li>\n<li>maxNameLength\u00a0refers to the length of the block that stores a file&#8217;s name. Any extra spaces are null.<\/li>\n<li>The archive is split into &#8216;pages&#8217; or &#8216;blocks&#8217;. Each &#8216;page&#8217; contains, at max,\u00a0filesPerBlock\u00a0files<\/li>\n<li>fileCount\u00a0is total number of files the archive contains<\/li>\n<li>xorKey\u00a0is the\u00a0XOR cipher used for encryption of the files<\/li>\n<li>fileSectionOffset\u00a0is the offset of the main data section, aka fileLength &#8211; mainHeaderLength<\/li>\n<\/ul>\n<div><\/div>\n<\/div>\n<div>The file entry header is defined as:<\/div>\n<pre class=\"brush:cpp\">struct ZfsEntryHeader {\r\n    char name[16];\r\n    uint32 offset;\r\n    uint32 id;\r\n    uint32 size;\r\n    uint32 time;\r\n    uint32 unknown;\r\n};\r\n<\/pre>\n<div>\n<ul>\n<li>name\u00a0is the file name right-padded with null characters<\/li>\n<li>offset\u00a0is the offset to the actual file data<\/li>\n<li>id\u00a0is a the numeric id of the file. The id&#8217;s increment from 0 to\u00a0fileCount<\/li>\n<li>size\u00a0is the length of the file<\/li>\n<li>unknown\u00a0is self explanatory<\/li>\n<\/ul>\n<div><\/div>\n<div>Therefore, the entire file structure is as follows:<\/div>\n<\/div>\n<pre class=\"brush:plain\">[Main Header]\r\n \r\n[uint32 offsetToPage2]\r\n[Page 1 File Entry Headers]\r\n[Page 1 File Data]\r\n \r\n[uint32 offsetToPage3]\r\n[Page 2 File Entry Headers]\r\n[Page 2 File Data]\r\n \r\netc.\r\n<\/pre>\n<div><\/div>\n<div><\/div>\n<div><b>Questions and Observations<\/b><br \/>\n<b><br \/>\n<\/b>maxNameLength<br \/>\nWhy have a fixed size name block vs. null terminated or [size][string]? Was that just the popular thing to do back then so the entire header to could be cast directly to a struct?<\/p>\n<p>filesPerBlock<br \/>\nWhat is the benefit to pagination? The only explanation I can see atm is that it was some artifact of their asset compiler max memory. Maybe I&#8217;m missing something since I&#8217;ve never programmed for that type of hardware.<\/p>\n<p>fileSectionOffset<br \/>\nI&#8217;ve seen things like this a lot in my reverse engineering; they give the offset to a section that&#8217;s literally just after the header. Even if they were doing straight casting instead of incremental reading, a simple sizeof(mainHeader) would give them the offset to the next section. Again, if I&#8217;m missing something, please let me know.<\/p>\n<p>Well that&#8217;s it for now,<br \/>\n-RichieSams<\/p><\/div>\n","protected":false},"excerpt":{"rendered":"<p>Over the years I&#8217;ve reverse engineered quite a few file formats, but I&#8217;ve never really sat down and picked apart why a format was designed the way it was. With that said, I wanted to show the ZFS archive file format and highlight\u00a0some of the peculiarities I saw and perhaps you guys can answer some [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-14","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/blogs.scummvm.org\/richiesams\/wp-json\/wp\/v2\/posts\/14","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.scummvm.org\/richiesams\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.scummvm.org\/richiesams\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.scummvm.org\/richiesams\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.scummvm.org\/richiesams\/wp-json\/wp\/v2\/comments?post=14"}],"version-history":[{"count":1,"href":"https:\/\/blogs.scummvm.org\/richiesams\/wp-json\/wp\/v2\/posts\/14\/revisions"}],"predecessor-version":[{"id":15,"href":"https:\/\/blogs.scummvm.org\/richiesams\/wp-json\/wp\/v2\/posts\/14\/revisions\/15"}],"wp:attachment":[{"href":"https:\/\/blogs.scummvm.org\/richiesams\/wp-json\/wp\/v2\/media?parent=14"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.scummvm.org\/richiesams\/wp-json\/wp\/v2\/categories?post=14"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.scummvm.org\/richiesams\/wp-json\/wp\/v2\/tags?post=14"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}