Week 7 – Shivang's GSoC Blog

Welcome to this week’s blog update. This week, I focused primarily on completing the scan.dat processing, as well as working on user.dat handling which is the actual data coming from the user side. The scan processing was almost complete by the previous week; the remaining task was to add a modification time field to the file table in the database and reflect that change in the frontend.

One significant fix was introducing checksum-based filtering at the very beginning of the filtering logic for all dats. Previously, I had placed it after the maximum matched files were already filtered, which did not align with ScummVM’s detection algorithm. Further the detection entries from Scumm Engine had 1MB checksums so checksum based filtering worked really well there. Another improvement was how I handled duplicate entries. Initially, I was dropping all entries in case of duplicates. However, it’s more efficient to retain the first entry and discard the rest, reducing the need for manually adding extra detection entries for such cases.

Then I worked on user.dat where I rewrote some of the matching logic. The filtering approach remains consistent with scan.dat, and I made minimal changes to the existing response logic. There is some work left related to moderation queue for reviewing user data and IP based checking.

Other fixes and improvements:

Parameterized all old SQL queries: I had postponed this task for a while, but finally sat down to parameterize them all.
Formatting compute_hash.py: I had avoided running the Ruff formatter on this file because it was interfering with specific parts of the scan utility code. However, thanks to a suggestion from rvanlaar, I used # fmt: off and # fmt: on comments to selectively disable formatting for those sections.

Recent Posts

Recent Comments

Archives

Categories

Leave a Reply Cancel reply