Month: August 2025

Week 12

Welcome to this week’s blog. This week, I added features related to updating metadata as well as file data directly from the application UI, along with some smaller fixes and improvements.

For any fileset, you can now update metadata fields directly from the UI. For user filesets in particular, there is an additional step of adding metadata first particularly gameid and engineid as they require creating entries in separate tables. To make filling metadata easier, I also added a dropdown feature that displays all existing values for a field from the database. This way, moderators can either type in a new value or directly choose an existing one. In addition to metadata, I added functionality to update individual files as well. This can be useful for tasks such as manually marking a file as detection file or updating other fields.

For better reliability, confirmation dialogs have been added for most buttons, such as deleting/updating files and adding/updating metadata. Further a separate button has been added for deleting the entire fileset. Another improvement is the ability to delete all filesets in bulk that appear in a filtered search result in the fileset search page.

To enhance logging for scanned files, a new field called data_path has been introduced. This field stores the relative path of the game directory, which is particularly useful when multiple files are scanned at once. This information can later be included in scan.dat related logs.

Lastly, I added an endpoint for sending a fileset ID as a mail notification. This is suppose to be triggered from the mail server whenever a user submits any fileset-related information, using a predefined mail structure in the ScummVM application. (This feature has not yet been integrated with the mail server.)

Uncategorized

Week 11

Welcome to this week’s blog. This week, I worked on testing the workflow for scan data as well as the user file integrity service.

For the scan data, we tested with the WAGE game archives, which provided a good opportunity to test both the scan utility and the scan matching for the Mac files. Some fixes were indeed needed for the matching process. Initially, I was using both size (data fork size) and size-rd (resource fork’s data section size) simultaneously while filtering filesets. However, this was incorrect, since detection filesets only contain one of these at a time. Additionally, I fixed how matched entries were being processed. Previously, entries matched with detection were placed for manual merge to add specific files while avoiding unnecessary ones like license or readme files from commercial games. However, it made more sense to merge them automatically and later remove such files if necessary—especially since, for the archives like WAGE, the issue of extra files from commercial games would not occur.

I also carried out testing for the user integrity service, focusing on different response cases:

All files are okay when a full fileset matches.
Extra files are present.
Some files are missing.

Another missing piece was reporting files due to checksum mismatches, which previously was being classified under extra files. This is now fixed. I also reviewed the manual merge process for user filesets. Unlike set filesets, the source fileset (user fileset here) should not be deleted after a manual merge, since it could be a possible new variant which would need additional metadata information. To support this, I implemented a feature to update fileset metadata—though it still requires some refinement. An additional thing that I need to add is to create an endpoint in the web server that can be triggered by the mail server. This endpoint will provide the mail information, particularly the user fileset ID, for which the user has provided some additional information via the pre-drafted email that is promted when user uses the ‘check integrity’ feature in the ScummVM application.

A few other fixes this week included:

Deleting multiple files from a fileset through dashboard: Previously, the query was being generated incorrectly. Instead of ‘DELETE FROM file WHERE id IN (‘1’, ‘2’, ‘3’)’ it was generating ‘DELETE FROM file WHERE id IN (‘1, 2, 3′)’ which, of course, did not work. This issue is now fixed.
Search filter issue: A bug occurred when a single quote (‘) was used as a value in search filters, breaking the query due to missing escaping for the quote. This has also been fixed.

Uncategorized

Week 10

Welcome to this week’s blog. This week, my work focused on enhancing API security, adding github authentication, refining project structure, and introducing a faster Python package manager (UV).

API Security Improvements

I implemented some checks on the validation endpoint, which processes the user game files data sent from the ScummVM application. These checks are designed to prevent any kind of brute-force attempts –

On top of that, I introduced rate limiting using Flask-Limiter. Currently, the validation endpoint allows a maximum of 3 requests per minute per user.

GitHub OAuth & Role-Based Access

GitHub OAuth authentication is now in place, introducing a three-level role-based system. Though, I have tested it with my own dummy organisation, the integration with ScummVM is remaining:

Admin – Full access, plus the ability to clear the database.
Moderators – Same permissions as Admin, except database clearing.
Read-Only – Logged-in users with viewing rights only.

Project Restructuring & UV Integration

As suggested by my mentor Rvanlaar, I restructured the project into a Python module, making the import logic cleaner and improving overall modularity. I also added UV, a high-performance Python package and project manager, offering faster dependency handling compared to pip.

Other Fixes & Improvements

Updated the apache config file to use the Python virtual environment instead of the global installation.
Correctly decode MacBinary filenames from headers using MacRoman instead of UTF-8.
Improved error handling for the scan utlility.
Use one of size or size-rd for filtering filesets for scan.dat in case of macfiles instead of both simultaneously.

Uncategorized

Week 9

Welcome to this week’s blog. This week was a busy one due to my college workload, but I mostly focused on enhancing the webpage. I worked on the configuration page, the manual merge dashboard, filtering, search-related improvements, and more.

Configuration Page:
I added a new configuration page that allows users to customize their preferences, including:
- Number of filesets per page
- Number of logs per page
- Column width percentages for the fileset search page
- Column width percentages for the log page
All these preferences are stored in cookies for persistence.

User Configuration Page
Manual Merge Dashboard:
I performed some refactoring of the codebase for manual merging. Additionally, I added options to:
- Show either all files or only the common ones
- Display either all fields of the files, or just the full-size MD5 and size (or size-rd in the case of Mac files)
Search Functionality:
I improved the search system with the following features:
- Exact match: Values wrapped in double quotes are matched exactly
- OR search: Multiple terms separated by spaces are treated as an OR
- AND search: Terms separated by + are treated as an AND
Sorting Enhancements:
The sorting feature now includes three states for each column: ascending, descending, and default (unsorted).

Minor Fixes & Improvements

Added favicon to display on the webpage tab
Implemented checksum-based filtering in the fileset search page
Included metadata information in seeding logs (unless --skiplog is passed)

Goals for Next Week

Add GitHub-based authentication
Implement a three-tier user system: admin, moderator, and read-only
Add validation checks on user data to prevent brute force attacks
Refactor the entire project into a Python module for better structure and cleaner imports

Recent Posts

Recent Comments

Archives

Categories