Categories
Uncategorized

Week-6

Welcome to this week’s blog. This week, I primarily worked on the scan utility and the scan processing logic.

Scan Utility

The scan utility was mostly complete in the first week, but I added three more features:

  1. Modification Time Filter:
    I added the modification time for scanned files into the .dat file. A command-line argument now allows users to specify a cutoff time, filtering out files updated after that time (except files modified today).
    Extracting the modification time was straightforward for non-Mac files since it could be retrieved from the OS. However, for Mac-specific formats—specifically MacBinary and AppleDouble—I had to extract the modification time from the Finder Info.

  2. Size Fields:
    I added all size types (size, size-r, and size-rd) in the .dat file.

  3. Punycode Path Encoding:
    Filepath components are now punycode-encoded individually.

Scan Processing Logic

For processing scan.dat, the first improvement was updating the checksum of all files in the database that matched both the checksum and file size.

The rest of the processing is similar to set.dat logic:
Filtering is used to find candidates with matching detection filenames, sizes and additionally checksums.

  • Single Candidate:
    • If the candidate’s status is partial, it’s upgraded to full (files are updated in case they were skipped earlier due to missing size info).

    • If the candidate’s status is detection,and the number of files in the scan.dat is equal, the status is set to full. Otherwise, it’s flagged for manual merge.

    • If the candidate status is already full, all files are compared, and any differences are reported.

  • Multiple Candidates:
    All candidates are added for manual merging.

Other Fixes and Improvements
  • Fix in set.dat Handling:
    Sometimes, filesets from the candidate list were updated during the same run due to other filesets. These updated filesets could incorrectly appear as false positives for manual merge if their size changed. Now, if a fileset gets updated and its size no longer matches, it’s removed from the candidate list.

  • Database Schema Update:
    An extra column was added to the fileset table to store set.dat metadata.

  • Website Navbar:
    A new navbar has been added to the webpage, along with the updated logo provided by Sev.

  • Database Connection Fix in Flask:
    For development, a “Clear Database” button was added to the webpage. However, the Flask code previously used a global database connection object. This led to multiple user connections persisting and occasionally locking the database. I’ve refactored the code to eliminate the global connection, resolving the issue.

Leave a Reply

Your email address will not be published. Required fields are marked *