Commit Graph

15 Commits

Author SHA1 Message Date
Brandon Rising
dc23bebebf Run ruff 2024-06-26 21:46:59 +10:00
Kent Keirsey
38b6f90c02 Update prevention exception message 2024-06-26 21:46:59 +10:00
Brandon Rising
63a7e19dbf Run ruff 2024-06-18 10:38:29 -04:00
Brandon Rising
fbc5a8ec65 Ignore validation on improperly formatted hashes (pytest) 2024-06-18 10:38:29 -04:00
Brandon Rising
8ce6e4540e Run ruff 2024-06-18 10:38:29 -04:00
Brandon Rising
f14f377ede Update validator list 2024-06-18 10:38:29 -04:00
Brandon Rising
1925f83f5e Update validator list 2024-06-18 10:38:29 -04:00
Brandon Rising
3a5ad6d112 Update validator list 2024-06-18 10:38:29 -04:00
Brandon Rising
41a6bb45f3 Initial functionality 2024-06-18 10:38:29 -04:00
psychedelicious
72b44f7ebc feat(mm): rename "blake3" to "blake3_multi"
Just make it clearer which is which.
2024-03-22 08:26:36 +11:00
psychedelicious
7726d312e1 feat(mm): default hashing algo to blake3_single
For SSDs, `blake3` is about 10x faster than `blake3_single` - 3 files/second vs 30 files/second.

For spinning HDDs, `blake3` is about 100x slower than `blake3_single` - 300 seconds/file vs 3 seconds/file.

For external drives, `blake3` is always worse, but the difference is highly variable. For external spinning drives, it's probably way worse than internal.

The least offensive algorithm is `blake3_single`, and it's still _much_ faster than any other algorithm.
2024-03-22 08:26:36 +11:00
psychedelicious
7a4122235f feat(mm): display progress when hashing files 2024-03-21 17:24:48 +11:00
psychedelicious
4117cea5bf tidy(mm): remove misplaced comment 2024-03-14 15:54:42 +11:00
psychedelicious
9fcd67b5c0 feat(mm): add algorithm prefix to hashes
For example:
- md5:a0cd925fc063f98dbf029eee315060c3
- sha1:9e362940e5603fdc60566ea100a288ba2fe48b8c
- blake3:ce3f0c5f3c05d119f4a5dcaf209b50d3149046a0d3a9adee9fed4c83cad6b4d0
2024-03-14 15:54:42 +11:00
psychedelicious
eb6e6548ed feat(mm): faster hashing for spinning disk HDDs
BLAKE3 has poor performance on spinning disks when parallelized. See https://github.com/BLAKE3-team/BLAKE3/issues/31

- Replace `skip_model_hash` setting with `hashing_algorithm`. Any algorithm we support is accepted.
- Add `random` algorithm: hashes a UUID with BLAKE3 to create a random "hash". Equivalent to the previous skip functionality.
- Add `blake3_single` algorithm: hashes on a single thread using BLAKE3, fixes the aforementioned performance issue
- Update model probe to accept the algorithm to hash with as an optional arg, defaulting to `blake3`
- Update all calls of the probe to use the app's configured hashing algorithm
- Update an external script that probes models
- Update tests
- Move ModelHash into its own module to avoid circuclar import issues
2024-03-14 15:54:42 +11:00