feat(mm): faster hashing for spinning disk HDDs

BLAKE3 has poor performance on spinning disks when parallelized. See https://github.com/BLAKE3-team/BLAKE3/issues/31 - Replace `skip_model_hash` setting with `hashing_algorithm`. Any algorithm we support is accepted. - Add `random` algorithm: hashes a UUID with BLAKE3 to create a random "hash". Equivalent to the previous skip functionality. - Add `blake3_single` algorithm: hashes on a single thread using BLAKE3, fixes the aforementioned performance issue - Update model probe to accept the algorithm to hash with as an optional arg, defaulting to `blake3` - Update all calls of the probe to use the app's configured hashing algorithm - Update an external script that probes models - Update tests - Move ModelHash into its own module to avoid circuclar import issues
2024-08-30 20:32:17 +00:00 · 2024-03-14 09:44:55 +11:00
parent 8287fcf097
commit eb6e6548ed
6 changed files with 78 additions and 33 deletions
--- a/scripts/probe-model.py
+++ b/scripts/probe-model.py
@ -4,20 +4,30 @@

 import argparse
 from pathlib import Path
+from typing import get_args

+from invokeai.backend.model_hash.model_hash import HASHING_ALGORITHMS
 from invokeai.backend.model_manager import InvalidModelConfigException, ModelProbe

+algos = ", ".join(set(get_args(HASHING_ALGORITHMS)))
+
 parser = argparse.ArgumentParser(description="Probe model type")
 parser.add_argument(
    "model_path",
    type=Path,
    nargs="+",
 )
+parser.add_argument(
+    "--hash_algo",
+    type=str,
+    default="blake3",
+    help=f"Hashing algorithm to use (default: blake3), one of: {algos}",
+)
 args = parser.parse_args()

 for path in args.model_path:
    try:
-        info = ModelProbe.probe(path)
+        info = ModelProbe.probe(path, hash_algo=args.hash_algo)
        print(f"{path}:{info.model_dump_json(indent=4)}")
    except InvalidModelConfigException as exc:
        print(exc)