feat(mm): default hashing algo to blake3_single

For SSDs, `blake3` is about 10x faster than `blake3_single` - 3 files/second vs 30 files/second.

For spinning HDDs, `blake3` is about 100x slower than `blake3_single` - 300 seconds/file vs 3 seconds/file.

For external drives, `blake3` is always worse, but the difference is highly variable. For external spinning drives, it's probably way worse than internal.

The least offensive algorithm is `blake3_single`, and it's still _much_ faster than any other algorithm.
This commit is contained in:
psychedelicious
2024-03-21 17:38:46 +11:00
parent 61520dfb86
commit 7726d312e1
5 changed files with 12 additions and 12 deletions

View File

@ -61,7 +61,7 @@ class ModelHash:
"""
def __init__(
self, algorithm: HASHING_ALGORITHMS = "blake3", file_filter: Optional[Callable[[str], bool]] = None
self, algorithm: HASHING_ALGORITHMS = "blake3_single", file_filter: Optional[Callable[[str], bool]] = None
) -> None:
self.algorithm: HASHING_ALGORITHMS = algorithm
if algorithm == "blake3":

View File

@ -114,7 +114,7 @@ class ModelProbe(object):
@classmethod
def probe(
cls, model_path: Path, fields: Optional[Dict[str, Any]] = None, hash_algo: HASHING_ALGORITHMS = "blake3"
cls, model_path: Path, fields: Optional[Dict[str, Any]] = None, hash_algo: HASHING_ALGORITHMS = "blake3_single"
) -> AnyModelConfig:
"""
Probe the model at model_path and return its configuration record.