616 Commits

Author SHA1 Message Date
29449ec27d Implement new api for LLaVA 2025-03-21 17:17:56 +11:00
e38f778d28 Extend ModelOnDisk 2025-03-21 17:17:15 +11:00
5ea3ec5cc8 Get FLUX Fill working. Note: To use FLUX Fill, set guidance to ~30. 2025-03-19 14:45:18 +11:00
f7cfbd1323 Add FLUX Fill starter model. 2025-03-19 14:45:18 +11:00
2806b60701 Add logic to probe FLUX variant (NORMAL vs INPAINT). 2025-03-19 14:45:18 +11:00
8e14f9d971 Merge branch 'main' into stripped-models 2025-03-19 07:52:56 +11:00
7fe4d4c21a feat(app): better errors when scanning models with picklescan
Differentiate between malware detection and scan error.
2025-03-19 07:20:25 +11:00
b9972be7f1 Merge branch 'model-classification-api' into stripped-models 2025-03-18 14:57:23 +11:00
e61c5a3f26 Merge 2025-03-18 14:55:11 +11:00
9a389e6b93 Add a LLaVA OneVision starter model. 2025-03-18 11:53:06 +11:00
2ef1ecf381 Fix copy-paste errors. 2025-03-18 11:53:06 +11:00
e9714fe476 Add LLaVA Onevision model loading and inference support. 2025-03-18 11:53:06 +11:00
3f29293e39 Add LlavaOnevision model type and probing logic. 2025-03-18 11:53:06 +11:00
3469fc9843 Ruff 2025-03-18 09:22:16 +11:00
7cdd4187a9 Update classify script 2025-03-18 09:21:38 +11:00
24218b34bf Make ruff happy 2025-03-17 12:04:26 +11:00
d970c6d6d5 Use override fixture 2025-03-17 11:58:13 +11:00
8bcd9fe4b7 Extend ModelOnDisk 2025-03-17 09:18:51 +11:00
4377158503 Variant 2025-03-13 13:32:57 +11:00
d8b9a8d0dd Merge branch 'main' into model-classification-api 2025-03-13 13:03:51 +11:00
39a4608d15 Fix annotations compatability 3.11 2025-03-13 13:01:19 +11:00
b86ac5e049 Explicit union 2025-03-13 10:28:07 +11:00
665236bb79 Type hints 2025-03-13 09:21:58 +11:00
f45400a275 Remove hash algo 2025-03-12 18:39:29 +11:00
e35537e60a fix(mm): move flux_redux starter model to the flux bundle, make siglip a dependency of it 2025-03-11 11:17:19 +11:00
d86b392bfd Remove redundant hash_algo field 2025-03-11 09:16:59 +11:00
3e9e45b177 Update comments 2025-03-11 09:04:19 +11:00
907d960745 PR suggestions 2025-03-11 08:37:43 +11:00
bfdace6437 New API for model classification 2025-03-11 08:34:34 +11:00
cf0cbaf0ae chore: ruff (more) 2025-03-06 10:57:54 +11:00
ac6fc6eccb chore: ruff 2025-03-06 10:57:54 +11:00
8e28888bc4 Fix SigLipPipeline model size calculation. 2025-03-06 10:31:17 +11:00
f1fde792ee Get FLUX Redux working: model loading and inference. 2025-03-06 10:31:17 +11:00
e82393f7ed Add FLUX Redux to starter models list. 2025-03-06 10:31:17 +11:00
d5211a8088 Add FluxRedux model type and probing logic. 2025-03-06 10:31:17 +11:00
3b095b5945 Add SigLIP starter model. 2025-03-06 10:31:17 +11:00
34959ef573 Add SigLIP model type and probing. 2025-03-06 10:31:17 +11:00
f2689598c0 Formatting 2025-03-06 09:11:00 +11:00
cc9d215a9b Add endpoint for emptying the model cache. Also, adds a threading lock to the ModelCache to make it thread-safe. 2025-01-30 09:18:28 -05:00
f7315f0432 Make the default max RAM cache size more conservative. 2025-01-30 08:46:59 -05:00
229834a5e8 Performance optimizations for LoRAs applied on top of GGML-quantized tensors. 2025-01-28 14:51:35 +00:00
5d472ac1b8 Move quantized weight handling for patch layers up from ConcatenatedLoRALayer to CustomModuleMixin. 2025-01-28 14:51:35 +00:00
28514ba59a Update ConcatenatedLoRALayer to work with all sub-layer types. 2025-01-28 14:51:35 +00:00
0db6639b4b Add FLUX OneTrainer model probing. 2025-01-28 14:51:35 +00:00
0cf51cefe8 Revise the logic for calculating the RAM model cache limit. 2025-01-16 23:46:07 +00:00
da589b3f1f Memory optimization to load state dicts one module at a time in CachedModelWithPartialLoad when we are not storing a CPU copy of the state dict (i.e. when keep_ram_copy_of_weights=False). 2025-01-16 17:00:33 +00:00
36a3869af0 Add keep_ram_copy_of_weights config option. 2025-01-16 15:35:25 +00:00
c76d08d1fd Add keep_ram_copy option to CachedModelOnlyFullLoad. 2025-01-16 15:08:23 +00:00
04087c38ce Add keep_ram_copy option to CachedModelWithPartialLoad. 2025-01-16 14:51:44 +00:00
b2bb359d47 Update the model loading logic for several of the large FLUX-related models to ensure that the model is initialized on the meta device prior to loading the state dict into it. This helps to keep peak memory down. 2025-01-16 02:30:28 +00:00