Commit Graph

99 Commits

Author SHA1 Message Date
e22df59239 proof-of-principle support for stable-fast
only compile model the first time :-)

probe for availability of stable-fast compiler and triton at startup time

simplify config logic
2023-12-21 16:28:42 -05:00
6494e8e551 chore: ruff format 2023-11-11 10:55:40 +11:00
3a136420d5 chore: ruff check - fix flake8-comprensions 2023-11-11 10:55:23 +11:00
aa02ebf8f5 Fix model cache gc.collect() condition. 2023-11-04 08:52:10 -04:00
fb3d0c4b12 Fix bug in model cache reference count checking. 2023-11-03 13:50:40 -07:00
8488ab0134 Reduce frequency that we call gc.collect() in the model cache. 2023-11-03 13:50:40 -07:00
875231ed3d Add reminder to clean up our model cache clearing logic. 2023-11-03 13:50:40 -07:00
43b300498f Remove explicit gc.collect() after transferring models from device to CPU. I'm not sure why this was there in the first place, but it was taking a significant amount of time (up to ~1sec in my tests). 2023-11-03 13:50:40 -07:00
3781e56e57 Add log_memory_usage param to ModelCache. 2023-11-02 19:20:37 -07:00
40f9e49b5e Demote model cache logs from warning to debug based on the conversation here: https://discord.com/channels/1020123559063990373/1049495067846524939/1161647290189090816 2023-10-11 12:02:46 -04:00
58b56e9b1e Add a skip_torch_weight_init() context manager to improve model load times (from disk). 2023-10-09 14:12:56 -04:00
2479a59e5e Re-enable garbage collection in model cache MemorySnapshots. 2023-10-03 15:18:47 -04:00
22a84930f6 Disable garbage collection in ModelCache calls to MemorySnapshot in order minimize snapshot overhead. 2023-10-03 14:25:34 -04:00
5915a4a51c Minor fixes. 2023-10-03 14:25:34 -04:00
4580ba0d87 Remove logic to update model cache size estimates dynamically. 2023-10-03 14:25:34 -04:00
b9fd2e9e76 Improve get_pretty_snapshot_diff(...) message formatting. 2023-10-03 14:25:34 -04:00
2a3c0ab5d2 Move MemorySnapshot to its own file. 2023-10-03 14:25:34 -04:00
7d65555a5a Fix type error in torch device comparison. 2023-10-03 14:25:34 -04:00
123f2b2dbc Update cache model size estimates based on changes in VRAM when moving models to/from CUDA. 2023-10-03 14:25:34 -04:00
1e4e42556e Update model cache device comparison to treat 'cuda' and 'cuda:0' as the same device type. 2023-10-03 14:25:34 -04:00
1f6699ac43 Consolidate all model.to(...) calls in the model cache to use a utility function with better logging. 2023-10-03 14:25:34 -04:00
ace8665411 Add warning log if moving a model from cuda to cpu causes unexpected change in VRAM usage. 2023-10-03 14:25:34 -04:00
7fa5bae8fd Add warning log if moving model from RAM to VRAM causes an unexpected change in VRAM usage. 2023-10-03 14:25:34 -04:00
f9faca7c91 Add warning log if model mis-reports its required cache memory before load from disk. 2023-10-03 14:25:34 -04:00
594fd3ba6d Add debug logging of changes in RAM and VRAM for all model cache operations. 2023-10-03 14:25:34 -04:00
44d68f5ed5 Auto-format model_cache.py. 2023-10-03 14:25:34 -04:00
2f5e923008 Removed duplicate import in model_cache.py 2023-09-13 19:33:43 -04:00
b7296000e4 made MPS calls conditional on MPS actually being the chosen device with backend available 2023-09-13 19:33:43 -04:00
fab055995e Add empty_cache() for MPS hardware. 2023-09-13 19:33:43 -04:00
e88d7c242f isort wip 3 2023-09-12 13:01:58 -04:00
537ae2f901 Resolving merge conflicts for flake8 2023-08-18 15:52:04 +10:00
bb1b8ceaa8 Update invokeai/backend/model_management/model_cache.py
Co-authored-by: StAlKeR7779 <stalkek7779@yandex.ru>
2023-08-16 08:48:44 -04:00
f9958de6be added memory used to load models 2023-08-15 21:56:19 -04:00
ec10aca91e report RAM and RAM cache statistics 2023-08-15 21:00:30 -04:00
6ad565d84c folded in changes from 4099 2023-08-04 18:24:47 -04:00
1ac14a1e43 add sdxl lora support 2023-08-04 11:44:56 -04:00
f5ac73b091 Merge branch 'main' into feat/onnx 2023-07-31 10:58:40 -04:00
e20c4dc1e8 blackified 2023-07-30 08:17:10 -04:00
844578ab88 fix lora loading crash 2023-07-30 07:57:10 -04:00
99daa97978 more refactoring; fixed place where rel conversion missed 2023-07-29 13:00:07 -04:00
da751da3dd Merge branch 'main' into feat/onnx 2023-07-28 09:59:35 -04:00
2b7b3dd4ba Run python black 2023-07-28 09:46:44 -04:00
bfdc8c80f3 Testing caching onnx sessions 2023-07-27 14:13:29 -04:00
218b6d0546 Apply black 2023-07-27 10:54:01 -04:00
bda0000acd Cleanup vram after models offloading, tweak to cleanup local variable references on ram offload 2023-07-18 23:21:18 +03:00
bc11296a5e Disable lazy offloading on disabled vram cache, move resulted tensors to cpu(to not stack vram tensors in cache), fix - text encoder not freed(detach) 2023-07-18 16:20:25 +03:00
dab03fb646 rename gpu_mem_reserved to max_vram_cache_size
To be consistent with max_cache_size, the amount of memory to hold in
VRAM for model caching is now controlled by the max_vram_cache_size
configuration parameter.
2023-07-11 15:25:39 -04:00
d32f9f7cb0 reverse logic of gpu_mem_reserved
- gpu_mem_reserved now indicates the amount of VRAM that will be reserved
  for model caching (similar to max_cache_size).
2023-07-11 15:16:40 -04:00
5759a390f9 introduce gpu_mem_reserved configuration parameter 2023-07-09 18:35:04 -04:00
8d7dba937d fix undefined variable 2023-07-09 14:37:45 -04:00