Commit Graph

95 Commits

Author SHA1 Message Date
Ryan Dick
fb3d0c4b12 Fix bug in model cache reference count checking. 2023-11-03 13:50:40 -07:00
Ryan Dick
8488ab0134 Reduce frequency that we call gc.collect() in the model cache. 2023-11-03 13:50:40 -07:00
Ryan Dick
875231ed3d Add reminder to clean up our model cache clearing logic. 2023-11-03 13:50:40 -07:00
Ryan Dick
43b300498f Remove explicit gc.collect() after transferring models from device to CPU. I'm not sure why this was there in the first place, but it was taking a significant amount of time (up to ~1sec in my tests). 2023-11-03 13:50:40 -07:00
Ryan Dick
3781e56e57 Add log_memory_usage param to ModelCache. 2023-11-02 19:20:37 -07:00
Ryan Dick
40f9e49b5e Demote model cache logs from warning to debug based on the conversation here: https://discord.com/channels/1020123559063990373/1049495067846524939/1161647290189090816 2023-10-11 12:02:46 -04:00
Ryan Dick
58b56e9b1e Add a skip_torch_weight_init() context manager to improve model load times (from disk). 2023-10-09 14:12:56 -04:00
Ryan Dick
2479a59e5e Re-enable garbage collection in model cache MemorySnapshots. 2023-10-03 15:18:47 -04:00
Ryan Dick
22a84930f6 Disable garbage collection in ModelCache calls to MemorySnapshot in order minimize snapshot overhead. 2023-10-03 14:25:34 -04:00
Ryan Dick
5915a4a51c Minor fixes. 2023-10-03 14:25:34 -04:00
Ryan Dick
4580ba0d87 Remove logic to update model cache size estimates dynamically. 2023-10-03 14:25:34 -04:00
Ryan Dick
b9fd2e9e76 Improve get_pretty_snapshot_diff(...) message formatting. 2023-10-03 14:25:34 -04:00
Ryan Dick
2a3c0ab5d2 Move MemorySnapshot to its own file. 2023-10-03 14:25:34 -04:00
Ryan Dick
7d65555a5a Fix type error in torch device comparison. 2023-10-03 14:25:34 -04:00
Ryan Dick
123f2b2dbc Update cache model size estimates based on changes in VRAM when moving models to/from CUDA. 2023-10-03 14:25:34 -04:00
Ryan Dick
1e4e42556e Update model cache device comparison to treat 'cuda' and 'cuda:0' as the same device type. 2023-10-03 14:25:34 -04:00
Ryan Dick
1f6699ac43 Consolidate all model.to(...) calls in the model cache to use a utility function with better logging. 2023-10-03 14:25:34 -04:00
Ryan Dick
ace8665411 Add warning log if moving a model from cuda to cpu causes unexpected change in VRAM usage. 2023-10-03 14:25:34 -04:00
Ryan Dick
7fa5bae8fd Add warning log if moving model from RAM to VRAM causes an unexpected change in VRAM usage. 2023-10-03 14:25:34 -04:00
Ryan Dick
f9faca7c91 Add warning log if model mis-reports its required cache memory before load from disk. 2023-10-03 14:25:34 -04:00
Ryan Dick
594fd3ba6d Add debug logging of changes in RAM and VRAM for all model cache operations. 2023-10-03 14:25:34 -04:00
Ryan Dick
44d68f5ed5 Auto-format model_cache.py. 2023-10-03 14:25:34 -04:00
Ryan
2f5e923008 Removed duplicate import in model_cache.py 2023-09-13 19:33:43 -04:00
Ryan
b7296000e4 made MPS calls conditional on MPS actually being the chosen device with backend available 2023-09-13 19:33:43 -04:00
Ryan
fab055995e Add empty_cache() for MPS hardware. 2023-09-13 19:33:43 -04:00
Martin Kristiansen
e88d7c242f isort wip 3 2023-09-12 13:01:58 -04:00
Martin Kristiansen
537ae2f901 Resolving merge conflicts for flake8 2023-08-18 15:52:04 +10:00
Lincoln Stein
bb1b8ceaa8
Update invokeai/backend/model_management/model_cache.py
Co-authored-by: StAlKeR7779 <stalkek7779@yandex.ru>
2023-08-16 08:48:44 -04:00
Lincoln Stein
f9958de6be added memory used to load models 2023-08-15 21:56:19 -04:00
Lincoln Stein
ec10aca91e report RAM and RAM cache statistics 2023-08-15 21:00:30 -04:00
Lincoln Stein
6ad565d84c folded in changes from 4099 2023-08-04 18:24:47 -04:00
Sergey Borisov
1ac14a1e43 add sdxl lora support 2023-08-04 11:44:56 -04:00
Brandon Rising
f5ac73b091 Merge branch 'main' into feat/onnx 2023-07-31 10:58:40 -04:00
Lincoln Stein
e20c4dc1e8 blackified 2023-07-30 08:17:10 -04:00
Lincoln Stein
844578ab88 fix lora loading crash 2023-07-30 07:57:10 -04:00
Lincoln Stein
99daa97978 more refactoring; fixed place where rel conversion missed 2023-07-29 13:00:07 -04:00
Brandon Rising
da751da3dd Merge branch 'main' into feat/onnx 2023-07-28 09:59:35 -04:00
Brandon Rising
2b7b3dd4ba Run python black 2023-07-28 09:46:44 -04:00
Brandon Rising
bfdc8c80f3 Testing caching onnx sessions 2023-07-27 14:13:29 -04:00
Martin Kristiansen
218b6d0546 Apply black 2023-07-27 10:54:01 -04:00
Sergey Borisov
bda0000acd Cleanup vram after models offloading, tweak to cleanup local variable references on ram offload 2023-07-18 23:21:18 +03:00
Sergey Borisov
bc11296a5e Disable lazy offloading on disabled vram cache, move resulted tensors to cpu(to not stack vram tensors in cache), fix - text encoder not freed(detach) 2023-07-18 16:20:25 +03:00
Lincoln Stein
dab03fb646 rename gpu_mem_reserved to max_vram_cache_size
To be consistent with max_cache_size, the amount of memory to hold in
VRAM for model caching is now controlled by the max_vram_cache_size
configuration parameter.
2023-07-11 15:25:39 -04:00
Lincoln Stein
d32f9f7cb0 reverse logic of gpu_mem_reserved
- gpu_mem_reserved now indicates the amount of VRAM that will be reserved
  for model caching (similar to max_cache_size).
2023-07-11 15:16:40 -04:00
Lincoln Stein
5759a390f9 introduce gpu_mem_reserved configuration parameter 2023-07-09 18:35:04 -04:00
Lincoln Stein
8d7dba937d fix undefined variable 2023-07-09 14:37:45 -04:00
Lincoln Stein
d6cb0e54b3 don't unload models from GPU until the space is needed 2023-07-09 14:26:30 -04:00
Lincoln Stein
0a6dccd607 expose max_cache_size to invokeai-configure interface 2023-07-05 20:59:14 -04:00
Lincoln Stein
6935858ef3 add debugging messages to aid in memory leak tracking 2023-07-02 13:34:53 -04:00
Lincoln Stein
0f02915012 remove hardcoded cuda device in model manager init 2023-07-01 21:15:42 -04:00