Commit Graph

64 Commits

Author SHA1 Message Date
Lincoln Stein
21a60af881
when unlocking models, offload_unlocked_models should prune to vram limit only (#6450)
Co-authored-by: Lincoln Stein <lstein@gmail.com>
2024-05-29 03:01:21 +00:00
Lincoln Stein
532f82cb97
Optimize RAM to VRAM transfer (#6312)
* avoid copying model back from cuda to cpu

* handle models that don't have state dicts

* add assertions that models need a `device()` method

* do not rely on torch.nn.Module having the device() method

* apply all patches after model is on the execution device

* fix model patching in latents too

* log patched tokenizer

* closes #6375

---------

Co-authored-by: Lincoln Stein <lstein@gmail.com>
2024-05-24 17:06:09 +00:00
Lincoln Stein
e93f4d632d
[util] Add generic torch device class (#6174)
* introduce new abstraction layer for GPU devices

* add unit test for device abstraction

* fix ruff

* convert TorchDeviceSelect into a stateless class

* move logic to select context-specific execution device into context API

* add mock hardware environments to pytest

* remove dangling mocker fixture

* fix unit test for running on non-CUDA systems

* remove unimplemented get_execution_device() call

* remove autocast precision

* Multiple changes:

1. Remove TorchDeviceSelect.get_execution_device(), as well as calls to
   context.models.get_execution_device().
2. Rename TorchDeviceSelect to TorchDevice
3. Added back the legacy public API defined in `invocation_api`, including
   choose_precision().
4. Added a config file migration script to accommodate removal of precision=autocast.

* add deprecation warnings to choose_torch_device() and choose_precision()

* fix test crash

* remove app_config argument from choose_torch_device() and choose_torch_dtype()

---------

Co-authored-by: Lincoln Stein <lstein@gmail.com>
2024-04-15 13:12:49 +00:00
Lincoln Stein
651c0b39b1 clear cache on all exceptions 2024-04-12 07:19:16 +10:00
Lincoln Stein
46d23cd868 catch RunTimeError during model to() call rather than OutOfMemoryError 2024-04-12 07:19:16 +10:00
Lincoln Stein
579082ac10 [mm] clear the cache entry for a model that got an OOM during loading 2024-04-12 07:19:16 +10:00
psychedelicious
9ab6655491 feat(backend): clean up choose_precision
- Allow user-defined precision on MPS.
- Use more explicit logic to handle all possible cases.
- Add comments.
- Remove the app_config args (they were effectively unused, just get the config using the singleton getter util)
2024-04-07 09:41:05 -04:00
psychedelicious
4068e817d6 fix(mm): typing issues in model cache 2024-04-06 14:35:36 +11:00
psychedelicious
a09d705e4c fix(mm): remove vram check
This check prematurely reports insufficient VRAM on Windows. See #6106 for details.
2024-04-06 14:35:36 +11:00
Lincoln Stein
4571986c63 fix misplaced lock call 2024-04-05 14:32:18 +11:00
Lincoln Stein
812f10730f
adjust free vram calculation for models that will be removed by lazy offloading (#6150)
Co-authored-by: Lincoln Stein <lstein@gmail.com>
2024-04-04 22:51:12 -04:00
psychedelicious
85f53f94f8 feat(mm): include needed vs free in OOM
Gives us a bit more visibility into these errors, which seem to be popping up more frequently with the new MM.
2024-04-04 06:26:15 +11:00
blessedcoolant
e574815413 chore: clean up merge conflicts 2024-04-03 20:28:00 +05:30
blessedcoolant
fb293dcd84 Merge branch 'checkpoint-ip-adapter' of https://github.com/blessedcoolant/InvokeAI into checkpoint-ip-adapter 2024-04-03 20:23:07 +05:30
blessedcoolant
2dcbb7223b fix: use Path for ip_adapter_ckpt_path instead of str 2024-04-03 20:21:03 +05:30
blessedcoolant
67afb1763e wip: Initial implementation of safetensor support for IP Adapter 2024-04-03 12:39:52 +05:30
psychedelicious
59b4a23479 feat(mm): use same pattern for vae converter as others
Add `dump_path` arg to the converter function & save the model to disk inside the conversion function. This is the same pattern as in the other conversion functions.
2024-04-01 12:34:49 +11:00
psychedelicious
13f410478a fix(mm): typing issues in vae loader 2024-04-01 12:34:49 +11:00
psychedelicious
25ff0bf80f fix(mm): return converted vae model instead of path
This was missed in #6072.
2024-04-01 12:34:49 +11:00
Lincoln Stein
3d6d89feb4
[mm] Do not write diffuser model to disk when convert_cache set to zero (#6072)
* pass model config to _load_model

* make conversion work again

* do not write diffusers to disk when convert_cache set to 0

* adding same model to cache twice is a no-op, not an assertion error

* fix issues identified by psychedelicious during pr review

* following conversion, avoid redundant read of cached submodels

* fix error introduced while merging

---------

Co-authored-by: Lincoln Stein <lstein@gmail.com>
2024-03-29 16:11:08 -04:00
psychedelicious
6d261a5a13 fix(mm): handle relative conversion config paths
I have tested main, controlnet and vae checkpoint conversions.
2024-03-29 10:56:06 -04:00
blessedcoolant
b013d0e064 wip: Initial implementation of safetensor support for IP Adapter 2024-03-27 22:08:14 +05:30
psychedelicious
eb33303e79 fix(mm): handle depth and inpainting models when converting to diffusers
"Normal" models have 4 in-channels, while "Depth" models have 5 and "Inpaint" models have 9.

We need to explicitly tell diffusers the channel count when converting models.

Closes  #6058
2024-03-27 07:48:54 -04:00
Lincoln Stein
27622dfd5e allow checkpoint config files to use root-relative paths 2024-03-22 08:57:45 +11:00
psychedelicious
6c13fa13ea fix(mm): regression from change to legacy conf dir change 2024-03-20 15:05:25 +11:00
Lincoln Stein
c87497fd54
record model_variant in t2i and clip_vision configs (#5989)
- Move base of t2i and clip_vision config models to DiffusersBase, which contains
  a field to record the model variant (e.g. "fp16")
- This restore the ability to load fp16 t2i and clip_vision models
- Also add defensive coding to load the vanilla model when the fp16 model
  has been replaced (or more likely, user's preferences changed since installation)

Co-authored-by: Lincoln Stein <lstein@gmail.com>
2024-03-19 20:14:12 +00:00
Lincoln Stein
f8df293d2c Revert "fix(mm): provide ckpt config as stream to diffusers"
This reverts commit 9d045964d6.
2024-03-19 14:24:54 +11:00
psychedelicious
9d045964d6 fix(mm): provide ckpt config as stream to diffusers
Fixes converting ckpt main models
2024-03-19 09:24:28 +11:00
Lincoln Stein
71a1740740 Remove core safetensors->diffusers conversion models
- No longer install core conversion models. Use the HuggingFace cache to load
  them if and when needed.

- Call directly into the diffusers library to perform conversions with only shallow
   wrappers around them to massage arguments, etc.

- At root configuration time, do not create all the possible model subdirectories,
  but let them be created and populated at model install time.

- Remove checks for missing core conversion files, since they are no
  longer installed.
2024-03-17 19:13:18 -04:00
Ryan Dick
9ee2e7ff25 Do not override log_memory_usage when debug logs are enabled. The speed cost of log_memory_usage=True is large. It is common to want debug log without enabling log_memory_usage. 2024-03-12 09:48:50 +11:00
Brandon Rising
e52274ecac Experiment with using absolute paths within model management 2024-03-08 15:36:14 -05:00
psychedelicious
132790eebe tidy(nodes): use canonical capitalizations 2024-03-07 10:56:59 +11:00
psychedelicious
7c9128b253 tidy(mm): use canonical capitalization for all model-related enums, classes
For example, "Lora" -> "LoRA", "Vae" -> "VAE".
2024-03-05 23:50:19 +11:00
psychedelicious
bd4fd9693d tidy(mm): rename ckpt "last_modified" -> "converted_at"
Clarify what this timestamp means
2024-03-05 23:50:19 +11:00
psychedelicious
9b40c28144 tidy(mm): rename ckpy "config" -> "config_path" 2024-03-05 23:50:19 +11:00
psychedelicious
16a5d718bf fix(mm): add config field to ckpt vaes 2024-03-05 23:50:19 +11:00
psychedelicious
76cbc745e1 refactor(mm): add CheckpointConfigBase for all ckpt models 2024-03-05 23:50:19 +11:00
psychedelicious
e426096d32 fix(mm): misc typing fixes for model loaders 2024-03-05 23:50:19 +11:00
psychedelicious
c561cd751f fix(mm): use correct import path for ConfigMixin, ModelMixin 2024-03-05 23:50:19 +11:00
psychedelicious
5b74117836 fix(mm): use generic for model loader registry
This preserves the typing for classes using the decorator
2024-03-05 23:50:19 +11:00
psychedelicious
dd31bc4586 refactor(mm): remove vae field on _MainConfig
We will handle default VAE selection in the UI.
2024-03-05 23:50:19 +11:00
psychedelicious
dd9daf8efb chore: ruff 2024-03-01 10:42:33 +11:00
psychedelicious
c80c0f0fb9 fix(mm): fix ModelCacheBase method name 2024-03-01 10:42:33 +11:00
psychedelicious
37d66488c5 chore: ruff 2024-03-01 10:42:33 +11:00
Lincoln Stein
371e3cc260 recover gracefuly from GPU out of memory errors (next version) 2024-03-01 10:42:33 +11:00
Lincoln Stein
d22738723d clear out VRAM when an OOM occurs 2024-03-01 10:42:33 +11:00
psychedelicious
5a3195f757 final tidying before marking PR as ready for review
- Replace AnyModelLoader with ModelLoaderRegistry
- Fix type check errors in multiple files
- Remove apparently unneeded `get_model_config_enum()` method from model manager
- Remove last vestiges of old model manager
- Updated tests and documentation

resolve conflict with seamless.py
2024-03-01 10:42:33 +11:00
Lincoln Stein
5d612ec095 Tidy names and locations of modules
- Rename old "model_management" directory to "model_management_OLD" in order to catch
  dangling references to original model manager.
- Caught and fixed most dangling references (still checking)
- Rename lora, textual_inversion and model_patcher modules
- Introduce a RawModel base class to simplfy the Union returned by the
  model loaders.
- Tidy up the model manager 2-related tests. Add useful fixtures, and
  a finalizer to the queue and installer fixtures that will stop the
  services and release threads.
2024-03-01 10:42:33 +11:00
Lincoln Stein
996eb96b4e Fix issues identified during PR review by RyanjDick and brandonrising
- ModelMetadataStoreService is now injected into ModelRecordStoreService
  (these two services are really joined at the hip, and should someday be merged)
- ModelRecordStoreService is now injected into ModelManagerService
- Reduced timeout value for the various installer and download wait*() methods
- Introduced a Mock modelmanager for testing
- Removed bare print() statement with _logger in the install helper backend.
- Removed unused code from model loader init file
- Made `locker` a private variable in the `LoadedModel` object.
- Fixed up model merge frontend (will be deprecated anyway!)
2024-03-01 10:42:33 +11:00
Brandon Rising
88d6de4101 Raise InvalidModelConfigException when unable to detect load class in ModelLoader 2024-03-01 10:42:33 +11:00