Commit Graph

311 Commits

Author SHA1 Message Date
Ryan Dick
f866b49255 Add some ESRGAN and SwinIR upscale models to the starter models list. 2024-07-16 15:55:10 -04:00
Ryan Dick
81991e072b Merge branch 'main' into ryan/spandrel-upscale 2024-07-16 15:14:08 -04:00
psychedelicious
38343917f8 fix(backend): revert non-blocking device transfer
In #6490 we enabled non-blocking torch device transfers throughout the model manager's memory management code. When using this torch feature, torch attempts to wait until the tensor transfer has completed before allowing any access to the tensor. Theoretically, that should make this a safe feature to use.

This provides a small performance improvement but causes race conditions in some situations. Specific platforms/systems are affected, and complicated data dependencies can make this unsafe.

- Intermittent black images on MPS devices - reported on discord and #6545, fixed with special handling in #6549.
- Intermittent OOMs and black images on a P4000 GPU on Windows - reported in #6613, fixed in this commit.

On my system, I haven't experience any issues with generation, but targeted testing of non-blocking ops did expose a race condition when moving tensors from CUDA to CPU.

One workaround is to use torch streams with manual sync points. Our application logic is complicated enough that this would be a lot of work and feels ripe for edge cases and missed spots.

Much safer is to fully revert non-locking - which is what this change does.
2024-07-16 08:59:42 +10:00
Ryan Dick
650902dc29 Fix broken unit test caused by non-existent model path. 2024-07-10 13:59:17 -04:00
Ryan Dick
7b5d4935b4 Merge branch 'main' into ryan/spandrel-upscale 2024-07-09 13:47:11 -04:00
Ryan Dick
af63c538ed Demote error log to warning to models treated as having size 0. 2024-07-09 08:35:43 -04:00
Ryan Dick
0ce6ec634d Do not assign the result of SpandrelImageToImageModel.load_from_file(...) during probe to ensure that the model is immediately gc'd. 2024-07-05 14:05:12 -04:00
Ryan Dick
35f8781ea2 Fix static type errors with SCHEDULER_NAME_VALUES. And, avoid bi-directional cross-directory imports, which contribute to circular import issues. 2024-07-05 07:38:35 -07:00
Ryan Dick
36202d6d25 Delete unused duplicate libc_util.py file. The active version is at invokeai/backend/model_manager/libc_util.py. 2024-07-04 10:30:40 -04:00
Ryan Dick
1d449097cc Apply ruff rule to disallow all relative imports. 2024-07-04 09:35:37 -04:00
Ryan Dick
9da5925287 Add ruff rule to disallow relative parent imports. 2024-07-04 09:35:37 -04:00
Ryan Dick
414750a45d Update calc_model_size_by_data(...) to handle all expected model types, and to log an error if an unexpected model type is received. 2024-07-04 09:08:25 -04:00
Ryan Dick
a405f14ea2 Fix SpandrelImageToImageModel size calculation for the model cache. 2024-07-03 16:38:16 -04:00
Ryan Dick
1ab20f43c8 Tidy spandrel model probe logic, and document the reasons behind the current implementation. 2024-07-03 16:28:21 -04:00
Ryan Dick
29c8ddfb88 WIP - A bunch of boilerplate to support Spandrel Image-to-Image models throughout the model manager and the frontend. 2024-07-03 16:28:21 -04:00
Ryan Dick
2a1514272f Set the dtype correctly for SpandrelImageToImageModels when they are loaded. 2024-07-03 16:28:21 -04:00
Ryan Dick
59ce9cf41c WIP - Begin to integrate SpandreImageToImageModel type into the model manager. 2024-07-03 16:28:21 -04:00
Ryan Dick
e6abea7bc5 (minor) Remove redundant else clause on a for-loop with no break statement. 2024-07-03 16:28:21 -04:00
Ryan Dick
c335f92345 (minor) simplify startswith(...) syntax. 2024-07-03 16:28:21 -04:00
Ryan Dick
e4813f800a Update calc_model_size_by_data(...) to handle all expected model types, and to log an error if an unexpected model type is received. 2024-07-02 21:51:45 -04:00
Kent Keirsey
5df2a79549 Update starter models 2024-06-28 17:49:45 +10:00
Kent Keirsey
10b9088312 update controlnet starter models 2024-06-28 17:49:45 +10:00
Lincoln Stein
3e0fb45dd7
Load single-file checkpoints directly without conversion (#6510)
* use model_class.load_singlefile() instead of converting; works, but performance is poor

* adjust the convert api - not right just yet

* working, needs sql migrator update

* rename migration_11 before conflict merge with main

* Update invokeai/backend/model_manager/load/model_loaders/stable_diffusion.py

Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>

* Update invokeai/backend/model_manager/load/model_loaders/stable_diffusion.py

Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>

* implement lightweight version-by-version config migration

* simplified config schema migration code

* associate sdxl config with sdxl VAEs

* remove use of original_config_file in load_single_file()

---------

Co-authored-by: Lincoln Stein <lstein@gmail.com>
Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>
2024-06-27 17:31:28 -04:00
Ryan Dick
14775cc9c4 ruff format 2024-06-27 09:45:13 -04:00
psychedelicious
c7562dd6c0
fix(backend): mps should not use non_blocking
We can get black outputs when moving tensors from CPU to MPS. It appears MPS to CPU is fine. See:
- https://github.com/pytorch/pytorch/issues/107455
- https://discuss.pytorch.org/t/should-we-set-non-blocking-to-true/38234/28

Changes:
- Add properties for each device on `TorchDevice` as a convenience.
- Add `get_non_blocking` static method on `TorchDevice`. This utility takes a torch device and returns the flag to be used for non_blocking when moving a tensor to the device provided.
- Update model patching and caching APIs to use this new utility.

Fixes: #6545
2024-06-27 19:15:23 +10:00
Lincoln Stein
b03073d888
[MM] Add support for probing and loading SDXL VAE checkpoint files (#6524)
* add support for probing and loading SDXL VAE checkpoint files

* broaden regexp probe for SDXL VAEs

---------

Co-authored-by: Lincoln Stein <lstein@gmail.com>
2024-06-20 02:57:27 +00:00
Ryan Dick
8e47e005a7 Tidy SilenceWarnings context manager:
- Fix type errors
- Enable SilenceWarnings to be used as both a context manager and a decorator
- Remove duplicate implementation
- Check the initial verbosity on __enter__() rather than __init__()
2024-06-18 15:06:22 -04:00
Brandon Rising
41a6bb45f3 Initial functionality 2024-06-18 10:38:29 -04:00
Lincoln Stein
a3cb5da130
Improve RAM<->VRAM memory copy performance in LoRA patching and elsewhere (#6490)
* allow model patcher to optimize away the unpatching step when feasible

* remove lazy_offloading functionality

* allow model patcher to optimize away the unpatching step when feasible

* remove lazy_offloading functionality

* do not save original weights if there is a CPU copy of state dict

* Update invokeai/backend/model_manager/load/load_base.py

Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>

* documentation fixes requested during penultimate review

* add non-blocking=True parameters to several torch.nn.Module.to() calls, for slight performance increases

* fix ruff errors

* prevent crash on non-cuda-enabled systems

---------

Co-authored-by: Lincoln Stein <lstein@gmail.com>
Co-authored-by: Kent Keirsey <31807370+hipsterusername@users.noreply.github.com>
Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>
2024-06-13 17:10:03 +00:00
psychedelicious
fde58ce0a3 Merge remote-tracking branch 'origin/main' into lstein/feat/simple-mm2-api 2024-06-07 14:23:41 +10:00
Lincoln Stein
f81b8bc9f6 add support for generic loading of diffusers directories 2024-06-07 13:54:30 +10:00
Lincoln Stein
2871676f79
LoRA patching optimization (#6439)
* allow model patcher to optimize away the unpatching step when feasible

* remove lazy_offloading functionality

* allow model patcher to optimize away the unpatching step when feasible

* remove lazy_offloading functionality

* do not save original weights if there is a CPU copy of state dict

* Update invokeai/backend/model_manager/load/load_base.py

Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>

* documentation fixes added during penultimate review

---------

Co-authored-by: Lincoln Stein <lstein@gmail.com>
Co-authored-by: Kent Keirsey <31807370+hipsterusername@users.noreply.github.com>
Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>
2024-06-06 13:53:35 +00:00
psychedelicious
e7513f6088 docs(mm): add comment in move_model_to_device 2024-06-03 10:56:04 +10:00
Lincoln Stein
2276f327e5
Merge branch 'main' into lstein/feat/simple-mm2-api 2024-06-02 09:45:31 -04:00
Lincoln Stein
21a60af881
when unlocking models, offload_unlocked_models should prune to vram limit only (#6450)
Co-authored-by: Lincoln Stein <lstein@gmail.com>
2024-05-29 03:01:21 +00:00
Lincoln Stein
34e1eb19f9 merge with main and resolve conflicts 2024-05-27 22:20:34 -04:00
Lincoln Stein
532f82cb97
Optimize RAM to VRAM transfer (#6312)
* avoid copying model back from cuda to cpu

* handle models that don't have state dicts

* add assertions that models need a `device()` method

* do not rely on torch.nn.Module having the device() method

* apply all patches after model is on the execution device

* fix model patching in latents too

* log patched tokenizer

* closes #6375

---------

Co-authored-by: Lincoln Stein <lstein@gmail.com>
2024-05-24 17:06:09 +00:00
Lincoln Stein
f29c406fed refactor model_install to work with refactored download queue 2024-05-13 22:49:15 -04:00
Lincoln Stein
0bf14c2830 add multifile_download() method to download service 2024-05-12 20:14:00 -06:00
Lincoln Stein
b48d4a049d bad implementation of diffusers folder download 2024-05-08 21:21:01 -07:00
Lincoln Stein
7c39929758 support VRAM caching of dict models that lack to() 2024-04-28 13:41:06 -04:00
Lincoln Stein
a26667d3ca make download and convert cache keys safe for filename length 2024-04-28 12:24:36 -04:00
Lincoln Stein
bb04f496e0 Merge branch 'main' into lstein/feat/simple-mm2-api 2024-04-28 11:33:26 -04:00
Lincoln Stein
70903ef057 refactor load_ckpt_from_url() 2024-04-28 11:33:23 -04:00
psychedelicious
241a1fdb57 feat(mm): support sdxl ckpt inpainting models
There are only a couple SDXL inpainting models, and my tests indicate they are not as good as SD1.5 inpainting, but at least we support them now.

- Add the config file. This matches what is used in A1111. The only difference from the non-inpainting SDXL config is the number of in-channels.
- Update the legacy config maps to use this config file.
2024-04-28 12:57:27 +10:00
Lincoln Stein
d72f272f16 Address change requests in first round of PR reviews.
Pending:

- Move model install calls into model manager and create passthrus in invocation_context.
- Consider splitting load_model_from_url() into a call to get the path and a call to load the path.
2024-04-24 23:53:30 -04:00
blessedcoolant
260e24733f fix: update SDXL IP Adpater starter model to be ViT-H 2024-04-24 00:08:21 -04:00
blessedcoolant
6b394554e2 fix: update ip adapter starter models path 2024-04-24 08:48:25 +05:30
psychedelicious
a461537087 chore: ruff 2024-04-23 07:32:53 -04:00
psychedelicious
0aa5aadfe8 fix(mm): move variant to MainConfigBase
shoulda been here all along
2024-04-23 07:32:53 -04:00
Lincoln Stein
470a39935c fix merge conflicts with main 2024-04-15 09:24:57 -04:00
Lincoln Stein
e93f4d632d
[util] Add generic torch device class (#6174)
* introduce new abstraction layer for GPU devices

* add unit test for device abstraction

* fix ruff

* convert TorchDeviceSelect into a stateless class

* move logic to select context-specific execution device into context API

* add mock hardware environments to pytest

* remove dangling mocker fixture

* fix unit test for running on non-CUDA systems

* remove unimplemented get_execution_device() call

* remove autocast precision

* Multiple changes:

1. Remove TorchDeviceSelect.get_execution_device(), as well as calls to
   context.models.get_execution_device().
2. Rename TorchDeviceSelect to TorchDevice
3. Added back the legacy public API defined in `invocation_api`, including
   choose_precision().
4. Added a config file migration script to accommodate removal of precision=autocast.

* add deprecation warnings to choose_torch_device() and choose_precision()

* fix test crash

* remove app_config argument from choose_torch_device() and choose_torch_dtype()

---------

Co-authored-by: Lincoln Stein <lstein@gmail.com>
2024-04-15 13:12:49 +00:00
Lincoln Stein
3a26c7bb9e fix merge conflicts 2024-04-12 00:58:11 -04:00
Lincoln Stein
df5ebdbc4f add invocation_context.load_ckpt_from_url() method 2024-04-12 00:55:21 -04:00
Lincoln Stein
651c0b39b1 clear cache on all exceptions 2024-04-12 07:19:16 +10:00
Lincoln Stein
46d23cd868 catch RunTimeError during model to() call rather than OutOfMemoryError 2024-04-12 07:19:16 +10:00
Lincoln Stein
579082ac10 [mm] clear the cache entry for a model that got an OOM during loading 2024-04-12 07:19:16 +10:00
psychedelicious
9ab6655491 feat(backend): clean up choose_precision
- Allow user-defined precision on MPS.
- Use more explicit logic to handle all possible cases.
- Add comments.
- Remove the app_config args (they were effectively unused, just get the config using the singleton getter util)
2024-04-07 09:41:05 -04:00
psychedelicious
4068e817d6 fix(mm): typing issues in model cache 2024-04-06 14:35:36 +11:00
psychedelicious
a09d705e4c fix(mm): remove vram check
This check prematurely reports insufficient VRAM on Windows. See #6106 for details.
2024-04-06 14:35:36 +11:00
Lincoln Stein
4571986c63 fix misplaced lock call 2024-04-05 14:32:18 +11:00
Lincoln Stein
812f10730f
adjust free vram calculation for models that will be removed by lazy offloading (#6150)
Co-authored-by: Lincoln Stein <lstein@gmail.com>
2024-04-04 22:51:12 -04:00
brandonrising
51ca59c088 Update probe to always use cpu for loading models 2024-04-04 07:34:43 +11:00
psychedelicious
85f53f94f8 feat(mm): include needed vs free in OOM
Gives us a bit more visibility into these errors, which seem to be popping up more frequently with the new MM.
2024-04-04 06:26:15 +11:00
blessedcoolant
5f01de1993 chore: ruff and lint fixes 2024-04-03 20:41:51 +05:30
blessedcoolant
e574815413 chore: clean up merge conflicts 2024-04-03 20:28:00 +05:30
blessedcoolant
fb293dcd84 Merge branch 'checkpoint-ip-adapter' of https://github.com/blessedcoolant/InvokeAI into checkpoint-ip-adapter 2024-04-03 20:23:07 +05:30
blessedcoolant
2dcbb7223b fix: use Path for ip_adapter_ckpt_path instead of str 2024-04-03 20:21:03 +05:30
blessedcoolant
a14ce0edab chore: rename IPAdapterDiffusersConfig to IPAdapterInvokeAIConfig 2024-04-03 12:40:10 +05:30
blessedcoolant
4a0dfc3b2d ui: improve the clip vision model picker layout 2024-04-03 12:40:08 +05:30
blessedcoolant
79f7b61dfe fix: cleanup across various ip adapter files 2024-04-03 12:39:52 +05:30
blessedcoolant
b1c8266e22 feat: add base model recognition for ip adapter safetensor files 2024-04-03 12:39:52 +05:30
blessedcoolant
67afb1763e wip: Initial implementation of safetensor support for IP Adapter 2024-04-03 12:39:52 +05:30
psychedelicious
59b4a23479 feat(mm): use same pattern for vae converter as others
Add `dump_path` arg to the converter function & save the model to disk inside the conversion function. This is the same pattern as in the other conversion functions.
2024-04-01 12:34:49 +11:00
psychedelicious
13f410478a fix(mm): typing issues in vae loader 2024-04-01 12:34:49 +11:00
psychedelicious
25ff0bf80f fix(mm): return converted vae model instead of path
This was missed in #6072.
2024-04-01 12:34:49 +11:00
Lincoln Stein
3d6d89feb4
[mm] Do not write diffuser model to disk when convert_cache set to zero (#6072)
* pass model config to _load_model

* make conversion work again

* do not write diffusers to disk when convert_cache set to 0

* adding same model to cache twice is a no-op, not an assertion error

* fix issues identified by psychedelicious during pr review

* following conversion, avoid redundant read of cached submodels

* fix error introduced while merging

---------

Co-authored-by: Lincoln Stein <lstein@gmail.com>
2024-03-29 16:11:08 -04:00
psychedelicious
6d261a5a13 fix(mm): handle relative conversion config paths
I have tested main, controlnet and vae checkpoint conversions.
2024-03-29 10:56:06 -04:00
blessedcoolant
cd52e99bb9 Merge branch 'main' into checkpoint-ip-adapter 2024-03-29 12:39:53 +05:30
blessedcoolant
0d8b535131 chore: rename IPAdapterDiffusersConfig to IPAdapterInvokeAIConfig 2024-03-29 11:50:18 +05:30
psychedelicious
2f6cce48af docs(mm): update ModelSearch 2024-03-28 12:35:41 +11:00
blessedcoolant
1a93f56d06 ui: improve the clip vision model picker layout 2024-03-27 22:11:07 +05:30
blessedcoolant
4ed2bf53ca fix: cleanup across various ip adapter files 2024-03-27 22:08:14 +05:30
blessedcoolant
60bf0caca3 feat: add base model recognition for ip adapter safetensor files 2024-03-27 22:08:14 +05:30
blessedcoolant
b013d0e064 wip: Initial implementation of safetensor support for IP Adapter 2024-03-27 22:08:14 +05:30
psychedelicious
21758e7b49 fix(mm): move depth variant config to sd2
Looks like a copy/paste got mixed up.
2024-03-27 07:48:54 -04:00
psychedelicious
eb33303e79 fix(mm): handle depth and inpainting models when converting to diffusers
"Normal" models have 4 in-channels, while "Depth" models have 5 and "Inpaint" models have 9.

We need to explicitly tell diffusers the channel count when converting models.

Closes  #6058
2024-03-27 07:48:54 -04:00
psychedelicious
243de683d7 fix(mm): handle integer state dict keys in probe
It's possible for a model's state dict to have integer keys, though we do not actually support such models.

As part of probing, we call `key.startswith(...)` on the state dict keys. This raises an `AttributeError` for integer keys.

This logic is in `invokeai/backend/model_manager/probe.py:get_model_type_from_checkpoint`

To fix this, we can cast the keys to strings first. The models w/ integer keys will still fail to be probed, but we'll get a `InvalidModelConfigException` instead of `AttributeError`.

Closes #6044
2024-03-27 09:30:25 +11:00
psychedelicious
2ec03ae95c fix(mm): default settings pydantic error
Add `extra="forbid"` to the default settings models.

Closes #6035.

Pydantic has some quirks related to unions. This affected how the union of default settings was evaluated. See https://github.com/pydantic/pydantic/issues/9095 for a detailed description of the behaviour that this change addresses.
2024-03-25 07:40:52 -04:00
skunkworxdark
37fd57d4d9 Update probe.py
Minor case-sensitive typo. `ModelType.Lora` should be `ModelType.LoRA`
2024-03-22 09:09:56 -07:00
psychedelicious
e7a096dec1 fix(mm): remove proteus model
This model is SDXL and relies on CLIP Skip. We don't support that yet.
2024-03-22 02:22:03 -07:00
psychedelicious
05d6661877 feat(mm): revised list of starter models
- Enriched dependencies to not just be a string - allows reuse of a dependency as a starter model _and_ dependency of another model. For example, all the SDXL models have the fp16 VAE as a dependency, but you can also download it on its own.
- Looked at popular models on the major model sites to select the list. No SD2 models. All hosted on HF.
2024-03-22 14:59:33 +11:00
Lincoln Stein
27622dfd5e allow checkpoint config files to use root-relative paths 2024-03-22 08:57:45 +11:00
psychedelicious
7726d312e1 feat(mm): default hashing algo to blake3_single
For SSDs, `blake3` is about 10x faster than `blake3_single` - 3 files/second vs 30 files/second.

For spinning HDDs, `blake3` is about 100x slower than `blake3_single` - 300 seconds/file vs 3 seconds/file.

For external drives, `blake3` is always worse, but the difference is highly variable. For external spinning drives, it's probably way worse than internal.

The least offensive algorithm is `blake3_single`, and it's still _much_ faster than any other algorithm.
2024-03-22 08:26:36 +11:00
Lincoln Stein
d4d0fea078
[feature] Add probe for SDXL controlnet models (#5382)
* add probe for SDXL controlnet models

* Update invokeai/backend/model_management/model_probe.py

Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>

* Update invokeai/backend/model_manager/probe.py

Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>

---------

Co-authored-by: Lincoln Stein <lstein@gmail.com>
Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>
2024-03-21 14:49:45 +00:00
psychedelicious
6c13fa13ea fix(mm): regression from change to legacy conf dir change 2024-03-20 15:05:25 +11:00
psychedelicious
5ceaeb234d feat(mm): add starter models route
The models from INITIAL_MODELS.yaml have been recreated as a structured python object. This data is served on a new route. The model sources are compared against currently-installed models to determine if they are already installed or not.
2024-03-20 15:05:25 +11:00
Lincoln Stein
c87497fd54
record model_variant in t2i and clip_vision configs (#5989)
- Move base of t2i and clip_vision config models to DiffusersBase, which contains
  a field to record the model variant (e.g. "fp16")
- This restore the ability to load fp16 t2i and clip_vision models
- Also add defensive coding to load the vanilla model when the fp16 model
  has been replaced (or more likely, user's preferences changed since installation)

Co-authored-by: Lincoln Stein <lstein@gmail.com>
2024-03-19 20:14:12 +00:00
Brandon Rising
9d5b96c119 Pull format type setting out of model_type if statement 2024-03-19 01:16:11 -04:00
Brandon Rising
5daefccf77 Simplify logic for determining model type in probe 2024-03-19 01:16:11 -04:00