Commit Graph

1558 Commits

Author SHA1 Message Date
Lincoln Stein
e26360f85b merged multi-gpu support into new session_processor architecture 2024-06-02 14:10:08 -04:00
Lincoln Stein
21a60af881
when unlocking models, offload_unlocked_models should prune to vram limit only (#6450)
Co-authored-by: Lincoln Stein <lstein@gmail.com>
2024-05-29 03:01:21 +00:00
Ryan Dick
829b9ad66b Add a callout about the hackiness of dropping tokens in the TextualInversionManager. 2024-05-28 05:11:54 -07:00
Ryan Dick
3aa1c8d3a8 Update TextualInversionManager for compatibility with the latest transformers release. See https://github.com/invoke-ai/InvokeAI/issues/6445. 2024-05-28 05:11:54 -07:00
Ryan Dick
994c61b67a Add docs to TextualInversionManager and improve types. No changes to functionality. 2024-05-28 05:11:54 -07:00
Lincoln Stein
532f82cb97
Optimize RAM to VRAM transfer (#6312)
* avoid copying model back from cuda to cpu

* handle models that don't have state dicts

* add assertions that models need a `device()` method

* do not rely on torch.nn.Module having the device() method

* apply all patches after model is on the execution device

* fix model patching in latents too

* log patched tokenizer

* closes #6375

---------

Co-authored-by: Lincoln Stein <lstein@gmail.com>
2024-05-24 17:06:09 +00:00
psychedelicious
93da75209c feat(nodes): use new blur_if_nsfw method 2024-05-14 07:23:38 +10:00
psychedelicious
9c819f0fd8 fix(nodes): fix nsfw checker model download 2024-05-14 07:23:38 +10:00
blessedcoolant
da61396b1c cleanup: seamless unused older code cleanup 2024-05-13 08:11:08 +10:00
blessedcoolant
6c9fb617dc fix: fix seamless 2024-05-13 08:11:08 +10:00
Lincoln Stein
e57809e1c6
Merge branch 'main' into lstein/feat/multi-gpu 2024-05-03 00:05:04 -04:00
Lincoln Stein
1c0067f931
Merge branch 'main' into lstein/feat/multi-gpu 2024-04-30 18:14:03 -04:00
blessedcoolant
39ab4dd83e Merge branch 'main' into pr/6086 2024-05-01 00:37:06 +05:30
psychedelicious
2d7b8c2a1b fix(backend): do not round image dims to 64 in controlnet processor resize
Rounding the dims results in control images that are subtly different than the input. We round to the nearest 8px later, there's no need to round now.
2024-04-30 08:10:59 -04:00
psychedelicious
241a1fdb57 feat(mm): support sdxl ckpt inpainting models
There are only a couple SDXL inpainting models, and my tests indicate they are not as good as SD1.5 inpainting, but at least we support them now.

- Add the config file. This matches what is used in A1111. The only difference from the non-inpainting SDXL config is the number of in-channels.
- Update the legacy config maps to use this config file.
2024-04-28 12:57:27 +10:00
psychedelicious
6b0bf59682 feat(backend): update nms util to make blur/thresholding optional 2024-04-25 13:20:09 +10:00
blessedcoolant
260e24733f fix: update SDXL IP Adpater starter model to be ViT-H 2024-04-24 00:08:21 -04:00
blessedcoolant
6b394554e2 fix: update ip adapter starter models path 2024-04-24 08:48:25 +05:30
psychedelicious
a461537087 chore: ruff 2024-04-23 07:32:53 -04:00
psychedelicious
0aa5aadfe8 fix(mm): move variant to MainConfigBase
shoulda been here all along
2024-04-23 07:32:53 -04:00
Lincoln Stein
2b9f06dc4c
Re-enable app shutdown actions (#6244)
* closes #6242

* only override sigINT during slow model scanning

* fix ruff formatting

---------

Co-authored-by: Lincoln Stein <lstein@gmail.com>
2024-04-19 06:45:42 -04:00
Lincoln Stein
c3d1252892 revert to old system for doing RAM <-> VRAM transfers; new way leaks memory 2024-04-17 09:51:57 -04:00
Lincoln Stein
84f5cbdd97 make choose_torch_dtype() usable outside an invocation context 2024-04-16 19:19:19 -04:00
Lincoln Stein
edac01d4fb reverse stupid hack 2024-04-16 18:13:59 -04:00
Lincoln Stein
d04c880cce fix ValueError on model manager install 2024-04-16 17:57:40 -04:00
Lincoln Stein
89f8326c0b Merge branch 'lstein/feat/multi-gpu' of github.com:invoke-ai/InvokeAI into lstein/feat/multi-gpu 2024-04-16 16:27:08 -04:00
Lincoln Stein
99558de178 device selection calls go through TorchDevice 2024-04-16 16:26:58 -04:00
Lincoln Stein
77130f108d
Merge branch 'main' into lstein/feat/multi-gpu 2024-04-16 16:14:27 -04:00
Lincoln Stein
371f5bc782 simplify logic for retrieving execution devices 2024-04-16 15:52:03 -04:00
Lincoln Stein
fb9b7fb63a make object_serializer._new_name() thread-safe; add max_threads config 2024-04-16 15:23:49 -04:00
blessedcoolant
6bab040d24 Merge branch 'main' into ip-adapter-style-comp 2024-04-16 21:14:06 +05:30
blessedcoolant
f46bbaf8c4 fix: make ip-adapter weights not be optional 2024-04-16 21:12:45 +05:30
Lincoln Stein
a84f3058e2 revert object_serializer_forward_cache.py 2024-04-15 22:28:48 -04:00
Lincoln Stein
f7436f3bae fixup config_default; patch TorchDevice to work dynamically 2024-04-15 22:15:50 -04:00
Lincoln Stein
7dd93cb810 fix merge issues; likely nonfunctional 2024-04-15 21:16:21 -04:00
blessedcoolant
d27907cc6d fix: entire reshaping block needs to be skipped 2024-04-16 04:29:53 +05:30
blessedcoolant
7ee3fef2db cleanup: better var names for the ip adapter weight collection block 2024-04-16 04:23:50 +05:30
blessedcoolant
a148c4322c fix: IP Adapter weights being incorrectly applied
They were being overwritten rather than being appended
2024-04-16 04:10:41 +05:30
blessedcoolant
5f6c6abf9c chore: change IPAdapterAttentionWeights to a dataclass 2024-04-15 23:38:55 +05:30
Lincoln Stein
e93f4d632d
[util] Add generic torch device class (#6174)
* introduce new abstraction layer for GPU devices

* add unit test for device abstraction

* fix ruff

* convert TorchDeviceSelect into a stateless class

* move logic to select context-specific execution device into context API

* add mock hardware environments to pytest

* remove dangling mocker fixture

* fix unit test for running on non-CUDA systems

* remove unimplemented get_execution_device() call

* remove autocast precision

* Multiple changes:

1. Remove TorchDeviceSelect.get_execution_device(), as well as calls to
   context.models.get_execution_device().
2. Rename TorchDeviceSelect to TorchDevice
3. Added back the legacy public API defined in `invocation_api`, including
   choose_precision().
4. Added a config file migration script to accommodate removal of precision=autocast.

* add deprecation warnings to choose_torch_device() and choose_precision()

* fix test crash

* remove app_config argument from choose_torch_device() and choose_torch_dtype()

---------

Co-authored-by: Lincoln Stein <lstein@gmail.com>
2024-04-15 13:12:49 +00:00
blessedcoolant
8426f1e7b2 fix(experimental): Possible fix for conflict with regional embed length mismatch
Pushing this so people can test it out and see if this needs to be handled in a different way.
2024-04-14 12:19:19 +05:30
blessedcoolant
9cb0f63c44 refactor: fix a bunch of type issues in custom_attention 2024-04-13 14:17:25 +05:30
blessedcoolant
d4393e4170 chore: linter fixes 2024-04-13 12:14:45 +05:30
blessedcoolant
6ea183f0d4 wip: Initial Implementation IP Adapter Style & Comp Modes 2024-04-13 11:09:45 +05:30
Lincoln Stein
651c0b39b1 clear cache on all exceptions 2024-04-12 07:19:16 +10:00
Lincoln Stein
46d23cd868 catch RunTimeError during model to() call rather than OutOfMemoryError 2024-04-12 07:19:16 +10:00
Lincoln Stein
579082ac10 [mm] clear the cache entry for a model that got an OOM during loading 2024-04-12 07:19:16 +10:00
psychedelicious
7bc77ddb40 fix(nodes): doubly-noised latents
When using refiner with a mask (i.e. inpainting), we don't have noise provided as an input to the node.

This situation uniquely hits a code path that wasn't reviewed when gradient denoising was implemented.

That code path does two things wrong:
- It lerp'd the input latents. This was fixed in 5a1f4cb1ce.
- It added noise to the latents an extra time. This is fixed in this change.

We don't need to add noise in `latents_from_embeddings` because we do it just a lines later in `AddsMaskGuidance`.

- Remove the extraneous call to `add_noise`
- Make `seed` a required arg. We never call the function without seed anyways. If we refactor this in the future, it will be clearer that we need to look at how seed is handled.
- Move the call to create the noise to a deeper conditional, just before we call `AddsMaskGuidance`. The created noise tensor is now only used in that function, no need to create it every time.

Note: Whether or not having both noise and latents as inputs on the node is correct is a separate conversation. This change just fixes the issue with the current setup.
2024-04-11 07:21:50 -04:00
Ryan Dick
f9af32a6d1 Fix the padding behavior when max-pooling regional IP-Adapter masks to mirror the downscaling behavior of SD and SDXL. Prior to this change, denoising with input latent dimensions that were not evenly divisible by 8 would raise an exception. 2024-04-09 16:50:43 -04:00
Ryan Dick
fba40eb1bd Fix the padding behavior when max-pooling regional prompt masks to mirror the downscaling behavior of SD and SDXL. Prior to this change, denoising with input latent dimensions that were not evenly divisible by 8 would raise an exception. 2024-04-09 16:50:43 -04:00