Commit Graph

48 Commits

Author SHA1 Message Date
psychedelicious
38343917f8 fix(backend): revert non-blocking device transfer
In #6490 we enabled non-blocking torch device transfers throughout the model manager's memory management code. When using this torch feature, torch attempts to wait until the tensor transfer has completed before allowing any access to the tensor. Theoretically, that should make this a safe feature to use.

This provides a small performance improvement but causes race conditions in some situations. Specific platforms/systems are affected, and complicated data dependencies can make this unsafe.

- Intermittent black images on MPS devices - reported on discord and #6545, fixed with special handling in #6549.
- Intermittent OOMs and black images on a P4000 GPU on Windows - reported in #6613, fixed in this commit.

On my system, I haven't experience any issues with generation, but targeted testing of non-blocking ops did expose a race condition when moving tensors from CUDA to CPU.

One workaround is to use torch streams with manual sync points. Our application logic is complicated enough that this would be a lot of work and feels ripe for edge cases and missed spots.

Much safer is to fully revert non-locking - which is what this change does.
2024-07-16 08:59:42 +10:00
Ryan Dick
1d449097cc Apply ruff rule to disallow all relative imports. 2024-07-04 09:35:37 -04:00
Ryan Dick
9da5925287 Add ruff rule to disallow relative parent imports. 2024-07-04 09:35:37 -04:00
Ryan Dick
414750a45d Update calc_model_size_by_data(...) to handle all expected model types, and to log an error if an unexpected model type is received. 2024-07-04 09:08:25 -04:00
Lincoln Stein
a3cb5da130
Improve RAM<->VRAM memory copy performance in LoRA patching and elsewhere (#6490)
* allow model patcher to optimize away the unpatching step when feasible

* remove lazy_offloading functionality

* allow model patcher to optimize away the unpatching step when feasible

* remove lazy_offloading functionality

* do not save original weights if there is a CPU copy of state dict

* Update invokeai/backend/model_manager/load/load_base.py

Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>

* documentation fixes requested during penultimate review

* add non-blocking=True parameters to several torch.nn.Module.to() calls, for slight performance increases

* fix ruff errors

* prevent crash on non-cuda-enabled systems

---------

Co-authored-by: Lincoln Stein <lstein@gmail.com>
Co-authored-by: Kent Keirsey <31807370+hipsterusername@users.noreply.github.com>
Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>
2024-06-13 17:10:03 +00:00
blessedcoolant
be574cb764 fix: incorrect suffix check in ip adapter checkpoint file 2024-04-03 22:38:28 +05:30
blessedcoolant
e574815413 chore: clean up merge conflicts 2024-04-03 20:28:00 +05:30
blessedcoolant
fb293dcd84 Merge branch 'checkpoint-ip-adapter' of https://github.com/blessedcoolant/InvokeAI into checkpoint-ip-adapter 2024-04-03 20:23:07 +05:30
blessedcoolant
414851f2f0 fix: raise and present the runtime error from the exception 2024-04-03 20:21:50 +05:30
blessedcoolant
2dcbb7223b fix: use Path for ip_adapter_ckpt_path instead of str 2024-04-03 20:21:03 +05:30
blessedcoolant
14a9f74b17 cleanup: use load_file of safetensors directly for loading ip adapters 2024-04-03 12:40:13 +05:30
blessedcoolant
1372ef15b3 fix: Fail when unexpected keys are found in IP Adapter models 2024-04-03 12:40:11 +05:30
blessedcoolant
be1212de9a fix: Raise a better error when incorrect CLIP Vision model is used 2024-04-03 12:40:10 +05:30
blessedcoolant
936b99bd3c chore: improve types in ip_adapter backend file 2024-04-03 12:40:02 +05:30
blessedcoolant
67afb1763e wip: Initial implementation of safetensor support for IP Adapter 2024-04-03 12:39:52 +05:30
blessedcoolant
23390f1516 cleanup: use load_file of safetensors directly for loading ip adapters 2024-04-01 06:37:38 +05:30
blessedcoolant
6e4c2d3685 fix: Fail when unexpected keys are found in IP Adapter models 2024-03-29 12:34:56 +05:30
blessedcoolant
cd078b1865 fix: Raise a better error when incorrect CLIP Vision model is used 2024-03-29 11:58:10 +05:30
blessedcoolant
688a0f30bb chore: improve types in ip_adapter backend file 2024-03-27 22:08:23 +05:30
blessedcoolant
b013d0e064 wip: Initial implementation of safetensor support for IP Adapter 2024-03-27 22:08:14 +05:30
psychedelicious
5a3195f757 final tidying before marking PR as ready for review
- Replace AnyModelLoader with ModelLoaderRegistry
- Fix type check errors in multiple files
- Remove apparently unneeded `get_model_config_enum()` method from model manager
- Remove last vestiges of old model manager
- Updated tests and documentation

resolve conflict with seamless.py
2024-03-01 10:42:33 +11:00
Lincoln Stein
5d612ec095 Tidy names and locations of modules
- Rename old "model_management" directory to "model_management_OLD" in order to catch
  dangling references to original model manager.
- Caught and fixed most dangling references (still checking)
- Rename lora, textual_inversion and model_patcher modules
- Introduce a RawModel base class to simplfy the Union returned by the
  model loaders.
- Tidy up the model manager 2-related tests. Add useful fixtures, and
  a finalizer to the queue and installer fixtures that will stop the
  services and release threads.
2024-03-01 10:42:33 +11:00
Lincoln Stein
a23dedd2ee make model manager v2 ready for PR review
- Replace legacy model manager service with the v2 manager.

- Update invocations to use new load interface.

- Fixed many but not all type checking errors in the invocations. Most
  were unrelated to model manager

- Updated routes. All the new routes live under the route tag
  `model_manager_v2`. To avoid confusion with the old routes,
  they have the URL prefix `/api/v2/models`. The old routes
  have been de-registered.

- Added a pytest for the loader.

- Updated documentation in contributing/MODEL_MANAGER.md
2024-03-01 10:42:33 +11:00
Ryan Dick
693c6cf5e4 Add support for IPAdapterFull models. The changes are based on this upstream PR: https://github.com/tencent-ailab/IP-Adapter/pull/139 . 2023-11-29 15:07:21 -08:00
Ryan Dick
26b91a538a Fixes to get IP-Adapter tests working with new multi-IP-Adapter support. 2023-10-06 20:43:43 -04:00
Ryan Dick
7ca456d674 Update IP-Adapter model to enable running multiple IP-Adapters at once. (Not tested yet.) 2023-10-06 20:43:43 -04:00
Ryan Dick
fbe6452c45 Add support for IPAdapterPlusXL based on 6219530507. 2023-10-04 22:35:17 -04:00
Ryan Dick
399ebe443e Fix IP-Adapter calculation of memory footprint. 2023-09-25 18:28:10 -04:00
Ryan Dick
b05b8ef677 Switch to using torch 2.0 attention for IP-Adapter (more memory-efficient). 2023-09-18 16:30:53 -04:00
user1
ced297ed21 Initial implementation of IP-Adapter "begin_step_percent" and "end_step_percent" for controlling on which steps IP-Adapter is applied in the denoising loop. 2023-09-16 08:24:12 -07:00
Ryan Dick
c2f074dc2f Fix python static checks. 2023-09-14 16:48:47 -04:00
Ryan Dick
94c186bb4c Fix bug in IPAdapter.to(...). 2023-09-14 15:45:25 -04:00
Ryan Dick
a22c8cb3a1 Improve robustness of check for IPAdapter vs IPAdapterPlus. 2023-09-14 15:25:41 -04:00
Ryan Dick
781e8521d5 Eliminate the need for IPAdapter.initialize(). 2023-09-14 15:02:59 -04:00
Ryan Dick
d114d0ba95 Remove need for the image_encoder param in IPAdapter.initialize(). 2023-09-14 14:14:35 -04:00
Ryan Dick
1c8991a3df Use CLIPVisionModel under model management for IP-Adapter. 2023-09-13 19:10:02 -04:00
Ryan Dick
3ee9a21647 Initial (barely) working version of IP-Adapter model management. 2023-09-13 08:27:24 -04:00
Ryan Dick
6ca6cf713c Tidy IPAdapter. Add types, improve field/method naming. 2023-09-08 16:00:58 -04:00
Ryan Dick
3f7d5b4e0f Remove redundant IPAdapterXL class. 2023-09-08 15:46:10 -04:00
Ryan Dick
91596d9527 Re-factor IPAdapter to patch UNet in a context manager. 2023-09-08 15:39:22 -04:00
Ryan Dick
d669f0855d Comment unused IPAdapter generate(...) methods. 2023-09-08 13:12:42 -04:00
Ryan Dick
b2d5b53b5f Pass IP-Adapter conditioning via cross_attention_kwargs instead of concatenating to the text embedding. This avoids interference with other features that manipulate the text embedding (e.g. long prompts). 2023-09-08 11:47:36 -04:00
Ryan Dick
c2d43f007b Specify the image_embedding_len in the IPAttnProcessor rather than the text embedding length. This enables the IPAttnProcessor to handle text embeddings of varying lengths. 2023-09-07 18:20:21 -04:00
Ryan Dick
7703bf2ca1 Delete IP-Adapter copies of AttnProcessor and AttnProcessor2_0, which were unmodified from diffusers. 2023-09-07 15:00:13 -04:00
blessedcoolant
65a76a086b cleanup: Some basic cleanup 2023-09-05 11:54:28 +12:00
blessedcoolant
07381e5a26 cleanup: merge conflicts 2023-09-05 11:37:12 +12:00
user1
8c1390166f Modifying code from https://github.com/tencent-ailab/IP-Adapter. Also adding license notice at top. 2023-08-30 17:28:30 -07:00
user1
1ad98ce999 Core ip_adapter files from https://github.com/tencent-ailab/IP-Adapter
Copied into InvokeAI since IP-Adapter repo is not a package. Is there a better way to do this for non-packaged Python code while still keeping InvokeAI install easy?
2023-08-30 17:28:30 -07:00