Commit Graph

1753 Commits

Author SHA1 Message Date
81991e072b Merge branch 'main' into ryan/spandrel-upscale 2024-07-16 15:14:08 -04:00
cec345cb5c Change attention processor apply logic 2024-07-16 20:03:29 +03:00
608cbe3f5c Separate inputs in denoise context 2024-07-16 19:30:29 +03:00
38343917f8 fix(backend): revert non-blocking device transfer
In #6490 we enabled non-blocking torch device transfers throughout the model manager's memory management code. When using this torch feature, torch attempts to wait until the tensor transfer has completed before allowing any access to the tensor. Theoretically, that should make this a safe feature to use.

This provides a small performance improvement but causes race conditions in some situations. Specific platforms/systems are affected, and complicated data dependencies can make this unsafe.

- Intermittent black images on MPS devices - reported on discord and #6545, fixed with special handling in #6549.
- Intermittent OOMs and black images on a P4000 GPU on Windows - reported in #6613, fixed in this commit.

On my system, I haven't experience any issues with generation, but targeted testing of non-blocking ops did expose a race condition when moving tensors from CUDA to CPU.

One workaround is to use torch streams with manual sync points. Our application logic is complicated enough that this would be a lot of work and feels ripe for edge cases and missed spots.

Much safer is to fully revert non-locking - which is what this change does.
2024-07-16 08:59:42 +10:00
9f088d1bf5 Multiple small fixes 2024-07-16 00:51:25 +03:00
fd8d1c12d4 Remove 'del' operator overload 2024-07-16 00:43:32 +03:00
d623bd429b Fix condtionings logic 2024-07-16 00:31:56 +03:00
28e79c4c5e chore: ruff
Looks like an upstream change to ruff resulted in this file being a violation.
2024-07-15 14:05:04 +10:00
499e4d4fde Add preview extension to check logic 2024-07-13 00:45:04 +03:00
e961dd1dec Remove remains of priority logic 2024-07-13 00:44:21 +03:00
7e00526999 Remove overrides logic for now 2024-07-13 00:28:56 +03:00
3a9dda9177 Renames 2024-07-12 22:44:00 +03:00
bd8ae5d896 Simplify guidance modes 2024-07-12 22:01:37 +03:00
87e96e1be2 Rename modifiers to callbacks, convert order to int, a bit unify injection points 2024-07-12 22:01:05 +03:00
0bc60378d3 A bit rework conditioning convert to unet kwargs 2024-07-12 20:43:32 +03:00
9cc852cf7f Base code from draft PR 2024-07-12 20:31:26 +03:00
ab775726b7 Add tiling support to the SpoandrelImageToImage node. 2024-07-10 14:25:19 -04:00
650902dc29 Fix broken unit test caused by non-existent model path. 2024-07-10 13:59:17 -04:00
7b5d4935b4 Merge branch 'main' into ryan/spandrel-upscale 2024-07-09 13:47:11 -04:00
af63c538ed Demote error log to warning to models treated as having size 0. 2024-07-09 08:35:43 -04:00
0ce6ec634d Do not assign the result of SpandrelImageToImageModel.load_from_file(...) during probe to ensure that the model is immediately gc'd. 2024-07-05 14:05:12 -04:00
35f8781ea2 Fix static type errors with SCHEDULER_NAME_VALUES. And, avoid bi-directional cross-directory imports, which contribute to circular import issues. 2024-07-05 07:38:35 -07:00
36202d6d25 Delete unused duplicate libc_util.py file. The active version is at invokeai/backend/model_manager/libc_util.py. 2024-07-04 10:30:40 -04:00
1d449097cc Apply ruff rule to disallow all relative imports. 2024-07-04 09:35:37 -04:00
9da5925287 Add ruff rule to disallow relative parent imports. 2024-07-04 09:35:37 -04:00
414750a45d Update calc_model_size_by_data(...) to handle all expected model types, and to log an error if an unexpected model type is received. 2024-07-04 09:08:25 -04:00
a405f14ea2 Fix SpandrelImageToImageModel size calculation for the model cache. 2024-07-03 16:38:16 -04:00
114320ee69 (minor) typo 2024-07-03 16:28:21 -04:00
6161aa73af Move pil_to_tensor() and tensor_to_pil() utilities to the SpandrelImageToImage class. 2024-07-03 16:28:21 -04:00
1ab20f43c8 Tidy spandrel model probe logic, and document the reasons behind the current implementation. 2024-07-03 16:28:21 -04:00
29c8ddfb88 WIP - A bunch of boilerplate to support Spandrel Image-to-Image models throughout the model manager and the frontend. 2024-07-03 16:28:21 -04:00
95079dc7d4 Use a ModelIdentifierField to identify the spandrel model in the UpscaleSpandrelInvocation. 2024-07-03 16:28:21 -04:00
2a1514272f Set the dtype correctly for SpandrelImageToImageModels when they are loaded. 2024-07-03 16:28:21 -04:00
59ce9cf41c WIP - Begin to integrate SpandreImageToImageModel type into the model manager. 2024-07-03 16:28:21 -04:00
e6abea7bc5 (minor) Remove redundant else clause on a for-loop with no break statement. 2024-07-03 16:28:21 -04:00
c335f92345 (minor) simplify startswith(...) syntax. 2024-07-03 16:28:21 -04:00
e4813f800a Update calc_model_size_by_data(...) to handle all expected model types, and to log an error if an unexpected model type is received. 2024-07-02 21:51:45 -04:00
3752509066 Expose the VAE tile_size on the VAE encode and decode invocations. 2024-07-02 09:07:03 -04:00
79640ba14e Add context manager for overriding VAE tiling params. 2024-07-02 09:07:03 -04:00
5df2a79549 Update starter models 2024-06-28 17:49:45 +10:00
10b9088312 update controlnet starter models 2024-06-28 17:49:45 +10:00
3e0fb45dd7 Load single-file checkpoints directly without conversion (#6510)
* use model_class.load_singlefile() instead of converting; works, but performance is poor

* adjust the convert api - not right just yet

* working, needs sql migrator update

* rename migration_11 before conflict merge with main

* Update invokeai/backend/model_manager/load/model_loaders/stable_diffusion.py

Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>

* Update invokeai/backend/model_manager/load/model_loaders/stable_diffusion.py

Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>

* implement lightweight version-by-version config migration

* simplified config schema migration code

* associate sdxl config with sdxl VAEs

* remove use of original_config_file in load_single_file()

---------

Co-authored-by: Lincoln Stein <lstein@gmail.com>
Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>
2024-06-27 17:31:28 -04:00
14775cc9c4 ruff format 2024-06-27 09:45:13 -04:00
c7562dd6c0 fix(backend): mps should not use non_blocking
We can get black outputs when moving tensors from CPU to MPS. It appears MPS to CPU is fine. See:
- https://github.com/pytorch/pytorch/issues/107455
- https://discuss.pytorch.org/t/should-we-set-non-blocking-to-true/38234/28

Changes:
- Add properties for each device on `TorchDevice` as a convenience.
- Add `get_non_blocking` static method on `TorchDevice`. This utility takes a torch device and returns the flag to be used for non_blocking when moving a tensor to the device provided.
- Update model patching and caching APIs to use this new utility.

Fixes: #6545
2024-06-27 19:15:23 +10:00
9a3b8c6fcb Fix handling of init_timestep in StableDiffusionGeneratorPipeline and improve its documentation. 2024-06-26 12:51:51 -04:00
bd74b84cc5 Revert "Remove the redundant init_timestep parameter that was being passed around. It is simply the first element of the timesteps array."
This reverts commit fa40061eca.
2024-06-26 12:51:51 -04:00
dc23bebebf Run ruff 2024-06-26 21:46:59 +10:00
38b6f90c02 Update prevention exception message 2024-06-26 21:46:59 +10:00
cd9dfefe3c Fix inpainting mask shape assertions. 2024-06-25 11:31:52 -07:00
e1af78c702 Make the tile_overlap input to MultiDiffusion *strictly* control the amount of overlap rather than being a lower bound. 2024-06-25 11:31:52 -07:00