InvokeAI/invokeai
psychedelicious 38343917f8 fix(backend): revert non-blocking device transfer
In #6490 we enabled non-blocking torch device transfers throughout the model manager's memory management code. When using this torch feature, torch attempts to wait until the tensor transfer has completed before allowing any access to the tensor. Theoretically, that should make this a safe feature to use.

This provides a small performance improvement but causes race conditions in some situations. Specific platforms/systems are affected, and complicated data dependencies can make this unsafe.

- Intermittent black images on MPS devices - reported on discord and #6545, fixed with special handling in #6549.
- Intermittent OOMs and black images on a P4000 GPU on Windows - reported in #6613, fixed in this commit.

On my system, I haven't experience any issues with generation, but targeted testing of non-blocking ops did expose a race condition when moving tensors from CUDA to CPU.

One workaround is to use torch streams with manual sync points. Our application logic is complicated enough that this would be a lot of work and feels ripe for edge cases and missed spots.

Much safer is to fully revert non-locking - which is what this change does.
2024-07-16 08:59:42 +10:00
..
app chore: update default workflows 2024-07-15 14:05:04 +10:00
assets feat(api): chore: pydantic & fastapi upgrade 2023-10-17 14:59:25 +11:00
backend fix(backend): revert non-blocking device transfer 2024-07-16 08:59:42 +10:00
configs feat(mm): support sdxl ckpt inpainting models 2024-04-28 12:57:27 +10:00
frontend fix(ui): boards cut off when search open 2024-07-15 14:07:20 +10:00
invocation_api Fix static type errors with SCHEDULER_NAME_VALUES. And, avoid bi-directional cross-directory imports, which contribute to circular import issues. 2024-07-05 07:38:35 -07:00
version chore: bump version to v4.2.6 2024-07-15 14:16:31 +10:00
__init__.py