InvokeAI/ldm/modules
Mihail Dumitrescu e0951f28cf Refactor attention.CrossAttention to remove duplicate code and apply optimizations
Apply ~6% speedup by moving * self.scale to earlier on a smaller tensor.
When we have enough VRAM don't make a useless zeros tensor.
Switch between cuda/mps/cpu based on q.device.type to allow cleaner per architecture future optimizations.
For cuda and cpu keep VRAM usage and faster slicing consistent.
For cpu use smaller slices. Tested ~20% faster on i7, 9.8 to 7.7 s/it.
Fix = typo to self.mem_total >= 8 in einsum_op_mps_v2 as per #582 discussion.
2022-09-17 20:19:21 +03:00
..
diffusionmodules Remove pointless del statements in diffusionmodules.model. (#520) 2022-09-12 17:39:06 -04:00
distributions prettified all the code using "blue" at the urging of @tildebyte 2022-08-26 03:15:42 -04:00
encoders add support for Apple hardware using MPS acceleration 2022-08-31 00:33:23 -04:00
image_degradation prettified all the code using "blue" at the urging of @tildebyte 2022-08-26 03:15:42 -04:00
losses prettified all the code using "blue" at the urging of @tildebyte 2022-08-26 03:15:42 -04:00
attention.py Refactor attention.CrossAttention to remove duplicate code and apply optimizations 2022-09-17 20:19:21 +03:00
ema.py prettified all the code using "blue" at the urging of @tildebyte 2022-08-26 03:15:42 -04:00
embedding_manager.py respect --outdir again; fix issue #628 2022-09-16 19:58:45 -04:00
x_transformer.py prettified all the code using "blue" at the urging of @tildebyte 2022-08-26 03:15:42 -04:00