Commit Graph

17 Commits

Author SHA1 Message Date
Damian at mba
056cb0d8a8 sliced cross-attention wrangler works 2022-10-19 21:08:03 +02:00
Damian at mba
37a204324b go back to using InvokeAI attention 2022-10-19 21:08:03 +02:00
Damian at mba
1fc1f8bf05 cross-attention working with placeholder {} syntax 2022-10-19 21:06:42 +02:00
Damian at mba
8ff507b03b runs but doesn't work properly - see below for test prompt
test prompt:
"a cat sitting on a car {a dog sitting on a car}" -W 384 -H 256 -s 10 -S 12346 -A k_euler
note that substition of dog for cat is currently hard-coded (ksampler.py
	line 43-44)
2022-10-19 21:06:42 +02:00
Mihail Dumitrescu
e0951f28cf Refactor attention.CrossAttention to remove duplicate code and apply optimizations
Apply ~6% speedup by moving * self.scale to earlier on a smaller tensor.
When we have enough VRAM don't make a useless zeros tensor.
Switch between cuda/mps/cpu based on q.device.type to allow cleaner per architecture future optimizations.
For cuda and cpu keep VRAM usage and faster slicing consistent.
For cpu use smaller slices. Tested ~20% faster on i7, 9.8 to 7.7 s/it.
Fix = typo to self.mem_total >= 8 in einsum_op_mps_v2 as per #582 discussion.
2022-09-17 20:19:21 +03:00
Mihai
dd3fff1d3e
~7% speedup by switch to += in ldm.modules.attention. (#569)
Tested on 8GB eGPU nvidia setup so YMMV.
Re-land with .clone() fix, context #508
2022-09-14 18:10:33 -04:00
Any-Winter-4079
d0a71dc361
Update attention.py for 16-32GB M1 performance (#540)
Code cleanup and attention.py einsum_ops update for M1 16-32GB performance.
Expected: On par with fastest ever from 8 to 128GB for 512x512. Allows large images.
2022-09-13 10:53:45 -04:00
Lincoln Stein
9fa1f31bf2 fix opencv and realesrgan dependencies in mac install 2022-09-12 07:07:05 -04:00
Any-Winter-4079
9cdf3aca7d Update attention.py
Performance improvements to generate larger images in M1 #431

Update attention.py

Added dtype=r1.dtype to softmax
2022-09-11 22:36:58 -04:00
Lincoln Stein
7708f4fb98 slight efficiency gain by using += in attention.py 2022-09-11 16:03:54 -04:00
Lincoln Stein
70aa674e9e merge PR #495 - keep using float16 in ldm.modules.attention 2022-09-11 10:34:06 -04:00
Lincoln Stein
10db192cc4 changes to dogettx optimizations to run on m1
* Author @any-winter-4079
* Author @dogettx
Thanks to many individuals who contributed time and hardware to
benchmarking and debugging these changes.
2022-09-09 09:51:41 -04:00
Lincoln Stein
29ab3c2028
disable neonpixel optimizations on M1 hardware (#414)
* disable neonpixel optimizations on M1 hardware

* fix typo that was causing random noise images on m1
2022-09-07 13:28:11 -04:00
Lincoln Stein
720e5cd651
Refactoring simplet2i (#387)
* start refactoring -not yet functional

* first phase of refactor done - not sure weighted prompts working

* Second phase of refactoring. Everything mostly working.
* The refactoring has moved all the hard-core inference work into
ldm.dream.generator.*, where there are submodules for txt2img and
img2img. inpaint will go in there as well.
* Some additional refactoring will be done soon, but relatively
minor work.

* fix -save_orig flag to actually work

* add @neonsecret attention.py memory optimization

* remove unneeded imports

* move token logging into conditioning.py

* add placeholder version of inpaint; porting in progress

* fix crash in img2img

* inpainting working; not tested on variations

* fix crashes in img2img

* ported attention.py memory optimization #117 from basujindal branch

* added @torch_no_grad() decorators to img2img, txt2img, inpaint closures

* Final commit prior to PR against development
* fixup crash when generating intermediate images in web UI
* rename ldm.simplet2i to ldm.generate
* add backward-compatibility simplet2i shell with deprecation warning

* add back in mps exception, addresses @vargol comment in #354

* replaced Conditioning class with exported functions

* fix wrong type of with_variations attribute during intialization

* changed "image_iterator()" to "get_make_image()"

* raise NotImplementedError for calling get_make_image() in parent class

* Update ldm/generate.py

better error message

Co-authored-by: Kevin Gibbons <bakkot@gmail.com>

* minor stylistic fixes and assertion checks from code review

* moved get_noise() method into img2img class

* break get_noise() into two methods, one for txt2img and the other for img2img

* inpainting works on non-square images now

* make get_noise() an abstract method in base class

* much improved inpainting

Co-authored-by: Kevin Gibbons <bakkot@gmail.com>
2022-09-05 20:40:10 -04:00
Lincoln Stein
bdb0651eb2 add support for Apple hardware using MPS acceleration 2022-08-31 00:33:23 -04:00
Lincoln Stein
4f02b72c9c prettified all the code using "blue" at the urging of @tildebyte 2022-08-26 03:15:42 -04:00
ablattmann
e66308c7f2 add code 2021-12-21 03:23:41 +01:00