Commit Graph

32 Commits

Author SHA1 Message Date
Mihai
071f65a892
Enable even larger images with one simple torch.nn.functional.silu import (#653)
Fixes:
File "stable-diffusion/ldm/modules/diffusionmodules/model.py", line 37, in nonlinearity
    return x*torch.sigmoid(x)
RuntimeError: CUDA out of memory. Tried to allocate 1.56 GiB [..]

Now up to 1536x1280 is possible on 8GB VRAM.
Also remove unused SiLU class.
2022-09-17 18:03:52 -04:00
Mihail Dumitrescu
e0951f28cf Refactor attention.CrossAttention to remove duplicate code and apply optimizations
Apply ~6% speedup by moving * self.scale to earlier on a smaller tensor.
When we have enough VRAM don't make a useless zeros tensor.
Switch between cuda/mps/cpu based on q.device.type to allow cleaner per architecture future optimizations.
For cuda and cpu keep VRAM usage and faster slicing consistent.
For cpu use smaller slices. Tested ~20% faster on i7, 9.8 to 7.7 s/it.
Fix = typo to self.mem_total >= 8 in einsum_op_mps_v2 as per #582 discussion.
2022-09-17 20:19:21 +03:00
Lincoln Stein
df4c80f177 respect --outdir again; fix issue #628 2022-09-16 19:58:45 -04:00
Mihai
dd3fff1d3e
~7% speedup by switch to += in ldm.modules.attention. (#569)
Tested on 8GB eGPU nvidia setup so YMMV.
Re-land with .clone() fix, context #508
2022-09-14 18:10:33 -04:00
Any-Winter-4079
d0a71dc361
Update attention.py for 16-32GB M1 performance (#540)
Code cleanup and attention.py einsum_ops update for M1 16-32GB performance.
Expected: On par with fastest ever from 8 to 128GB for 512x512. Allows large images.
2022-09-13 10:53:45 -04:00
Mihai
dedf8a3692
Remove pointless del statements in diffusionmodules.model. (#520) 2022-09-12 17:39:06 -04:00
Mihai
0bc6779361
Disable autocast for cpu to fix error. Remove unused precision arg. (#518)
When running on just cpu (intel), a call to torch.layer_norm would error with RuntimeError: expected scalar type BFloat16 but found Float
Fix buggy device handling in model.py.
Tested with scripts/dream.py --full_precision on just cpu on intel laptop. Works but slow at ~10s/it.
2022-09-12 16:55:21 -04:00
Lincoln Stein
9fa1f31bf2 fix opencv and realesrgan dependencies in mac install 2022-09-12 07:07:05 -04:00
Any-Winter-4079
25d9ccc509 Update model.py 2022-09-11 22:37:45 -04:00
Any-Winter-4079
9cdf3aca7d Update attention.py
Performance improvements to generate larger images in M1 #431

Update attention.py

Added dtype=r1.dtype to softmax
2022-09-11 22:36:58 -04:00
Lincoln Stein
7708f4fb98 slight efficiency gain by using += in attention.py 2022-09-11 16:03:54 -04:00
chromaticist
4951e66103
Adding support for .bin files from huggingface concepts (#498)
* Adding support for .bin files from huggingface concepts

* Updating documentation to include huggingface .bin info
2022-09-11 15:44:26 -04:00
Lincoln Stein
70aa674e9e merge PR #495 - keep using float16 in ldm.modules.attention 2022-09-11 10:34:06 -04:00
Lincoln Stein
10db192cc4 changes to dogettx optimizations to run on m1
* Author @any-winter-4079
* Author @dogettx
Thanks to many individuals who contributed time and hardware to
benchmarking and debugging these changes.
2022-09-09 09:51:41 -04:00
Lincoln Stein
653144694f
work around unexplained crash when timesteps=1000 (#440)
* work around unexplained crash when timesteps=1000

* this fix seems to work
2022-09-08 20:41:37 -04:00
Lincoln Stein
29ab3c2028
disable neonpixel optimizations on M1 hardware (#414)
* disable neonpixel optimizations on M1 hardware

* fix typo that was causing random noise images on m1
2022-09-07 13:28:11 -04:00
Lincoln Stein
720e5cd651
Refactoring simplet2i (#387)
* start refactoring -not yet functional

* first phase of refactor done - not sure weighted prompts working

* Second phase of refactoring. Everything mostly working.
* The refactoring has moved all the hard-core inference work into
ldm.dream.generator.*, where there are submodules for txt2img and
img2img. inpaint will go in there as well.
* Some additional refactoring will be done soon, but relatively
minor work.

* fix -save_orig flag to actually work

* add @neonsecret attention.py memory optimization

* remove unneeded imports

* move token logging into conditioning.py

* add placeholder version of inpaint; porting in progress

* fix crash in img2img

* inpainting working; not tested on variations

* fix crashes in img2img

* ported attention.py memory optimization #117 from basujindal branch

* added @torch_no_grad() decorators to img2img, txt2img, inpaint closures

* Final commit prior to PR against development
* fixup crash when generating intermediate images in web UI
* rename ldm.simplet2i to ldm.generate
* add backward-compatibility simplet2i shell with deprecation warning

* add back in mps exception, addresses @vargol comment in #354

* replaced Conditioning class with exported functions

* fix wrong type of with_variations attribute during intialization

* changed "image_iterator()" to "get_make_image()"

* raise NotImplementedError for calling get_make_image() in parent class

* Update ldm/generate.py

better error message

Co-authored-by: Kevin Gibbons <bakkot@gmail.com>

* minor stylistic fixes and assertion checks from code review

* moved get_noise() method into img2img class

* break get_noise() into two methods, one for txt2img and the other for img2img

* inpainting works on non-square images now

* make get_noise() an abstract method in base class

* much improved inpainting

Co-authored-by: Kevin Gibbons <bakkot@gmail.com>
2022-09-05 20:40:10 -04:00
Lincoln Stein
bdb0651eb2 add support for Apple hardware using MPS acceleration 2022-08-31 00:33:23 -04:00
Paul Sajna
555f13e469
Merge branch 'main' into half-precision-embeddings 2022-08-26 08:33:46 -07:00
Paul Sajna
9b5101cd8d support full-precision embeddings in half precision mode 2022-08-26 08:30:58 -07:00
Lincoln Stein
4f02b72c9c prettified all the code using "blue" at the urging of @tildebyte 2022-08-26 03:15:42 -04:00
Sean McLellan
84989f0d05 Remote token output on startup 2022-08-23 22:39:10 -04:00
Sean McLellan
611ccb991e Remove another duplicate file 2022-08-23 18:31:41 -04:00
Sean McLellan
8952196bbf Add personalization 2022-08-23 18:26:28 -04:00
Lincoln Stein
a20827697c adjusted instructions for the released stable-diffusion-v1 weights 2022-08-22 15:33:27 -04:00
Lincoln Stein
831bbd7a54 improved error reporting when a missing online dependency can't be downloaded 2022-08-17 18:06:30 -04:00
Lincoln Stein
a7532b386a simplified instructions to preload Bert and kornia prerequisites; fixed --grid and --batch handling; added timing information after image generation 2022-08-17 12:00:00 -04:00
Lincoln Stein
d6124c44a3 added customized patches and updated the README 2022-08-16 21:34:37 -04:00
Robin Rombach
2ff270f4e0 stable diffusion 2022-08-10 16:30:49 +02:00
rromb
f13bf9bf46 add vqgan loss with codebook statistic eval 2022-02-21 15:06:50 +01:00
ablattmann
171cf29fb5 add configs for training unconditional/class-conditional ldms 2021-12-22 15:57:23 +01:00
ablattmann
e66308c7f2 add code 2021-12-21 03:23:41 +01:00