Readme update (#253)

* check if torch.backends has mps before calling it

* Updated Mac Readme with latest debugging info

Co-authored-by: James Reynolds <magnsuviri@me.com>
This commit is contained in:
James Reynolds 2022-08-31 09:27:13 -06:00 committed by GitHub
parent 58e3562652
commit 0d2e6f90c8
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -1,20 +1,19 @@
# Apple Silicon Mac Users # Apple Silicon Mac Users
Several people have gotten Stable Diffusion to work on Apple Silicon Several people have gotten Stable Diffusion to work on Apple Silicon
Macs using Anaconda. I've gathered up most of their instructions and Macs using Anaconda, miniforge, etc. I've gathered up most of their instructions and
put them in this fork (and readme). I haven't tested anything besides put them in this fork (and readme). Things have moved really fast and so these
Anaconda, and I've read about issues with things like miniforge, so if instructions change often. Hopefully things will settle down a little.
you have an issue that isn't dealt with in this fork then head on over
to the [Apple There's several places where people are discussing Apple
Silicon](https://github.com/CompVis/stable-diffusion/issues/25) issue MPS functionality: [the original CompVis
on GitHub (that page is so long that GitHub hides most of it by issue](https://github.com/CompVis/stable-diffusion/issues/25), and generally on
default, so you need to find the hidden part and expand it to view the [lstein's fork](https://github.com/lstein/stable-diffusion/).
whole thing). This fork would not have been possible without the work
done by the people on that issue.
You have to have macOS 12.3 Monterey or later. Anything earlier than that won't work. You have to have macOS 12.3 Monterey or later. Anything earlier than that won't work.
BTW, I haven't tested any of this on Intel Macs. BTW, I haven't tested any of this on Intel Macs but I have read that one person
got it to work.
How to: How to:
@ -27,38 +26,41 @@ ln -s /path/to/ckpt/sd-v1-1.ckpt models/ldm/stable-diffusion-v1/model.ckpt
conda env create -f environment-mac.yaml conda env create -f environment-mac.yaml
conda activate ldm conda activate ldm
python scripts/preload_models.py
python scripts/orig_scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --plms
``` ```
These instructions are identical to the main repo except I added We have not gotten lstein's dream.py to work yet.
environment-mac.yaml because Mac doesn't have cudatoolkit.
After you follow all the instructions and run txt2img.py you might get several errors. Here's the errors I've seen and found solutions for. After you follow all the instructions and run txt2img.py you might get several errors. Here's the errors I've seen and found solutions for.
### Is it slow?
Be sure to specify 1 sample and 1 iteration.
python ./scripts/txt2img.py --prompt "ocean" --ddim_steps 5 --n_samples 1 --n_iter 1
### Doesn't work anymore? ### Doesn't work anymore?
We are using PyTorch nightly, which includes support for MPS. I don't PyTorch nightly includes support for MPS. Because of this, this setup is
know exactly how Anaconda does updates, but I woke up one morning and inherently unstable. One morning I woke up and it no longer worked no matter
Stable Diffusion crashed and I couldn't think of anything I did that what I did until I switched to miniforge. However, I have another Mac that works
would've changed anything the night before, when it worked. A day and just fine with Anaconda. If you can't get it to work, please search a little
a half later I finally got it working again. I don't know what changed first because many of the errors will get posted and solved. If you can't find
overnight. PyTorch-nightly changes overnight but I'm pretty sure I a solution please [create an issue](https://github.com/lstein/stable-diffusion/issues).
didn't manually update it. Either way, things are probably going to be
bumpy on Apple Silicon until PyTorch releases a firm version that we
can lock to.
To manually update to the latest version of PyTorch nightly (which could fix issues), run this command. One debugging step is to update to the latest version of PyTorch nightly.
conda install pytorch torchvision torchaudio -c pytorch-nightly conda install pytorch torchvision torchaudio -c pytorch-nightly
## Debugging? Or you can clean everything up.
Tired of waiting for your renders to finish before you can see if it conda clean --yes --all
works? Reduce the steps! The picture wont look like anything but if it
finishes, hey, it works! This could also help you figure out if you've
got a memory problem, because I'm betting 1 step doesn't use much
memory.
python ./scripts/txt2img.py --prompt "ocean" --ddim_steps 1 Or you can reset Anaconda.
conda update --force-reinstall -y -n base -c defaults conda
### "No module named cv2" (or some other module) ### "No module named cv2" (or some other module)
@ -83,6 +85,23 @@ globally.
You might also need to install Rust (I mention this again below). You might also need to install Rust (I mention this again below).
### Debugging?
Tired of waiting for your renders to finish before you can see if it
works? Reduce the steps! The image quality will be horrible but at least you'll
get quick feedback.
python ./scripts/txt2img.py --prompt "ocean" --ddim_steps 5 --n_samples 1 --n_iter 1
### MAC: torch._C' has no attribute '_cuda_resetPeakMemoryStats' #234
We haven't fixed gotten dream.py to work on Mac yet.
### OSError: Can't load tokenizer for 'openai/clip-vit-large-patch14'...
python scripts/preload_models.py
### "The operator [name] is not current implemented for the MPS device." (sic) ### "The operator [name] is not current implemented for the MPS device." (sic)
Example error. Example error.
@ -92,9 +111,7 @@ Example error.
NotImplementedError: The operator 'aten::index.Tensor' is not current implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on [https://github.com/pytorch/pytorch/issues/77764](https://github.com/pytorch/pytorch/issues/77764). As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS. NotImplementedError: The operator 'aten::index.Tensor' is not current implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on [https://github.com/pytorch/pytorch/issues/77764](https://github.com/pytorch/pytorch/issues/77764). As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.
``` ```
Just do what it says: The lstein branch includes this fix in [environment-mac.yaml](https://github.com/lstein/stable-diffusion/blob/main/environment-mac.yaml).
export PYTORCH_ENABLE_MPS_FALLBACK=1
### "Could not build wheels for tokenizers" ### "Could not build wheels for tokenizers"
@ -104,6 +121,8 @@ I have not seen this error because I had Rust installed on my computer before I
### How come `--seed` doesn't work? ### How come `--seed` doesn't work?
First this:
> Completely reproducible results are not guaranteed across PyTorch > Completely reproducible results are not guaranteed across PyTorch
releases, individual commits, or different platforms. Furthermore, releases, individual commits, or different platforms. Furthermore,
results may not be reproducible between CPU and GPU executions, even results may not be reproducible between CPU and GPU executions, even
@ -111,7 +130,8 @@ when using identical seeds.
[PyTorch docs](https://pytorch.org/docs/stable/notes/randomness.html) [PyTorch docs](https://pytorch.org/docs/stable/notes/randomness.html)
There is an [open issue](https://github.com/pytorch/pytorch/issues/78035) (as of August 2022) in pytorch regarding gradient inconsistency. I am guessing that's what is causing this. Second, we might have a fix that at least gets a consistent seed sort of. We're
still working on it.
### libiomp5.dylib error? ### libiomp5.dylib error?
@ -137,6 +157,8 @@ sort). [There's more
suggestions](https://stackoverflow.com/questions/53014306/error-15-initializing-libiomp5-dylib-but-found-libiomp5-dylib-already-initial), suggestions](https://stackoverflow.com/questions/53014306/error-15-initializing-libiomp5-dylib-but-found-libiomp5-dylib-already-initial),
like uninstalling tensorflow and reinstalling. I haven't tried them. like uninstalling tensorflow and reinstalling. I haven't tried them.
Since I switched to miniforge I haven't seen the error.
### Not enough memory. ### Not enough memory.
This seems to be a common problem and is probably the underlying This seems to be a common problem and is probably the underlying
@ -174,10 +196,10 @@ Actually, this could be happening because there's not enough RAM. You could try
### My images come out black ### My images come out black
I haven't solved this issue. I just throw away my black We might have this fixed, we are still testing.
images. There's a [similar
issue](https://github.com/CompVis/stable-diffusion/issues/69) on CUDA There's a [similar issue](https://github.com/CompVis/stable-diffusion/issues/69)
GPU's where the images come out green. Maybe it's the same issue? on CUDA GPU's where the images come out green. Maybe it's the same issue?
Someone in that issue says to use "--precision full", but this fork Someone in that issue says to use "--precision full", but this fork
actually disables that flag. I don't know why, someone else provided actually disables that flag. I don't know why, someone else provided
that code and I don't know what it does. Maybe the `model.half()` that code and I don't know what it does. Maybe the `model.half()`
@ -204,25 +226,4 @@ What? Intel? On an Apple Silicon?
The processor must support the Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) instructions. The processor must support the Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) instructions.
The processor must support the Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions. The processor must support the Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
This fixed it for me: This was actually the issue that I couldn't solve until I switched to miniforge.
conda clean --yes --all
### Still slow?
I changed the defaults of n_samples and n_iter to 1 so that it uses
less RAM and makes less images so it will be faster the first time you
use it. I don't actually know what n_samples does internally, but I
know it consumes a lot more RAM. The n_iter flag just loops around the
image creation code, so it shouldn't consume more RAM (it should be
faster if you're going to do multiple images because the libraries and
model will already be loaded--use a prompt file to get this speed
boost).
These flags are the default sample and iter settings in this fork/branch:
~~~~
python scripts/txt2img.py --prompt "ocean" --n_samples=1 --n_iter=1
~~~