Merge branch 'main' into fit-init-img

add a --fit option to limit the size of the initial image to the maximum boundaries specified by width and height.
2024-08-30 20:32:17 +00:00 · 2022-09-01 14:09:46 -04:00 · 2022-09-01 14:09:46 -04:00 · e6b2c15fc5
commit e6b2c15fc5
parent 28fe84177e 9e99fcbc16
5 changed files with 90 additions and 69 deletions
--- a/README-Mac-MPS.md
+++ b/README-Mac-MPS.md
@ -12,8 +12,7 @@ issue](https://github.com/CompVis/stable-diffusion/issues/25), and generally on
 You have to have macOS 12.3 Monterey or later. Anything earlier than that won't work.
-BTW, I haven't tested any of this on Intel Macs but I have read that one person
+Tested on a 2022 Macbook M2 Air with 10-core GPU and 24 GB unified memory.
 got it to work.
 How to:
@ -22,24 +21,23 @@ git clone https://github.com/lstein/stable-diffusion.git
 cd stable-diffusion
 mkdir -p models/ldm/stable-diffusion-v1/
-ln -s /path/to/ckpt/sd-v1-1.ckpt models/ldm/stable-diffusion-v1/model.ckpt
+PATH_TO_CKPT="$HOME/Documents/stable-diffusion-v-1-4-original"  # or wherever yours is.
 ln -s "$PATH_TO_CKPT/sd-v1-4.ckpt" models/ldm/stable-diffusion-v1/model.ckpt
-conda env create -f environment-mac.yaml
+CONDA_SUBDIR=osx-arm64 conda env create -f environment-mac.yaml
 conda activate ldm
 python scripts/preload_models.py
-python scripts/orig_scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --plms
+python scripts/dream.py --full_precision  # half-precision requires autocast and won't work
 ```
-We have not gotten lstein's dream.py to work yet.
+After you follow all the instructions and run dream.py you might get several errors. Here's the errors I've seen and found solutions for.
 After you follow all the instructions and run txt2img.py you might get several errors. Here's the errors I've seen and found solutions for.
 ### Is it slow?
 Be sure to specify 1 sample and 1 iteration.
-	python ./scripts/txt2img.py --prompt "ocean" --ddim_steps 5 --n_samples 1 --n_iter 1
+	python ./scripts/orig_scripts/txt2img.py --prompt "ocean" --ddim_steps 5 --n_samples 1 --n_iter 1
 ### Doesn't work anymore?
@ -94,10 +92,6 @@ get quick feedback.
 	python ./scripts/txt2img.py --prompt "ocean" --ddim_steps 5 --n_samples 1 --n_iter 1
 ### MAC: torch._C' has no attribute '_cuda_resetPeakMemoryStats' #234
 We haven't fixed gotten dream.py to work on Mac yet.
 ### OSError: Can't load tokenizer for 'openai/clip-vit-large-patch14'...
 	python scripts/preload_models.py
@ -108,7 +102,7 @@ Example error.
 ```
 ...
-NotImplementedError: The operator 'aten::index.Tensor' is not current implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on [https://github.com/pytorch/pytorch/issues/77764](https://github.com/pytorch/pytorch/issues/77764). As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.
+NotImplementedError: The operator 'aten::_index_put_impl_' is not current implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on [https://github.com/pytorch/pytorch/issues/77764](https://github.com/pytorch/pytorch/issues/77764). As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.
 ```
 The lstein branch includes this fix in [environment-mac.yaml](https://github.com/lstein/stable-diffusion/blob/main/environment-mac.yaml).
@ -137,27 +131,18 @@ still working on it.
 	OMP: Error #15: Initializing libiomp5.dylib, but found libomp.dylib already initialized.
-There are several things you can do. First, you could use something
+You are likely using an Intel package by mistake. Be sure to run conda with
-besides Anaconda like miniforge. I read a lot of things online telling
+the environment variable `CONDA_SUBDIR=osx-arm64`, like so:
 people to use something else, but I am stuck with Anaconda for other
 reasons.
-Or you can try this.
+`CONDA_SUBDIR=osx-arm64 conda install ...`
-	export KMP_DUPLICATE_LIB_OK=True
+This error happens with Anaconda on Macs when the Intel-only `mkl` is pulled in by
 a dependency. [nomkl](https://stackoverflow.com/questions/66224879/what-is-the-nomkl-python-package-used-for)
 is a metapackage designed to prevent this, by making it impossible to install
 `mkl`, but if your environment is already broken it may not work. 
-Or this (which takes forever on my computer and didn't work anyway).
+Do *not* use `os.environ['KMP_DUPLICATE_LIB_OK']='True'` or equivalents as this
-
+masks the underlying issue of using Intel packages.
 	conda install nomkl
 This error happens with Anaconda on Macs, and
 [nomkl](https://stackoverflow.com/questions/66224879/what-is-the-nomkl-python-package-used-for)
 is supposed to fix the issue (it isn't a module but a fix of some
 sort). [There's more
 suggestions](https://stackoverflow.com/questions/53014306/error-15-initializing-libiomp5-dylib-but-found-libiomp5-dylib-already-initial),
 like uninstalling tensorflow and reinstalling. I haven't tried them.
 Since I switched to miniforge I haven't seen the error.
 ### Not enough memory.
@ -226,4 +211,8 @@ What? Intel? On an Apple Silicon?
 	The processor must support the Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) instructions.
 	The processor must support the Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
-This was actually the issue that I couldn't solve until I switched to miniforge.
+This is due to the Intel `mkl` package getting picked up when you try to install
 something that depends on it-- Rosetta can translate some Intel instructions but
 not the specialized ones here. To avoid this, make sure to use the environment
 variable `CONDA_SUBDIR=osx-arm64`, which restricts the Conda environment to only
 use ARM packages, and use `nomkl` as described above.
--- a/README.md
+++ b/README.md
@ -740,7 +740,7 @@ and [Tesseract Cat](https://github.com/TesseractCat)
 Original portions of the software are Copyright (c) 2020 Lincoln D. Stein (https://github.com/lstein)
-#Further Reading
+# Further Reading
 Please see the original README for more information on this software
-and underlying algorithm, located in the file README-CompViz.md.
+and underlying algorithm, located in the file [README-CompViz.md](README-CompViz.md).
--- a/environment-mac.yaml
+++ b/environment-mac.yaml
@ -1,33 +1,57 @@
 name: ldm
 channels:
  - apple
  - conda-forge
  - pytorch-nightly
-  - defaults
+  - conda-forge
 dependencies:
-  - python=3.10.4
+  - python==3.9.13
-  - pip=22.1.2
+  - pip==22.2.2
  # pytorch-nightly, left unpinned
  - pytorch
  - torchmetrics
  - torchvision
-  - numpy=1.23.1
+
  # I suggest to keep the other deps sorted for convenience.
  # If you wish to upgrade to 3.10, try to run this:
  #
  # ```shell
  # CONDA_CMD=conda
  # sed -E 's/python==3.9.13/python==3.10.5/;s/ldm/ldm-3.10/;21,99s/- ([^=]+)==.+/- \1/' environment-mac.yaml > /tmp/environment-mac-updated.yml
  # CONDA_SUBDIR=osx-arm64 $CONDA_CMD env create -f /tmp/environment-mac-updated.yml && $CONDA_CMD list -n ldm-3.10 | awk ' {print "  - " $1 "==" $2;} '
  # ```
  #
  # Unfortunately, as of 2022-08-31, this fails at the pip stage.
  - albumentations==1.2.1
  - coloredlogs==15.0.1
  - einops==0.4.1
  - grpcio==1.46.4
  - humanfriendly
  - imageio-ffmpeg==0.4.7
  - imageio==2.21.2
  - imgaug==0.4.0
  - kornia==0.6.7
  - mpmath==1.2.1
  - nomkl
  - numpy==1.23.2
  - omegaconf==2.1.1
  - onnx==1.12.0
  - onnxruntime==1.12.1
  - opencv==4.6.0
  - pudb==2022.1
  - pytorch-lightning==1.6.5
  - scipy==1.9.1
  - streamlit==1.12.2
  - sympy==1.10.1
  - tensorboard==2.9.0
  - transformers==4.21.2
  - pip:
-    - albumentations==0.4.6
+    - invisible-watermark
-    - opencv-python==4.6.0.66
+    - test-tube
-    - pudb==2019.2
+    - tokenizers
-    - imageio==2.9.0
+    - torch-fidelity
-    - imageio-ffmpeg==0.4.2
+    - -e git+https://github.com/huggingface/diffusers.git@v0.2.4#egg=diffusers
    - pytorch-lightning==1.4.2
    - omegaconf==2.1.1
    - test-tube>=0.7.5
    - streamlit==1.12.0
    - pillow==9.2.0
    - einops==0.3.0
    - torch-fidelity==0.3.0
    - transformers==4.19.2
    - torchmetrics==0.6.0
    - kornia==0.6.0
    - -e git+https://github.com/openai/CLIP.git@main#egg=clip
    - -e git+https://github.com/CompVis/taming-transformers.git@master#egg=taming-transformers
    - -e git+https://github.com/openai/CLIP.git@main#egg=clip
    - -e git+https://github.com/lstein/k-diffusion.git@master#egg=k-diffusion
    - -e .
 variables:
--- a/ldm/simplet2i.py
+++ b/ldm/simplet2i.py
@ -27,7 +27,6 @@ from ldm.models.diffusion.ddim import DDIMSampler
 from ldm.models.diffusion.plms import PLMSSampler
 from ldm.models.diffusion.ksampler import KSampler
 from ldm.dream.pngwriter import PngWriter
 from ldm.dream.image_util import InitImageResizer
 from ldm.dream.devices import choose_torch_device
 """Simplified text to image API for stable diffusion/latent diffusion
@ -159,7 +158,7 @@ class T2I:
        # for VRAM usage statistics
        self.session_peakmem = torch.cuda.max_memory_allocated() if self.device == 'cuda' else None
-            
+
        if seed is None:
            self.seed = self._new_seed()
        else:
@ -178,7 +177,8 @@ class T2I:
        outputs = []
        for image, seed in results:
            name = f'{prefix}.{seed}.png'
-            path = pngwriter.save_image_and_prompt_to_png(image, f'{prompt} -S{seed}', name)
+            path = pngwriter.save_image_and_prompt_to_png(
                image, f'{prompt} -S{seed}', name)
            outputs.append([path, seed])
        return outputs
@ -276,7 +276,8 @@ class T2I:
            self._set_sampler()
        tic = time.time()
-        torch.cuda.torch.cuda.reset_peak_memory_stats()
+        if torch.cuda.is_available():
            torch.cuda.reset_peak_memory_stats()
        results = list()
        try:
@ -487,7 +488,8 @@ class T2I:
        uc = self.model.get_learned_conditioning([''])
        # get weighted sub-prompts
-        weighted_subprompts = T2I._split_weighted_subprompts(prompt, skip_normalize)
+        weighted_subprompts = T2I._split_weighted_subprompts(
            prompt, skip_normalize)
        if len(weighted_subprompts) > 1:
            # i dont know if this is correct.. but it works
@ -530,7 +532,7 @@ class T2I:
        if self.model is None:
            seed_everything(self.seed)
            try:
-                config      = OmegaConf.load(self.config)
+                config = OmegaConf.load(self.config)
                self.device = self._get_device()
                model = self._load_model_from_config(config, self.weights)
                if self.embedding_path is not None:
@ -673,18 +675,20 @@ class T2I:
            $               # else, if no ':' then match end of line
            )               # end non-capture group
        """, re.VERBOSE)
-        parsed_prompts = [(match.group("prompt").replace("\\:", ":"), float(match.group("weight") or 1)) for match in re.finditer(prompt_parser, text)]
+        parsed_prompts = [(match.group("prompt").replace("\\:", ":"), float(
            match.group("weight") or 1)) for match in re.finditer(prompt_parser, text)]
        if skip_normalize:
            return parsed_prompts
        weight_sum = sum(map(lambda x: x[1], parsed_prompts))
        if weight_sum == 0:
-            print("Warning: Subprompt weights add up to zero. Discarding and using even weights instead.")
+            print(
                "Warning: Subprompt weights add up to zero. Discarding and using even weights instead.")
            equal_weight = 1 / len(parsed_prompts)
            return [(x[0], equal_weight) for x in parsed_prompts]
        return [(x[0], x[1] / weight_sum) for x in parsed_prompts]
-        
+
-    # shows how the prompt is tokenized 
+    # shows how the prompt is tokenized
-    # usually tokens have '</w>' to indicate end-of-word, 
+    # usually tokens have '</w>' to indicate end-of-word,
    # but for readability it has been replaced with ' '
    def _log_tokenization(self, text):
        if not self.log_tokenization:
@ -721,4 +725,8 @@ class T2I:
            height = h
            width  = w
            resize_needed = True
        if (width * height) > (self.width * self.height):
            print(">> This input is larger than your defaults. If you run out of memory, please use a smaller image.")
        return width, height, resize_needed
--- a/scripts/orig_scripts/img2img.py
+++ b/scripts/orig_scripts/img2img.py
@ -200,7 +200,7 @@ def main():
    config = OmegaConf.load(f"{opt.config}")
    model = load_model_from_config(config, f"{opt.ckpt}")
-    device = choose_torch_device()
+    device = torch.device(choose_torch_device())
    model = model.to(device)
    if opt.plms: