Merge branch 'lstein:main' into main

2024-08-30 20:32:17 +00:00 · 2022-09-01 19:41:14 -06:00
parent 35d3f0ed90 3ee82d8a3b
commit 2b7f32502c
13 changed files with 391 additions and 314 deletions
--- a/README-Mac-MPS.md
+++ b/README-Mac-MPS.md
@ -12,8 +12,7 @@ issue](https://github.com/CompVis/stable-diffusion/issues/25), and generally on
 You have to have macOS 12.3 Monterey or later. Anything earlier than that won't work.
-BTW, I haven't tested any of this on Intel Macs but I have read that one person
+Tested on a 2022 Macbook M2 Air with 10-core GPU and 24 GB unified memory.
 got it to work.
 How to:
@ -22,24 +21,23 @@ git clone https://github.com/lstein/stable-diffusion.git
 cd stable-diffusion
 mkdir -p models/ldm/stable-diffusion-v1/
-ln -s /path/to/ckpt/sd-v1-1.ckpt models/ldm/stable-diffusion-v1/model.ckpt
+PATH_TO_CKPT="$HOME/Documents/stable-diffusion-v-1-4-original"  # or wherever yours is.
 ln -s "$PATH_TO_CKPT/sd-v1-4.ckpt" models/ldm/stable-diffusion-v1/model.ckpt
-conda env create -f environment-mac.yaml
+CONDA_SUBDIR=osx-arm64 conda env create -f environment-mac.yaml
 conda activate ldm
 python scripts/preload_models.py
-python scripts/orig_scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --plms
+python scripts/dream.py --full_precision  # half-precision requires autocast and won't work
 ```
-We have not gotten lstein's dream.py to work yet.
+After you follow all the instructions and run dream.py you might get several errors. Here's the errors I've seen and found solutions for.
 After you follow all the instructions and run txt2img.py you might get several errors. Here's the errors I've seen and found solutions for.
 ### Is it slow?
 Be sure to specify 1 sample and 1 iteration.
-	python ./scripts/txt2img.py --prompt "ocean" --ddim_steps 5 --n_samples 1 --n_iter 1
+	python ./scripts/orig_scripts/txt2img.py --prompt "ocean" --ddim_steps 5 --n_samples 1 --n_iter 1
 ### Doesn't work anymore?
@ -94,10 +92,6 @@ get quick feedback.
 	python ./scripts/txt2img.py --prompt "ocean" --ddim_steps 5 --n_samples 1 --n_iter 1
 ### MAC: torch._C' has no attribute '_cuda_resetPeakMemoryStats' #234
 We haven't fixed gotten dream.py to work on Mac yet.
 ### OSError: Can't load tokenizer for 'openai/clip-vit-large-patch14'...
 	python scripts/preload_models.py
@ -108,7 +102,7 @@ Example error.
 ```
 ...
-NotImplementedError: The operator 'aten::index.Tensor' is not current implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on [https://github.com/pytorch/pytorch/issues/77764](https://github.com/pytorch/pytorch/issues/77764). As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.
+NotImplementedError: The operator 'aten::_index_put_impl_' is not current implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on [https://github.com/pytorch/pytorch/issues/77764](https://github.com/pytorch/pytorch/issues/77764). As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.
 ```
 The lstein branch includes this fix in [environment-mac.yaml](https://github.com/lstein/stable-diffusion/blob/main/environment-mac.yaml).
@ -137,27 +131,18 @@ still working on it.
 	OMP: Error #15: Initializing libiomp5.dylib, but found libomp.dylib already initialized.
-There are several things you can do. First, you could use something
+You are likely using an Intel package by mistake. Be sure to run conda with
-besides Anaconda like miniforge. I read a lot of things online telling
+the environment variable `CONDA_SUBDIR=osx-arm64`, like so:
 people to use something else, but I am stuck with Anaconda for other
 reasons.
-Or you can try this.
+`CONDA_SUBDIR=osx-arm64 conda install ...`
-	export KMP_DUPLICATE_LIB_OK=True
+This error happens with Anaconda on Macs when the Intel-only `mkl` is pulled in by
 a dependency. [nomkl](https://stackoverflow.com/questions/66224879/what-is-the-nomkl-python-package-used-for)
 is a metapackage designed to prevent this, by making it impossible to install
 `mkl`, but if your environment is already broken it may not work. 
-Or this (which takes forever on my computer and didn't work anyway).
+Do *not* use `os.environ['KMP_DUPLICATE_LIB_OK']='True'` or equivalents as this
-
+masks the underlying issue of using Intel packages.
 	conda install nomkl
 This error happens with Anaconda on Macs, and
 [nomkl](https://stackoverflow.com/questions/66224879/what-is-the-nomkl-python-package-used-for)
 is supposed to fix the issue (it isn't a module but a fix of some
 sort). [There's more
 suggestions](https://stackoverflow.com/questions/53014306/error-15-initializing-libiomp5-dylib-but-found-libiomp5-dylib-already-initial),
 like uninstalling tensorflow and reinstalling. I haven't tried them.
 Since I switched to miniforge I haven't seen the error.
 ### Not enough memory.
@ -226,4 +211,8 @@ What? Intel? On an Apple Silicon?
 	The processor must support the Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) instructions.
 	The processor must support the Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
-This was actually the issue that I couldn't solve until I switched to miniforge.
+This is due to the Intel `mkl` package getting picked up when you try to install
 something that depends on it-- Rosetta can translate some Intel instructions but
 not the specialized ones here. To avoid this, make sure to use the environment
 variable `CONDA_SUBDIR=osx-arm64`, which restricts the Conda environment to only
 use ARM packages, and use `nomkl` as described above.
--- a/README.md
+++ b/README.md
@ -740,7 +740,7 @@ and [Tesseract Cat](https://github.com/TesseractCat)
 Original portions of the software are Copyright (c) 2020 Lincoln D. Stein (https://github.com/lstein)
-#Further Reading
+# Further Reading
 Please see the original README for more information on this software
-and underlying algorithm, located in the file README-CompViz.md.
+and underlying algorithm, located in the file [README-CompViz.md](README-CompViz.md).
--- a/environment-mac.yaml
+++ b/environment-mac.yaml
@ -1,34 +1,58 @@
 name: ldm
 channels:
  - apple
  - conda-forge
  - pytorch-nightly
-  - defaults
+  - conda-forge
 dependencies:
-  - python=3.10.4
+  - python==3.9.13
-  - pip=22.1.2
+  - pip==22.2.2
  # pytorch-nightly, left unpinned
  - pytorch
  - torchmetrics
  - torchvision
-  - numpy=1.23.1
+
  # I suggest to keep the other deps sorted for convenience.
  # If you wish to upgrade to 3.10, try to run this:
  #
  # ```shell
  # CONDA_CMD=conda
  # sed -E 's/python==3.9.13/python==3.10.5/;s/ldm/ldm-3.10/;21,99s/- ([^=]+)==.+/- \1/' environment-mac.yaml > /tmp/environment-mac-updated.yml
  # CONDA_SUBDIR=osx-arm64 $CONDA_CMD env create -f /tmp/environment-mac-updated.yml && $CONDA_CMD list -n ldm-3.10 | awk ' {print "  - " $1 "==" $2;} '
  # ```
  #
  # Unfortunately, as of 2022-08-31, this fails at the pip stage.
  - albumentations==1.2.1
  - coloredlogs==15.0.1
  - einops==0.4.1
  - grpcio==1.46.4
  - humanfriendly
  - imageio-ffmpeg==0.4.7
  - imageio==2.21.2
  - imgaug==0.4.0
  - kornia==0.6.7
  - mpmath==1.2.1
  - nomkl
  - numpy==1.23.2
  - omegaconf==2.1.1
  - onnx==1.12.0
  - onnxruntime==1.12.1
  - opencv==4.6.0
  - pudb==2022.1
  - pytorch-lightning==1.6.5
  - scipy==1.9.1
  - streamlit==1.12.2
  - sympy==1.10.1
  - tensorboard==2.9.0
  - transformers==4.21.2
  - pip:
-    - albumentations==0.4.6
+    - invisible-watermark
-    - opencv-python==4.6.0.66
+    - test-tube
-    - pudb==2019.2
+    - tokenizers
-    - imageio==2.9.0
+    - torch-fidelity
-    - imageio-ffmpeg==0.4.2
+    - -e git+https://github.com/huggingface/diffusers.git@v0.2.4#egg=diffusers
    - pytorch-lightning==1.4.2
    - omegaconf==2.1.1
    - test-tube>=0.7.5
    - streamlit==1.12.0
    - pillow==9.2.0
    - einops==0.3.0
    - torch-fidelity==0.3.0
    - transformers==4.19.2
    - torchmetrics==0.6.0
    - kornia==0.6.0
    - -e git+https://github.com/openai/CLIP.git@main#egg=clip
    - -e git+https://github.com/CompVis/taming-transformers.git@master#egg=taming-transformers
-    - -e git+https://github.com/lstein/k-diffusion.git@master#egg=k-diffusion
+    - -e git+https://github.com/openai/CLIP.git@main#egg=clip
    - -e git+https://github.com/Birch-san/k-diffusion.git@mps#egg=k_diffusion
    - -e .
 variables:
  PYTORCH_ENABLE_MPS_FALLBACK: 1
--- a/ldm/dream/devices.py
+++ b/ldm/dream/devices.py
@ -8,4 +8,10 @@ def choose_torch_device() -> str:
        return 'mps'
    return 'cpu'
-    
+def choose_autocast_device(device) -> str:
    '''Returns an autocast compatible device from a torch device'''
    device_type = device.type # this returns 'mps' on M1
    # autocast only supports cuda or cpu
    if device_type not in ('cuda','cpu'):
        return 'cpu'
    return device_type
--- a/ldm/dream/image_util.py
+++ b/ldm/dream/image_util.py
@ -8,11 +8,10 @@ class InitImageResizer():
    def resize(self,width=None,height=None) -> Image:
        """
-        Return a copy of the image resized to width x height.
+        Return a copy of the image resized to fit within
-        The aspect ratio is maintained, with any excess space
+        a box width x height. The aspect ratio is 
-        filled using black borders (i.e. letterboxed). If
+        maintained. If neither width nor height are provided, 
-        neither width nor height are provided, then returns
+        then returns a copy of the original image. If one or the other is
        a copy of the original image. If one or the other is
        provided, then the other will be calculated from the
        aspect ratio.
@ -21,38 +20,34 @@ class InitImageResizer():
        """
        im    = self.image
-        if not(width or height):
+        ar = im.width/float(im.height)
            return im.copy()
        ar = im.width/im.height
        # Infer missing values from aspect ratio
-        if not height:          # height missing
+        if not(width or height): # both missing
            width  = im.width
            height = im.height
        elif not height:           # height missing
            height = int(width/ar)
-        if not width:          # width missing
+        elif not width:            # width missing
            width  = int(height*ar)
        # rw and rh are the resizing width and height for the image
        # they maintain the aspect ratio, but may not completelyl fill up
        # the requested destination size
-        (rw,rh) = (width,int(width/ar)) if im.width>=im.height else (int(height*ar),width)
+        (rw,rh) = (width,int(width/ar)) if im.width>=im.height else (int(height*ar),height)
        #round everything to multiples of 64
        width,height,rw,rh = map(
            lambda x: x-x%64, (width,height,rw,rh)
-            )
+        )
-        # resize the original image so that it fits inside the dest
+        # no resize necessary, but return a copy
        if im.width == width and im.height == height:
            return im.copy()
        # otherwise resize the original image so that it fits inside the bounding box
        resized_image = self.image.resize((rw,rh),resample=Image.Resampling.LANCZOS)
-
+        return resized_image
        # create new destination image of specified dimensions
        # and paste the resized image into it centered appropriately
        new_image = Image.new('RGB',(width,height))
        new_image.paste(resized_image,((width-rw)//2,(height-rh)//2))
        print(f'>> Resized image size to {width}x{height}')
        return new_image
 def make_grid(image_list, rows=None, cols=None):
    image_cnt = len(image_list)
--- a/ldm/dream/pngwriter.py
+++ b/ldm/dream/pngwriter.py
@ -61,6 +61,8 @@ class PromptFormatter:
        switches.append(f'-A{opt.sampler_name or t2i.sampler_name}')
        if opt.init_img:
            switches.append(f'-I{opt.init_img}')
        if opt.fit:
            switches.append(f'--fit')
        if opt.strength and opt.init_img is not None:
            switches.append(f'-f{opt.strength or t2i.strength}')
        if opt.gfpgan_strength:
--- a/ldm/dream/server.py
+++ b/ldm/dream/server.py
@ -70,6 +70,7 @@ class DreamServer(BaseHTTPRequestHandler):
        steps = int(post_data['steps'])
        width = int(post_data['width'])
        height = int(post_data['height'])
        fit    = 'fit' in post_data
        cfgscale = float(post_data['cfgscale'])
        sampler_name  = post_data['sampler']
        gfpgan_strength = float(post_data['gfpgan_strength']) if gfpgan_model_exists else 0
@ -80,7 +81,7 @@ class DreamServer(BaseHTTPRequestHandler):
        seed = self.model.seed if int(post_data['seed']) == -1 else int(post_data['seed'])
        self.canceled.clear()
-        print(f"Request to generate with prompt: {prompt}")
+        print(f">> Request to generate with prompt: {prompt}")
        # In order to handle upscaled images, the PngWriter needs to maintain state
        # across images generated by each call to prompt2img(), so we define it in
        # the outer scope of image_done()
@ -177,10 +178,13 @@ class DreamServer(BaseHTTPRequestHandler):
                                            init_img = "./img2img-tmp.png",
                                            strength = strength,
                                            iterations = iterations,
-                                            cfg_scale = cfgscale,
+                                            cfg_scale  = cfgscale,
-                                            seed = seed,
+                                            seed       = seed,
-                                            steps = steps,
+                                            steps      = steps,
                                            sampler_name    = sampler_name,
                                            width      = width,
                                            height     = height,
                                            fit        = fit,
                                            gfpgan_strength=gfpgan_strength,
                                            upscale         = upscale,
                                            step_callback=image_progress,
@ -192,8 +196,6 @@ class DreamServer(BaseHTTPRequestHandler):
            print(f"Canceled.")
            return
        print(f"Prompt generated!")
 class ThreadingDreamServer(ThreadingHTTPServer):
    def __init__(self, server_address):
--- a/ldm/gfpgan/gfpgan_tools.py
+++ b/ldm/gfpgan/gfpgan_tools.py
@ -14,7 +14,7 @@ model_path = os.path.join(opt.gfpgan_dir, opt.gfpgan_model_path)
 gfpgan_model_exists = os.path.isfile(model_path)
 def _run_gfpgan(image, strength, prompt, seed, upsampler_scale=4):
-    print(f'\n* GFPGAN - Restoring Faces: {prompt} : seed:{seed}')
+    print(f'>> GFPGAN - Restoring Faces: {prompt} : seed:{seed}')
    gfpgan = None
    with warnings.catch_warnings():
        warnings.filterwarnings('ignore', category=DeprecationWarning)
@ -41,12 +41,12 @@ def _run_gfpgan(image, strength, prompt, seed, upsampler_scale=4):
        except Exception:
            import traceback
-            print('Error loading GFPGAN:', file=sys.stderr)
+            print('>> Error loading GFPGAN:', file=sys.stderr)
            print(traceback.format_exc(), file=sys.stderr)
    if gfpgan is None:
        print(
-            f'GFPGAN not initialized, it must be loaded via the --gfpgan argument'
+            f'>> GFPGAN not initialized, it must be loaded via the --gfpgan argument'
        )
        return image
@ -129,7 +129,7 @@ def _load_gfpgan_bg_upsampler(bg_upsampler, upsampler_scale, bg_tile=400):
 def real_esrgan_upscale(image, strength, upsampler_scale, prompt, seed):
    print(
-        f'\n* Real-ESRGAN Upscaling: {prompt} : seed:{seed} : scale:{upsampler_scale}x'
+        f'>> Real-ESRGAN Upscaling: {prompt} : seed:{seed} : scale:{upsampler_scale}x'
    )
    with warnings.catch_warnings():
@ -143,7 +143,7 @@ def real_esrgan_upscale(image, strength, upsampler_scale, prompt, seed):
        except Exception:
            import traceback
-            print('Error loading Real-ESRGAN:', file=sys.stderr)
+            print('>> Error loading Real-ESRGAN:', file=sys.stderr)
            print(traceback.format_exc(), file=sys.stderr)
    output, img_mode = upsampler.enhance(
--- a/ldm/simplet2i.py
+++ b/ldm/simplet2i.py
@ -8,6 +8,7 @@ import torch
 import numpy as np
 import random
 import os
 import traceback
 from omegaconf import OmegaConf
 from PIL import Image
 from tqdm import tqdm, trange
@ -22,12 +23,13 @@ import time
 import re
 import sys
-from ldm.util import instantiate_from_config
+from ldm.util                      import instantiate_from_config
-from ldm.models.diffusion.ddim import DDIMSampler
+from ldm.models.diffusion.ddim     import DDIMSampler
-from ldm.models.diffusion.plms import PLMSSampler
+from ldm.models.diffusion.plms     import PLMSSampler
 from ldm.models.diffusion.ksampler import KSampler
-from ldm.dream.pngwriter import PngWriter
+from ldm.dream.pngwriter           import PngWriter
-from ldm.dream.devices import choose_torch_device
+from ldm.dream.image_util          import InitImageResizer
 from ldm.dream.devices import choose_autocast_device, choose_torch_device
 """Simplified text to image API for stable diffusion/latent diffusion
@ -113,51 +115,59 @@ class T2I:
 """
    def __init__(
-        self,
+            self,
-        iterations=1,
+            iterations=1,
-        steps=50,
+            steps=50,
-        seed=None,
+            seed=None,
-        cfg_scale=7.5,
+            cfg_scale=7.5,
-        weights='models/ldm/stable-diffusion-v1/model.ckpt',
+            weights='models/ldm/stable-diffusion-v1/model.ckpt',
-        config='configs/stable-diffusion/v1-inference.yaml',
+            config='configs/stable-diffusion/v1-inference.yaml',
-        grid=False,
+            grid=False,
-        width=512,
+            width=512,
-        height=512,
+            height=512,
-        sampler_name='k_lms',
+            sampler_name='k_lms',
-        latent_channels=4,
+            latent_channels=4,
-        downsampling_factor=8,
+            downsampling_factor=8,
-        ddim_eta=0.0,  # deterministic
+            ddim_eta=0.0,  # deterministic
-        precision='autocast',
+            precision='autocast',
-        full_precision=False,
+            full_precision=False,
-        strength=0.75,  # default in scripts/img2img.py
+            strength=0.75,  # default in scripts/img2img.py
-        embedding_path=None,
+            embedding_path=None,
-        # just to keep track of this parameter when regenerating prompt
+            device_type = 'cuda',
-        latent_diffusion_weights=False,
+            # just to keep track of this parameter when regenerating prompt
-        device='cuda',
+            # needs to be replaced when new configuration system implemented.
            latent_diffusion_weights=False,
    ):
-        self.iterations = iterations
+        self.iterations               = iterations
-        self.width = width
+        self.width                    = width
-        self.height = height
+        self.height                   = height
-        self.steps = steps
+        self.steps                    = steps
-        self.cfg_scale = cfg_scale
+        self.cfg_scale                = cfg_scale
-        self.weights = weights
+        self.weights                  = weights
-        self.config = config
+        self.config                   = config
-        self.sampler_name = sampler_name
+        self.sampler_name             = sampler_name
-        self.latent_channels = latent_channels
+        self.latent_channels          = latent_channels
-        self.downsampling_factor = downsampling_factor
+        self.downsampling_factor      = downsampling_factor
-        self.grid = grid
+        self.grid                     = grid
-        self.ddim_eta = ddim_eta
+        self.ddim_eta                 = ddim_eta
-        self.precision = precision
+        self.precision                = precision
-        self.full_precision = full_precision
+        self.full_precision           = full_precision
-        self.strength = strength
+        self.strength                 = strength
-        self.embedding_path = embedding_path
+        self.embedding_path           = embedding_path
-        self.model = None     # empty for now
+        self.device_type              = device_type
-        self.sampler = None
+        self.model                    = None     # empty for now
        self.sampler                  = None
        self.device                   = None
        self.latent_diffusion_weights = latent_diffusion_weights
-        self.device = device
+
        if device_type == 'cuda' and not torch.cuda.is_available():
            device_type = choose_torch_device()
            print(">> cuda not available, using device", device_type)
        self.device = torch.device(device_type)
        # for VRAM usage statistics
-        self.session_peakmem = torch.cuda.max_memory_allocated() if self.device == 'cuda' else None
+        device_type          = choose_torch_device()
        self.session_peakmem = torch.cuda.max_memory_allocated() if device_type == 'cuda' else None
        if seed is None:
            self.seed = self._new_seed()
@ -194,29 +204,29 @@ class T2I:
        return self.prompt2png(prompt, outdir, **kwargs)
    def prompt2image(
-        self,
+            self,
-        # these are common
+            # these are common
-        prompt,
+            prompt,
-        iterations=None,
+            iterations     =    None,
-        steps=None,
+            steps          =    None,
-        seed=None,
+            seed           =    None,
-        cfg_scale=None,
+            cfg_scale      =    None,
-        ddim_eta=None,
+            ddim_eta       =    None,
-        skip_normalize=False,
+            skip_normalize =    False,
-        image_callback=None,
+            image_callback =    None,
-        step_callback=None,
+            step_callback  =    None,
-        width=None,
+            width          =    None,
-        height=None,
+            height         =    None,
-        # these are specific to img2img
+            # these are specific to img2img
-        init_img=None,
+            init_img       =    None,
-        strength=None,
+            fit            =    False,
-        gfpgan_strength=0,
+            strength       =    None,
-        save_original=False,
+            gfpgan_strength=    0,
-        upscale=None,
+            save_original  =    False,
-        variants=None,
+            upscale        =    None,
-        sampler_name=None,
+            sampler_name   =    None,
-        log_tokenization=False,
+            log_tokenization=  False,
-        **args,
+            **args,
    ):   # eat up additional cruft
        """
        ldm.prompt2image() is the common entry point for txt2img() and img2img()
@ -232,7 +242,6 @@ class T2I:
           strength                        // strength for noising/unnoising init_img. 0.0 preserves image exactly, 1.0 replaces it completely
           gfpgan_strength                 // strength for GFPGAN. 0.0 preserves image exactly, 1.0 replaces it completely
           ddim_eta                        // image randomness (eta=0.0 means the same seed always produces the same image)
           variants                        // if >0, the 1st generated image will be passed back to img2img to generate the requested number of variants
           step_callback                   // a function or method that will be called each step
           image_callback                  // a function or method that will be called each time an image is generated
@ -251,14 +260,15 @@ class T2I:
        to create the requested output directory, select a unique informative name for each image, and
        write the prompt into the PNG metadata.
        """
-        steps = steps or self.steps
+        # TODO: convert this into a getattr() loop
-        seed = seed or self.seed
+        steps                 = steps      or self.steps
-        width = width or self.width
+        seed                  = seed       or self.seed
-        height = height or self.height
+        width                 = width      or self.width
-        cfg_scale = cfg_scale or self.cfg_scale
+        height                = height     or self.height
-        ddim_eta = ddim_eta or self.ddim_eta
+        cfg_scale             = cfg_scale  or self.cfg_scale
-        iterations = iterations or self.iterations
+        ddim_eta              = ddim_eta   or self.ddim_eta
-        strength = strength or self.strength
+        iterations            = iterations or self.iterations
        strength              = strength   or self.strength
        self.log_tokenization = log_tokenization
        model = (
@ -269,9 +279,7 @@ class T2I:
            0.0 <= strength <= 1.0
        ), 'can only work with strength in [0.0, 1.0]'
-        if not(width == self.width and height == self.height):
+        width, height, _ = self._resolution_check(width, height, log=True)
            width, height, _ = self._resolution_check(width, height, log=True)
        scope = autocast if self.precision == 'autocast' else nullcontext
        if sampler_name and (sampler_name != self.sampler_name):
@ -279,7 +287,8 @@ class T2I:
            self._set_sampler()
        tic = time.time()
-        torch.cuda.torch.cuda.reset_peak_memory_stats()
+        if torch.cuda.is_available():
            torch.cuda.reset_peak_memory_stats()
        results = list()
        try:
@ -295,6 +304,7 @@ class T2I:
                    init_img=init_img,
                    width=width,
                    height=height,
                    fit=fit,
                    strength=strength,
                    callback=step_callback,
                )
@ -311,7 +321,8 @@ class T2I:
                    callback=step_callback,
                )
-            with scope(self.device.type), self.model.ema_scope():
+            device_type = choose_autocast_device(self.device)
            with scope(device_type), self.model.ema_scope():
                for n in trange(iterations, desc='Generating'):
                    seed_everything(seed)
                    image = next(images_iterator)
@ -345,7 +356,7 @@ class T2I:
                                )
                        except Exception as e:
                            print(
-                                f'Error running RealESRGAN - Your image was not upscaled.\n{e}'
+                                f'>> Error running RealESRGAN - Your image was not upscaled.\n{e}'
                            )
                        if image_callback is not None:
                            if save_original:
@ -358,19 +369,19 @@ class T2I:
        except KeyboardInterrupt:
            print('*interrupted*')
            print(
-                'Partial results will be returned; if --grid was requested, nothing will be returned.'
+                '>> Partial results will be returned; if --grid was requested, nothing will be returned.'
            )
        except RuntimeError as e:
-            print(str(e))
+            print(traceback.format_exc(), file=sys.stderr)
-            print('Are you sure your system has an adequate NVIDIA GPU?')
+            print('>> Are you sure your system has an adequate NVIDIA GPU?')
        toc = time.time()
-        print('Usage stats:')
+        print('>> Usage stats:')
        print(
-            f'   {len(results)} image(s) generated in', '%4.2fs' % (toc - tic)
+            f'>>   {len(results)} image(s) generated in', '%4.2fs' % (toc - tic)
        )
        print(
-            f'   Max VRAM used for this generation:',
+            f'>>   Max VRAM used for this generation:',
            '%4.2fG' % (torch.cuda.max_memory_allocated() / 1e9),
        )
@ -379,7 +390,7 @@ class T2I:
                self.session_peakmem, torch.cuda.max_memory_allocated()
            )
            print(
-                f'   Max VRAM used since script start: ',
+                f'>>   Max VRAM used since script start: ',
                '%4.2fG' % (self.session_peakmem / 1e9),
            )
        return results
@ -425,18 +436,19 @@ class T2I:
    @torch.no_grad()
    def _img2img(
-        self,
+            self,
-        prompt,
+            prompt,
-        precision_scope,
+            precision_scope,
-        steps,
+            steps,
-        cfg_scale,
+            cfg_scale,
-        ddim_eta,
+            ddim_eta,
-        skip_normalize,
+            skip_normalize,
-        init_img,
+            init_img,
-        width,
+            width,
-        height,
+            height,
-        strength,
+            fit,
-        callback,  # Currently not implemented for img2img
+            strength,
            callback,  # Currently not implemented for img2img
    ):
        """
        An infinite iterator of images from the prompt and the initial image
@ -445,13 +457,13 @@ class T2I:
        # PLMS sampler not supported yet, so ignore previous sampler
        if self.sampler_name != 'ddim':
            print(
-                f"sampler '{self.sampler_name}' is not yet supported. Using DDIM sampler"
+                f">> sampler '{self.sampler_name}' is not yet supported. Using DDIM sampler"
            )
            sampler = DDIMSampler(self.model, device=self.device)
        else:
            sampler = self.sampler
-        init_image = self._load_img(init_img, width, height).to(self.device)
+        init_image = self._load_img(init_img, width, height,fit).to(self.device)
        with precision_scope(self.device.type):
            init_latent = self.model.get_first_stage_encoding(
                self.model.encode_first_stage(init_image)
@ -462,7 +474,6 @@ class T2I:
        )
        t_enc = int(strength * steps)
        # print(f"target t_enc is {t_enc} steps")
        while True:
            uc, c = self._get_uc_and_c(prompt, skip_normalize)
@ -513,7 +524,7 @@ class T2I:
        x_samples = torch.clamp((x_samples + 1.0) / 2.0, min=0.0, max=1.0)
        if len(x_samples) != 1:
            raise Exception(
-                f'expected to get a single image, but got {len(x_samples)}')
+                f'>> expected to get a single image, but got {len(x_samples)}')
        x_sample = 255.0 * rearrange(
            x_samples[0].cpu().numpy(), 'c h w -> h w c'
        )
@ -523,17 +534,12 @@ class T2I:
        self.seed = random.randrange(0, np.iinfo(np.uint32).max)
        return self.seed
    def _get_device(self):
        device_type = choose_torch_device()
        return torch.device(device_type)
    def load_model(self):
        """Load and initialize the model from configuration variables passed at object creation time"""
        if self.model is None:
            seed_everything(self.seed)
            try:
                config = OmegaConf.load(self.config)
                self.device = self._get_device()
                model = self._load_model_from_config(config, self.weights)
                if self.embedding_path is not None:
                    model.embedding_manager.load(
@ -542,12 +548,10 @@ class T2I:
                self.model = model.to(self.device)
                # model.to doesn't change the cond_stage_model.device used to move the tokenizer output, so set it here
                self.model.cond_stage_model.device = self.device
-            except AttributeError:
+            except AttributeError as e:
-                import traceback
+                print(f'>> Error loading model. {str(e)}', file=sys.stderr)
                print(
                    'Error loading model. Only the CUDA backend is supported', file=sys.stderr)
                print(traceback.format_exc(), file=sys.stderr)
-                raise SystemExit
+                raise SystemExit from e
            self._set_sampler()
@ -582,7 +586,7 @@ class T2I:
        print(msg)
    def _load_model_from_config(self, config, ckpt):
-        print(f'Loading model from {ckpt}')
+        print(f'>> Loading model from {ckpt}')
        pl_sd = torch.load(ckpt, map_location='cpu')
        #        if "global_step" in pl_sd:
        #            print(f"Global Step: {pl_sd['global_step']}")
@ -597,41 +601,63 @@ class T2I:
            )
        else:
            print(
-                'Using half precision math. Call with --full_precision to use more accurate but VRAM-intensive full precision.'
+                '>> Using half precision math. Call with --full_precision to use more accurate but VRAM-intensive full precision.'
            )
            model.half()
        return model
-    def _load_img(self, path, width, height):
+    def _load_img(self, path, width, height, fit=False):
        print(f'image path = {path}, cwd = {os.getcwd()}')
        with Image.open(path) as img:
            image = img.convert('RGB')
        print(
-            f'loaded input image of size {image.width}x{image.height} from {path}')
+            f'>> loaded input image of size {image.width}x{image.height} from {path}'
        )
-        from ldm.dream.image_util import InitImageResizer
+        # The logic here is:
-        if width == self.width and height == self.height:
+        # 1. If "fit" is true, then the image will be fit into the bounding box defined
-            new_image_width, new_image_height, resize_needed = self._resolution_check(
+        #    by width and height. It will do this in a way that preserves the init image's
-                image.width, image.height)
+        #    aspect ratio while preventing letterboxing. This means that if there is
        #    leftover horizontal space after rescaling the image to fit in the bounding box,
        #    the generated image's width will be reduced to the rescaled init image's width.
        #    Similarly for the vertical space.
        # 2. Otherwise, if "fit" is false, then the image will be scaled, preserving its
        #    aspect ratio, to the nearest multiple of 64. Large images may generate an
        #    unexpected OOM error.
        if fit:
            image = self._fit_image(image,(width,height))
        else:
-            if height == self.height:
+            image = self._squeeze_image(image)
                new_image_width, new_image_height, resize_needed = self._resolution_check(
                    width, image.height)
            if width == self.width:
                new_image_width, new_image_height, resize_needed = self._resolution_check(
                    image.width, height)
            else:
                image = InitImageResizer(image).resize(width, height)
                resize_needed = False
        if resize_needed:
            image = InitImageResizer(image).resize(
                new_image_width, new_image_height)
        image = np.array(image).astype(np.float32) / 255.0
        image = image[None].transpose(0, 3, 1, 2)
        image = torch.from_numpy(image)
        return 2.0 * image - 1.0
    def _squeeze_image(self,image):
        x,y,resize_needed = self._resolution_check(image.width,image.height)
        if resize_needed:
            return InitImageResizer(image).resize(x,y)
        return image
    def _fit_image(self,image,max_dimensions):
        w,h = max_dimensions
        print(
            f'>> image will be resized to fit inside a box {w}x{h} in size.'
        )
        if image.width > image.height:
            h   = None   # by setting h to none, we tell InitImageResizer to fit into the width and calculate height
        elif image.height > image.width:
            w   = None   # ditto for w
        else:
            pass
        image = InitImageResizer(image).resize(w,h)   # note that InitImageResizer does the multiple of 64 truncation internally
        print(
            f'>> after adjusting image dimensions to be multiples of 64, init image is {image.width}x{image.height}'
            )
        return image
    # TO DO: Move this and related weighted subprompt code into its own module.
    def _split_weighted_subprompts(text, skip_normalize=False):
        """
        grabs all text up to the first occurrence of ':'
@ -701,7 +727,7 @@ class T2I:
                    f'>> Provided width and height must be multiples of 64. Auto-resizing to {w}x{h}'
                )
            height = h
-            width = w
+            width  = w
            resize_needed = True
        if (width * height) > (self.width * self.height):
--- a/scripts/dream.py
+++ b/scripts/dream.py
@ -9,6 +9,7 @@ import sys
 import copy
 import warnings
 import time
 from ldm.dream.devices import choose_torch_device
 import ldm.dream.readline
 from ldm.dream.pngwriter import PngWriter, PromptFormatter
 from ldm.dream.server import DreamServer, ThreadingDreamServer
@ -60,7 +61,7 @@ def main():
        # this is solely for recreating the prompt
        latent_diffusion_weights=opt.laion400m,
        embedding_path=opt.embedding_path,
-        device=opt.device,
+        device_type=opt.device
    )
    # make sure the output directory exists
@ -88,7 +89,7 @@ def main():
    tic = time.time()
    t2i.load_model()
    print(
-        f'model loaded in', '%4.2fs' % (time.time() - tic)
+        f'>> model loaded in', '%4.2fs' % (time.time() - tic)
    )
    if not infile:
@ -347,6 +348,8 @@ def create_argv_parser():
        dest='full_precision',
        action='store_true',
        help='Use slower full precision math for calculations',
        # MPS only functions with full precision, see https://github.com/lstein/stable-diffusion/issues/237
        default=choose_torch_device() == 'mps',
    )
    parser.add_argument(
        '-g',
@ -376,13 +379,6 @@ def create_argv_parser():
        type=str,
        help='Path to a pre-trained embedding manager checkpoint - can only be set on command line',
    )
    parser.add_argument(
        '--device',
        '-d',
        type=str,
        default='cuda',
        help='Device to run Stable Diffusion on. Defaults to cuda `torch.cuda.current_device()` if avalible',
    )
    parser.add_argument(
        '--prompt_as_dir',
        '-p',
@ -426,6 +422,13 @@ def create_argv_parser():
        default='model',
        help='Indicates the Stable Diffusion model to use.',
    )
    parser.add_argument(
        '--device',
        '-d',
        type=str,
        default='cuda',
        help="device to run stable diffusion on. defaults to cuda `torch.cuda.current_device()` if available"
    )
    return parser
@ -483,6 +486,13 @@ def create_cmd_parser():
        type=str,
        help='Path to input image for img2img mode (supersedes width and height)',
    )
    parser.add_argument(
        '-T',
        '-fit',
        '--fit',
        action='store_true',
        help='If specified, will resize the input image to fit within the dimensions of width x height (512x512 default)',
    )
    parser.add_argument(
        '-f',
        '--strength',
--- a/scripts/orig_scripts/img2img.py
+++ b/scripts/orig_scripts/img2img.py
@ -200,7 +200,7 @@ def main():
    config = OmegaConf.load(f"{opt.config}")
    model = load_model_from_config(config, f"{opt.ckpt}")
-    device = choose_torch_device()
+    device = torch.device(choose_torch_device())
    model = model.to(device)
    if opt.plms:
--- a/static/dream_web/index.css
+++ b/static/dream_web/index.css
@ -8,13 +8,15 @@
    margin-top: 20vh;
    margin-left: auto;
    margin-right: auto;
-    max-width: 800px;
+    max-width: 1024px;
    text-align: center;
 }
 fieldset {
    border: none;
 }
 div {
    padding: 10px 10px 10px 10px;
 }
 #fieldset-search {
    display: flex;
 }
@ -78,3 +80,18 @@ label {
    cursor: pointer;
    color: red;
 }
 #txt2img {
    background-color: #DCDCDC;
 }
 #img2img {
    background-color: #F5F5F5;
 }
 #gfpgan {
    background-color: #DCDCDC;
 }
 #progress-section {
    background-color: #F5F5F5;
 }
 #about {
    background-color: #DCDCDC;
 }
--- a/static/dream_web/index.html
+++ b/static/dream_web/index.html
@ -14,78 +14,84 @@
      <h2 id="header">Stable Diffusion Dream Server</h2>
      <form id="generate-form" method="post" action="#">
-        <fieldset id="fieldset-search">
+	<div id="txt2img">
-          <input type="text" id="prompt" name="prompt">
+          <fieldset id="fieldset-search">
-          <input type="submit" id="submit" value="Generate">
+            <input type="text" id="prompt" name="prompt">
-        </fieldset>
+            <input type="submit" id="submit" value="Generate">
-        <fieldset id="fieldset-config">
+          </fieldset>
-          <label for="iterations">Images to generate:</label>
+          <fieldset id="fieldset-config">
-          <input value="1" type="number" id="iterations" name="iterations" size="4">
+            <label for="iterations">Images to generate:</label>
-          <label for="steps">Steps:</label>
+            <input value="1" type="number" id="iterations" name="iterations" size="4">
-          <input value="50" type="number" id="steps" name="steps">
+            <label for="steps">Steps:</label>
-          <label for="cfgscale">Cfg Scale:</label>
+            <input value="50" type="number" id="steps" name="steps">
-          <input value="7.5" type="number" id="cfgscale" name="cfgscale" step="any">
+            <label for="cfgscale">Cfg Scale:</label>
-          <label for="sampler">Sampler:</label>
+            <input value="7.5" type="number" id="cfgscale" name="cfgscale" step="any">
-          <select id="sampler" name="sampler" value="k_lms">
+            <label for="sampler">Sampler:</label>
-            <option value="ddim">DDIM</option>
+            <select id="sampler" name="sampler" value="k_lms">
-            <option value="plms">PLMS</option>
+              <option value="ddim">DDIM</option>
-            <option value="k_lms" selected>KLMS</option>
+              <option value="plms">PLMS</option>
-            <option value="k_dpm_2">KDPM_2</option>
+              <option value="k_lms" selected>KLMS</option>
-            <option value="k_dpm_2_a">KDPM_2A</option>
+              <option value="k_dpm_2">KDPM_2</option>
-            <option value="k_euler">KEULER</option>
+              <option value="k_dpm_2_a">KDPM_2A</option>
-	    <option value="k_euler_a">KEULER_A</option>
+              <option value="k_euler">KEULER</option>
-            <option value="k_heun">KHEUN</option>
+	      <option value="k_euler_a">KEULER_A</option>
-          </select>
+              <option value="k_heun">KHEUN</option>
-          <br>
+            </select>
-          <label title="Set to multiple of 64" for="width">Width:</label>
+            <br>
-          <select id="width" name="width" value="512">
+            <label title="Set to multiple of 64" for="width">Width:</label>
-            <option value="64">64</option> <option value="128">128</option>
+            <select id="width" name="width" value="512">
-            <option value="192">192</option> <option value="256">256</option>
+              <option value="64">64</option> <option value="128">128</option>
-            <option value="320">320</option> <option value="384">384</option>
+              <option value="192">192</option> <option value="256">256</option>
-            <option value="448">448</option> <option value="512" selected>512</option>
+              <option value="320">320</option> <option value="384">384</option>
-            <option value="576">576</option> <option value="640">640</option>
+              <option value="448">448</option> <option value="512" selected>512</option>
-            <option value="704">704</option> <option value="768">768</option>
+              <option value="576">576</option> <option value="640">640</option>
-            <option value="832">832</option> <option value="896">896</option>
+              <option value="704">704</option> <option value="768">768</option>
-            <option value="960">960</option> <option value="1024">1024</option>
+              <option value="832">832</option> <option value="896">896</option>
-          </select>
+              <option value="960">960</option> <option value="1024">1024</option>
-          <label title="Set to multiple of 64" for="height">Height:</label>
+            </select>
-          <select id="height" name="height" value="512">
+            <label title="Set to multiple of 64" for="height">Height:</label>
-            <option value="64">64</option> <option value="128">128</option>
+            <select id="height" name="height" value="512">
-            <option value="192">192</option> <option value="256">256</option>
+              <option value="64">64</option> <option value="128">128</option>
-            <option value="320">320</option> <option value="384">384</option>
+              <option value="192">192</option> <option value="256">256</option>
-            <option value="448">448</option> <option value="512" selected>512</option>
+              <option value="320">320</option> <option value="384">384</option>
-            <option value="576">576</option> <option value="640">640</option>
+              <option value="448">448</option> <option value="512" selected>512</option>
-            <option value="704">704</option> <option value="768">768</option>
+              <option value="576">576</option> <option value="640">640</option>
-            <option value="832">832</option> <option value="896">896</option>
+              <option value="704">704</option> <option value="768">768</option>
-            <option value="960">960</option> <option value="1024">1024</option>
+              <option value="832">832</option> <option value="896">896</option>
-          </select>
+              <option value="960">960</option> <option value="1024">1024</option>
-          <label title="Set to -1 for random seed" for="seed">Seed:</label>
+            </select>
-          <input value="-1" type="number" id="seed" name="seed">
+            <label title="Set to -1 for random seed" for="seed">Seed:</label>
-          <button type="button" id="reset-seed">&olarr;</button>
+            <input value="-1" type="number" id="seed" name="seed">
-          <br>
+            <button type="button" id="reset-seed">&olarr;</button>
            <input type="checkbox" name="progress_images" id="progress_images">
 	    <label for="progress_images">Display in-progress images (slows down generation):</label>
 	    <button type="button" id="reset-all">Reset to Defaults</button>
 	</div>
 	<div id="img2img">
          <label title="Upload an image to use img2img" for="initimg">Initial image:</label>
          <input type="file" id="initimg" name="initimg" accept=".jpg, .jpeg, .png">
 	  <br>
          <label for="strength">Img2Img Strength:</label>
          <input value="0.75" type="number" id="strength" name="strength" step="0.01" min="0" max="1">
-          <label title="Upload an image to use img2img" for="initimg">Init:</label>
+          <input type="checkbox" id="fit" name="fit" checked>
-          <input type="file" id="initimg" name="initimg" accept=".jpg, .jpeg, .png">
+          <label title="Rescale image to fit within requested width and height" for="fit">Fit to width/height:</label>
-          <button type="button" id="reset-all">Reset to Defaults</button>
+	</div>
-          <br>
+        <div id="gfpgan">
-          <label for="progress_images">Display in-progress images (slows down generation):</label>
+          <label title="Strength of the gfpgan (face fixing) algorithm." for="gfpgan_strength">GPFGAN Strength (0 to disable):</label>
-          <input type="checkbox" name="progress_images" id="progress_images">
+          <input value="0.8" min="0" max="1" type="number" id="gfpgan_strength" name="gfpgan_strength" step="0.05">
-          <div id="gfpgan">
+          <label title="Upscaling to perform using ESRGAN." for="upscale_level">Upscaling Level</label>
-            <label title="Strength of the gfpgan (face fixing) algorithm." for="gfpgan_strength">GPFGAN Strength (0 to disable):</label>
+          <select id="upscale_level" name="upscale_level" value="">
-            <input value="0.8" min="0" max="1" type="number" id="gfpgan_strength" name="gfpgan_strength" step="0.05">
+            <option value="" selected>None</option>
-            <label title="Upscaling to perform using ESRGAN." for="upscale_level">Upscaling Level</label>
+            <option value="2">2x</option>
-            <select id="upscale_level" name="upscale_level" value="">
+            <option value="4">4x</option>
-              <option value="" selected>None</option>
+          </select>
-              <option value="2">2x</option>
+          <label title="Strength of the esrgan (upscaling) algorithm." for="upscale_strength">Upscale Strength:</label>
-              <option value="4">4x</option>
+          <input value="0.75" min="0" max="1" type="number" id="upscale_strength" name="upscale_strength" step="0.05">
-            </select>
+        </div>
            <label title="Strength of the esrgan (upscaling) algorithm." for="upscale_strength">Upscale Strength:</label>
            <input value="0.75" min="0" max="1" type="number" id="upscale_strength" name="upscale_strength" step="0.05">
          </div>
        </fieldset>
      </form>
      <div id="about">For news and support for this web service, visit our <a href="http://github.com/lstein/stable-diffusion">GitHub site</a></div>
      <br>
      <div id="progress-section">
        <progress id="progress-bar" value="0" max="1"></progress>
        <span id="cancel-button" title="Cancel">&#10006;</span>