Merge branch 'lstein:main' into main

This commit is contained in:
James Reynolds 2022-09-01 19:41:14 -06:00 committed by GitHub
commit 2b7f32502c
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
13 changed files with 391 additions and 314 deletions

View File

@ -12,8 +12,7 @@ issue](https://github.com/CompVis/stable-diffusion/issues/25), and generally on
You have to have macOS 12.3 Monterey or later. Anything earlier than that won't work.
BTW, I haven't tested any of this on Intel Macs but I have read that one person
got it to work.
Tested on a 2022 Macbook M2 Air with 10-core GPU and 24 GB unified memory.
How to:
@ -22,24 +21,23 @@ git clone https://github.com/lstein/stable-diffusion.git
cd stable-diffusion
mkdir -p models/ldm/stable-diffusion-v1/
ln -s /path/to/ckpt/sd-v1-1.ckpt models/ldm/stable-diffusion-v1/model.ckpt
PATH_TO_CKPT="$HOME/Documents/stable-diffusion-v-1-4-original" # or wherever yours is.
ln -s "$PATH_TO_CKPT/sd-v1-4.ckpt" models/ldm/stable-diffusion-v1/model.ckpt
conda env create -f environment-mac.yaml
CONDA_SUBDIR=osx-arm64 conda env create -f environment-mac.yaml
conda activate ldm
python scripts/preload_models.py
python scripts/orig_scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --plms
python scripts/dream.py --full_precision # half-precision requires autocast and won't work
```
We have not gotten lstein's dream.py to work yet.
After you follow all the instructions and run txt2img.py you might get several errors. Here's the errors I've seen and found solutions for.
After you follow all the instructions and run dream.py you might get several errors. Here's the errors I've seen and found solutions for.
### Is it slow?
Be sure to specify 1 sample and 1 iteration.
python ./scripts/txt2img.py --prompt "ocean" --ddim_steps 5 --n_samples 1 --n_iter 1
python ./scripts/orig_scripts/txt2img.py --prompt "ocean" --ddim_steps 5 --n_samples 1 --n_iter 1
### Doesn't work anymore?
@ -94,10 +92,6 @@ get quick feedback.
python ./scripts/txt2img.py --prompt "ocean" --ddim_steps 5 --n_samples 1 --n_iter 1
### MAC: torch._C' has no attribute '_cuda_resetPeakMemoryStats' #234
We haven't fixed gotten dream.py to work on Mac yet.
### OSError: Can't load tokenizer for 'openai/clip-vit-large-patch14'...
python scripts/preload_models.py
@ -108,7 +102,7 @@ Example error.
```
...
NotImplementedError: The operator 'aten::index.Tensor' is not current implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on [https://github.com/pytorch/pytorch/issues/77764](https://github.com/pytorch/pytorch/issues/77764). As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.
NotImplementedError: The operator 'aten::_index_put_impl_' is not current implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on [https://github.com/pytorch/pytorch/issues/77764](https://github.com/pytorch/pytorch/issues/77764). As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.
```
The lstein branch includes this fix in [environment-mac.yaml](https://github.com/lstein/stable-diffusion/blob/main/environment-mac.yaml).
@ -137,27 +131,18 @@ still working on it.
OMP: Error #15: Initializing libiomp5.dylib, but found libomp.dylib already initialized.
There are several things you can do. First, you could use something
besides Anaconda like miniforge. I read a lot of things online telling
people to use something else, but I am stuck with Anaconda for other
reasons.
You are likely using an Intel package by mistake. Be sure to run conda with
the environment variable `CONDA_SUBDIR=osx-arm64`, like so:
Or you can try this.
`CONDA_SUBDIR=osx-arm64 conda install ...`
export KMP_DUPLICATE_LIB_OK=True
This error happens with Anaconda on Macs when the Intel-only `mkl` is pulled in by
a dependency. [nomkl](https://stackoverflow.com/questions/66224879/what-is-the-nomkl-python-package-used-for)
is a metapackage designed to prevent this, by making it impossible to install
`mkl`, but if your environment is already broken it may not work.
Or this (which takes forever on my computer and didn't work anyway).
conda install nomkl
This error happens with Anaconda on Macs, and
[nomkl](https://stackoverflow.com/questions/66224879/what-is-the-nomkl-python-package-used-for)
is supposed to fix the issue (it isn't a module but a fix of some
sort). [There's more
suggestions](https://stackoverflow.com/questions/53014306/error-15-initializing-libiomp5-dylib-but-found-libiomp5-dylib-already-initial),
like uninstalling tensorflow and reinstalling. I haven't tried them.
Since I switched to miniforge I haven't seen the error.
Do *not* use `os.environ['KMP_DUPLICATE_LIB_OK']='True'` or equivalents as this
masks the underlying issue of using Intel packages.
### Not enough memory.
@ -226,4 +211,8 @@ What? Intel? On an Apple Silicon?
The processor must support the Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) instructions.
The processor must support the Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
This was actually the issue that I couldn't solve until I switched to miniforge.
This is due to the Intel `mkl` package getting picked up when you try to install
something that depends on it-- Rosetta can translate some Intel instructions but
not the specialized ones here. To avoid this, make sure to use the environment
variable `CONDA_SUBDIR=osx-arm64`, which restricts the Conda environment to only
use ARM packages, and use `nomkl` as described above.

View File

@ -743,4 +743,4 @@ Original portions of the software are Copyright (c) 2020 Lincoln D. Stein (https
# Further Reading
Please see the original README for more information on this software
and underlying algorithm, located in the file README-CompViz.md.
and underlying algorithm, located in the file [README-CompViz.md](README-CompViz.md).

View File

@ -1,34 +1,58 @@
name: ldm
channels:
- apple
- conda-forge
- pytorch-nightly
- defaults
- conda-forge
dependencies:
- python=3.10.4
- pip=22.1.2
- python==3.9.13
- pip==22.2.2
# pytorch-nightly, left unpinned
- pytorch
- torchmetrics
- torchvision
- numpy=1.23.1
- pip:
- albumentations==0.4.6
- opencv-python==4.6.0.66
- pudb==2019.2
- imageio==2.9.0
- imageio-ffmpeg==0.4.2
- pytorch-lightning==1.4.2
# I suggest to keep the other deps sorted for convenience.
# If you wish to upgrade to 3.10, try to run this:
#
# ```shell
# CONDA_CMD=conda
# sed -E 's/python==3.9.13/python==3.10.5/;s/ldm/ldm-3.10/;21,99s/- ([^=]+)==.+/- \1/' environment-mac.yaml > /tmp/environment-mac-updated.yml
# CONDA_SUBDIR=osx-arm64 $CONDA_CMD env create -f /tmp/environment-mac-updated.yml && $CONDA_CMD list -n ldm-3.10 | awk ' {print " - " $1 "==" $2;} '
# ```
#
# Unfortunately, as of 2022-08-31, this fails at the pip stage.
- albumentations==1.2.1
- coloredlogs==15.0.1
- einops==0.4.1
- grpcio==1.46.4
- humanfriendly
- imageio-ffmpeg==0.4.7
- imageio==2.21.2
- imgaug==0.4.0
- kornia==0.6.7
- mpmath==1.2.1
- nomkl
- numpy==1.23.2
- omegaconf==2.1.1
- test-tube>=0.7.5
- streamlit==1.12.0
- pillow==9.2.0
- einops==0.3.0
- torch-fidelity==0.3.0
- transformers==4.19.2
- torchmetrics==0.6.0
- kornia==0.6.0
- -e git+https://github.com/openai/CLIP.git@main#egg=clip
- onnx==1.12.0
- onnxruntime==1.12.1
- opencv==4.6.0
- pudb==2022.1
- pytorch-lightning==1.6.5
- scipy==1.9.1
- streamlit==1.12.2
- sympy==1.10.1
- tensorboard==2.9.0
- transformers==4.21.2
- pip:
- invisible-watermark
- test-tube
- tokenizers
- torch-fidelity
- -e git+https://github.com/huggingface/diffusers.git@v0.2.4#egg=diffusers
- -e git+https://github.com/CompVis/taming-transformers.git@master#egg=taming-transformers
- -e git+https://github.com/lstein/k-diffusion.git@master#egg=k-diffusion
- -e git+https://github.com/openai/CLIP.git@main#egg=clip
- -e git+https://github.com/Birch-san/k-diffusion.git@mps#egg=k_diffusion
- -e .
variables:
PYTORCH_ENABLE_MPS_FALLBACK: 1

View File

@ -8,4 +8,10 @@ def choose_torch_device() -> str:
return 'mps'
return 'cpu'
def choose_autocast_device(device) -> str:
'''Returns an autocast compatible device from a torch device'''
device_type = device.type # this returns 'mps' on M1
# autocast only supports cuda or cpu
if device_type not in ('cuda','cpu'):
return 'cpu'
return device_type

View File

@ -8,11 +8,10 @@ class InitImageResizer():
def resize(self,width=None,height=None) -> Image:
"""
Return a copy of the image resized to width x height.
The aspect ratio is maintained, with any excess space
filled using black borders (i.e. letterboxed). If
neither width nor height are provided, then returns
a copy of the original image. If one or the other is
Return a copy of the image resized to fit within
a box width x height. The aspect ratio is
maintained. If neither width nor height are provided,
then returns a copy of the original image. If one or the other is
provided, then the other will be calculated from the
aspect ratio.
@ -21,38 +20,34 @@ class InitImageResizer():
"""
im = self.image
if not(width or height):
return im.copy()
ar = im.width/im.height
ar = im.width/float(im.height)
# Infer missing values from aspect ratio
if not height: # height missing
if not(width or height): # both missing
width = im.width
height = im.height
elif not height: # height missing
height = int(width/ar)
if not width: # width missing
elif not width: # width missing
width = int(height*ar)
# rw and rh are the resizing width and height for the image
# they maintain the aspect ratio, but may not completelyl fill up
# the requested destination size
(rw,rh) = (width,int(width/ar)) if im.width>=im.height else (int(height*ar),width)
(rw,rh) = (width,int(width/ar)) if im.width>=im.height else (int(height*ar),height)
#round everything to multiples of 64
width,height,rw,rh = map(
lambda x: x-x%64, (width,height,rw,rh)
)
# resize the original image so that it fits inside the dest
# no resize necessary, but return a copy
if im.width == width and im.height == height:
return im.copy()
# otherwise resize the original image so that it fits inside the bounding box
resized_image = self.image.resize((rw,rh),resample=Image.Resampling.LANCZOS)
# create new destination image of specified dimensions
# and paste the resized image into it centered appropriately
new_image = Image.new('RGB',(width,height))
new_image.paste(resized_image,((width-rw)//2,(height-rh)//2))
print(f'>> Resized image size to {width}x{height}')
return new_image
return resized_image
def make_grid(image_list, rows=None, cols=None):
image_cnt = len(image_list)

View File

@ -61,6 +61,8 @@ class PromptFormatter:
switches.append(f'-A{opt.sampler_name or t2i.sampler_name}')
if opt.init_img:
switches.append(f'-I{opt.init_img}')
if opt.fit:
switches.append(f'--fit')
if opt.strength and opt.init_img is not None:
switches.append(f'-f{opt.strength or t2i.strength}')
if opt.gfpgan_strength:

View File

@ -70,6 +70,7 @@ class DreamServer(BaseHTTPRequestHandler):
steps = int(post_data['steps'])
width = int(post_data['width'])
height = int(post_data['height'])
fit = 'fit' in post_data
cfgscale = float(post_data['cfgscale'])
sampler_name = post_data['sampler']
gfpgan_strength = float(post_data['gfpgan_strength']) if gfpgan_model_exists else 0
@ -80,7 +81,7 @@ class DreamServer(BaseHTTPRequestHandler):
seed = self.model.seed if int(post_data['seed']) == -1 else int(post_data['seed'])
self.canceled.clear()
print(f"Request to generate with prompt: {prompt}")
print(f">> Request to generate with prompt: {prompt}")
# In order to handle upscaled images, the PngWriter needs to maintain state
# across images generated by each call to prompt2img(), so we define it in
# the outer scope of image_done()
@ -181,6 +182,9 @@ class DreamServer(BaseHTTPRequestHandler):
seed = seed,
steps = steps,
sampler_name = sampler_name,
width = width,
height = height,
fit = fit,
gfpgan_strength=gfpgan_strength,
upscale = upscale,
step_callback=image_progress,
@ -192,8 +196,6 @@ class DreamServer(BaseHTTPRequestHandler):
print(f"Canceled.")
return
print(f"Prompt generated!")
class ThreadingDreamServer(ThreadingHTTPServer):
def __init__(self, server_address):

View File

@ -14,7 +14,7 @@ model_path = os.path.join(opt.gfpgan_dir, opt.gfpgan_model_path)
gfpgan_model_exists = os.path.isfile(model_path)
def _run_gfpgan(image, strength, prompt, seed, upsampler_scale=4):
print(f'\n* GFPGAN - Restoring Faces: {prompt} : seed:{seed}')
print(f'>> GFPGAN - Restoring Faces: {prompt} : seed:{seed}')
gfpgan = None
with warnings.catch_warnings():
warnings.filterwarnings('ignore', category=DeprecationWarning)
@ -41,12 +41,12 @@ def _run_gfpgan(image, strength, prompt, seed, upsampler_scale=4):
except Exception:
import traceback
print('Error loading GFPGAN:', file=sys.stderr)
print('>> Error loading GFPGAN:', file=sys.stderr)
print(traceback.format_exc(), file=sys.stderr)
if gfpgan is None:
print(
f'GFPGAN not initialized, it must be loaded via the --gfpgan argument'
f'>> GFPGAN not initialized, it must be loaded via the --gfpgan argument'
)
return image
@ -129,7 +129,7 @@ def _load_gfpgan_bg_upsampler(bg_upsampler, upsampler_scale, bg_tile=400):
def real_esrgan_upscale(image, strength, upsampler_scale, prompt, seed):
print(
f'\n* Real-ESRGAN Upscaling: {prompt} : seed:{seed} : scale:{upsampler_scale}x'
f'>> Real-ESRGAN Upscaling: {prompt} : seed:{seed} : scale:{upsampler_scale}x'
)
with warnings.catch_warnings():
@ -143,7 +143,7 @@ def real_esrgan_upscale(image, strength, upsampler_scale, prompt, seed):
except Exception:
import traceback
print('Error loading Real-ESRGAN:', file=sys.stderr)
print('>> Error loading Real-ESRGAN:', file=sys.stderr)
print(traceback.format_exc(), file=sys.stderr)
output, img_mode = upsampler.enhance(

View File

@ -8,6 +8,7 @@ import torch
import numpy as np
import random
import os
import traceback
from omegaconf import OmegaConf
from PIL import Image
from tqdm import tqdm, trange
@ -27,7 +28,8 @@ from ldm.models.diffusion.ddim import DDIMSampler
from ldm.models.diffusion.plms import PLMSSampler
from ldm.models.diffusion.ksampler import KSampler
from ldm.dream.pngwriter import PngWriter
from ldm.dream.devices import choose_torch_device
from ldm.dream.image_util import InitImageResizer
from ldm.dream.devices import choose_autocast_device, choose_torch_device
"""Simplified text to image API for stable diffusion/latent diffusion
@ -131,9 +133,10 @@ class T2I:
full_precision=False,
strength=0.75, # default in scripts/img2img.py
embedding_path=None,
device_type = 'cuda',
# just to keep track of this parameter when regenerating prompt
# needs to be replaced when new configuration system implemented.
latent_diffusion_weights=False,
device='cuda',
):
self.iterations = iterations
self.width = width
@ -151,13 +154,20 @@ class T2I:
self.full_precision = full_precision
self.strength = strength
self.embedding_path = embedding_path
self.device_type = device_type
self.model = None # empty for now
self.sampler = None
self.device = None
self.latent_diffusion_weights = latent_diffusion_weights
self.device = device
if device_type == 'cuda' and not torch.cuda.is_available():
device_type = choose_torch_device()
print(">> cuda not available, using device", device_type)
self.device = torch.device(device_type)
# for VRAM usage statistics
self.session_peakmem = torch.cuda.max_memory_allocated() if self.device == 'cuda' else None
device_type = choose_torch_device()
self.session_peakmem = torch.cuda.max_memory_allocated() if device_type == 'cuda' else None
if seed is None:
self.seed = self._new_seed()
@ -209,11 +219,11 @@ class T2I:
height = None,
# these are specific to img2img
init_img = None,
fit = False,
strength = None,
gfpgan_strength= 0,
save_original = False,
upscale = None,
variants=None,
sampler_name = None,
log_tokenization= False,
**args,
@ -232,7 +242,6 @@ class T2I:
strength // strength for noising/unnoising init_img. 0.0 preserves image exactly, 1.0 replaces it completely
gfpgan_strength // strength for GFPGAN. 0.0 preserves image exactly, 1.0 replaces it completely
ddim_eta // image randomness (eta=0.0 means the same seed always produces the same image)
variants // if >0, the 1st generated image will be passed back to img2img to generate the requested number of variants
step_callback // a function or method that will be called each step
image_callback // a function or method that will be called each time an image is generated
@ -251,6 +260,7 @@ class T2I:
to create the requested output directory, select a unique informative name for each image, and
write the prompt into the PNG metadata.
"""
# TODO: convert this into a getattr() loop
steps = steps or self.steps
seed = seed or self.seed
width = width or self.width
@ -269,9 +279,7 @@ class T2I:
0.0 <= strength <= 1.0
), 'can only work with strength in [0.0, 1.0]'
if not(width == self.width and height == self.height):
width, height, _ = self._resolution_check(width, height, log=True)
scope = autocast if self.precision == 'autocast' else nullcontext
if sampler_name and (sampler_name != self.sampler_name):
@ -279,7 +287,8 @@ class T2I:
self._set_sampler()
tic = time.time()
torch.cuda.torch.cuda.reset_peak_memory_stats()
if torch.cuda.is_available():
torch.cuda.reset_peak_memory_stats()
results = list()
try:
@ -295,6 +304,7 @@ class T2I:
init_img=init_img,
width=width,
height=height,
fit=fit,
strength=strength,
callback=step_callback,
)
@ -311,7 +321,8 @@ class T2I:
callback=step_callback,
)
with scope(self.device.type), self.model.ema_scope():
device_type = choose_autocast_device(self.device)
with scope(device_type), self.model.ema_scope():
for n in trange(iterations, desc='Generating'):
seed_everything(seed)
image = next(images_iterator)
@ -345,7 +356,7 @@ class T2I:
)
except Exception as e:
print(
f'Error running RealESRGAN - Your image was not upscaled.\n{e}'
f'>> Error running RealESRGAN - Your image was not upscaled.\n{e}'
)
if image_callback is not None:
if save_original:
@ -358,19 +369,19 @@ class T2I:
except KeyboardInterrupt:
print('*interrupted*')
print(
'Partial results will be returned; if --grid was requested, nothing will be returned.'
'>> Partial results will be returned; if --grid was requested, nothing will be returned.'
)
except RuntimeError as e:
print(str(e))
print('Are you sure your system has an adequate NVIDIA GPU?')
print(traceback.format_exc(), file=sys.stderr)
print('>> Are you sure your system has an adequate NVIDIA GPU?')
toc = time.time()
print('Usage stats:')
print('>> Usage stats:')
print(
f' {len(results)} image(s) generated in', '%4.2fs' % (toc - tic)
f'>> {len(results)} image(s) generated in', '%4.2fs' % (toc - tic)
)
print(
f' Max VRAM used for this generation:',
f'>> Max VRAM used for this generation:',
'%4.2fG' % (torch.cuda.max_memory_allocated() / 1e9),
)
@ -379,7 +390,7 @@ class T2I:
self.session_peakmem, torch.cuda.max_memory_allocated()
)
print(
f' Max VRAM used since script start: ',
f'>> Max VRAM used since script start: ',
'%4.2fG' % (self.session_peakmem / 1e9),
)
return results
@ -435,6 +446,7 @@ class T2I:
init_img,
width,
height,
fit,
strength,
callback, # Currently not implemented for img2img
):
@ -445,13 +457,13 @@ class T2I:
# PLMS sampler not supported yet, so ignore previous sampler
if self.sampler_name != 'ddim':
print(
f"sampler '{self.sampler_name}' is not yet supported. Using DDIM sampler"
f">> sampler '{self.sampler_name}' is not yet supported. Using DDIM sampler"
)
sampler = DDIMSampler(self.model, device=self.device)
else:
sampler = self.sampler
init_image = self._load_img(init_img, width, height).to(self.device)
init_image = self._load_img(init_img, width, height,fit).to(self.device)
with precision_scope(self.device.type):
init_latent = self.model.get_first_stage_encoding(
self.model.encode_first_stage(init_image)
@ -462,7 +474,6 @@ class T2I:
)
t_enc = int(strength * steps)
# print(f"target t_enc is {t_enc} steps")
while True:
uc, c = self._get_uc_and_c(prompt, skip_normalize)
@ -513,7 +524,7 @@ class T2I:
x_samples = torch.clamp((x_samples + 1.0) / 2.0, min=0.0, max=1.0)
if len(x_samples) != 1:
raise Exception(
f'expected to get a single image, but got {len(x_samples)}')
f'>> expected to get a single image, but got {len(x_samples)}')
x_sample = 255.0 * rearrange(
x_samples[0].cpu().numpy(), 'c h w -> h w c'
)
@ -523,17 +534,12 @@ class T2I:
self.seed = random.randrange(0, np.iinfo(np.uint32).max)
return self.seed
def _get_device(self):
device_type = choose_torch_device()
return torch.device(device_type)
def load_model(self):
"""Load and initialize the model from configuration variables passed at object creation time"""
if self.model is None:
seed_everything(self.seed)
try:
config = OmegaConf.load(self.config)
self.device = self._get_device()
model = self._load_model_from_config(config, self.weights)
if self.embedding_path is not None:
model.embedding_manager.load(
@ -542,12 +548,10 @@ class T2I:
self.model = model.to(self.device)
# model.to doesn't change the cond_stage_model.device used to move the tokenizer output, so set it here
self.model.cond_stage_model.device = self.device
except AttributeError:
import traceback
print(
'Error loading model. Only the CUDA backend is supported', file=sys.stderr)
except AttributeError as e:
print(f'>> Error loading model. {str(e)}', file=sys.stderr)
print(traceback.format_exc(), file=sys.stderr)
raise SystemExit
raise SystemExit from e
self._set_sampler()
@ -582,7 +586,7 @@ class T2I:
print(msg)
def _load_model_from_config(self, config, ckpt):
print(f'Loading model from {ckpt}')
print(f'>> Loading model from {ckpt}')
pl_sd = torch.load(ckpt, map_location='cpu')
# if "global_step" in pl_sd:
# print(f"Global Step: {pl_sd['global_step']}")
@ -597,41 +601,63 @@ class T2I:
)
else:
print(
'Using half precision math. Call with --full_precision to use more accurate but VRAM-intensive full precision.'
'>> Using half precision math. Call with --full_precision to use more accurate but VRAM-intensive full precision.'
)
model.half()
return model
def _load_img(self, path, width, height):
print(f'image path = {path}, cwd = {os.getcwd()}')
def _load_img(self, path, width, height, fit=False):
with Image.open(path) as img:
image = img.convert('RGB')
print(
f'loaded input image of size {image.width}x{image.height} from {path}')
f'>> loaded input image of size {image.width}x{image.height} from {path}'
)
from ldm.dream.image_util import InitImageResizer
if width == self.width and height == self.height:
new_image_width, new_image_height, resize_needed = self._resolution_check(
image.width, image.height)
# The logic here is:
# 1. If "fit" is true, then the image will be fit into the bounding box defined
# by width and height. It will do this in a way that preserves the init image's
# aspect ratio while preventing letterboxing. This means that if there is
# leftover horizontal space after rescaling the image to fit in the bounding box,
# the generated image's width will be reduced to the rescaled init image's width.
# Similarly for the vertical space.
# 2. Otherwise, if "fit" is false, then the image will be scaled, preserving its
# aspect ratio, to the nearest multiple of 64. Large images may generate an
# unexpected OOM error.
if fit:
image = self._fit_image(image,(width,height))
else:
if height == self.height:
new_image_width, new_image_height, resize_needed = self._resolution_check(
width, image.height)
if width == self.width:
new_image_width, new_image_height, resize_needed = self._resolution_check(
image.width, height)
else:
image = InitImageResizer(image).resize(width, height)
resize_needed = False
if resize_needed:
image = InitImageResizer(image).resize(
new_image_width, new_image_height)
image = self._squeeze_image(image)
image = np.array(image).astype(np.float32) / 255.0
image = image[None].transpose(0, 3, 1, 2)
image = torch.from_numpy(image)
return 2.0 * image - 1.0
def _squeeze_image(self,image):
x,y,resize_needed = self._resolution_check(image.width,image.height)
if resize_needed:
return InitImageResizer(image).resize(x,y)
return image
def _fit_image(self,image,max_dimensions):
w,h = max_dimensions
print(
f'>> image will be resized to fit inside a box {w}x{h} in size.'
)
if image.width > image.height:
h = None # by setting h to none, we tell InitImageResizer to fit into the width and calculate height
elif image.height > image.width:
w = None # ditto for w
else:
pass
image = InitImageResizer(image).resize(w,h) # note that InitImageResizer does the multiple of 64 truncation internally
print(
f'>> after adjusting image dimensions to be multiples of 64, init image is {image.width}x{image.height}'
)
return image
# TO DO: Move this and related weighted subprompt code into its own module.
def _split_weighted_subprompts(text, skip_normalize=False):
"""
grabs all text up to the first occurrence of ':'

View File

@ -9,6 +9,7 @@ import sys
import copy
import warnings
import time
from ldm.dream.devices import choose_torch_device
import ldm.dream.readline
from ldm.dream.pngwriter import PngWriter, PromptFormatter
from ldm.dream.server import DreamServer, ThreadingDreamServer
@ -60,7 +61,7 @@ def main():
# this is solely for recreating the prompt
latent_diffusion_weights=opt.laion400m,
embedding_path=opt.embedding_path,
device=opt.device,
device_type=opt.device
)
# make sure the output directory exists
@ -88,7 +89,7 @@ def main():
tic = time.time()
t2i.load_model()
print(
f'model loaded in', '%4.2fs' % (time.time() - tic)
f'>> model loaded in', '%4.2fs' % (time.time() - tic)
)
if not infile:
@ -347,6 +348,8 @@ def create_argv_parser():
dest='full_precision',
action='store_true',
help='Use slower full precision math for calculations',
# MPS only functions with full precision, see https://github.com/lstein/stable-diffusion/issues/237
default=choose_torch_device() == 'mps',
)
parser.add_argument(
'-g',
@ -376,13 +379,6 @@ def create_argv_parser():
type=str,
help='Path to a pre-trained embedding manager checkpoint - can only be set on command line',
)
parser.add_argument(
'--device',
'-d',
type=str,
default='cuda',
help='Device to run Stable Diffusion on. Defaults to cuda `torch.cuda.current_device()` if avalible',
)
parser.add_argument(
'--prompt_as_dir',
'-p',
@ -426,6 +422,13 @@ def create_argv_parser():
default='model',
help='Indicates the Stable Diffusion model to use.',
)
parser.add_argument(
'--device',
'-d',
type=str,
default='cuda',
help="device to run stable diffusion on. defaults to cuda `torch.cuda.current_device()` if available"
)
return parser
@ -483,6 +486,13 @@ def create_cmd_parser():
type=str,
help='Path to input image for img2img mode (supersedes width and height)',
)
parser.add_argument(
'-T',
'-fit',
'--fit',
action='store_true',
help='If specified, will resize the input image to fit within the dimensions of width x height (512x512 default)',
)
parser.add_argument(
'-f',
'--strength',

View File

@ -200,7 +200,7 @@ def main():
config = OmegaConf.load(f"{opt.config}")
model = load_model_from_config(config, f"{opt.ckpt}")
device = choose_torch_device()
device = torch.device(choose_torch_device())
model = model.to(device)
if opt.plms:

View File

@ -8,13 +8,15 @@
margin-top: 20vh;
margin-left: auto;
margin-right: auto;
max-width: 800px;
max-width: 1024px;
text-align: center;
}
fieldset {
border: none;
}
div {
padding: 10px 10px 10px 10px;
}
#fieldset-search {
display: flex;
}
@ -78,3 +80,18 @@ label {
cursor: pointer;
color: red;
}
#txt2img {
background-color: #DCDCDC;
}
#img2img {
background-color: #F5F5F5;
}
#gfpgan {
background-color: #DCDCDC;
}
#progress-section {
background-color: #F5F5F5;
}
#about {
background-color: #DCDCDC;
}

View File

@ -14,6 +14,7 @@
<h2 id="header">Stable Diffusion Dream Server</h2>
<form id="generate-form" method="post" action="#">
<div id="txt2img">
<fieldset id="fieldset-search">
<input type="text" id="prompt" name="prompt">
<input type="submit" id="submit" value="Generate">
@ -62,15 +63,19 @@
<label title="Set to -1 for random seed" for="seed">Seed:</label>
<input value="-1" type="number" id="seed" name="seed">
<button type="button" id="reset-seed">&olarr;</button>
<input type="checkbox" name="progress_images" id="progress_images">
<label for="progress_images">Display in-progress images (slows down generation):</label>
<button type="button" id="reset-all">Reset to Defaults</button>
</div>
<div id="img2img">
<label title="Upload an image to use img2img" for="initimg">Initial image:</label>
<input type="file" id="initimg" name="initimg" accept=".jpg, .jpeg, .png">
<br>
<label for="strength">Img2Img Strength:</label>
<input value="0.75" type="number" id="strength" name="strength" step="0.01" min="0" max="1">
<label title="Upload an image to use img2img" for="initimg">Init:</label>
<input type="file" id="initimg" name="initimg" accept=".jpg, .jpeg, .png">
<button type="button" id="reset-all">Reset to Defaults</button>
<br>
<label for="progress_images">Display in-progress images (slows down generation):</label>
<input type="checkbox" name="progress_images" id="progress_images">
<input type="checkbox" id="fit" name="fit" checked>
<label title="Rescale image to fit within requested width and height" for="fit">Fit to width/height:</label>
</div>
<div id="gfpgan">
<label title="Strength of the gfpgan (face fixing) algorithm." for="gfpgan_strength">GPFGAN Strength (0 to disable):</label>
<input value="0.8" min="0" max="1" type="number" id="gfpgan_strength" name="gfpgan_strength" step="0.05">
@ -86,6 +91,7 @@
</fieldset>
</form>
<div id="about">For news and support for this web service, visit our <a href="http://github.com/lstein/stable-diffusion">GitHub site</a></div>
<br>
<div id="progress-section">
<progress id="progress-bar" value="0" max="1"></progress>
<span id="cancel-button" title="Cancel">&#10006;</span>