Replace --full_precision with --precision that works even if not specified

Allowed values are 'auto', 'float32', 'autocast', 'float16'. If not specified or 'auto' a working precision is automatically selected based on the torch device.
Context: #526
Deprecated --full_precision / -F

Tested on both cuda and cpu by calling scripts/dream.py without arguments and checked the auto configuration worked. With --precision=auto/float32/autocast/float16 it performs as expected, either working or failing with a reasonable error. Also checked Img2Img.
This commit is contained in:
Mihail Dumitrescu 2022-09-17 20:56:25 +03:00 committed by Lincoln Stein
parent 30de9fcfae
commit d176fb07cd
18 changed files with 108 additions and 82 deletions

View File

@ -5,8 +5,7 @@ SAMPLES_DIR=${OUT_DIR}
python scripts/dream.py \ python scripts/dream.py \
--from_file ${PROMPT_FILE} \ --from_file ${PROMPT_FILE} \
--outdir ${OUT_DIR} \ --outdir ${OUT_DIR} \
--sampler plms \ --sampler plms
--full_precision
# original output by CompVis/stable-diffusion # original output by CompVis/stable-diffusion
IMAGE1=".dev_scripts/images/v1_4_astronaut_rides_horse_plms_step50_seed42.png" IMAGE1=".dev_scripts/images/v1_4_astronaut_rides_horse_plms_step50_seed42.png"

View File

@ -85,9 +85,9 @@ jobs:
fi fi
# Utterly hacky, but I don't know how else to do this # Utterly hacky, but I don't know how else to do this
if [[ ${{ github.ref }} == 'refs/heads/master' ]]; then if [[ ${{ github.ref }} == 'refs/heads/master' ]]; then
time ${{ steps.vars.outputs.PYTHON_BIN }} scripts/dream.py --from_file tests/preflight_prompts.txt --full_precision time ${{ steps.vars.outputs.PYTHON_BIN }} scripts/dream.py --from_file tests/preflight_prompts.txt
elif [[ ${{ github.ref }} == 'refs/heads/development' ]]; then elif [[ ${{ github.ref }} == 'refs/heads/development' ]]; then
time ${{ steps.vars.outputs.PYTHON_BIN }} scripts/dream.py --from_file tests/dev_prompts.txt --full_precision time ${{ steps.vars.outputs.PYTHON_BIN }} scripts/dream.py --from_file tests/dev_prompts.txt
fi fi
mkdir -p outputs/img-samples mkdir -p outputs/img-samples
- name: Archive results - name: Archive results

View File

@ -86,17 +86,14 @@ You wil need one of the following:
- At least 6 GB of free disk space for the machine learning model, Python, and all its dependencies. - At least 6 GB of free disk space for the machine learning model, Python, and all its dependencies.
> Note #### Note
>
> If you have an Nvidia 10xx series card (e.g. the 1080ti), please run the dream script in
> full-precision mode as shown below.
Similarly, specify full-precision mode on Apple M1 hardware. Precision is auto configured based on the device. If however you encounter
errors like 'expected type Float but found Half' or 'not implemented for Half'
To run in full-precision mode, start `dream.py` with the `--full_precision` flag: you can try starting `dream.py` with the `--precision=float32` flag:
```bash ```bash
(ldm) ~/stable-diffusion$ python scripts/dream.py --full_precision (ldm) ~/stable-diffusion$ python scripts/dream.py --precision=float32
``` ```
### Features ### Features
@ -125,6 +122,11 @@ To run in full-precision mode, start `dream.py` with the `--full_precision` flag
### Latest Changes ### Latest Changes
- vNEXT (TODO 2022)
- Deprecated `--full_precision` / `-F`. Simply omit it and `dream.py` will auto
configure. To switch away from auto use the new flag like `--precision=float32`.
- v1.14 (11 September 2022) - v1.14 (11 September 2022)
- Memory optimizations for small-RAM cards. 512x512 now possible on 4 GB GPUs. - Memory optimizations for small-RAM cards. 512x512 now possible on 4 GB GPUs.

View File

@ -74,7 +74,7 @@ prompt arguments] (#list-of-prompt-arguments). Others
| --prompt_as_dir | -p | False | Name output directories using the prompt text. | | --prompt_as_dir | -p | False | Name output directories using the prompt text. |
| --from_file <path> | | None | Read list of prompts from a file. Use "-" to read from standard input | | --from_file <path> | | None | Read list of prompts from a file. Use "-" to read from standard input |
| --model <modelname> | | stable-diffusion-1.4 | Loads model specified in configs/models.yaml. Currently one of "stable-diffusion-1.4" or "laion400m" | | --model <modelname> | | stable-diffusion-1.4 | Loads model specified in configs/models.yaml. Currently one of "stable-diffusion-1.4" or "laion400m" |
| --full_precision | -F | False | Run in slower full-precision mode. Needed for Macintosh M1/M2 hardware and some older video cards. | | --precision <pname> | | auto | Set to a specific precision. Rare but you may need to switch to 'float32' on some video cards. |
| --web | | False | Start in web server mode | | --web | | False | Start in web server mode |
| --host <ip addr> | | localhost | Which network interface web server should listen on. Set to 0.0.0.0 to listen on any. | | --host <ip addr> | | localhost | Which network interface web server should listen on. Set to 0.0.0.0 to listen on any. |
| --port <port> | | 9090 | Which port web server should listen for requests on. | | --port <port> | | 9090 | Which port web server should listen for requests on. |

View File

@ -57,9 +57,7 @@ Once the model is trained, specify the trained .pt or .bin file when starting
dream using dream using
```bash ```bash
python3 ./scripts/dream.py \ python3 ./scripts/dream.py --embedding_path /path/to/embedding.pt
--embedding_path /path/to/embedding.pt \
--full_precision
``` ```
Then, to utilize your subject at the dream prompt Then, to utilize your subject at the dream prompt

View File

@ -62,15 +62,12 @@ You wil need one of the following:
### Note ### Note
If you are have a Nvidia 10xx series card (e.g. the 1080ti), please run the dream script in Precision is auto configured based on the device. If however you encounter
full-precision mode as shown below. errors like 'expected type Float but found Half' or 'not implemented for Half'
you can try starting `dream.py` with the `--precision=float32` flag:
Similarly, specify full-precision mode on Apple M1 hardware.
To run in full-precision mode, start `dream.py` with the `--full_precision` flag:
```bash ```bash
(ldm) ~/stable-diffusion$ python scripts/dream.py --full_precision (ldm) ~/stable-diffusion$ python scripts/dream.py --precision=float32
``` ```
## Features ## Features
@ -98,6 +95,11 @@ To run in full-precision mode, start `dream.py` with the `--full_precision` flag
## Latest Changes ## Latest Changes
### vNEXT <small>(TODO 2022)</small>
- Deprecated `--full_precision` / `-F`. Simply omit it and `dream.py` will auto
configure. To switch away from auto use the new flag like `--precision=float32`.
### v1.14 <small>(11 September 2022)</small> ### v1.14 <small>(11 September 2022)</small>
- Memory optimizations for small-RAM cards. 512x512 now possible on 4 GB GPUs. - Memory optimizations for small-RAM cards. 512x512 now possible on 4 GB GPUs.

View File

@ -97,7 +97,7 @@ conda activate ldm
python scripts/preload_models.py python scripts/preload_models.py
# run SD! # run SD!
python scripts/dream.py --full_precision # half-precision requires autocast and won't work python scripts/dream.py
# or run the web interface! # or run the web interface!
python scripts/dream.py --web python scripts/dream.py --web
@ -453,5 +453,3 @@ Abort trap: 6
warnings.warn('resource_tracker: There appear to be %d ' warnings.warn('resource_tracker: There appear to be %d '
``` ```
Macs do not support `autocast/mixed-precision`, so you need to supply
`--full_precision` to use float32 everywhere.

View File

@ -100,6 +100,13 @@ SAMPLER_CHOICES = [
'plms', 'plms',
] ]
PRECISION_CHOICES = [
'auto',
'float32',
'autocast',
'float16',
]
# is there a way to pick this up during git commits? # is there a way to pick this up during git commits?
APP_ID = 'lstein/stable-diffusion' APP_ID = 'lstein/stable-diffusion'
APP_VERSION = 'v1.15' APP_VERSION = 'v1.15'
@ -322,7 +329,16 @@ class Args(object):
'--full_precision', '--full_precision',
dest='full_precision', dest='full_precision',
action='store_true', action='store_true',
help='Use more memory-intensive full precision math for calculations', help='Deprecated way to set --precision=float32',
)
model_group.add_argument(
'--precision',
dest='precision',
type=str,
choices=PRECISION_CHOICES,
metavar='PRECISION',
help=f'Set model precision. Defaults to auto selected based on device. Options: {", ".join(PRECISION_CHOICES)}',
default='auto',
) )
file_group.add_argument( file_group.add_argument(
'--from_file', '--from_file',

View File

@ -1,6 +1,6 @@
import torch import torch
from torch import autocast from torch import autocast
from contextlib import contextmanager, nullcontext from contextlib import nullcontext
def choose_torch_device() -> str: def choose_torch_device() -> str:
'''Convenience routine for guessing which GPU device to run model on''' '''Convenience routine for guessing which GPU device to run model on'''
@ -10,15 +10,18 @@ def choose_torch_device() -> str:
return 'mps' return 'mps'
return 'cpu' return 'cpu'
def choose_autocast_device(device): def choose_precision(device) -> str:
'''Returns an autocast compatible device from a torch device''' '''Returns an appropriate precision for the given torch device'''
device_type = device.type # this returns 'mps' on M1 if device.type == 'cuda':
# autocast only for cuda, but GTX 16xx have issues with it device_name = torch.cuda.get_device_name(device)
if device_type == 'cuda': if not ('GeForce GTX 1660' in device_name or 'GeForce GTX 1650' in device_name):
device_name = torch.cuda.get_device_name() return 'float16'
if 'GeForce GTX 1660' in device_name or 'GeForce GTX 1650' in device_name: return 'float32'
return device_type,nullcontext
else: def choose_autocast(precision):
return device_type,autocast '''Returns an autocast context or nullcontext for the given precision string'''
else: # float16 currently requires autocast to avoid errors like:
return 'cpu',nullcontext # 'expected scalar type Half but found Float'
if precision == 'autocast' or precision == 'float16':
return autocast
return nullcontext

View File

@ -9,13 +9,14 @@ from tqdm import tqdm, trange
from PIL import Image from PIL import Image
from einops import rearrange, repeat from einops import rearrange, repeat
from pytorch_lightning import seed_everything from pytorch_lightning import seed_everything
from ldm.dream.devices import choose_autocast_device from ldm.dream.devices import choose_autocast
downsampling = 8 downsampling = 8
class Generator(): class Generator():
def __init__(self,model): def __init__(self, model, precision):
self.model = model self.model = model
self.precision = precision
self.seed = None self.seed = None
self.latent_channels = model.channels self.latent_channels = model.channels
self.downsampling_factor = downsampling # BUG: should come from model or config self.downsampling_factor = downsampling # BUG: should come from model or config
@ -38,7 +39,7 @@ class Generator():
def generate(self,prompt,init_image,width,height,iterations=1,seed=None, def generate(self,prompt,init_image,width,height,iterations=1,seed=None,
image_callback=None, step_callback=None, image_callback=None, step_callback=None,
**kwargs): **kwargs):
device_type,scope = choose_autocast_device(self.model.device) scope = choose_autocast(self.precision)
make_image = self.get_make_image( make_image = self.get_make_image(
prompt, prompt,
init_image = init_image, init_image = init_image,
@ -51,7 +52,7 @@ class Generator():
results = [] results = []
seed = seed if seed else self.new_seed() seed = seed if seed else self.new_seed()
seed, initial_noise = self.generate_initial_noise(seed, width, height) seed, initial_noise = self.generate_initial_noise(seed, width, height)
with scope(device_type), self.model.ema_scope(): with scope(self.model.device.type), self.model.ema_scope():
for n in trange(iterations, desc='Generating'): for n in trange(iterations, desc='Generating'):
x_T = None x_T = None
if self.variation_amount > 0: if self.variation_amount > 0:

View File

@ -11,8 +11,8 @@ from ldm.models.diffusion.ddim import DDIMSampler
from ldm.dream.generator.img2img import Img2Img from ldm.dream.generator.img2img import Img2Img
class Embiggen(Generator): class Embiggen(Generator):
def __init__(self,model): def __init__(self, model, precision):
super().__init__(model) super().__init__(model, precision)
self.init_latent = None self.init_latent = None
@torch.no_grad() @torch.no_grad()

View File

@ -4,15 +4,15 @@ ldm.dream.generator.img2img descends from ldm.dream.generator
import torch import torch
import numpy as np import numpy as np
from ldm.dream.devices import choose_autocast_device from ldm.dream.devices import choose_autocast
from ldm.dream.generator.base import Generator from ldm.dream.generator.base import Generator
from ldm.models.diffusion.ddim import DDIMSampler from ldm.models.diffusion.ddim import DDIMSampler
class Img2Img(Generator): class Img2Img(Generator):
def __init__(self,model): def __init__(self, model, precision):
super().__init__(model) super().__init__(model, precision)
self.init_latent = None # by get_noise() self.init_latent = None # by get_noise()
@torch.no_grad() @torch.no_grad()
def get_make_image(self,prompt,sampler,steps,cfg_scale,ddim_eta, def get_make_image(self,prompt,sampler,steps,cfg_scale,ddim_eta,
conditioning,init_image,strength,step_callback=None,**kwargs): conditioning,init_image,strength,step_callback=None,**kwargs):
@ -32,8 +32,8 @@ class Img2Img(Generator):
ddim_num_steps=steps, ddim_eta=ddim_eta, verbose=False ddim_num_steps=steps, ddim_eta=ddim_eta, verbose=False
) )
device_type,scope = choose_autocast_device(self.model.device) scope = choose_autocast(self.precision)
with scope(device_type): with scope(self.model.device.type):
self.init_latent = self.model.get_first_stage_encoding( self.init_latent = self.model.get_first_stage_encoding(
self.model.encode_first_stage(init_image) self.model.encode_first_stage(init_image)
) # move to latent space ) # move to latent space

View File

@ -5,15 +5,15 @@ ldm.dream.generator.inpaint descends from ldm.dream.generator
import torch import torch
import numpy as np import numpy as np
from einops import rearrange, repeat from einops import rearrange, repeat
from ldm.dream.devices import choose_autocast_device from ldm.dream.devices import choose_autocast
from ldm.dream.generator.img2img import Img2Img from ldm.dream.generator.img2img import Img2Img
from ldm.models.diffusion.ddim import DDIMSampler from ldm.models.diffusion.ddim import DDIMSampler
class Inpaint(Img2Img): class Inpaint(Img2Img):
def __init__(self,model): def __init__(self, model, precision):
self.init_latent = None self.init_latent = None
super().__init__(model) super().__init__(model, precision)
@torch.no_grad() @torch.no_grad()
def get_make_image(self,prompt,sampler,steps,cfg_scale,ddim_eta, def get_make_image(self,prompt,sampler,steps,cfg_scale,ddim_eta,
conditioning,init_image,mask_image,strength, conditioning,init_image,mask_image,strength,
@ -38,8 +38,8 @@ class Inpaint(Img2Img):
ddim_num_steps=steps, ddim_eta=ddim_eta, verbose=False ddim_num_steps=steps, ddim_eta=ddim_eta, verbose=False
) )
device_type,scope = choose_autocast_device(self.model.device) scope = choose_autocast(self.precision)
with scope(device_type): with scope(self.model.device.type):
self.init_latent = self.model.get_first_stage_encoding( self.init_latent = self.model.get_first_stage_encoding(
self.model.encode_first_stage(init_image) self.model.encode_first_stage(init_image)
) # move to latent space ) # move to latent space

View File

@ -7,9 +7,9 @@ import numpy as np
from ldm.dream.generator.base import Generator from ldm.dream.generator.base import Generator
class Txt2Img(Generator): class Txt2Img(Generator):
def __init__(self,model): def __init__(self, model, precision):
super().__init__(model) super().__init__(model, precision)
@torch.no_grad() @torch.no_grad()
def get_make_image(self,prompt,sampler,steps,cfg_scale,ddim_eta, def get_make_image(self,prompt,sampler,steps,cfg_scale,ddim_eta,
conditioning,width,height,step_callback=None,**kwargs): conditioning,width,height,step_callback=None,**kwargs):

View File

@ -29,7 +29,7 @@ from ldm.models.diffusion.plms import PLMSSampler
from ldm.models.diffusion.ksampler import KSampler from ldm.models.diffusion.ksampler import KSampler
from ldm.dream.pngwriter import PngWriter from ldm.dream.pngwriter import PngWriter
from ldm.dream.image_util import InitImageResizer from ldm.dream.image_util import InitImageResizer
from ldm.dream.devices import choose_torch_device from ldm.dream.devices import choose_torch_device, choose_precision
from ldm.dream.conditioning import get_uc_and_c from ldm.dream.conditioning import get_uc_and_c
def fix_func(orig): def fix_func(orig):
@ -104,7 +104,7 @@ gr = Generate(
# these values are set once and shouldn't be changed # these values are set once and shouldn't be changed
conf = path to configuration file ('configs/models.yaml') conf = path to configuration file ('configs/models.yaml')
model = symbolic name of the model in the configuration file model = symbolic name of the model in the configuration file
full_precision = False precision = float precision to be used
# this value is sticky and maintained between generation calls # this value is sticky and maintained between generation calls
sampler_name = ['ddim', 'k_dpm_2_a', 'k_dpm_2', 'k_euler_a', 'k_euler', 'k_heun', 'k_lms', 'plms'] // k_lms sampler_name = ['ddim', 'k_dpm_2_a', 'k_dpm_2', 'k_euler_a', 'k_euler', 'k_heun', 'k_lms', 'plms'] // k_lms
@ -130,6 +130,7 @@ class Generate:
sampler_name = 'k_lms', sampler_name = 'k_lms',
ddim_eta = 0.0, # deterministic ddim_eta = 0.0, # deterministic
full_precision = False, full_precision = False,
precision = 'auto',
# these are deprecated; if present they override values in the conf file # these are deprecated; if present they override values in the conf file
weights = None, weights = None,
config = None, config = None,
@ -145,7 +146,7 @@ class Generate:
self.cfg_scale = 7.5 self.cfg_scale = 7.5
self.sampler_name = sampler_name self.sampler_name = sampler_name
self.ddim_eta = 0.0 # same seed always produces same image self.ddim_eta = 0.0 # same seed always produces same image
self.full_precision = True if choose_torch_device() == 'mps' else full_precision self.precision = precision
self.strength = 0.75 self.strength = 0.75
self.seamless = False self.seamless = False
self.embedding_path = embedding_path self.embedding_path = embedding_path
@ -162,6 +163,14 @@ class Generate:
# it wasn't actually doing anything. This logic could be reinstated. # it wasn't actually doing anything. This logic could be reinstated.
device_type = choose_torch_device() device_type = choose_torch_device()
self.device = torch.device(device_type) self.device = torch.device(device_type)
if full_precision:
if self.precision != 'auto':
raise ValueError('Remove --full_precision / -F if using --precision')
print('Please remove deprecated --full_precision / -F')
print('If auto config does not work you can use --precision=float32')
self.precision = 'float32'
if self.precision == 'auto':
self.precision = choose_precision(self.device)
# for VRAM usage statistics # for VRAM usage statistics
self.session_peakmem = torch.cuda.max_memory_allocated() if self._has_cuda else None self.session_peakmem = torch.cuda.max_memory_allocated() if self._has_cuda else None
@ -440,25 +449,25 @@ class Generate:
def _make_img2img(self): def _make_img2img(self):
if not self.generators.get('img2img'): if not self.generators.get('img2img'):
from ldm.dream.generator.img2img import Img2Img from ldm.dream.generator.img2img import Img2Img
self.generators['img2img'] = Img2Img(self.model) self.generators['img2img'] = Img2Img(self.model, self.precision)
return self.generators['img2img'] return self.generators['img2img']
def _make_embiggen(self): def _make_embiggen(self):
if not self.generators.get('embiggen'): if not self.generators.get('embiggen'):
from ldm.dream.generator.embiggen import Embiggen from ldm.dream.generator.embiggen import Embiggen
self.generators['embiggen'] = Embiggen(self.model) self.generators['embiggen'] = Embiggen(self.model, self.precision)
return self.generators['embiggen'] return self.generators['embiggen']
def _make_txt2img(self): def _make_txt2img(self):
if not self.generators.get('txt2img'): if not self.generators.get('txt2img'):
from ldm.dream.generator.txt2img import Txt2Img from ldm.dream.generator.txt2img import Txt2Img
self.generators['txt2img'] = Txt2Img(self.model) self.generators['txt2img'] = Txt2Img(self.model, self.precision)
return self.generators['txt2img'] return self.generators['txt2img']
def _make_inpaint(self): def _make_inpaint(self):
if not self.generators.get('inpaint'): if not self.generators.get('inpaint'):
from ldm.dream.generator.inpaint import Inpaint from ldm.dream.generator.inpaint import Inpaint
self.generators['inpaint'] = Inpaint(self.model) self.generators['inpaint'] = Inpaint(self.model, self.precision)
return self.generators['inpaint'] return self.generators['inpaint']
def load_model(self): def load_model(self):
@ -469,7 +478,7 @@ class Generate:
model = self._load_model_from_config(self.config, self.weights) model = self._load_model_from_config(self.config, self.weights)
if self.embedding_path is not None: if self.embedding_path is not None:
model.embedding_manager.load( model.embedding_manager.load(
self.embedding_path, self.full_precision self.embedding_path, self.precision == 'float32' or self.precision == 'autocast'
) )
self.model = model.to(self.device) self.model = model.to(self.device)
# model.to doesn't change the cond_stage_model.device used to move the tokenizer output, so set it here # model.to doesn't change the cond_stage_model.device used to move the tokenizer output, so set it here
@ -619,16 +628,13 @@ class Generate:
sd = pl_sd['state_dict'] sd = pl_sd['state_dict']
model = instantiate_from_config(c.model) model = instantiate_from_config(c.model)
m, u = model.load_state_dict(sd, strict=False) m, u = model.load_state_dict(sd, strict=False)
if self.full_precision: if self.precision == 'float16':
print( print('Using faster float16 precision')
'>> Using slower but more accurate full-precision math (--full_precision)' model.to(torch.float16)
)
else: else:
print( print('Using more accurate float32 precision')
'>> Using half precision math. Call with --full_precision to use more accurate but VRAM-intensive full precision.'
)
model.half()
model.to(self.device) model.to(self.device)
model.eval() model.eval()

View File

@ -54,6 +54,7 @@ def main():
sampler_name = opt.sampler_name, sampler_name = opt.sampler_name,
embedding_path = opt.embedding_path, embedding_path = opt.embedding_path,
full_precision = opt.full_precision, full_precision = opt.full_precision,
precision = opt.precision,
) )
except (FileNotFoundError, IOError, KeyError) as e: except (FileNotFoundError, IOError, KeyError) as e:
print(f'{e}. Aborting.') print(f'{e}. Aborting.')

View File

@ -119,7 +119,7 @@ def main():
# "height": height, # "height": height,
# "sampler_name": opt.sampler_name, # "sampler_name": opt.sampler_name,
# "weights": weights, # "weights": weights,
# "full_precision": opt.full_precision, # "precision": opt.precision,
# "config": config, # "config": config,
# "grid": opt.grid, # "grid": opt.grid,
# "latent_diffusion_weights": opt.laion400m, # "latent_diffusion_weights": opt.laion400m,

View File

@ -23,14 +23,14 @@ class Container(containers.DeclarativeContainer):
model = config.model, model = config.model,
sampler_name = config.sampler_name, sampler_name = config.sampler_name,
embedding_path = config.embedding_path, embedding_path = config.embedding_path,
full_precision = config.full_precision precision = config.precision
# config = config.model.config, # config = config.model.config,
# width = config.model.width, # width = config.model.width,
# height = config.model.height, # height = config.model.height,
# sampler_name = config.model.sampler_name, # sampler_name = config.model.sampler_name,
# weights = config.model.weights, # weights = config.model.weights,
# full_precision = config.model.full_precision, # precision = config.model.precision,
# grid = config.model.grid, # grid = config.model.grid,
# seamless = config.model.seamless, # seamless = config.model.seamless,
# embedding_path = config.model.embedding_path, # embedding_path = config.model.embedding_path,