InvokeAI/invokeai/backend/util/devices.py

from __future__ import annotations

from contextlib import nullcontext
from typing import Union, Optional

import torch
from torch import autocast

from invokeai.app.services.config import InvokeAIAppConfig

CPU_DEVICE = torch.device("cpu")
CUDA_DEVICE = torch.device("cuda")
MPS_DEVICE = torch.device("mps")
config = InvokeAIAppConfig.get_config()


def choose_torch_device() -> torch.device:
    """Convenience routine for guessing which GPU device to run model on"""
    if config.use_cpu:  # legacy setting - force CPU
        return CPU_DEVICE
    elif config.device == "auto":
        if torch.cuda.is_available():
            return torch.device("cuda")
        if hasattr(torch.backends, "mps") and torch.backends.mps.is_available():
            return torch.device("mps")
        else:
            return CPU_DEVICE
    else:
        return torch.device(config.device)


def choose_precision(device: torch.device) -> str:
    """Returns an appropriate precision for the given torch device"""
    if device.type == "cuda":
        device_name = torch.cuda.get_device_name(device)
        if not ("GeForce GTX 1660" in device_name or "GeForce GTX 1650" in device_name):
            if config.precision == "bfloat16":
                return "bfloat16"
            else:
                return "float16"
    elif device.type == "mps":
        return "float16"
    return "float32"


def torch_dtype(device: Optional[torch.device] = None) -> torch.dtype:
    device = device or choose_torch_device()
    precision = choose_precision(device)
    if precision == "float16":
        return torch.float16
    if precision == "bfloat16":
        return torch.bfloat16
    else:
        # "auto", "autocast", "float32"
        return torch.float32


def choose_autocast(precision):
    """Returns an autocast context or nullcontext for the given precision string"""
    # float16 currently requires autocast to avoid errors like:
    # 'expected scalar type Half but found Float'
    if precision == "autocast" or precision == "float16":
        return autocast
    return nullcontext


def normalize_device(device: Union[str, torch.device]) -> torch.device:
    """Ensure device has a device index defined, if appropriate."""
    device = torch.device(device)
    if device.index is None:
        # cuda might be the only torch backend that currently uses the device index?
        # I don't see anything like `current_device` for cpu or mps.
        if device.type == "cuda":
            device = torch.device(device.type, torch.cuda.current_device())
    return device
fix(diffusers_pipeline): ensure `cuda.get_mem_info` always gets a specific device index. Also tighten up the typing of `device` attributes in general. 2023-02-18 00:29:03 +00:00			`from __future__ import annotations`

isort wip 2 2023-08-18 15:13:28 +00:00			`from contextlib import nullcontext`
model loading and conversion implemented for vaes 2024-02-04 03:55:09 +00:00			`from typing import Union, Optional`
fix(diffusers_pipeline): ensure `cuda.get_mem_info` always gets a specific device index. Also tighten up the typing of `device` attributes in general. 2023-02-18 00:29:03 +00:00
add support for Apple hardware using MPS acceleration 2022-08-31 04:33:23 +00:00			`import torch`
Refactoring simplet2i (#387) * start refactoring -not yet functional * first phase of refactor done - not sure weighted prompts working * Second phase of refactoring. Everything mostly working. * The refactoring has moved all the hard-core inference work into ldm.dream.generator., where there are submodules for txt2img and img2img. inpaint will go in there as well. Some additional refactoring will be done soon, but relatively minor work. * fix -save_orig flag to actually work * add @neonsecret attention.py memory optimization * remove unneeded imports * move token logging into conditioning.py * add placeholder version of inpaint; porting in progress * fix crash in img2img * inpainting working; not tested on variations * fix crashes in img2img * ported attention.py memory optimization #117 from basujindal branch * added @torch_no_grad() decorators to img2img, txt2img, inpaint closures * Final commit prior to PR against development * fixup crash when generating intermediate images in web UI * rename ldm.simplet2i to ldm.generate * add backward-compatibility simplet2i shell with deprecation warning * add back in mps exception, addresses @vargol comment in #354 * replaced Conditioning class with exported functions * fix wrong type of with_variations attribute during intialization * changed "image_iterator()" to "get_make_image()" * raise NotImplementedError for calling get_make_image() in parent class * Update ldm/generate.py better error message Co-authored-by: Kevin Gibbons <bakkot@gmail.com> * minor stylistic fixes and assertion checks from code review * moved get_noise() method into img2img class * break get_noise() into two methods, one for txt2img and the other for img2img * inpainting works on non-square images now * make get_noise() an abstract method in base class * much improved inpainting Co-authored-by: Kevin Gibbons <bakkot@gmail.com> 2022-09-06 00:40:10 +00:00			`from torch import autocast`
isort wip 2 2023-08-18 15:13:28 +00:00
fix potential race condition in config system 2023-05-26 00:41:26 +00:00			`from invokeai.app.services.config import InvokeAIAppConfig`
add support for Apple hardware using MPS acceleration 2022-08-31 04:33:23 +00:00
fix(diffusers_pipeline): ensure `cuda.get_mem_info` always gets a specific device index. Also tighten up the typing of `device` attributes in general. 2023-02-18 00:29:03 +00:00			`CPU_DEVICE = torch.device("cpu")`
all files migrated; tweaks needed 2023-03-03 05:02:15 +00:00			`CUDA_DEVICE = torch.device("cuda")`
			`MPS_DEVICE = torch.device("mps")`
fix potential race condition in config system 2023-05-26 00:41:26 +00:00			`config = InvokeAIAppConfig.get_config()`
all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00
Apply black 2023-07-27 14:54:01 +00:00
fix(diffusers_pipeline): ensure `cuda.get_mem_info` always gets a specific device index. Also tighten up the typing of `device` attributes in general. 2023-02-18 00:29:03 +00:00			`def choose_torch_device() -> torch.device:`
all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00			`"""Convenience routine for guessing which GPU device to run model on"""`
refactor InvokeAIAppConfig 2023-08-17 17:47:26 +00:00			`if config.use_cpu: # legacy setting - force CPU`
fix(diffusers_pipeline): ensure `cuda.get_mem_info` always gets a specific device index. Also tighten up the typing of `device` attributes in general. 2023-02-18 00:29:03 +00:00			`return CPU_DEVICE`
refactor InvokeAIAppConfig 2023-08-17 17:47:26 +00:00			`elif config.device == "auto":`
			`if torch.cuda.is_available():`
			`return torch.device("cuda")`
			`if hasattr(torch.backends, "mps") and torch.backends.mps.is_available():`
			`return torch.device("mps")`
			`else:`
			`return CPU_DEVICE`
			`else:`
			`return torch.device(config.device)`
add support for Apple hardware using MPS acceleration 2022-08-31 04:33:23 +00:00
all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00
fix(diffusers_pipeline): ensure `cuda.get_mem_info` always gets a specific device index. Also tighten up the typing of `device` attributes in general. 2023-02-18 00:29:03 +00:00			`def choose_precision(device: torch.device) -> str:`
all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00			`"""Returns an appropriate precision for the given torch device"""`
			`if device.type == "cuda":`
Replace --full_precision with --precision that works even if not specified Allowed values are 'auto', 'float32', 'autocast', 'float16'. If not specified or 'auto' a working precision is automatically selected based on the torch device. Context: #526 Deprecated --full_precision / -F Tested on both cuda and cpu by calling scripts/dream.py without arguments and checked the auto configuration worked. With --precision=auto/float32/autocast/float16 it performs as expected, either working or failing with a reasonable error. Also checked Img2Img. 2022-09-17 17:56:25 +00:00			`device_name = torch.cuda.get_device_name(device)`
all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00			`if not ("GeForce GTX 1660" in device_name or "GeForce GTX 1650" in device_name):`
Allow bfloat16 to be configurable in invoke.yaml (#5469) * feat: allow bfloat16 to be configurable in invoke.yaml * fix: `torch_dtype()` util - Use `choose_precision` to get the precision string - Do not reference deprecated `config.full_precision` flat (why does this still exist?), if a user had this enabled it would override their actual precision setting and potentially cause a lot of confusion. --------- Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com> 2024-01-12 18:40:37 +00:00			`if config.precision == "bfloat16":`
			`return "bfloat16"`
			`else:`
			`return "float16"`
remove MacOS Sonoma check in devices.py (#5312) * remove MacOS Sonoma check in devices.py As of pytorch 2.1.0, float16 works with our MPS fixes on Sonoma, so the check is no longer needed. * remove unused platform import 2023-12-22 00:42:47 +00:00			`elif device.type == "mps":`
Mac MPS FP16 fixes This PR is to allow FP16 precision to work on Macs with MPS. In addition, it centralizes the torch fixes/workarounds required for MPS into a new backend utility file `mps_fixes.py`. This is conditionally imported in `api_app.py`/`cli_app.py`. Many MANY thanks to StAlKeR7779 for patiently working to debug and fix these issues. 2023-07-04 22:05:01 +00:00			`return "float16"`
all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00			`return "float32"`

Replace --full_precision with --precision that works even if not specified Allowed values are 'auto', 'float32', 'autocast', 'float16'. If not specified or 'auto' a working precision is automatically selected based on the torch device. Context: #526 Deprecated --full_precision / -F Tested on both cuda and cpu by calling scripts/dream.py without arguments and checked the auto configuration worked. With --precision=auto/float32/autocast/float16 it performs as expected, either working or failing with a reasonable error. Also checked Img2Img. 2022-09-17 17:56:25 +00:00
model loading and conversion implemented for vaes 2024-02-04 03:55:09 +00:00			`def torch_dtype(device: Optional[torch.device] = None) -> torch.dtype:`
			`device = device or choose_torch_device()`
Allow bfloat16 to be configurable in invoke.yaml (#5469) * feat: allow bfloat16 to be configurable in invoke.yaml * fix: `torch_dtype()` util - Use `choose_precision` to get the precision string - Do not reference deprecated `config.full_precision` flat (why does this still exist?), if a user had this enabled it would override their actual precision setting and potentially cause a lot of confusion. --------- Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com> 2024-01-12 18:40:37 +00:00			`precision = choose_precision(device)`
			`if precision == "float16":`
			`return torch.float16`
			`if precision == "bfloat16":`
			`return torch.bfloat16`
do not use autocast for diffusers - All tensors in diffusers code path are now set explicitly to float32 or float16, depending on the --precision flag. - autocast is still used in the ckpt path, since it is being deprecated. 2023-01-17 00:32:06 +00:00			`else:`
Allow bfloat16 to be configurable in invoke.yaml (#5469) * feat: allow bfloat16 to be configurable in invoke.yaml * fix: `torch_dtype()` util - Use `choose_precision` to get the precision string - Do not reference deprecated `config.full_precision` flat (why does this still exist?), if a user had this enabled it would override their actual precision setting and potentially cause a lot of confusion. --------- Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com> 2024-01-12 18:40:37 +00:00			`# "auto", "autocast", "float32"`
do not use autocast for diffusers - All tensors in diffusers code path are now set explicitly to float32 or float16, depending on the --precision flag. - autocast is still used in the ckpt path, since it is being deprecated. 2023-01-17 00:32:06 +00:00			`return torch.float32`

all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00
Replace --full_precision with --precision that works even if not specified Allowed values are 'auto', 'float32', 'autocast', 'float16'. If not specified or 'auto' a working precision is automatically selected based on the torch device. Context: #526 Deprecated --full_precision / -F Tested on both cuda and cpu by calling scripts/dream.py without arguments and checked the auto configuration worked. With --precision=auto/float32/autocast/float16 it performs as expected, either working or failing with a reasonable error. Also checked Img2Img. 2022-09-17 17:56:25 +00:00			`def choose_autocast(precision):`
all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00			`"""Returns an autocast context or nullcontext for the given precision string"""`
Replace --full_precision with --precision that works even if not specified Allowed values are 'auto', 'float32', 'autocast', 'float16'. If not specified or 'auto' a working precision is automatically selected based on the torch device. Context: #526 Deprecated --full_precision / -F Tested on both cuda and cpu by calling scripts/dream.py without arguments and checked the auto configuration worked. With --precision=auto/float32/autocast/float16 it performs as expected, either working or failing with a reasonable error. Also checked Img2Img. 2022-09-17 17:56:25 +00:00			`# float16 currently requires autocast to avoid errors like:`
			`# 'expected scalar type Half but found Float'`
all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00			`if precision == "autocast" or precision == "float16":`
Replace --full_precision with --precision that works even if not specified Allowed values are 'auto', 'float32', 'autocast', 'float16'. If not specified or 'auto' a working precision is automatically selected based on the torch device. Context: #526 Deprecated --full_precision / -F Tested on both cuda and cpu by calling scripts/dream.py without arguments and checked the auto configuration worked. With --precision=auto/float32/autocast/float16 it performs as expected, either working or failing with a reasonable error. Also checked Img2Img. 2022-09-17 17:56:25 +00:00			`return autocast`
			`return nullcontext`
fix(diffusers_pipeline): ensure `cuda.get_mem_info` always gets a specific device index. Also tighten up the typing of `device` attributes in general. 2023-02-18 00:29:03 +00:00
all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00
restore 3.9 compatibility by replacing \| with Union[] 2023-07-03 14:55:04 +00:00			`def normalize_device(device: Union[str, torch.device]) -> torch.device:`
fix(diffusers_pipeline): ensure `cuda.get_mem_info` always gets a specific device index. Also tighten up the typing of `device` attributes in general. 2023-02-18 00:29:03 +00:00			`"""Ensure device has a device index defined, if appropriate."""`
			`device = torch.device(device)`
			`if device.index is None:`
			`# cuda might be the only torch backend that currently uses the device index?`
			# I don't see anything like `current_device` for cpu or mps.
all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00			`if device.type == "cuda":`
fix(diffusers_pipeline): ensure `cuda.get_mem_info` always gets a specific device index. Also tighten up the typing of `device` attributes in general. 2023-02-18 00:29:03 +00:00			`device = torch.device(device.type, torch.cuda.current_device())`
			`return device`