InvokeAI/invokeai/backend/util/devices.py

from __future__ import annotations

from contextlib import nullcontext
from typing import Literal, Optional, Union

import torch
from torch import autocast

from invokeai.app.services.config import InvokeAIAppConfig
from invokeai.app.services.config.config_default import get_config

CPU_DEVICE = torch.device("cpu")
CUDA_DEVICE = torch.device("cuda")
MPS_DEVICE = torch.device("mps")


def choose_torch_device() -> torch.device:
    """Convenience routine for guessing which GPU device to run model on"""
    config = get_config()
    if config.device == "auto":
        if torch.cuda.is_available():
            return torch.device("cuda")
        if hasattr(torch.backends, "mps") and torch.backends.mps.is_available():
            return torch.device("mps")
        else:
            return CPU_DEVICE
    else:
        return torch.device(config.device)


def get_torch_device_name() -> str:
    device = choose_torch_device()
    return torch.cuda.get_device_name(device) if device.type == "cuda" else device.type.upper()


# We are in transition here from using a single global AppConfig to allowing multiple
# configurations. It is strongly recommended to pass the app_config to this function.
def choose_precision(
    device: torch.device, app_config: Optional[InvokeAIAppConfig] = None
) -> Literal["float32", "float16", "bfloat16"]:
    """Return an appropriate precision for the given torch device."""
    app_config = app_config or get_config()
    if device.type == "cuda":
        device_name = torch.cuda.get_device_name(device)
        if not ("GeForce GTX 1660" in device_name or "GeForce GTX 1650" in device_name):
            if app_config.precision == "float32":
                return "float32"
            elif app_config.precision == "bfloat16":
                return "bfloat16"
            else:
                return "float16"
    elif device.type == "mps":
        return "float16"
    return "float32"


# We are in transition here from using a single global AppConfig to allowing multiple
# configurations. It is strongly recommended to pass the app_config to this function.
def torch_dtype(
    device: Optional[torch.device] = None,
    app_config: Optional[InvokeAIAppConfig] = None,
) -> torch.dtype:
    device = device or choose_torch_device()
    precision = choose_precision(device, app_config)
    if precision == "float16":
        return torch.float16
    if precision == "bfloat16":
        return torch.bfloat16
    else:
        # "auto", "autocast", "float32"
        return torch.float32


def choose_autocast(precision):
    """Returns an autocast context or nullcontext for the given precision string"""
    # float16 currently requires autocast to avoid errors like:
    # 'expected scalar type Half but found Float'
    if precision == "autocast" or precision == "float16":
        return autocast
    return nullcontext


def normalize_device(device: Union[str, torch.device]) -> torch.device:
    """Ensure device has a device index defined, if appropriate."""
    device = torch.device(device)
    if device.index is None:
        # cuda might be the only torch backend that currently uses the device index?
        # I don't see anything like `current_device` for cpu or mps.
        if device.type == "cuda":
            device = torch.device(device.type, torch.cuda.current_device())
    return device
fix(diffusers_pipeline): ensure `cuda.get_mem_info` always gets a specific device index. Also tighten up the typing of `device` attributes in general. 2023-02-18 00:29:03 +00:00			`from __future__ import annotations`

isort wip 2 2023-08-18 15:13:28 +00:00			`from contextlib import nullcontext`
fix invokeai_configure script to work with new mm; rename CLIs 2024-02-09 21:42:33 +00:00			`from typing import Literal, Optional, Union`
fix(diffusers_pipeline): ensure `cuda.get_mem_info` always gets a specific device index. Also tighten up the typing of `device` attributes in general. 2023-02-18 00:29:03 +00:00
add support for Apple hardware using MPS acceleration 2022-08-31 04:33:23 +00:00			`import torch`
Refactoring simplet2i (#387) * start refactoring -not yet functional * first phase of refactor done - not sure weighted prompts working * Second phase of refactoring. Everything mostly working. * The refactoring has moved all the hard-core inference work into ldm.dream.generator., where there are submodules for txt2img and img2img. inpaint will go in there as well. Some additional refactoring will be done soon, but relatively minor work. * fix -save_orig flag to actually work * add @neonsecret attention.py memory optimization * remove unneeded imports * move token logging into conditioning.py * add placeholder version of inpaint; porting in progress * fix crash in img2img * inpainting working; not tested on variations * fix crashes in img2img * ported attention.py memory optimization #117 from basujindal branch * added @torch_no_grad() decorators to img2img, txt2img, inpaint closures * Final commit prior to PR against development * fixup crash when generating intermediate images in web UI * rename ldm.simplet2i to ldm.generate * add backward-compatibility simplet2i shell with deprecation warning * add back in mps exception, addresses @vargol comment in #354 * replaced Conditioning class with exported functions * fix wrong type of with_variations attribute during intialization * changed "image_iterator()" to "get_make_image()" * raise NotImplementedError for calling get_make_image() in parent class * Update ldm/generate.py better error message Co-authored-by: Kevin Gibbons <bakkot@gmail.com> * minor stylistic fixes and assertion checks from code review * moved get_noise() method into img2img class * break get_noise() into two methods, one for txt2img and the other for img2img * inpainting works on non-square images now * make get_noise() an abstract method in base class * much improved inpainting Co-authored-by: Kevin Gibbons <bakkot@gmail.com> 2022-09-06 00:40:10 +00:00			`from torch import autocast`
isort wip 2 2023-08-18 15:13:28 +00:00
fix potential race condition in config system 2023-05-26 00:41:26 +00:00			`from invokeai.app.services.config import InvokeAIAppConfig`
fix(config): use new get_config across the app, use correct settings 2024-03-11 12:01:48 +00:00			`from invokeai.app.services.config.config_default import get_config`
add support for Apple hardware using MPS acceleration 2022-08-31 04:33:23 +00:00
fix(diffusers_pipeline): ensure `cuda.get_mem_info` always gets a specific device index. Also tighten up the typing of `device` attributes in general. 2023-02-18 00:29:03 +00:00			`CPU_DEVICE = torch.device("cpu")`
all files migrated; tweaks needed 2023-03-03 05:02:15 +00:00			`CUDA_DEVICE = torch.device("cuda")`
			`MPS_DEVICE = torch.device("mps")`
all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00
Apply black 2023-07-27 14:54:01 +00:00
fix(diffusers_pipeline): ensure `cuda.get_mem_info` always gets a specific device index. Also tighten up the typing of `device` attributes in general. 2023-02-18 00:29:03 +00:00			`def choose_torch_device() -> torch.device:`
all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00			`"""Convenience routine for guessing which GPU device to run model on"""`
partially address --root CLI argument handling - fix places where `get_config()` is being called at import time rather than at run time. - add regression test for import time get_config() calling. 2024-03-17 02:25:19 +00:00			`config = get_config()`
fix(config): drop nonexistent `config.use_cpu` setting 2024-03-11 11:57:31 +00:00			`if config.device == "auto":`
refactor InvokeAIAppConfig 2023-08-17 17:47:26 +00:00			`if torch.cuda.is_available():`
			`return torch.device("cuda")`
			`if hasattr(torch.backends, "mps") and torch.backends.mps.is_available():`
			`return torch.device("mps")`
			`else:`
			`return CPU_DEVICE`
			`else:`
			`return torch.device(config.device)`
add support for Apple hardware using MPS acceleration 2022-08-31 04:33:23 +00:00
chore: ruff 2024-03-27 06:32:08 +00:00
feat: display torch device on startup This functionality disappeared at some point. 2024-03-27 06:28:06 +00:00			`def get_torch_device_name() -> str:`
			`device = choose_torch_device()`
			`return torch.cuda.get_device_name(device) if device.type == "cuda" else device.type.upper()`
all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00
chore: ruff 2024-03-27 06:32:08 +00:00
Multiple refinements on loaders: - Cache stat collection enabled. - Implemented ONNX loading. - Add ability to specify the repo version variant in installer CLI. - If caller asks for a repo version that doesn't exist, will fall back to empty version rather than raising an error. 2024-02-06 02:55:11 +00:00			`# We are in transition here from using a single global AppConfig to allowing multiple`
			`# configurations. It is strongly recommended to pass the app_config to this function.`
fix invokeai_configure script to work with new mm; rename CLIs 2024-02-09 21:42:33 +00:00			`def choose_precision(`
			`device: torch.device, app_config: Optional[InvokeAIAppConfig] = None`
			`) -> Literal["float32", "float16", "bfloat16"]:`
Multiple refinements on loaders: - Cache stat collection enabled. - Implemented ONNX loading. - Add ability to specify the repo version variant in installer CLI. - If caller asks for a repo version that doesn't exist, will fall back to empty version rather than raising an error. 2024-02-06 02:55:11 +00:00			`"""Return an appropriate precision for the given torch device."""`
partially address --root CLI argument handling - fix places where `get_config()` is being called at import time rather than at run time. - add regression test for import time get_config() calling. 2024-03-17 02:25:19 +00:00			`app_config = app_config or get_config()`
all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00			`if device.type == "cuda":`
Replace --full_precision with --precision that works even if not specified Allowed values are 'auto', 'float32', 'autocast', 'float16'. If not specified or 'auto' a working precision is automatically selected based on the torch device. Context: #526 Deprecated --full_precision / -F Tested on both cuda and cpu by calling scripts/dream.py without arguments and checked the auto configuration worked. With --precision=auto/float32/autocast/float16 it performs as expected, either working or failing with a reasonable error. Also checked Img2Img. 2022-09-17 17:56:25 +00:00			`device_name = torch.cuda.get_device_name(device)`
all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00			`if not ("GeForce GTX 1660" in device_name or "GeForce GTX 1650" in device_name):`
Multiple refinements on loaders: - Cache stat collection enabled. - Implemented ONNX loading. - Add ability to specify the repo version variant in installer CLI. - If caller asks for a repo version that doesn't exist, will fall back to empty version rather than raising an error. 2024-02-06 02:55:11 +00:00			`if app_config.precision == "float32":`
			`return "float32"`
			`elif app_config.precision == "bfloat16":`
Allow bfloat16 to be configurable in invoke.yaml (#5469) * feat: allow bfloat16 to be configurable in invoke.yaml * fix: `torch_dtype()` util - Use `choose_precision` to get the precision string - Do not reference deprecated `config.full_precision` flat (why does this still exist?), if a user had this enabled it would override their actual precision setting and potentially cause a lot of confusion. --------- Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com> 2024-01-12 18:40:37 +00:00			`return "bfloat16"`
			`else:`
			`return "float16"`
remove MacOS Sonoma check in devices.py (#5312) * remove MacOS Sonoma check in devices.py As of pytorch 2.1.0, float16 works with our MPS fixes on Sonoma, so the check is no longer needed. * remove unused platform import 2023-12-22 00:42:47 +00:00			`elif device.type == "mps":`
Mac MPS FP16 fixes This PR is to allow FP16 precision to work on Macs with MPS. In addition, it centralizes the torch fixes/workarounds required for MPS into a new backend utility file `mps_fixes.py`. This is conditionally imported in `api_app.py`/`cli_app.py`. Many MANY thanks to StAlKeR7779 for patiently working to debug and fix these issues. 2023-07-04 22:05:01 +00:00			`return "float16"`
all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00			`return "float32"`

Replace --full_precision with --precision that works even if not specified Allowed values are 'auto', 'float32', 'autocast', 'float16'. If not specified or 'auto' a working precision is automatically selected based on the torch device. Context: #526 Deprecated --full_precision / -F Tested on both cuda and cpu by calling scripts/dream.py without arguments and checked the auto configuration worked. With --precision=auto/float32/autocast/float16 it performs as expected, either working or failing with a reasonable error. Also checked Img2Img. 2022-09-17 17:56:25 +00:00
Multiple refinements on loaders: - Cache stat collection enabled. - Implemented ONNX loading. - Add ability to specify the repo version variant in installer CLI. - If caller asks for a repo version that doesn't exist, will fall back to empty version rather than raising an error. 2024-02-06 02:55:11 +00:00			`# We are in transition here from using a single global AppConfig to allowing multiple`
			`# configurations. It is strongly recommended to pass the app_config to this function.`
			`def torch_dtype(`
			`device: Optional[torch.device] = None,`
			`app_config: Optional[InvokeAIAppConfig] = None,`
			`) -> torch.dtype:`
model loading and conversion implemented for vaes 2024-02-04 03:55:09 +00:00			`device = device or choose_torch_device()`
Multiple refinements on loaders: - Cache stat collection enabled. - Implemented ONNX loading. - Add ability to specify the repo version variant in installer CLI. - If caller asks for a repo version that doesn't exist, will fall back to empty version rather than raising an error. 2024-02-06 02:55:11 +00:00			`precision = choose_precision(device, app_config)`
Allow bfloat16 to be configurable in invoke.yaml (#5469) * feat: allow bfloat16 to be configurable in invoke.yaml * fix: `torch_dtype()` util - Use `choose_precision` to get the precision string - Do not reference deprecated `config.full_precision` flat (why does this still exist?), if a user had this enabled it would override their actual precision setting and potentially cause a lot of confusion. --------- Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com> 2024-01-12 18:40:37 +00:00			`if precision == "float16":`
			`return torch.float16`
			`if precision == "bfloat16":`
			`return torch.bfloat16`
do not use autocast for diffusers - All tensors in diffusers code path are now set explicitly to float32 or float16, depending on the --precision flag. - autocast is still used in the ckpt path, since it is being deprecated. 2023-01-17 00:32:06 +00:00			`else:`
Allow bfloat16 to be configurable in invoke.yaml (#5469) * feat: allow bfloat16 to be configurable in invoke.yaml * fix: `torch_dtype()` util - Use `choose_precision` to get the precision string - Do not reference deprecated `config.full_precision` flat (why does this still exist?), if a user had this enabled it would override their actual precision setting and potentially cause a lot of confusion. --------- Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com> 2024-01-12 18:40:37 +00:00			`# "auto", "autocast", "float32"`
do not use autocast for diffusers - All tensors in diffusers code path are now set explicitly to float32 or float16, depending on the --precision flag. - autocast is still used in the ckpt path, since it is being deprecated. 2023-01-17 00:32:06 +00:00			`return torch.float32`

all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00
Replace --full_precision with --precision that works even if not specified Allowed values are 'auto', 'float32', 'autocast', 'float16'. If not specified or 'auto' a working precision is automatically selected based on the torch device. Context: #526 Deprecated --full_precision / -F Tested on both cuda and cpu by calling scripts/dream.py without arguments and checked the auto configuration worked. With --precision=auto/float32/autocast/float16 it performs as expected, either working or failing with a reasonable error. Also checked Img2Img. 2022-09-17 17:56:25 +00:00			`def choose_autocast(precision):`
all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00			`"""Returns an autocast context or nullcontext for the given precision string"""`
Replace --full_precision with --precision that works even if not specified Allowed values are 'auto', 'float32', 'autocast', 'float16'. If not specified or 'auto' a working precision is automatically selected based on the torch device. Context: #526 Deprecated --full_precision / -F Tested on both cuda and cpu by calling scripts/dream.py without arguments and checked the auto configuration worked. With --precision=auto/float32/autocast/float16 it performs as expected, either working or failing with a reasonable error. Also checked Img2Img. 2022-09-17 17:56:25 +00:00			`# float16 currently requires autocast to avoid errors like:`
			`# 'expected scalar type Half but found Float'`
all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00			`if precision == "autocast" or precision == "float16":`
Replace --full_precision with --precision that works even if not specified Allowed values are 'auto', 'float32', 'autocast', 'float16'. If not specified or 'auto' a working precision is automatically selected based on the torch device. Context: #526 Deprecated --full_precision / -F Tested on both cuda and cpu by calling scripts/dream.py without arguments and checked the auto configuration worked. With --precision=auto/float32/autocast/float16 it performs as expected, either working or failing with a reasonable error. Also checked Img2Img. 2022-09-17 17:56:25 +00:00			`return autocast`
			`return nullcontext`
fix(diffusers_pipeline): ensure `cuda.get_mem_info` always gets a specific device index. Also tighten up the typing of `device` attributes in general. 2023-02-18 00:29:03 +00:00
all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00
restore 3.9 compatibility by replacing \| with Union[] 2023-07-03 14:55:04 +00:00			`def normalize_device(device: Union[str, torch.device]) -> torch.device:`
fix(diffusers_pipeline): ensure `cuda.get_mem_info` always gets a specific device index. Also tighten up the typing of `device` attributes in general. 2023-02-18 00:29:03 +00:00			`"""Ensure device has a device index defined, if appropriate."""`
			`device = torch.device(device)`
			`if device.index is None:`
			`# cuda might be the only torch backend that currently uses the device index?`
			# I don't see anything like `current_device` for cpu or mps.
all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00			`if device.type == "cuda":`
fix(diffusers_pipeline): ensure `cuda.get_mem_info` always gets a specific device index. Also tighten up the typing of `device` attributes in general. 2023-02-18 00:29:03 +00:00			`device = torch.device(device.type, torch.cuda.current_device())`
			`return device`