InvokeAI/invokeai/backend/util/devices.py

from __future__ import annotations

from contextlib import nullcontext
from typing import Literal, Optional, Union

import torch
from torch import autocast

from invokeai.app.services.config.config_default import PRECISION, get_config

CPU_DEVICE = torch.device("cpu")
CUDA_DEVICE = torch.device("cuda")
MPS_DEVICE = torch.device("mps")


def choose_torch_device() -> torch.device:
    """Convenience routine for guessing which GPU device to run model on"""
    config = get_config()
    if config.device == "auto":
        if torch.cuda.is_available():
            return torch.device("cuda")
        if hasattr(torch.backends, "mps") and torch.backends.mps.is_available():
            return torch.device("mps")
        else:
            return CPU_DEVICE
    else:
        return torch.device(config.device)


def get_torch_device_name() -> str:
    device = choose_torch_device()
    return torch.cuda.get_device_name(device) if device.type == "cuda" else device.type.upper()


def choose_precision(device: torch.device) -> Literal["float32", "float16", "bfloat16"]:
    """Return an appropriate precision for the given torch device."""
    app_config = get_config()
    if device.type == "cuda":
        device_name = torch.cuda.get_device_name(device)
        if "GeForce GTX 1660" in device_name or "GeForce GTX 1650" in device_name:
            # These GPUs have limited support for float16
            return "float32"
        elif app_config.precision == "auto" or app_config.precision == "autocast":
            # Default to float16 for CUDA devices
            return "float16"
        else:
            # Use the user-defined precision
            return app_config.precision
    elif device.type == "mps":
        if app_config.precision == "auto" or app_config.precision == "autocast":
            # Default to float16 for MPS devices
            return "float16"
        else:
            # Use the user-defined precision
            return app_config.precision
    # CPU / safe fallback
    return "float32"


def torch_dtype(device: Optional[torch.device] = None) -> torch.dtype:
    device = device or choose_torch_device()
    precision = choose_precision(device)
    if precision == "float16":
        return torch.float16
    if precision == "bfloat16":
        return torch.bfloat16
    else:
        # "auto", "autocast", "float32"
        return torch.float32


def choose_autocast(precision: PRECISION):
    """Returns an autocast context or nullcontext for the given precision string"""
    # float16 currently requires autocast to avoid errors like:
    # 'expected scalar type Half but found Float'
    if precision == "autocast" or precision == "float16":
        return autocast
    return nullcontext


def normalize_device(device: Union[str, torch.device]) -> torch.device:
    """Ensure device has a device index defined, if appropriate."""
    device = torch.device(device)
    if device.index is None:
        # cuda might be the only torch backend that currently uses the device index?
        # I don't see anything like `current_device` for cpu or mps.
        if device.type == "cuda":
            device = torch.device(device.type, torch.cuda.current_device())
    return device
fix(diffusers_pipeline): ensure `cuda.get_mem_info` always gets a specific device index. Also tighten up the typing of `device` attributes in general. 2023-02-18 00:29:03 +00:00			`from __future__ import annotations`

isort wip 2 2023-08-18 15:13:28 +00:00			`from contextlib import nullcontext`
fix invokeai_configure script to work with new mm; rename CLIs 2024-02-09 21:42:33 +00:00			`from typing import Literal, Optional, Union`
fix(diffusers_pipeline): ensure `cuda.get_mem_info` always gets a specific device index. Also tighten up the typing of `device` attributes in general. 2023-02-18 00:29:03 +00:00
add support for Apple hardware using MPS acceleration 2022-08-31 04:33:23 +00:00			`import torch`
Refactoring simplet2i (#387) * start refactoring -not yet functional * first phase of refactor done - not sure weighted prompts working * Second phase of refactoring. Everything mostly working. * The refactoring has moved all the hard-core inference work into ldm.dream.generator., where there are submodules for txt2img and img2img. inpaint will go in there as well. Some additional refactoring will be done soon, but relatively minor work. * fix -save_orig flag to actually work * add @neonsecret attention.py memory optimization * remove unneeded imports * move token logging into conditioning.py * add placeholder version of inpaint; porting in progress * fix crash in img2img * inpainting working; not tested on variations * fix crashes in img2img * ported attention.py memory optimization #117 from basujindal branch * added @torch_no_grad() decorators to img2img, txt2img, inpaint closures * Final commit prior to PR against development * fixup crash when generating intermediate images in web UI * rename ldm.simplet2i to ldm.generate * add backward-compatibility simplet2i shell with deprecation warning * add back in mps exception, addresses @vargol comment in #354 * replaced Conditioning class with exported functions * fix wrong type of with_variations attribute during intialization * changed "image_iterator()" to "get_make_image()" * raise NotImplementedError for calling get_make_image() in parent class * Update ldm/generate.py better error message Co-authored-by: Kevin Gibbons <bakkot@gmail.com> * minor stylistic fixes and assertion checks from code review * moved get_noise() method into img2img class * break get_noise() into two methods, one for txt2img and the other for img2img * inpainting works on non-square images now * make get_noise() an abstract method in base class * much improved inpainting Co-authored-by: Kevin Gibbons <bakkot@gmail.com> 2022-09-06 00:40:10 +00:00			`from torch import autocast`
isort wip 2 2023-08-18 15:13:28 +00:00
feat(backend): clean up choose_precision - Allow user-defined precision on MPS. - Use more explicit logic to handle all possible cases. - Add comments. - Remove the app_config args (they were effectively unused, just get the config using the singleton getter util) 2024-04-07 04:28:29 +00:00			`from invokeai.app.services.config.config_default import PRECISION, get_config`
add support for Apple hardware using MPS acceleration 2022-08-31 04:33:23 +00:00
fix(diffusers_pipeline): ensure `cuda.get_mem_info` always gets a specific device index. Also tighten up the typing of `device` attributes in general. 2023-02-18 00:29:03 +00:00			`CPU_DEVICE = torch.device("cpu")`
all files migrated; tweaks needed 2023-03-03 05:02:15 +00:00			`CUDA_DEVICE = torch.device("cuda")`
			`MPS_DEVICE = torch.device("mps")`
all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00
Apply black 2023-07-27 14:54:01 +00:00
fix(diffusers_pipeline): ensure `cuda.get_mem_info` always gets a specific device index. Also tighten up the typing of `device` attributes in general. 2023-02-18 00:29:03 +00:00			`def choose_torch_device() -> torch.device:`
all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00			`"""Convenience routine for guessing which GPU device to run model on"""`
partially address --root CLI argument handling - fix places where `get_config()` is being called at import time rather than at run time. - add regression test for import time get_config() calling. 2024-03-17 02:25:19 +00:00			`config = get_config()`
fix(config): drop nonexistent `config.use_cpu` setting 2024-03-11 11:57:31 +00:00			`if config.device == "auto":`
refactor InvokeAIAppConfig 2023-08-17 17:47:26 +00:00			`if torch.cuda.is_available():`
			`return torch.device("cuda")`
			`if hasattr(torch.backends, "mps") and torch.backends.mps.is_available():`
			`return torch.device("mps")`
			`else:`
			`return CPU_DEVICE`
			`else:`
			`return torch.device(config.device)`
add support for Apple hardware using MPS acceleration 2022-08-31 04:33:23 +00:00
chore: ruff 2024-03-27 06:32:08 +00:00
feat: display torch device on startup This functionality disappeared at some point. 2024-03-27 06:28:06 +00:00			`def get_torch_device_name() -> str:`
			`device = choose_torch_device()`
			`return torch.cuda.get_device_name(device) if device.type == "cuda" else device.type.upper()`
all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00
chore: ruff 2024-03-27 06:32:08 +00:00
feat(backend): clean up choose_precision - Allow user-defined precision on MPS. - Use more explicit logic to handle all possible cases. - Add comments. - Remove the app_config args (they were effectively unused, just get the config using the singleton getter util) 2024-04-07 04:28:29 +00:00			`def choose_precision(device: torch.device) -> Literal["float32", "float16", "bfloat16"]:`
Multiple refinements on loaders: - Cache stat collection enabled. - Implemented ONNX loading. - Add ability to specify the repo version variant in installer CLI. - If caller asks for a repo version that doesn't exist, will fall back to empty version rather than raising an error. 2024-02-06 02:55:11 +00:00			`"""Return an appropriate precision for the given torch device."""`
feat(backend): clean up choose_precision - Allow user-defined precision on MPS. - Use more explicit logic to handle all possible cases. - Add comments. - Remove the app_config args (they were effectively unused, just get the config using the singleton getter util) 2024-04-07 04:28:29 +00:00			`app_config = get_config()`
all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00			`if device.type == "cuda":`
Replace --full_precision with --precision that works even if not specified Allowed values are 'auto', 'float32', 'autocast', 'float16'. If not specified or 'auto' a working precision is automatically selected based on the torch device. Context: #526 Deprecated --full_precision / -F Tested on both cuda and cpu by calling scripts/dream.py without arguments and checked the auto configuration worked. With --precision=auto/float32/autocast/float16 it performs as expected, either working or failing with a reasonable error. Also checked Img2Img. 2022-09-17 17:56:25 +00:00			`device_name = torch.cuda.get_device_name(device)`
feat(backend): clean up choose_precision - Allow user-defined precision on MPS. - Use more explicit logic to handle all possible cases. - Add comments. - Remove the app_config args (they were effectively unused, just get the config using the singleton getter util) 2024-04-07 04:28:29 +00:00			`if "GeForce GTX 1660" in device_name or "GeForce GTX 1650" in device_name:`
			`# These GPUs have limited support for float16`
			`return "float32"`
			`elif app_config.precision == "auto" or app_config.precision == "autocast":`
			`# Default to float16 for CUDA devices`
			`return "float16"`
			`else:`
			`# Use the user-defined precision`
			`return app_config.precision`
remove MacOS Sonoma check in devices.py (#5312) * remove MacOS Sonoma check in devices.py As of pytorch 2.1.0, float16 works with our MPS fixes on Sonoma, so the check is no longer needed. * remove unused platform import 2023-12-22 00:42:47 +00:00			`elif device.type == "mps":`
feat(backend): clean up choose_precision - Allow user-defined precision on MPS. - Use more explicit logic to handle all possible cases. - Add comments. - Remove the app_config args (they were effectively unused, just get the config using the singleton getter util) 2024-04-07 04:28:29 +00:00			`if app_config.precision == "auto" or app_config.precision == "autocast":`
			`# Default to float16 for MPS devices`
			`return "float16"`
			`else:`
			`# Use the user-defined precision`
			`return app_config.precision`
			`# CPU / safe fallback`
all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00			`return "float32"`

Replace --full_precision with --precision that works even if not specified Allowed values are 'auto', 'float32', 'autocast', 'float16'. If not specified or 'auto' a working precision is automatically selected based on the torch device. Context: #526 Deprecated --full_precision / -F Tested on both cuda and cpu by calling scripts/dream.py without arguments and checked the auto configuration worked. With --precision=auto/float32/autocast/float16 it performs as expected, either working or failing with a reasonable error. Also checked Img2Img. 2022-09-17 17:56:25 +00:00
feat(backend): clean up choose_precision - Allow user-defined precision on MPS. - Use more explicit logic to handle all possible cases. - Add comments. - Remove the app_config args (they were effectively unused, just get the config using the singleton getter util) 2024-04-07 04:28:29 +00:00			`def torch_dtype(device: Optional[torch.device] = None) -> torch.dtype:`
model loading and conversion implemented for vaes 2024-02-04 03:55:09 +00:00			`device = device or choose_torch_device()`
feat(backend): clean up choose_precision - Allow user-defined precision on MPS. - Use more explicit logic to handle all possible cases. - Add comments. - Remove the app_config args (they were effectively unused, just get the config using the singleton getter util) 2024-04-07 04:28:29 +00:00			`precision = choose_precision(device)`
Allow bfloat16 to be configurable in invoke.yaml (#5469) * feat: allow bfloat16 to be configurable in invoke.yaml * fix: `torch_dtype()` util - Use `choose_precision` to get the precision string - Do not reference deprecated `config.full_precision` flat (why does this still exist?), if a user had this enabled it would override their actual precision setting and potentially cause a lot of confusion. --------- Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com> 2024-01-12 18:40:37 +00:00			`if precision == "float16":`
			`return torch.float16`
			`if precision == "bfloat16":`
			`return torch.bfloat16`
do not use autocast for diffusers - All tensors in diffusers code path are now set explicitly to float32 or float16, depending on the --precision flag. - autocast is still used in the ckpt path, since it is being deprecated. 2023-01-17 00:32:06 +00:00			`else:`
Allow bfloat16 to be configurable in invoke.yaml (#5469) * feat: allow bfloat16 to be configurable in invoke.yaml * fix: `torch_dtype()` util - Use `choose_precision` to get the precision string - Do not reference deprecated `config.full_precision` flat (why does this still exist?), if a user had this enabled it would override their actual precision setting and potentially cause a lot of confusion. --------- Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com> 2024-01-12 18:40:37 +00:00			`# "auto", "autocast", "float32"`
do not use autocast for diffusers - All tensors in diffusers code path are now set explicitly to float32 or float16, depending on the --precision flag. - autocast is still used in the ckpt path, since it is being deprecated. 2023-01-17 00:32:06 +00:00			`return torch.float32`

all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00
feat(backend): clean up choose_precision - Allow user-defined precision on MPS. - Use more explicit logic to handle all possible cases. - Add comments. - Remove the app_config args (they were effectively unused, just get the config using the singleton getter util) 2024-04-07 04:28:29 +00:00			`def choose_autocast(precision: PRECISION):`
all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00			`"""Returns an autocast context or nullcontext for the given precision string"""`
Replace --full_precision with --precision that works even if not specified Allowed values are 'auto', 'float32', 'autocast', 'float16'. If not specified or 'auto' a working precision is automatically selected based on the torch device. Context: #526 Deprecated --full_precision / -F Tested on both cuda and cpu by calling scripts/dream.py without arguments and checked the auto configuration worked. With --precision=auto/float32/autocast/float16 it performs as expected, either working or failing with a reasonable error. Also checked Img2Img. 2022-09-17 17:56:25 +00:00			`# float16 currently requires autocast to avoid errors like:`
			`# 'expected scalar type Half but found Float'`
all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00			`if precision == "autocast" or precision == "float16":`
Replace --full_precision with --precision that works even if not specified Allowed values are 'auto', 'float32', 'autocast', 'float16'. If not specified or 'auto' a working precision is automatically selected based on the torch device. Context: #526 Deprecated --full_precision / -F Tested on both cuda and cpu by calling scripts/dream.py without arguments and checked the auto configuration worked. With --precision=auto/float32/autocast/float16 it performs as expected, either working or failing with a reasonable error. Also checked Img2Img. 2022-09-17 17:56:25 +00:00			`return autocast`
			`return nullcontext`
fix(diffusers_pipeline): ensure `cuda.get_mem_info` always gets a specific device index. Also tighten up the typing of `device` attributes in general. 2023-02-18 00:29:03 +00:00
all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00
restore 3.9 compatibility by replacing \| with Union[] 2023-07-03 14:55:04 +00:00			`def normalize_device(device: Union[str, torch.device]) -> torch.device:`
fix(diffusers_pipeline): ensure `cuda.get_mem_info` always gets a specific device index. Also tighten up the typing of `device` attributes in general. 2023-02-18 00:29:03 +00:00			`"""Ensure device has a device index defined, if appropriate."""`
			`device = torch.device(device)`
			`if device.index is None:`
			`# cuda might be the only torch backend that currently uses the device index?`
			# I don't see anything like `current_device` for cpu or mps.
all vestiges of ldm.invoke removed 2023-03-03 06:02:00 +00:00			`if device.type == "cuda":`
fix(diffusers_pipeline): ensure `cuda.get_mem_info` always gets a specific device index. Also tighten up the typing of `device` attributes in general. 2023-02-18 00:29:03 +00:00			`device = torch.device(device.type, torch.cuda.current_device())`
			`return device`