Merge branch 'main' into feat/refactor_generation_backend

2024-08-30 20:32:17 +00:00 · 2023-08-11 20:53:38 +12:00 · 2023-08-11 20:53:38 +12:00 · 231e665675
commit 231e665675
parent e9ec5ab85c 80fd4c2176
33 changed files with 385 additions and 187 deletions
--- a/README.md
+++ b/README.md
@ -161,7 +161,7 @@ the command `npm install -g yarn` if needed)
    _For Windows/Linux with an NVIDIA GPU:_

    ```terminal
-    pip install "InvokeAI[xformers]" --use-pep517 --extra-index-url https://download.pytorch.org/whl/cu117
+    pip install "InvokeAI[xformers]" --use-pep517 --extra-index-url https://download.pytorch.org/whl/cu118
    ```

    _For Linux with an AMD GPU:_
--- a/docs/installation/010_INSTALL_AUTOMATED.md
+++ b/docs/installation/010_INSTALL_AUTOMATED.md
@ -471,7 +471,7 @@ Then type the following commands:

 === "NVIDIA System"
    ```bash
-    pip install torch torchvision --force-reinstall --extra-index-url https://download.pytorch.org/whl/cu117
+    pip install torch torchvision --force-reinstall --extra-index-url https://download.pytorch.org/whl/cu118
    pip install xformers
    ```

--- a/docs/installation/020_INSTALL_MANUAL.md
+++ b/docs/installation/020_INSTALL_MANUAL.md
@ -148,7 +148,7 @@ manager, please follow these steps:
    === "CUDA (NVidia)"

        ```bash
-        pip install "InvokeAI[xformers]" --use-pep517 --extra-index-url https://download.pytorch.org/whl/cu117
+        pip install "InvokeAI[xformers]" --use-pep517 --extra-index-url https://download.pytorch.org/whl/cu118
        ```

    === "ROCm (AMD)"
@ -312,7 +312,7 @@ installation protocol (important!)

    === "CUDA (NVidia)"
        ```bash
-        pip install -e .[xformers] --use-pep517 --extra-index-url https://download.pytorch.org/whl/cu117
+        pip install -e .[xformers] --use-pep517 --extra-index-url https://download.pytorch.org/whl/cu118
        ```

    === "ROCm (AMD)"
@ -356,7 +356,7 @@ you can do so using this unsupported recipe:
 mkdir ~/invokeai
 conda create -n invokeai python=3.10
 conda activate invokeai
-pip install InvokeAI[xformers] --use-pep517 --extra-index-url https://download.pytorch.org/whl/cu117
+pip install InvokeAI[xformers] --use-pep517 --extra-index-url https://download.pytorch.org/whl/cu118
 invokeai-configure --root ~/invokeai
 invokeai --root ~/invokeai --web
 ```
--- a/docs/installation/030_INSTALL_CUDA_AND_ROCM.md
+++ b/docs/installation/030_INSTALL_CUDA_AND_ROCM.md
@ -34,11 +34,11 @@ directly from NVIDIA. **Do not try to install Ubuntu's
 nvidia-cuda-toolkit package. It is out of date and will cause
 conflicts among the NVIDIA driver and binaries.**

-Go to [CUDA Toolkit 11.7
-Downloads](https://developer.nvidia.com/cuda-11-7-0-download-archive),
-and use the target selection wizard to choose your operating system,
-hardware platform, and preferred installation method (e.g. "local"
-versus "network").
+Go to [CUDA Toolkit
+Downloads](https://developer.nvidia.com/cuda-downloads), and use the
+target selection wizard to choose your operating system, hardware
+platform, and preferred installation method (e.g. "local" versus
+"network").

 This will provide you with a downloadable install file or, depending
 on your choices, a recipe for downloading and running a install shell
@ -61,7 +61,7 @@ Runtime Site](https://developer.nvidia.com/nvidia-container-runtime)

 When installing torch and torchvision manually with `pip`, remember to provide
 the argument `--extra-index-url
-https://download.pytorch.org/whl/cu117` as described in the [Manual
+https://download.pytorch.org/whl/cu118` as described in the [Manual
 Installation Guide](020_INSTALL_MANUAL.md).

 ## :simple-amd: ROCm
--- a/docs/installation/070_INSTALL_XFORMERS.md
+++ b/docs/installation/070_INSTALL_XFORMERS.md
@ -28,18 +28,21 @@ command line, then just be sure to activate it's virtual environment.
 Then run the following three commands:

 ```sh
-pip install xformers==0.0.16rc425
-pip install triton
+pip install xformers~=0.0.19
+pip install triton    # WON'T WORK ON WINDOWS
 python -m xformers.info output
 ```

 The first command installs `xformers`, the second installs the
 `triton` training accelerator, and the third prints out the `xformers`
-installation status. If all goes well, you'll see a report like the
+installation status. On Windows, please omit the `triton` package,
+which is not available on that platform.
+
+If all goes well, you'll see a report like the
 following:

 ```sh
-xFormers 0.0.16rc425
+xFormers 0.0.20
 memory_efficient_attention.cutlassF:               available
 memory_efficient_attention.cutlassB:               available
 memory_efficient_attention.flshattF:               available
@ -48,22 +51,28 @@ memory_efficient_attention.smallkF:                available
 memory_efficient_attention.smallkB:                available
 memory_efficient_attention.tritonflashattF:        available
 memory_efficient_attention.tritonflashattB:        available
+indexing.scaled_index_addF:                        available
+indexing.scaled_index_addB:                        available
+indexing.index_select:                             available
+swiglu.dual_gemm_silu:                             available
+swiglu.gemm_fused_operand_sum:                     available
 swiglu.fused.p.cpp:                                available
 is_triton_available:                               True
 is_functorch_available:                            False
-pytorch.version:                                   1.13.1+cu117
+pytorch.version:                                   2.0.1+cu118
 pytorch.cuda:                                      available
-gpu.compute_capability:                            8.6
-gpu.name:                                          NVIDIA RTX A2000 12GB
+gpu.compute_capability:                            8.9
+gpu.name:                                          NVIDIA GeForce RTX 4070
 build.info:                                        available
-build.cuda_version:                                1107
-build.python_version:                              3.10.9
-build.torch_version:                               1.13.1+cu117
+build.cuda_version:                                1108
+build.python_version:                              3.10.11
+build.torch_version:                               2.0.1+cu118
 build.env.TORCH_CUDA_ARCH_LIST:                    5.0+PTX 6.0 6.1 7.0 7.5 8.0 8.6
 build.env.XFORMERS_BUILD_TYPE:                     Release
 build.env.XFORMERS_ENABLE_DEBUG_ASSERTIONS:        None
 build.env.NVCC_FLAGS:                              None
-build.env.XFORMERS_PACKAGE_FROM:                   wheel-v0.0.16rc425
+build.env.XFORMERS_PACKAGE_FROM:                   wheel-v0.0.20
+build.nvcc_version:                                11.8.89
 source.privacy:                                    open source
 ```

@ -83,14 +92,14 @@ installed from source. These instructions were written for a system
 running Ubuntu 22.04, but other Linux distributions should be able to
 adapt this recipe.

-#### 1. Install CUDA Toolkit 11.7
+#### 1. Install CUDA Toolkit 11.8

 You will need the CUDA developer's toolkit in order to compile and
 install xFormers. **Do not try to install Ubuntu's nvidia-cuda-toolkit
 package.** It is out of date and will cause conflicts among the NVIDIA
 driver and binaries. Instead install the CUDA Toolkit package provided
-by NVIDIA itself. Go to [CUDA Toolkit 11.7
-Downloads](https://developer.nvidia.com/cuda-11-7-0-download-archive)
+by NVIDIA itself. Go to [CUDA Toolkit 11.8
+Downloads](https://developer.nvidia.com/cuda-11-8-0-download-archive)
 and use the target selection wizard to choose your platform and Linux
 distribution. Select an installer type of "runfile (local)" at the
 last step.
@ -101,17 +110,17 @@ example, the install script recipe for Ubuntu 22.04 running on a
 x86_64 system is:

 ```
-wget https://developer.download.nvidia.com/compute/cuda/11.7.0/local_installers/cuda_11.7.0_515.43.04_linux.run
-sudo sh cuda_11.7.0_515.43.04_linux.run
+wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda_11.8.0_520.61.05_linux.run
+sudo sh cuda_11.8.0_520.61.05_linux.run
 ```

 Rather than cut-and-paste this example, We recommend that you walk
 through the toolkit wizard in order to get the most up to date
 installer for your system.

-#### 2. Confirm/Install pyTorch 1.13 with CUDA 11.7 support
+#### 2. Confirm/Install pyTorch 2.01 with CUDA 11.8 support

-If you are using InvokeAI 2.3 or higher, these will already be
+If you are using InvokeAI 3.0.2 or higher, these will already be
 installed. If not, you can check whether you have the needed libraries
 using a quick command. Activate the invokeai virtual environment,
 either by entering the "developer's console", or manually with a
@ -124,7 +133,7 @@ Then run the command:
 python -c 'exec("import torch\nprint(torch.__version__)")'
 ```

-If it prints __1.13.1+cu117__ you're good. If not, you can install the
+If it prints __1.13.1+cu118__ you're good. If not, you can install the
 most up to date libraries with this command:

 ```sh
--- a/installer/lib/installer.py
+++ b/installer/lib/installer.py
@ -348,7 +348,7 @@ class InvokeAiInstance:

        introduction()

-        from invokeai.frontend.install import invokeai_configure
+        from invokeai.frontend.install.invokeai_configure import invokeai_configure

        # NOTE: currently the config script does its own arg parsing! this means the command-line switches
        # from the installer will also automatically propagate down to the config script.
@ -463,10 +463,10 @@ def get_torch_source() -> (Union[str, None], str):
            url = "https://download.pytorch.org/whl/cpu"

    if device == "cuda":
-        url = "https://download.pytorch.org/whl/cu117"
+        url = "https://download.pytorch.org/whl/cu118"
        optional_modules = "[xformers,onnx-cuda]"
    if device == "cuda_and_dml":
-        url = "https://download.pytorch.org/whl/cu117"
+        url = "https://download.pytorch.org/whl/cu118"
        optional_modules = "[xformers,onnx-directml]"

    # in all other cases, Torch wheels should be coming from PyPi as of Torch 1.13
--- a/invokeai/app/api/routers/models.py
+++ b/invokeai/app/api/routers/models.py
@ -104,8 +104,12 @@ async def update_model(
            ):  # model manager moved model path during rename - don't overwrite it
                info.path = new_info.get("path")

+        # replace empty string values with None/null to avoid phenomenon of vae: ''
+        info_dict = info.dict()
+        info_dict = {x: info_dict[x] if info_dict[x] else None for x in info_dict.keys()}
+
        ApiDependencies.invoker.services.model_manager.update_model(
-            model_name=model_name, base_model=base_model, model_type=model_type, model_attributes=info.dict()
+            model_name=model_name, base_model=base_model, model_type=model_type, model_attributes=info_dict
        )

        model_raw = ApiDependencies.invoker.services.model_manager.list_model(
--- a/invokeai/app/invocations/metadata.py
+++ b/invokeai/app/invocations/metadata.py
@ -2,6 +2,7 @@ from typing import Literal, Optional, Union

 from pydantic import Field

+from ...version import __version__
 from invokeai.app.invocations.baseinvocation import (
    BaseInvocation,
    BaseInvocationOutput,
@ -23,6 +24,7 @@ class LoRAMetadataField(BaseModelExcludeNull):
 class CoreMetadata(BaseModelExcludeNull):
    """Core generation metadata for an image generated in InvokeAI."""

+    app_version: str = Field(default=__version__, description="The version of InvokeAI used to generate this image")
    generation_mode: str = Field(
        description="The generation mode that output this image",
    )
--- a/invokeai/backend/install/invokeai_configure.py
+++ b/invokeai/backend/install/invokeai_configure.py
@ -21,7 +21,6 @@ from argparse import Namespace
 from enum import Enum
 from pathlib import Path
 from shutil import get_terminal_size
-from typing import get_type_hints
 from urllib import request

 import npyscreen
@ -396,13 +395,23 @@ Use cursor arrows to make a checkbox selection, and space to toggle.
            max_width=80,
            scroll_exit=True,
        )
-        self.max_cache_size = self.add_widget_intelligent(
-            IntTitleSlider,
+        self.nextrely += 1
+        self.add_widget_intelligent(
+            npyscreen.TitleFixedText,
            name="RAM cache size (GB). Make this at least large enough to hold a single full model.",
-            value=old_opts.max_cache_size,
-            out_of=MAX_RAM,
-            lowest=3,
-            begin_entry_at=6,
+            begin_entry_at=0,
+            editable=False,
+            color="CONTROL",
+            scroll_exit=True,
+        )
+        self.nextrely -= 1
+        self.max_cache_size = self.add_widget_intelligent(
+            npyscreen.Slider,
+            value=clip(old_opts.max_cache_size, range=(3.0, MAX_RAM), step=0.5),
+            out_of=round(MAX_RAM),
+            lowest=0.0,
+            step=0.5,
+            relx=8,
            scroll_exit=True,
        )
        if HAS_CUDA:
@ -418,7 +427,7 @@ Use cursor arrows to make a checkbox selection, and space to toggle.
            self.nextrely -= 1
            self.max_vram_cache_size = self.add_widget_intelligent(
                npyscreen.Slider,
-                value=old_opts.max_vram_cache_size,
+                value=clip(old_opts.max_vram_cache_size, range=(0, MAX_VRAM), step=0.25),
                out_of=round(MAX_VRAM * 2) / 2,
                lowest=0.0,
                relx=8,
@ -596,6 +605,16 @@ def default_user_selections(program_opts: Namespace) -> InstallSelections:
    )


+# -------------------------------------
+def clip(value: float, range: tuple[float, float], step: float) -> float:
+    minimum, maximum = range
+    if value < minimum:
+        value = minimum
+    if value > maximum:
+        value = maximum
+    return round(value / step) * step
+
+
 # -------------------------------------
 def initialize_rootdir(root: Path, yes_to_all: bool = False):
    logger.info("Initializing InvokeAI runtime directory")
--- a/invokeai/backend/install/migrate_to_3.py
+++ b/invokeai/backend/install/migrate_to_3.py
@ -591,7 +591,6 @@ script, which will perform a full upgrade in place.""",
    # TODO: revisit - don't rely on invokeai.yaml to exist yet!
    dest_is_setup = (dest_root / "models/core").exists() and (dest_root / "databases").exists()
    if not dest_is_setup:
-        import invokeai.frontend.install.invokeai_configure
        from invokeai.backend.install.invokeai_configure import initialize_rootdir

        initialize_rootdir(dest_root, True)
--- a/invokeai/backend/model_management/lora.py
+++ b/invokeai/backend/model_management/lora.py
@ -143,7 +143,7 @@ class ModelPatcher:
                        # with torch.autocast(device_type="cpu"):
                        layer.to(dtype=torch.float32)
                        layer_scale = layer.alpha / layer.rank if (layer.alpha and layer.rank) else 1.0
-                        layer_weight = layer.get_weight() * lora_weight * layer_scale
+                        layer_weight = layer.get_weight(original_weights[module_key]) * lora_weight * layer_scale

                        if module.weight.shape != layer_weight.shape:
                            # TODO: debug on lycoris
@ -361,7 +361,8 @@ class ONNXModelPatcher:

                    layer.to(dtype=torch.float32)
                    layer_key = layer_key.replace(prefix, "")
-                    layer_weight = layer.get_weight().detach().cpu().numpy() * lora_weight
+                    # TODO: rewrite to pass original tensor weight(required by ia3)
+                    layer_weight = layer.get_weight(None).detach().cpu().numpy() * lora_weight
                    if layer_key is blended_loras:
                        blended_loras[layer_key] += layer_weight
                    else:
--- a/invokeai/backend/model_management/model_manager.py
+++ b/invokeai/backend/model_management/model_manager.py
@ -526,7 +526,7 @@ class ModelManager(object):
        # Does the config explicitly override the submodel?
        if submodel_type is not None and hasattr(model_config, submodel_type):
            submodel_path = getattr(model_config, submodel_type)
-            if submodel_path is not None:
+            if submodel_path is not None and len(submodel_path) > 0:
                model_path = getattr(model_config, submodel_type)
                is_submodel_override = True

--- a/invokeai/backend/model_management/model_probe.py
+++ b/invokeai/backend/model_management/model_probe.py
@ -17,6 +17,7 @@ from .models import (
    SilenceWarnings,
    InvalidModelException,
 )
+from .util import lora_token_vector_length
 from .models.base import read_checkpoint_meta


@ -315,38 +316,16 @@ class LoRACheckpointProbe(CheckpointProbeBase):

    def get_base_type(self) -> BaseModelType:
        checkpoint = self.checkpoint
+        token_vector_length = lora_token_vector_length(checkpoint)

-        # SD-2 models are very hard to probe. These probes are brittle and likely to fail in the future
-        # There are also some "SD-2 LoRAs" that have identical keys and shapes to SD-1 and will be
-        # misclassified as SD-1
-        key = "lora_te_text_model_encoder_layers_0_mlp_fc1.lora_down.weight"
-        if key in checkpoint and checkpoint[key].shape[0] == 320:
-            return BaseModelType.StableDiffusion2
-
-        key = "lora_unet_output_blocks_5_1_transformer_blocks_1_ff_net_2.lora_up.weight"
-        if key in checkpoint:
-            return BaseModelType.StableDiffusionXL
-
-        key1 = "lora_te_text_model_encoder_layers_0_mlp_fc1.lora_down.weight"
-        key2 = "lora_te_text_model_encoder_layers_0_self_attn_k_proj.lora_down.weight"
-        key3 = "lora_te_text_model_encoder_layers_0_self_attn_k_proj.hada_w1_a"
-
-        lora_token_vector_length = (
-            checkpoint[key1].shape[1]
-            if key1 in checkpoint
-            else checkpoint[key2].shape[1]
-            if key2 in checkpoint
-            else checkpoint[key3].shape[0]
-            if key3 in checkpoint
-            else None
-        )
-
-        if lora_token_vector_length == 768:
+        if token_vector_length == 768:
            return BaseModelType.StableDiffusion1
-        elif lora_token_vector_length == 1024:
+        elif token_vector_length == 1024:
            return BaseModelType.StableDiffusion2
+        elif token_vector_length == 2048:
+            return BaseModelType.StableDiffusionXL
        else:
-            raise InvalidModelException(f"Unknown LoRA type")
+            raise InvalidModelException(f"Unknown LoRA type: {self.checkpoint_path}")


 class TextualInversionCheckpointProbe(CheckpointProbeBase):
--- a/invokeai/backend/model_management/models/lora.py
+++ b/invokeai/backend/model_management/models/lora.py
@ -122,41 +122,7 @@ class LoRALayerBase:
        self.rank = None  # set in layer implementation
        self.layer_key = layer_key

-    def forward(
-        self,
-        module: torch.nn.Module,
-        input_h: Any,  # for real looks like Tuple[torch.nn.Tensor] but not sure
-        multiplier: float,
-    ):
-        if type(module) == torch.nn.Conv2d:
-            op = torch.nn.functional.conv2d
-            extra_args = dict(
-                stride=module.stride,
-                padding=module.padding,
-                dilation=module.dilation,
-                groups=module.groups,
-            )
-
-        else:
-            op = torch.nn.functional.linear
-            extra_args = {}
-
-        weight = self.get_weight()
-
-        bias = self.bias if self.bias is not None else 0
-        scale = self.alpha / self.rank if (self.alpha and self.rank) else 1.0
-        return (
-            op(
-                *input_h,
-                (weight + bias).view(module.weight.shape),
-                None,
-                **extra_args,
-            )
-            * multiplier
-            * scale
-        )
-
-    def get_weight(self):
+    def get_weight(self, orig_weight: torch.Tensor):
        raise NotImplementedError()

    def calc_size(self) -> int:
@ -197,7 +163,7 @@ class LoRALayer(LoRALayerBase):

        self.rank = self.down.shape[0]

-    def get_weight(self):
+    def get_weight(self, orig_weight: torch.Tensor):
        if self.mid is not None:
            up = self.up.reshape(self.up.shape[0], self.up.shape[1])
            down = self.down.reshape(self.down.shape[0], self.down.shape[1])
@ -260,7 +226,7 @@ class LoHALayer(LoRALayerBase):

        self.rank = self.w1_b.shape[0]

-    def get_weight(self):
+    def get_weight(self, orig_weight: torch.Tensor):
        if self.t1 is None:
            weight = (self.w1_a @ self.w1_b) * (self.w2_a @ self.w2_b)

@ -342,7 +308,7 @@ class LoKRLayer(LoRALayerBase):
        else:
            self.rank = None  # unscaled

-    def get_weight(self):
+    def get_weight(self, orig_weight: torch.Tensor):
        w1 = self.w1
        if w1 is None:
            w1 = self.w1_a @ self.w1_b
@ -410,7 +376,7 @@ class FullLayer(LoRALayerBase):

        self.rank = None  # unscaled

-    def get_weight(self):
+    def get_weight(self, orig_weight: torch.Tensor):
        return self.weight

    def calc_size(self) -> int:
@ -428,6 +394,45 @@ class FullLayer(LoRALayerBase):
        self.weight = self.weight.to(device=device, dtype=dtype)


+class IA3Layer(LoRALayerBase):
+    # weight: torch.Tensor
+    # on_input: torch.Tensor
+
+    def __init__(
+        self,
+        layer_key: str,
+        values: dict,
+    ):
+        super().__init__(layer_key, values)
+
+        self.weight = values["weight"]
+        self.on_input = values["on_input"]
+
+        self.rank = None  # unscaled
+
+    def get_weight(self, orig_weight: torch.Tensor):
+        weight = self.weight
+        if not self.on_input:
+            weight = weight.reshape(-1, 1)
+        return orig_weight * weight
+
+    def calc_size(self) -> int:
+        model_size = super().calc_size()
+        model_size += self.weight.nelement() * self.weight.element_size()
+        model_size += self.on_input.nelement() * self.on_input.element_size()
+        return model_size
+
+    def to(
+        self,
+        device: Optional[torch.device] = None,
+        dtype: Optional[torch.dtype] = None,
+    ):
+        super().to(device=device, dtype=dtype)
+
+        self.weight = self.weight.to(device=device, dtype=dtype)
+        self.on_input = self.on_input.to(device=device, dtype=dtype)
+
+
 # TODO: rename all methods used in model logic with Info postfix and remove here Raw postfix
 class LoRAModelRaw:  # (torch.nn.Module):
    _name: str
@ -547,11 +552,15 @@ class LoRAModelRaw:  # (torch.nn.Module):
            elif "lokr_w1_b" in values or "lokr_w1" in values:
                layer = LoKRLayer(layer_key, values)

+            # diff
            elif "diff" in values:
                layer = FullLayer(layer_key, values)

+            # ia3
+            elif "weight" in values and "on_input" in values:
+                layer = IA3Layer(layer_key, values)
+
            else:
-                # TODO: ia3/... format
                print(f">> Encountered unknown lora layer module in {model.name}: {layer_key} - {list(values.keys())}")
                raise Exception("Unknown lora format!")

--- a/invokeai/backend/model_management/util.py
+++ b/invokeai/backend/model_management/util.py
@ -0,0 +1,75 @@
+# Copyright (c) 2023 The InvokeAI Development Team
+"""Utilities used by the Model Manager"""
+
+
+def lora_token_vector_length(checkpoint: dict) -> int:
+    """
+    Given a checkpoint in memory, return the lora token vector length
+
+    :param checkpoint: The checkpoint
+    """
+
+    def _get_shape_1(key, tensor, checkpoint):
+        lora_token_vector_length = None
+
+        if "." not in key:
+            return lora_token_vector_length  # wrong key format
+        model_key, lora_key = key.split(".", 1)
+
+        # check lora/locon
+        if lora_key == "lora_down.weight":
+            lora_token_vector_length = tensor.shape[1]
+
+        # check loha (don't worry about hada_t1/hada_t2 as it used only in 4d shapes)
+        elif lora_key in ["hada_w1_b", "hada_w2_b"]:
+            lora_token_vector_length = tensor.shape[1]
+
+        # check lokr (don't worry about lokr_t2 as it used only in 4d shapes)
+        elif "lokr_" in lora_key:
+            if model_key + ".lokr_w1" in checkpoint:
+                _lokr_w1 = checkpoint[model_key + ".lokr_w1"]
+            elif model_key + "lokr_w1_b" in checkpoint:
+                _lokr_w1 = checkpoint[model_key + ".lokr_w1_b"]
+            else:
+                return lora_token_vector_length  # unknown format
+
+            if model_key + ".lokr_w2" in checkpoint:
+                _lokr_w2 = checkpoint[model_key + ".lokr_w2"]
+            elif model_key + "lokr_w2_b" in checkpoint:
+                _lokr_w2 = checkpoint[model_key + ".lokr_w2_b"]
+            else:
+                return lora_token_vector_length  # unknown format
+
+            lora_token_vector_length = _lokr_w1.shape[1] * _lokr_w2.shape[1]
+
+        elif lora_key == "diff":
+            lora_token_vector_length = tensor.shape[1]
+
+        # ia3 can be detected only by shape[0] in text encoder
+        elif lora_key == "weight" and "lora_unet_" not in model_key:
+            lora_token_vector_length = tensor.shape[0]
+
+        return lora_token_vector_length
+
+    lora_token_vector_length = None
+    lora_te1_length = None
+    lora_te2_length = None
+    for key, tensor in checkpoint.items():
+        if key.startswith("lora_unet_") and ("_attn2_to_k." in key or "_attn2_to_v." in key):
+            lora_token_vector_length = _get_shape_1(key, tensor, checkpoint)
+        elif key.startswith("lora_te") and "_self_attn_" in key:
+            tmp_length = _get_shape_1(key, tensor, checkpoint)
+            if key.startswith("lora_te_"):
+                lora_token_vector_length = tmp_length
+            elif key.startswith("lora_te1_"):
+                lora_te1_length = tmp_length
+            elif key.startswith("lora_te2_"):
+                lora_te2_length = tmp_length
+
+        if lora_te1_length is not None and lora_te2_length is not None:
+            lora_token_vector_length = lora_te1_length + lora_te2_length
+
+        if lora_token_vector_length is not None:
+            break
+
+    return lora_token_vector_length
--- a/invokeai/frontend/install/init.py
+++ b/invokeai/frontend/install/init.py
@ -1,6 +1,3 @@
 """
 Initialization file for invokeai.frontend.config
 """
-from .invokeai_configure import main as invokeai_configure
-from .invokeai_update import main as invokeai_update
-from .model_install import main as invokeai_model_install
--- a/invokeai/frontend/install/invokeai_configure.py
+++ b/invokeai/frontend/install/invokeai_configure.py
@ -1,4 +1,4 @@
 """
 Wrapper for invokeai.backend.configure.invokeai_configure
 """
-from ...backend.install.invokeai_configure import main
+from ...backend.install.invokeai_configure import main as invokeai_configure
--- a/invokeai/frontend/merge/merge_diffusers.py
+++ b/invokeai/frontend/merge/merge_diffusers.py
@ -382,7 +382,8 @@ def run_cli(args: Namespace):

 def main():
    args = _parse_args()
-    config.parse_args(["--root", str(args.root_dir)])
+    if args.root_dir:
+        config.parse_args(["--root", str(args.root_dir)])

    try:
        if args.front_end:
--- a/invokeai/frontend/web/dist/assets/App-fd54b7b9.js
+++ b/invokeai/frontend/web/dist/assets/App-fd54b7b9.js
--- a/invokeai/frontend/web/dist/assets/ThemeLocaleProvider-139ac716.js
+++ b/invokeai/frontend/web/dist/assets/ThemeLocaleProvider-139ac716.js
@ -1,4 +1,4 @@
-import{B as m,g7 as Je,A as y,a5 as Ka,g8 as Xa,af as va,aj as d,g9 as b,ga as t,gb as Ya,gc as h,gd as ua,ge as Ja,gf as Qa,aL as Za,gg as et,ad as rt,gh as at}from"./index-dd054634.js";import{s as fa,n as o,t as tt,o as ha,p as ot,q as ma,v as ga,w as ya,x as it,y as Sa,z as pa,A as xr,B as nt,D as lt,E as st,F as xa,G as $a,H as ka,J as dt,K as _a,L as ct,M as bt,N as vt,O as ut,Q as wa,R as ft,S as ht,T as mt,U as gt,V as yt,W as St,e as pt,X as xt}from"./menu-b42141e3.js";var za=String.raw,Ca=za`
+import{B as m,g7 as Je,A as y,a5 as Ka,g8 as Xa,af as va,aj as d,g9 as b,ga as t,gb as Ya,gc as h,gd as ua,ge as Ja,gf as Qa,aL as Za,gg as et,ad as rt,gh as at}from"./index-815faab3.js";import{s as fa,n as o,t as tt,o as ha,p as ot,q as ma,v as ga,w as ya,x as it,y as Sa,z as pa,A as xr,B as nt,D as lt,E as st,F as xa,G as $a,H as ka,J as dt,K as _a,L as ct,M as bt,N as vt,O as ut,Q as wa,R as ft,S as ht,T as mt,U as gt,V as yt,W as St,e as pt,X as xt}from"./menu-e9f8a36e.js";var za=String.raw,Ca=za`
  :root,
  :host {
    --chakra-vh: 100vh;
--- a/invokeai/frontend/web/dist/assets/index-815faab3.js
+++ b/invokeai/frontend/web/dist/assets/index-815faab3.js
--- a/invokeai/frontend/web/dist/assets/menu-e9f8a36e.js
+++ b/invokeai/frontend/web/dist/assets/menu-e9f8a36e.js
--- a/invokeai/frontend/web/dist/index.html
+++ b/invokeai/frontend/web/dist/index.html
@ -12,7 +12,7 @@
        margin: 0;
      }
    </style>
-    <script type="module" crossorigin src="./assets/index-dd054634.js"></script>
+    <script type="module" crossorigin src="./assets/index-815faab3.js"></script>
  </head>

  <body dir="ltr">
--- a/invokeai/frontend/web/src/app/store/middleware/listenerMiddleware/listeners/tabChanged.ts
+++ b/invokeai/frontend/web/src/app/store/middleware/listenerMiddleware/listeners/tabChanged.ts
@ -1,55 +1,58 @@
 import { modelChanged } from 'features/parameters/store/generationSlice';
 import { setActiveTab } from 'features/ui/store/uiSlice';
-import { forEach } from 'lodash-es';
 import { NON_REFINER_BASE_MODELS } from 'services/api/constants';
-import {
-  MainModelConfigEntity,
-  modelsApi,
-} from 'services/api/endpoints/models';
+import { mainModelsAdapter, modelsApi } from 'services/api/endpoints/models';
 import { startAppListening } from '..';

 export const addTabChangedListener = () => {
  startAppListening({
    actionCreator: setActiveTab,
-    effect: (action, { getState, dispatch }) => {
+    effect: async (action, { getState, dispatch }) => {
      const activeTabName = action.payload;
      if (activeTabName === 'unifiedCanvas') {
-        // grab the models from RTK Query cache
-        const { data } = modelsApi.endpoints.getMainModels.select(
-          NON_REFINER_BASE_MODELS
-        )(getState());
+        const currentBaseModel = getState().generation.model?.base_model;

-        if (!data) {
-          // no models yet, so we can't do anything
-          dispatch(modelChanged(null));
+        if (currentBaseModel && ['sd-1', 'sd-2'].includes(currentBaseModel)) {
+          // if we're already on a valid model, no change needed
          return;
        }

-        // need to filter out all the invalid canvas models (currently, this is just sdxl)
-        const validCanvasModels: MainModelConfigEntity[] = [];
+        try {
+          // just grab fresh models
+          const modelsRequest = dispatch(
+            modelsApi.endpoints.getMainModels.initiate(NON_REFINER_BASE_MODELS)
+          );
+          const models = await modelsRequest.unwrap();
+          // cancel this cache subscription
+          modelsRequest.unsubscribe();

-        forEach(data.entities, (entity) => {
-          if (!entity) {
+          if (!models.ids.length) {
+            // no valid canvas models
+            dispatch(modelChanged(null));
            return;
          }
-          if (['sd-1', 'sd-2'].includes(entity.base_model)) {
-            validCanvasModels.push(entity);
+
+          // need to filter out all the invalid canvas models (currently sdxl & refiner)
+          const validCanvasModels = mainModelsAdapter
+            .getSelectors()
+            .selectAll(models)
+            .filter((model) => ['sd-1', 'sd-2'].includes(model.base_model));
+
+          const firstValidCanvasModel = validCanvasModels[0];
+
+          if (!firstValidCanvasModel) {
+            // no valid canvas models
+            dispatch(modelChanged(null));
+            return;
          }
-        });

-        // this could still be undefined even tho TS doesn't say so
-        const firstValidCanvasModel = validCanvasModels[0];
+          const { base_model, model_name, model_type } = firstValidCanvasModel;

-        if (!firstValidCanvasModel) {
-          // uh oh, we have no models that are valid for canvas
+          dispatch(modelChanged({ base_model, model_name, model_type }));
+        } catch {
+          // network request failed, bail
          dispatch(modelChanged(null));
-          return;
        }
-
-        // only store the model name and base model in redux
-        const { base_model, model_name, model_type } = firstValidCanvasModel;
-
-        dispatch(modelChanged({ base_model, model_name, model_type }));
      }
    },
  });
--- a/invokeai/frontend/web/src/features/lora/components/ParamLoraSelect.tsx
+++ b/invokeai/frontend/web/src/features/lora/components/ParamLoraSelect.tsx
@ -54,12 +54,7 @@ const ParamLoRASelect = () => {
      });
    });

-    // Sort Alphabetically
-    data.sort((a, b) =>
-      a.label && b.label ? (a.label?.localeCompare(b.label) ? 1 : -1) : -1
-    );
-
-    return data.sort((a, b) => (a.disabled && !b.disabled ? -1 : 1));
+    return data.sort((a, b) => (a.disabled && !b.disabled ? 1 : -1));
  }, [loras, loraModels, currentMainModel?.base_model]);

  const handleChange = useCallback(
--- a/invokeai/frontend/web/src/features/system/store/systemSlice.ts
+++ b/invokeai/frontend/web/src/features/system/store/systemSlice.ts
@ -365,12 +365,19 @@ export const systemSlice = createSlice({
      state.statusTranslationKey = 'common.statusConnected';
      state.progressImage = null;

+      let errorDescription = undefined;
+
+      if (action.payload?.status === 422) {
+        errorDescription = 'Validation Error';
+      } else if (action.payload?.error) {
+        errorDescription = action.payload?.error as string;
+      }
+
      state.toastQueue.push(
        makeToast({
          title: t('toast.serverError'),
          status: 'error',
-          description:
-            action.payload?.status === 422 ? 'Validation Error' : undefined,
+          description: errorDescription,
        })
      );
    });
--- a/invokeai/frontend/web/src/services/api/thunks/session.ts
+++ b/invokeai/frontend/web/src/services/api/thunks/session.ts
@ -60,6 +60,9 @@ type InvokedSessionThunkConfig = {
 const isErrorWithStatus = (error: unknown): error is { status: number } =>
  isObject(error) && 'status' in error;

+const isErrorWithDetail = (error: unknown): error is { detail: string } =>
+  isObject(error) && 'detail' in error;
+
 /**
 * `SessionsService.invokeSession()` thunk
 */
@ -85,7 +88,15 @@ export const sessionInvoked = createAsyncThunk<
        error: (error as any).body.detail,
      });
    }
-    return rejectWithValue({ arg, status: response.status, error });
+    if (isErrorWithDetail(error) && response.status === 403) {
+      return rejectWithValue({
+        arg,
+        status: response.status,
+        error: error.detail
+      });
+    }
+    if (error)
+      return rejectWithValue({ arg, status: response.status, error });
  }
 });

--- a/invokeai/version/invokeai_version.py
+++ b/invokeai/version/invokeai_version.py
@ -1 +1 @@
-__version__ = "3.0.2rc1"
+__version__ = "3.0.2"
--- a/pyproject.toml
+++ b/pyproject.toml
@ -118,7 +118,7 @@ dependencies = [
 [project.scripts]

 # legacy entrypoints; provided for backwards compatibility
-"configure_invokeai.py" = "invokeai.frontend.install:invokeai_configure"
+"configure_invokeai.py" = "invokeai.frontend.install.invokeai_configure:invokeai_configure"
 "textual_inversion.py" = "invokeai.frontend.training:invokeai_textual_inversion"

 # shortcut commands to start cli and web
@ -130,12 +130,12 @@ dependencies = [
 "invokeai-web" = "invokeai.app.api_app:invoke_api"

 # full commands
-"invokeai-configure" = "invokeai.frontend.install:invokeai_configure"
+"invokeai-configure" = "invokeai.frontend.install.invokeai_configure:invokeai_configure"
 "invokeai-merge" = "invokeai.frontend.merge:invokeai_merge_diffusers"
 "invokeai-ti" = "invokeai.frontend.training:invokeai_textual_inversion"
-"invokeai-model-install" = "invokeai.frontend.install:invokeai_model_install"
+"invokeai-model-install" = "invokeai.frontend.install.model_install:main"
 "invokeai-migrate3" = "invokeai.backend.install.migrate_to_3:main"
-"invokeai-update" = "invokeai.frontend.install:invokeai_update"
+"invokeai-update" = "invokeai.frontend.install.invokeai_update:main"
 "invokeai-metadata" = "invokeai.frontend.CLI.sd_metadata:print_metadata"
 "invokeai-node-cli" = "invokeai.app.cli_app:invoke_cli"
 "invokeai-node-web" = "invokeai.app.api_app:invoke_api"
--- a/scripts/create_checkpoint_template.py
+++ b/scripts/create_checkpoint_template.py
@ -0,0 +1,34 @@
+#!/usr/bin/env python
+"""
+Read a checkpoint/safetensors file and write out a template .json file containing
+its metadata for use in fast model probing.
+"""
+
+import sys
+import argparse
+import json
+
+from pathlib import Path
+
+from invokeai.backend.model_management.models.base import read_checkpoint_meta
+
+parser = argparse.ArgumentParser(description="Create a .json template from checkpoint/safetensors model")
+parser.add_argument("--checkpoint", "--in", type=Path, help="Path to the input checkpoint/safetensors file")
+parser.add_argument("--template", "--out", type=Path, help="Path to the output .json file")
+
+opt = parser.parse_args()
+ckpt = read_checkpoint_meta(opt.checkpoint)
+while "state_dict" in ckpt:
+    ckpt = ckpt["state_dict"]
+
+tmpl = {}
+
+for key, tensor in ckpt.items():
+    tmpl[key] = list(tensor.shape)
+
+try:
+    with open(opt.template, "w") as f:
+        json.dump(tmpl, f)
+    print(f"Template written out as {opt.template}")
+except Exception as e:
+    print(f"An exception occurred while writing template: {str(e)}")
--- a/scripts/verify_checkpoint_template.py
+++ b/scripts/verify_checkpoint_template.py
@ -0,0 +1,37 @@
+#!/usr/bin/env python
+"""
+Read a checkpoint/safetensors file and compare it to a template .json.
+Returns True if their metadata match.
+"""
+
+import sys
+import argparse
+import json
+
+from pathlib import Path
+
+from invokeai.backend.model_management.models.base import read_checkpoint_meta
+
+parser = argparse.ArgumentParser(description="Compare a checkpoint/safetensors file to a JSON metadata template.")
+parser.add_argument("--checkpoint", "--in", type=Path, help="Path to the input checkpoint/safetensors file")
+parser.add_argument("--template", "--out", type=Path, help="Path to the template .json file to match against")
+
+opt = parser.parse_args()
+ckpt = read_checkpoint_meta(opt.checkpoint)
+while "state_dict" in ckpt:
+    ckpt = ckpt["state_dict"]
+
+checkpoint_metadata = {}
+
+for key, tensor in ckpt.items():
+    checkpoint_metadata[key] = list(tensor.shape)
+
+with open(opt.template, "r") as f:
+    template = json.load(f)
+
+if checkpoint_metadata == template:
+    print("True")
+    sys.exit(0)
+else:
+    print("False")
+    sys.exit(-1)
--- a/tests/test_model_manager.py
+++ b/tests/test_model_manager.py
@ -7,6 +7,7 @@ from invokeai.backend import ModelManager, BaseModelType, ModelType, SubModelTyp

 BASIC_MODEL_NAME = ("SDXL base", BaseModelType.StableDiffusionXL, ModelType.Main)
 VAE_OVERRIDE_MODEL_NAME = ("SDXL with VAE", BaseModelType.StableDiffusionXL, ModelType.Main)
+VAE_NULL_OVERRIDE_MODEL_NAME = ("SDXL with empty VAE", BaseModelType.StableDiffusionXL, ModelType.Main)


@pytest.fixture
@ -36,3 +37,11 @@ def test_get_model_path_for_overridden_vae(model_manager: ModelManager, datadir:
    expected_vae_path = datadir / "models" / "sdxl" / "vae" / "sdxl-vae-fp16-fix"
    assert vae_model_path == expected_vae_path
    assert is_override
+
+
+def test_get_model_path_for_null_overridden_vae(model_manager: ModelManager, datadir: Path):
+    model_config = model_manager._get_model_config(
+        VAE_NULL_OVERRIDE_MODEL_NAME[1], VAE_NULL_OVERRIDE_MODEL_NAME[0], VAE_NULL_OVERRIDE_MODEL_NAME[2]
+    )
+    vae_model_path, is_override = model_manager._get_model_path(model_config, SubModelType.Vae)
+    assert not is_override
--- a/tests/test_model_manager/configs/relative_sub.models.yaml
+++ b/tests/test_model_manager/configs/relative_sub.models.yaml
@ -13,3 +13,10 @@ sdxl/main/SDXL with VAE:
  vae: sdxl/vae/sdxl-vae-fp16-fix/
  variant: normal
  format: diffusers
+
+sdxl/main/SDXL with empty VAE:
+  path: sdxl/main/SDXL base 1_0
+  description: SDXL with customized VAE
+  vae: ''
+  variant: normal
+  format: diffusers