mirror of
https://github.com/invoke-ai/InvokeAI
synced 2024-08-30 20:32:17 +00:00
4536e4a8b6
* add basic functionality for model metadata fetching from hf and civitai * add storage * start unit tests * add unit tests and documentation * add missing dependency for pytests * remove redundant fetch; add modified/published dates; updated docs * add code to select diffusers files based on the variant type * implement Civitai installs * make huggingface parallel downloading work * add unit tests for model installation manager - Fixed race condition on selection of download destination path - Add fixtures common to several model_manager_2 unit tests - Added dummy model files for testing diffusers and safetensors downloading/probing - Refactored code for selecting proper variant from list of huggingface repo files - Regrouped ordering of methods in model_install_default.py * improve Civitai model downloading - Provide a better error message when Civitai requires an access token (doesn't give a 403 forbidden, but redirects to the HTML of an authorization page -- arrgh) - Handle case of Civitai providing a primary download link plus additional links for VAEs, config files, etc * add routes for retrieving metadata and tags * code tidying and documentation * fix ruff errors * add file needed to maintain test root diretory in repo for unit tests * fix self->cls in classmethod * add pydantic plugin for mypy * use TestSession instead of requests.Session to prevent any internet activity improve logging fix error message formatting fix logging again fix forward vs reverse slash issue in Windows install tests * Several fixes of problems detected during PR review: - Implement cancel_model_install_job and get_model_install_job routes to allow for better control of model download and install. - Fix thread deadlock that occurred after cancelling an install. - Remove unneeded pytest_plugins section from tests/conftest.py - Remove unused _in_terminal_state() from model_install_default. - Remove outdated documentation from several spots. - Add workaround for Civitai API results which don't return correct URL for the default model. * fix docs and tests to match get_job_by_source() rather than get_job() * Update invokeai/backend/model_manager/metadata/fetch/huggingface.py Co-authored-by: Ryan Dick <ryanjdick3@gmail.com> * Call CivitaiMetadata.model_validate_json() directly Co-authored-by: Ryan Dick <ryanjdick3@gmail.com> * Second round of revisions suggested by @ryanjdick: - Fix type mismatch in `list_all_metadata()` route. - Do not have a default value for the model install job id - Remove static class variable declarations from non Pydantic classes - Change `id` field to `model_id` for the sqlite3 `model_tags` table. - Changed AFTER DELETE triggers to ON DELETE CASCADE for the metadata and tags tables. - Made the `id` field of the `model_metadata` table into a primary key to achieve uniqueness. * Code cleanup suggested in PR review: - Narrowed the declaration of the `parts` attribute of the download progress event - Removed auto-conversion of str to Url in Url-containing sources - Fixed handling of `InvalidModelConfigException` - Made unknown sources raise `NotImplementedError` rather than `Exception` - Improved status reporting on cached HuggingFace access tokens * Multiple fixes: - `job.total_size` returns a valid size for locally installed models - new route `list_models` returns a paged summary of model, name, description, tags and other essential info - fix a few type errors * consolidated all invokeai root pytest fixtures into a single location * Update invokeai/backend/model_manager/metadata/metadata_store.py Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com> * Small tweaks in response to review comments: - Remove flake8 configuration from pyproject.toml - Use `id` rather than `modelId` for huggingface `ModelInfo` object - Use `last_modified` rather than `LastModified` for huggingface `ModelInfo` object - Add `sha256` field to file metadata downloaded from huggingface - Add `Invoker` argument to the model installer `start()` and `stop()` routines (but made it optional in order to facilitate use of the service outside the API) - Removed redundant `PRAGMA foreign_keys` from metadata store initialization code. * Additional tweaks and minor bug fixes - Fix calculation of aggregate diffusers model size to only count the size of files, not files + directories (which gives different unit test results on different filesystems). - Refactor _get_metadata() and _get_download_urls() to have distinct code paths for Civitai, HuggingFace and URL sources. - Forward the `inplace` flag from the source to the job and added unit test for this. - Attach cached model metadata to the job rather than to the model install service. * fix unit test that was breaking on windows due to CR/LF changing size of test json files * fix ruff formatting * a few last minor fixes before merging: - Turn job `error` and `error_type` into properties derived from the exception. - Add TODO comment about the reason for handling temporary directory destruction manually rather than using tempfile.tmpdir(). * add unit tests for reporting HTTP download errors --------- Co-authored-by: Lincoln Stein <lstein@gmail.com> Co-authored-by: Ryan Dick <ryanjdick3@gmail.com> Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
203 lines
8.5 KiB
Python
203 lines
8.5 KiB
Python
# Copyright (c) 2023 Lincoln D. Stein and the InvokeAI Development Team
|
|
|
|
"""This module defines core text-to-image model metadata fields.
|
|
|
|
Metadata comprises any descriptive information that is not essential
|
|
for getting the model to run. For example "author" is metadata, while
|
|
"type", "base" and "format" are not. The latter fields are part of the
|
|
model's config, as defined in invokeai.backend.model_manager.config.
|
|
|
|
Note that the "name" and "description" are also present in `config`
|
|
records. This is intentional. The config record fields are intended to
|
|
be editable by the user as a form of customization. The metadata
|
|
versions of these fields are intended to be kept in sync with the
|
|
remote repo.
|
|
"""
|
|
|
|
from datetime import datetime
|
|
from enum import Enum
|
|
from pathlib import Path
|
|
from typing import Any, Dict, List, Literal, Optional, Set, Tuple, Union
|
|
|
|
from huggingface_hub import configure_http_backend, hf_hub_url
|
|
from pydantic import BaseModel, Field, TypeAdapter
|
|
from pydantic.networks import AnyHttpUrl
|
|
from requests.sessions import Session
|
|
from typing_extensions import Annotated
|
|
|
|
from invokeai.backend.model_manager import ModelRepoVariant
|
|
|
|
from ..util import select_hf_files
|
|
|
|
|
|
class UnknownMetadataException(Exception):
|
|
"""Raised when no metadata is available for a model."""
|
|
|
|
|
|
class CommercialUsage(str, Enum):
|
|
"""Type of commercial usage allowed."""
|
|
|
|
No = "None"
|
|
Image = "Image"
|
|
Rent = "Rent"
|
|
RentCivit = "RentCivit"
|
|
Sell = "Sell"
|
|
|
|
|
|
class LicenseRestrictions(BaseModel):
|
|
"""Broad categories of licensing restrictions."""
|
|
|
|
AllowNoCredit: bool = Field(
|
|
description="if true, model can be redistributed without crediting author", default=False
|
|
)
|
|
AllowDerivatives: bool = Field(description="if true, derivatives of this model can be redistributed", default=False)
|
|
AllowDifferentLicense: bool = Field(
|
|
description="if true, derivatives of this model be redistributed under a different license", default=False
|
|
)
|
|
AllowCommercialUse: CommercialUsage = Field(
|
|
description="Type of commercial use allowed or 'No' if no commercial use is allowed.", default_factory=set
|
|
)
|
|
|
|
|
|
class RemoteModelFile(BaseModel):
|
|
"""Information about a downloadable file that forms part of a model."""
|
|
|
|
url: AnyHttpUrl = Field(description="The url to download this model file")
|
|
path: Path = Field(description="The path to the file, relative to the model root")
|
|
size: int = Field(description="The size of this file, in bytes")
|
|
sha256: Optional[str] = Field(description="SHA256 hash of this model (not always available)", default=None)
|
|
|
|
|
|
class ModelMetadataBase(BaseModel):
|
|
"""Base class for model metadata information."""
|
|
|
|
name: str = Field(description="model's name")
|
|
author: str = Field(description="model's author")
|
|
tags: Set[str] = Field(description="tags provided by model source")
|
|
|
|
|
|
class BaseMetadata(ModelMetadataBase):
|
|
"""Adds typing data for discriminated union."""
|
|
|
|
type: Literal["basemetadata"] = "basemetadata"
|
|
|
|
|
|
class ModelMetadataWithFiles(ModelMetadataBase):
|
|
"""Base class for metadata that contains a list of downloadable model file(s)."""
|
|
|
|
files: List[RemoteModelFile] = Field(description="model files and their sizes", default_factory=list)
|
|
|
|
def download_urls(
|
|
self,
|
|
variant: Optional[ModelRepoVariant] = None,
|
|
subfolder: Optional[Path] = None,
|
|
session: Optional[Session] = None,
|
|
) -> List[RemoteModelFile]:
|
|
"""
|
|
Return a list of URLs needed to download the model.
|
|
|
|
:param variant: Return files needed to reconstruct the indicated variant (e.g. ModelRepoVariant('fp16'))
|
|
:param subfolder: Return files in the designated subfolder only
|
|
:param session: A request.Session object for offline testing
|
|
|
|
Note that the "variant" and "subfolder" concepts currently only apply to HuggingFace.
|
|
However Civitai does have fields for the precision and format of its models, and may
|
|
provide variant selection criteria in the future.
|
|
"""
|
|
return self.files
|
|
|
|
|
|
class CivitaiMetadata(ModelMetadataWithFiles):
|
|
"""Extended metadata fields provided by Civitai."""
|
|
|
|
type: Literal["civitai"] = "civitai"
|
|
id: int = Field(description="Civitai version identifier")
|
|
version_name: str = Field(description="Version identifier, such as 'V2-alpha'")
|
|
version_id: int = Field(description="Civitai model version identifier")
|
|
created: datetime = Field(description="date the model was created")
|
|
updated: datetime = Field(description="date the model was last modified")
|
|
published: datetime = Field(description="date the model was published to Civitai")
|
|
description: str = Field(description="text description of model; may contain HTML")
|
|
version_description: str = Field(
|
|
description="text description of the model's reversion; usually change history; may contain HTML"
|
|
)
|
|
nsfw: bool = Field(description="whether the model tends to generate NSFW content", default=False)
|
|
restrictions: LicenseRestrictions = Field(description="license terms", default_factory=LicenseRestrictions)
|
|
trained_words: Set[str] = Field(description="words to trigger the model", default_factory=set)
|
|
download_url: AnyHttpUrl = Field(description="download URL for this model")
|
|
base_model_trained_on: str = Field(description="base model on which this model was trained (currently not an enum)")
|
|
thumbnail_url: Optional[AnyHttpUrl] = Field(description="a thumbnail image for this model", default=None)
|
|
weight_minmax: Tuple[float, float] = Field(
|
|
description="minimum and maximum slider values for a LoRA or other secondary model", default=(-1.0, +2.0)
|
|
) # note: For future use
|
|
|
|
@property
|
|
def credit_required(self) -> bool:
|
|
"""Return True if you must give credit for derivatives of this model and images generated from it."""
|
|
return not self.restrictions.AllowNoCredit
|
|
|
|
@property
|
|
def allow_commercial_use(self) -> bool:
|
|
"""Return True if commercial use is allowed."""
|
|
return self.restrictions.AllowCommercialUse != CommercialUsage("None")
|
|
|
|
@property
|
|
def allow_derivatives(self) -> bool:
|
|
"""Return True if derivatives of this model can be redistributed."""
|
|
return self.restrictions.AllowDerivatives
|
|
|
|
@property
|
|
def allow_different_license(self) -> bool:
|
|
"""Return true if derivatives of this model can use a different license."""
|
|
return self.restrictions.AllowDifferentLicense
|
|
|
|
|
|
class HuggingFaceMetadata(ModelMetadataWithFiles):
|
|
"""Extended metadata fields provided by HuggingFace."""
|
|
|
|
type: Literal["huggingface"] = "huggingface"
|
|
id: str = Field(description="huggingface model id")
|
|
tag_dict: Dict[str, Any]
|
|
last_modified: datetime = Field(description="date of last commit to repo")
|
|
|
|
def download_urls(
|
|
self,
|
|
variant: Optional[ModelRepoVariant] = None,
|
|
subfolder: Optional[Path] = None,
|
|
session: Optional[Session] = None,
|
|
) -> List[RemoteModelFile]:
|
|
"""
|
|
Return list of downloadable files, filtering by variant and subfolder, if any.
|
|
|
|
:param variant: Return model files needed to reconstruct the indicated variant
|
|
:param subfolder: Return model files from the designated subfolder only
|
|
:param session: A request.Session object used for internet-free testing
|
|
|
|
Note that there is special variant-filtering behavior here:
|
|
When the fp16 variant is requested and not available, the
|
|
full-precision model is returned.
|
|
"""
|
|
session = session or Session()
|
|
configure_http_backend(backend_factory=lambda: session) # used in testing
|
|
|
|
paths = select_hf_files.filter_files(
|
|
[x.path for x in self.files], variant, subfolder
|
|
) # all files in the model
|
|
prefix = f"{subfolder}/" if subfolder else ""
|
|
|
|
# the next step reads model_index.json to determine which subdirectories belong
|
|
# to the model
|
|
if Path(f"{prefix}model_index.json") in paths:
|
|
url = hf_hub_url(self.id, filename="model_index.json", subfolder=subfolder)
|
|
resp = session.get(url)
|
|
resp.raise_for_status()
|
|
submodels = resp.json()
|
|
paths = [Path(subfolder or "", x) for x in paths if Path(x).parent.as_posix() in submodels]
|
|
paths.insert(0, Path(f"{prefix}model_index.json"))
|
|
|
|
return [x for x in self.files if x.path in paths]
|
|
|
|
|
|
AnyModelRepoMetadata = Annotated[Union[BaseMetadata, HuggingFaceMetadata, CivitaiMetadata], Field(discriminator="type")]
|
|
AnyModelRepoMetadataValidator = TypeAdapter(AnyModelRepoMetadata)
|