Sometimes, diffusers model components (tokenizer, unet, etc.) have multiple weights files in the same directory.
In this situation, we assume the files are different versions of the same weights. For example, we may have multiple
formats (`.bin`, `.safetensors`) with different precisions. When downloading model files, we want to select only
the best of these files for the requested format and precision/variant.
The previous logic assumed that each model weights file would have the same base filename, but this assumption was
not always true. The logic is revised score each file and choose the best scoring file, resulting in only a single
file being downloaded for each submodel/subdirectory.
* UI in MM to create trigger phrases
* add scheduler and vaePrecision to config
* UI for configuring default settings for models'
* hook MM default model settings up to API
* add button to set default settings in parameters
* pull out trigger phrases
* back-end for default settings
* lint
* remove log;
gi
* ruff
* ruff format
---------
Co-authored-by: Mary Hipp <maryhipp@Marys-MacBook-Air.local>
- Use memory view for hashlib algorithms (closer to python 3.11's filehash API in hashlib)
- Remove `sha1_fast` (realized it doesn't even hash the whole file, it just does the first block)
- Add support for custom file filters
- Update docstrings
- Update tests
- When installing, model keys are now calculated from the model contents.
- .safetensors, .ckpt and other single file models are hashed with sha1
- The contents of diffusers directories are hashed using imohash (faster)
fixup yaml->sql db migration script to assign deterministic key
- this commit also detects and assigns the correct image encoder for
ip adapter models.
Model metadata includes the main model, VAE and refiner model.
These used full model configs, as returned by the server, as their metadata type.
LoRA and control adapter metadata only use the metadata identifier.
This created a difference in handling. After parsing a model/vae/refiner, we have its name and can display it. But for LoRAs and control adapters, we only have the model key and must query for the full model config to get the name.
This change makes main model/vae/refiner metadata only have the model key, like LoRAs and control adapters.
The render function is now async so fetching can occur within it. All metadata fields with models now only contain the identifier, and fetch the model name to render their values.
When we retrieve a list of models, upsert that data into the `getModelConfig` and `getModelConfigByAttrs` query caches.
With this change, calls to those two queries are almost always going to be free, because their caches will already have all models in them. The exception is queries for models that no longer exist.
Add concepts for metadata handlers. Handlers include parsers, recallers and validators for different metadata types:
- Parsers parse a raw metadata object of any shape to a structured object.
- Recallers load the parsed metadata into state. Recallers are optional, as some metadata types don't need to be loaded into state.
- Validators provide an additional layer of validation before recalling the metadata. This is needed because a metadata object may be valid, but not able to be recalled due to some other requirement, like base model compatibility. Validators are optional.
Sometimes metadata is not a single object but a list of items - like LoRAs. Metadata handlers may implement an optional set of "item" handlers which operate on individual items in the list.
Parsers and validators are async to allow fetching additional data, like a model config. Recallers are synchronous.
The these handlers are composed into a public API, exported as a `handlers` object. Besides the handlers functions, a metadata handler set includes:
- A function to get the label of the metadata type.
- An optional function to render the value of the metadata type.
- An optional function to render the _item_ value of the metadata type.
Gets the first model that matches the given name, base and type. Raises 404 if there isn't one.
This will be used for backwards compatibility with old metadata.
This was done in the frontend before but it's something the backend should handle.
The logic compares the found model paths to the path and source of all installed models. It excludes core models.
Refactor of metadata recall handling. This is in preparation for a backwards compatibility layer for models.
- Create helpers to fetch a model outside react (e.g. not in a hook)
- Created helpers to parse model metadata
- Renamed a lot of types that were confusing and/or had naming collisions
The setup of `ModelConfigBase` means autogenerated types have critical fields flagged as nullable (like `key` and `base`). Need to manually flag them as required.
- Support extended HF repoid syntax in TUI. This allows
installation of subfolders and safetensors files, as in
`XpucT/Deliberate::Deliberate_v5.safetensors`
- Add `error` and `error_traceback` properties to the install
job objects.
- Rename the `heuristic_import` route to `heuristic_install`.
- Fix the example `config` input in the `heuristic_install` route.
Notable updates:
- Minor version of RTK includes customizable selectors for RTK Query, so we can remove the patch that was added to ensure only the LRU memoize function was used for perf reasons. Updated to use the LRU memoize function.
- Major version of react-resizable-panels. No breaking changes, works great, and you can now resize all panels when dragging at the intersection point of panels. Cool!
- Minor (?) version of nanostores. `action` API is removed, we were using it in one spot. Fixed.
- @invoke-ai/eslint-config-react has all deps bumped and now has its dependent plugins/configs listed as normal dependencies (as opposed to peer deps). This means we can remove those packages from explicit dev deps.
- Use a single listener for all of the to keep them in one spot
- Use the bulk download item name as a toast id so we can update the existing toasts
- Update handling to work with other environments
- Move all bulk download handling from components to listener
Double underscores are used in the app but it doesn't actually do or convey anything that single underscores don't already do. Considered unpythonic except for actual dunder/magic methods.
Consolidate graph processing logic into session processor.
With graphs as the unit of work, and the session queue distributing graphs, we no longer need the invocation queue or processor.
Instead, the session processor dequeues the next session and processes it in a simple loop, greatly simplifying the app.
- Remove `graph_execution_manager` service.
- Remove `queue` (invocation queue) service.
- Remove `processor` (invocation processor) service.
- Remove queue-related logic from `Invoker`. It now only starts and stops the services, providing them with access to other services.
- Remove unused `invocation_retrieval_error` and `session_retrieval_error` events, these are no longer needed.
- Clean up stats service now that it is less coupled to the rest of the app.
- Refactor cancellation logic - cancellations now originate from session queue (i.e. HTTP cancel endpoint) and are emitted as events. Processor gets the events and sets the canceled event. Access to this event is provided to the invocation context for e.g. the step callback.
- Remove `sessions` router; it provided access to `graph_executions` but that no longer exists.
`GraphInvocation` is a node that can contain a whole graph. It is removed for a number of reasons:
1. This feature was unused (the UI doesn't support it) and there is no plan for it to be used.
The use-case it served is known in other node execution engines as "node groups" or "blocks" - a self-contained group of nodes, which has group inputs and outputs. This is a planned feature that will be handled client-side.
2. It adds substantial complexity to the graph processing logic. It's probably not enough to have a measurable performance impact but it does make it harder to work in the graph logic.
3. It allows for graphs to be recursive, and the improved invocations union handling does not play well with it. Actually, it works fine within `graph.py` but not in the tests for some reason. I do not understand why. There's probably a workaround, but I took this as encouragement to remove `GraphInvocation` from the app since we don't use it.
The change to `Graph.nodes` and `GraphExecutionState.results` validation requires some fanagling to get the OpenAPI schema generation to work. See new comments for a details.
We use pydantic to validate a union of valid invocations when instantiating a graph.
Previously, we constructed the union while creating the `Graph` class. This introduces a dependency on the order of imports.
For example, consider a setup where we have 3 invocations in the app:
- Python executes the module where `FirstInvocation` is defined, registering `FirstInvocation`.
- Python executes the module where `SecondInvocation` is defined, registering `SecondInvocation`.
- Python executes the module where `Graph` is defined. A union of invocations is created and used to define the `Graph.nodes` field. The union contains `FirstInvocation` and `SecondInvocation`.
- Python executes the module where `ThirdInvocation` is defined, registering `ThirdInvocation`.
- A graph is created that includes `ThirdInvocation`. Pydantic validates the graph using the union, which does not know about `ThirdInvocation`, raising a `ValidationError` about an unknown invocation type.
This scenario has been particularly problematic in tests, where we may create invocations dynamically. The test files have to be structured in such a way that the imports happen in the right order. It's a major pain.
This PR refactors the validation of graph nodes to resolve this issue:
- `BaseInvocation` gets a new method `get_typeadapter`. This builds a pydantic `TypeAdapter` for the union of all registered invocations, caching it after the first call.
- `Graph.nodes`'s type is widened to `dict[str, BaseInvocation]`. This actually is a nice bonus, because we get better type hints whenever we reference `some_graph.nodes`.
- A "plain" field validator takes over the validation logic for `Graph.nodes`. "Plain" validators totally override pydantic's own validation logic. The validator grabs the `TypeAdapter` from `BaseInvocation`, then validates each node with it. The validation is identical to the previous implementation - we get the same errors.
`BaseInvocationOutput` gets the same treatment.
- Replace AnyModelLoader with ModelLoaderRegistry
- Fix type check errors in multiple files
- Remove apparently unneeded `get_model_config_enum()` method from model manager
- Remove last vestiges of old model manager
- Updated tests and documentation
resolve conflict with seamless.py
- Rename old "model_management" directory to "model_management_OLD" in order to catch
dangling references to original model manager.
- Caught and fixed most dangling references (still checking)
- Rename lora, textual_inversion and model_patcher modules
- Introduce a RawModel base class to simplfy the Union returned by the
model loaders.
- Tidy up the model manager 2-related tests. Add useful fixtures, and
a finalizer to the queue and installer fixtures that will stop the
services and release threads.
- ModelMetadataStoreService is now injected into ModelRecordStoreService
(these two services are really joined at the hip, and should someday be merged)
- ModelRecordStoreService is now injected into ModelManagerService
- Reduced timeout value for the various installer and download wait*() methods
- Introduced a Mock modelmanager for testing
- Removed bare print() statement with _logger in the install helper backend.
- Removed unused code from model loader init file
- Made `locker` a private variable in the `LoadedModel` object.
- Fixed up model merge frontend (will be deprecated anyway!)
- Update most model identifiers to be `{key: string}` instead of name/base/type. Doesn't change the model select components yet.
- Update model _parameters_, stored in redux, to be `{key: string, base: BaseModel}` - we need to store the base model to be able to check model compatibility. May want to store the whole config? Not sure...
- Replace legacy model manager service with the v2 manager.
- Update invocations to use new load interface.
- Fixed many but not all type checking errors in the invocations. Most
were unrelated to model manager
- Updated routes. All the new routes live under the route tag
`model_manager_v2`. To avoid confusion with the old routes,
they have the URL prefix `/api/v2/models`. The old routes
have been de-registered.
- Added a pytest for the loader.
- Updated documentation in contributing/MODEL_MANAGER.md