Adds full support for managing model-to-model relationships in the UI and backend.
Introduces RelatedModels subpanel for linking and unlinking models in model management.
- Adds REST API routes for adding, removing, and retrieving model relationships.
- New database migration: creates model_relationships table for bidirectional links.
- New service layer (model_relationships) for relationship management.
- Updated frontend: Related models float to top of LoRA/Main grouped model comboboxes for quick access.
- Added 'Show Only Related' toggle badge to MainModelPicker filter bar
**Amended commit to remove changes to ParamMainModelSelect.tsx and MainModelPicker.tsx to avoid conflict with upstream deletion/ rewrite**
Previously, custom node loading occurred _during module imports_. A consequence of this is that when a custom node import fails (e.g. its type clobbers an existing node), the app fails to start up.
In fact, any time we import basically anything from the app, we trigger custom node imports! Not good.
This logic is now in its own function, called as the API app starts up.
If a custom node load fails for any reason, it no longer prevents the app from starting up.
One other bonus we get from this is that we can now ensure custom nodes are loaded _after_ core nodes.
Any clobbering that may occur while loading custom nodes is now guaranteed to be a custom node clobbering a core node's type - and not the other way round.
Uvicorn's logging is rather verbose. This change adds a `log_level_network` config setting to independently control uvicorn's log outputs. The setting defaults to warning.
The change hides the helpful startup message that says the host and port we are running on.
For example: `Uvicorn running on http://0.0.0.0:9090 (Press CTRL+C to quit`
The ASGI lifespan handler is updated to log an equivalent message on startup, regardless of log level settings.
Besides being helpful, the launcher relies on a message like this to launch the app. So, previously, if the user set their log level to anything above info (e.g. warning or error), the launcher would fail to open the app. This change prevents that edge case.
This pops up every now and then and I could never figure it out. A user figured it out in #6936. The cause is appending a query string to the app URL.
For example:
```sh
http://127.0.0.1:9090/?__theme=dark
```
The query string breaking the static file serving, which prevents our translations from loading correctly. Instead of the JSON translations, FastAPI sends the index HTML page. The UI then errors when attempting to parse the translation JSON.
The query string ?__theme=dark is used by Gradio to force dark mode. I believe the users with this issue are doing the same thing the user in #6936 did (just change the port number on an existing bookmark) or their browser history/bookmark includes the query string.
Though this is technically a user-caused problem (we cannot prevent the user from using a malformed URL), we can work around it. When query string is used on the root path, we can redirect the browser to the root path without the query string.
This is done via very simple middleware.
Closes#6696Closes#6817Closes#6828Closes#6936Closes#6983
Around the time we (I) implemented pydantic events, I noticed a short pause between progress images every 4 or 5 steps when generating with SDXL. It didn't happen with SD1.5, but I did notice that with SD1.5, we'd get 4 or 5 progress events simultaneously. I'd expect one event every ~25ms, matching my it/s with SD1.5. Mysterious!
Digging in, I found an issue is related to our use of a synchronous queue for events. When the event queue is empty, we must call `asyncio.sleep` before checking again. We were sleeping for 100ms.
Said another way, every time we clear the event queue, we have to wait 100ms before another event can be dispatched, even if it is put on the queue immediately after we start waiting. In practice, this means our events get buffered into batches, dispatched once every 100ms.
This explains why I was getting batches of 4 or 5 SD1.5 progress events at once, but not the intermittent SDXL delay.
But this 100ms wait has another effect when the events are put on the queue in intervals that don't perfectly line up with the 100ms wait. This is most noticeable when the time between events is >100ms, and can add up to 100ms delay before the event is dispatched.
For example, say the queue is empty and we start a 100ms wait. Then, immediately after - like 0.01ms later - we push an event on to the queue. We still need to wait another 99.9ms before that event will be dispatched. That's the SDXL delay.
The easy fix is to reduce the sleep to something like 0.01 seconds, but this feels kinda dirty. Can't we just wait on the queue and dispatch every event immediately? Not with the normal synchronous queue - but we can with `asyncio.Queue`.
I switched the events queue to use `asyncio.Queue` (as seen in this commit), which lets us asynchronous wait on the queue in a loop.
Unfortunately, I ran into another issue - events now felt like their timing was inconsistent, but in a different way than with the 100ms sleep. The time between pushing events on the queue and dispatching them was not consistently ~0ms as I'd expect - it was highly variable from ~0ms up to ~100ms.
This is resolved by passing the asyncio loop directly into the events service and using its methods to create the task and interact with the queue. I don't fully understand why this resolved the issue, because either way we are interacting with the same event loop (as shown by `asyncio.get_running_loop()`). I suppose there's some scheduling magic happening.
For some reason, I started getting this indefinite hang when the app checks if port 9090 is available. After some fiddling around, I found that adding a timeout resolves the issue.
I confirmed that the util still works by starting the app on 9090, then starting a second instance. The second instance correctly saw 9090 in use and moved to 9091.
Some tech debt related to dynamic pydantic schemas for invocations became problematic. Including the invocations and results in the event schemas was breaking pydantic's handling of ref schemas. I don't really understand why - I think it's a pydantic bug in a remote edge case that we are hitting.
After many failed attempts I landed on this implementation, which is actually much tidier than what was in there before.
- Create pydantic-enabled types for `AnyInvocation` and `AnyInvocationOutput` and use these in place of the janky dynamic unions. Actually, they are kinda the same, but better encapsulated. Use these in `Graph`, `GraphExecutionState`, `InvocationEventBase` and `InvocationCompleteEvent`.
- Revise the custom openapi function to work with the new models.
- Split out the custom openapi function to a separate file. Add a `post_transform` callback so consumers can customize the output schema.
- Update makefile scripts.
We don't need to use the payload schema registry. All our events are dispatched as pydantic models, which are already validated on instantiation.
We do want to add all events to the OpenAPI schema, and we referred to the payload schema registry for this. To get all events, add a simple helper to EventBase. This is functionally identical to using the schema registry.
Our events handling and implementation has a couple pain points:
- Adding or removing data from event payloads requires changes wherever the events are dispatched from.
- We have no type safety for events and need to rely on string matching and dict access when interacting with events.
- Frontend types for socket events must be manually typed. This has caused several bugs.
`fastapi-events` has a neat feature where you can create a pydantic model as an event payload, give it an `__event_name__` attr, and then dispatch the model directly.
This allows us to eliminate a layer of indirection and some unpleasant complexity:
- Event handler callbacks get type hints for their event payloads, and can use `isinstance` on them if needed.
- Event payload construction is now the responsibility of the event itself (a pydantic model), not the service. Every event model has a `build` class method, encapsulating this logic. The build methods are provided as few args as possible. For example, `InvocationStartedEvent.build()` gets the invocation instance and queue item, and can choose the data it wants to include in the event payload.
- Frontend event types may be autogenerated from the OpenAPI schema. We use the payload registry feature of `fastapi-events` to collect all payload models into one place, making it trivial to keep our schema and frontend types in sync.
This commit moves the backend over to this improved event handling setup.
* introduce new abstraction layer for GPU devices
* add unit test for device abstraction
* fix ruff
* convert TorchDeviceSelect into a stateless class
* move logic to select context-specific execution device into context API
* add mock hardware environments to pytest
* remove dangling mocker fixture
* fix unit test for running on non-CUDA systems
* remove unimplemented get_execution_device() call
* remove autocast precision
* Multiple changes:
1. Remove TorchDeviceSelect.get_execution_device(), as well as calls to
context.models.get_execution_device().
2. Rename TorchDeviceSelect to TorchDevice
3. Added back the legacy public API defined in `invocation_api`, including
choose_precision().
4. Added a config file migration script to accommodate removal of precision=autocast.
* add deprecation warnings to choose_torch_device() and choose_precision()
* fix test crash
* remove app_config argument from choose_torch_device() and choose_torch_dtype()
---------
Co-authored-by: Lincoln Stein <lstein@gmail.com>
HF login, legacy yaml confs, and default init file are all handled during app setup.
All directories are created as they are needed by the app.
No need to check for a valid root dir - we will make it if it doesn't exist.
We have two problems with how argparse is being utilized:
- We parse CLI args as the `api_app.py` file is read. This causes a problem pytest, which has an incompatible set of CLI args. Some tests import the FastAPI app, which triggers the config to parse CLI args, which receives the pytest args and fails.
- We've repeatedly had problems when something that uses the config is imported before the CLI args are parsed. When this happens, the root dir may not be set correctly, so we attempt to operate on incorrect paths.
To resolve these issues, we need to lift CLI arg parsing outside of the application code, but still let the application access the CLI args. We can create a external app entrypoint to do this.
- `InvokeAIArgs` is a simple helper class that parses CLI args and stores the result.
- `run_app()` is the new entrypoint. It first parses CLI args, then runs `invoke_api` to start the app.
The `invokeai-web` project script and `invokeai-web.py` dev script now call `run_app()` instead of `invoke_api()`.
The first time `get_config()` is called to get the singleton config object, it retrieves the args from `InvokeAIArgs`, sets the root dir if provided, then merges settings in from `invokeai.yaml`.
CLI arg parsing is now safely insulated from application code, but still accessible. And we don't need to worry about import order having an impact on anything, because by the time the app is running, we have already parsed CLI args. Whew!
- `write_file` requires an destination file path
- `read_config` -> `merge_from_file`, if no path is provided, reads from `self.init_file_path`
- update app, tests to use new methods
- fix configurator, was overwriting config file data unexpectedly
Having this all in the `get_config` function makes testing hard. Move these two functions to their own methods, and call them on app startup explicitly.
Consolidate graph processing logic into session processor.
With graphs as the unit of work, and the session queue distributing graphs, we no longer need the invocation queue or processor.
Instead, the session processor dequeues the next session and processes it in a simple loop, greatly simplifying the app.
- Remove `graph_execution_manager` service.
- Remove `queue` (invocation queue) service.
- Remove `processor` (invocation processor) service.
- Remove queue-related logic from `Invoker`. It now only starts and stops the services, providing them with access to other services.
- Remove unused `invocation_retrieval_error` and `session_retrieval_error` events, these are no longer needed.
- Clean up stats service now that it is less coupled to the rest of the app.
- Refactor cancellation logic - cancellations now originate from session queue (i.e. HTTP cancel endpoint) and are emitted as events. Processor gets the events and sets the canceled event. Access to this event is provided to the invocation context for e.g. the step callback.
- Remove `sessions` router; it provided access to `graph_executions` but that no longer exists.
The change to `Graph.nodes` and `GraphExecutionState.results` validation requires some fanagling to get the OpenAPI schema generation to work. See new comments for a details.
- Replace AnyModelLoader with ModelLoaderRegistry
- Fix type check errors in multiple files
- Remove apparently unneeded `get_model_config_enum()` method from model manager
- Remove last vestiges of old model manager
- Updated tests and documentation
resolve conflict with seamless.py
- Replace legacy model manager service with the v2 manager.
- Update invocations to use new load interface.
- Fixed many but not all type checking errors in the invocations. Most
were unrelated to model manager
- Updated routes. All the new routes live under the route tag
`model_manager_v2`. To avoid confusion with the old routes,
they have the URL prefix `/api/v2/models`. The old routes
have been de-registered.
- Added a pytest for the loader.
- Updated documentation in contributing/MODEL_MANAGER.md
The previous method wasn't totally foolproof, and locales/assets were cached.
To solve this once and for all (famous last words, I know), we can subclass `StaticFiles` and use maximally strict no-caching headers to disable caching on all static files.
- Add various brand images, organise images
- Create favicon for docs pages (light blue version of key logo)
- Rename app title to `Invoke - Community Edition`
* add base definition of download manager
* basic functionality working
* add unit tests for download queue
* add documentation and FastAPI route
* fix docs
* add missing test dependency; fix import ordering
* fix file path length checking on windows
* fix ruff check error
* move release() into the __del__ method
* disable testing of stderr messages due to issues with pytest capsys fixture
* fix unsorted imports
* harmonized implementation of start() and stop() calls in download and & install modules
* Update invokeai/app/services/download/download_base.py
Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>
* replace test datadir fixture with tmp_path
* replace DownloadJobBase->DownloadJob in download manager documentation
* make source and dest arguments to download_queue.download() an AnyHttpURL and Path respectively
* fix pydantic typecheck errors in the download unit test
* ruff formatting
* add "job cancelled" as an event rather than an exception
* fix ruff errors
* Update invokeai/app/services/download/download_default.py
Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
* use threading.Event to stop service worker threads; handle unfinished job edge cases
* remove dangling STOP job definition
* fix ruff complaint
* fix ruff check again
* avoid race condition when start() and stop() are called simultaneously from different threads
* avoid race condition in stop() when a job becomes active while shutting down
---------
Co-authored-by: Lincoln Stein <lstein@gmail.com>
Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>
Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
Co-authored-by: Kent Keirsey <31807370+hipsterusername@users.noreply.github.com>