The previous super-minimal implementation had a major issue - the saved workflow didn't take into account batched field values. When generating with multiple iterations or dynamic prompts, the same workflow with the first prompt, seed, etc was stored in each image.
As a result, when the batch results in multiple queue items, only one of the images has the correct workflow - the others are mismatched.
To work around this, we can store the _graph_ in the image metadata (alongside the workflow, if generated via workflow editor). When loading a workflow from an image, we can choose to load the workflow or the graph, preferring the workflow.
Internally, we need to update images router image-saving services. The changes are minimal.
To avoid pydantic errors deserializing the graph, when we extract it from the image, we will leave it as stringified JSON and let the frontend's more sophisticated and flexible parsing handle it. The worklow is also changed to just return stringified JSON, so the API is consistent.
This stateful class provides abstractions for building a graph. It exposes graph methods like adding and removing nodes and edges.
The methods are documented, tested, and strongly typed.
Currently translated at 98.5% (1192 of 1210 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.5% (1192 of 1210 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.5% (1192 of 1210 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.5% (1192 of 1210 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
Currently translated at 98.3% (1189 of 1209 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.3% (1189 of 1209 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
Currently translated at 100.0% (1209 of 1209 strings)
translationBot(ui): update translation (Russian)
Currently translated at 100.0% (1209 of 1209 strings)
translationBot(ui): update translation (Russian)
Currently translated at 100.0% (1188 of 1188 strings)
translationBot(ui): update translation (Russian)
Currently translated at 100.0% (1185 of 1185 strings)
Co-authored-by: Васянатор <ilabulanov339@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/ru/
Translation: InvokeAI/Web UI
Currently translated at 71.9% (839 of 1166 strings)
Co-authored-by: Alexander Eichhorn <pfannkuchensack@einfach-doof.de>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/de/
Translation: InvokeAI/Web UI
Currently translated at 97.3% (1154 of 1185 strings)
translationBot(ui): update translation (Russian)
Currently translated at 100.0% (1174 of 1174 strings)
translationBot(ui): update translation (Russian)
Currently translated at 100.0% (1173 of 1173 strings)
translationBot(ui): update translation (Russian)
Currently translated at 100.0% (1166 of 1166 strings)
translationBot(ui): update translation (Russian)
Currently translated at 100.0% (1165 of 1165 strings)
translationBot(ui): update translation (Russian)
Currently translated at 100.0% (1149 of 1149 strings)
translationBot(ui): update translation (Russian)
Currently translated at 100.0% (1147 of 1147 strings)
Co-authored-by: Васянатор <ilabulanov339@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/ru/
Translation: InvokeAI/Web UI
Currently translated at 96.0% (1138 of 1185 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.4% (1156 of 1174 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.3% (1155 of 1174 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.4% (1129 of 1147 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
Make the Invoke button show a loading spinner while queueing.
The queue mutations need to be awaited else the `isLoading` state doesn't work as expected. I feel like I should understand why, but I don't...
This allows comboboxes for models to have more granular groupings. For example, Control Adapter models can be grouped by base model & model type.
Before:
- `SD-1`
- `SDXL`
After:
- `SD-1 / ControlNet`
- `SD-1 / T2I Adapter`
- `SDXL / ControlNet`
- `SDXL / T2I Adapter`
When a control adapter processor config is changed, if we were already processing an image, that batch is immediately canceled. This prevents the processed image from getting stuck in a weird state if you change or reset the processor at the right (err, wrong?) moment.
- Update internal state for control adapters to track processor batches, instead of just having a flag indicating if the image is processing. Add a slice migration to not break the user's existing app state.
- Update preprocessor listener with more sophisticated logic to handle canceling the batch and resetting the processed image when the config changes or is reset.
- Fixed error handling that erroneously showed "failed to queue graph" errors when an active listener instance is canceled, need to check the abort signal.
This is largely an internal change, and it should have been this way from the start - less tip-toeing around layer types. The user-facing change is when you click an IP Adapter layer, it is highlighted. That's it.
Turns out, it's more efficient to just use the bbox logic for empty mask calculations. We already track if if the bbox needs updating, so this calculation does minimal work.
The dedicated calculation wasn't able to use the bbox tracking so it ran far more often than the bbox calculation.
Removed the "fast" bbox calculation logic, bc the new logic means we are continually updating the bbox in the background - not only when the user switches to the move tool and/or selects a layer.
The bbox calculation logic is split out from the bbox rendering logic to support this.
Result - better perf overall, with the empty mask handling retained.
Mask vector data includes additive (brush, rect) shapes and subtractive (eraser) shapes. A different composite operation is used to draw a shape, depending on whether it is additive or subtractive.
This means that a mask may have vector objects, but once rendered, is _visually_ empty (fully transparent). The only way determine if a mask is visually empty is to render it and check every pixel.
When we generate and save layer metadata, these fully erased masks are still used. Generating with an empty mask is a no-op in the backend, so we want to avoid this and not pollute graphs/metadata.
Previously, we did that pixel-based when calculating the bbox, which we only did when using the move tool, and only for the selected layer.
This change introduces a simpler function to check if a mask is transparent, and if so, deletes all its objects to reset it. This allows us skip these no-op layers entirely.
This check is debounced to 300 ms, trailing edge only.
When layer metadata is stored, the layer IDs are included. When recalling the metadata, we need to assign fresh IDs, else we can end up with multiple layers with the same ID, which of course causes all sorts of issues.
- Viewer only exists on Generation tab
- Viewer defaults to open
- When clicking the Control Layers tab on the left panel, close the viewer (i.e. open the CL editor)
- Do not switch to editor when adding layers (this is handled by clicking the Control Layers tab)
- Do not open viewer when single-clicking images in gallery
- _Do_ open viewer when _double_-clicking images in gallery
- Do not change viewer state when switching between app tabs (this no longer makes sense; the viewer only exists on generation tab)
- Change the button to a drop down menu that states what you are currently doing, e.g. Viewing vs Editing
There are unresolved platform-specific issues with this component, and its utility is debatable.
Should be easy to just revert this commit to add it back in the future if desired.
There are a number of bugs with `framer-motion` that can result in sync issues with AnimatePresence and the conditionally rendered component.
You can see this if you rapidly click an accordion, occasionally it gets out of sync and is closed when it should be open.
This is a bigger problem with the viewer where the user may hold down the `z` key. It's trivial to get it to lock up.
For now, just remove the animation entirely.
Upstream issues for reference:
https://github.com/framer/motion/issues/2023https://github.com/framer/motion/issues/2618https://github.com/framer/motion/issues/2554
- Rects snap to stage edge when within a threshold (10 screen pixels)
- When mouse leaves stage, set last mousedown pos to null, preventing nonfunctional rect outlines
Partially addresses #6306.
There's a technical challenge to fully address the issue - mouse event are not fired when the mouse is outside the stage. While we could draw the rect even if the mouse leaves, we cannot update the rect's dimensions on mouse move, or complete the drawing on mouse up.
To fully address the issue, we'd need to a way to forward window events back to the stage, or at least handle window events. We can explore this later.
When invoking with control layers, we were creating and uploading the mask images on every enqueue, even when the mask didn't change. The mask image can be cached to greatly reduce the number of uploads.
With this change, we are a bit smarter about the mask images:
- Check if there is an uploaded mask image name
- If so, attempt to retrieve its DTO. Typically it will be in the RTKQ cache, so there is no network request, but it will make a network request if not cached to confirm the image actually exists on the server.
- If we don't have an uploaded mask image name, or the request fails, we go ahead and upload the generated blob
- Update the layer's state with a reference to this uploaded image for next time
- Continue as before
Any time we modify the mask (drawing/erasing, resetting the layer), we invalidate that cached image name (set it to null).
We now only upload images when we need to and generation starts faster.
- Rework styling
- Replace "CurrentImageDisplay" entirely
- Add a super short fade to reduce jarring transition
- Make the viewer a singleton component, overlaid on everything else - reduces change when switching tabs
- Works on txt2img, canvas and workflows tabs, img2img has its own side-by-side view
- In workflow editor, the is closeable only if you are in edit mode, else it's always there
- Press `i` to open
- Press `esc` to close
- Selecting an image or changing image selection opens the viewer
- When generating, if auto-switch to new image is enabled, the viewer opens when an image comes in
To support this change, I organized and restructured some tab stuff.
When recalling metadata and/or using control image dimensions, it was possible to set a width or height that was not a multiple of 8, resulting in generation failures.
Added a `clamp` option to the w/h actions to fix this. The option is used for all untrusted sources - everything except for the w/h number inputs, which clamp the values themselves.
Firefox v125.0.3 and below has a bug where `mouseenter` events are fired continually during mouse moves. The issue isn't present on FF v126.0b6 Developer Edition. It's not clear if the issue is present on FF nightly, and we're not sure if it will actually be fixed in the stable v126 release.
The control layers drawing logic relied on on `mouseenter` events to create new lines, and `mousemove` to extend existing lines. On the affected version of FF, all line extensions are turned into new lines, resulting in very poor performance, noncontiguous lines, and way-too-big internal state.
To resolve this, the drawing handling was updated to not use `mouseenter` at all. As a bonus, resolving this issue has resulted in simpler logic for drawing on the canvas.
- Add set of metadata handlers for the control layers CAs
- Use these conditionally depending on the active tab - when recalling on txt2img, the CAs go to control layers, else they go to the old CA area.
These changes were left over from the previous attempt to handle control adapters in control layers with the same logic. Control Layers are now handled totally separately, so these changes may be reverted.
When typing in a number into the w/h number inputs, if the number is less than the step, it appears the value of 0 is used. This is unexpected; it means Chakra isn't clamping the value correctly (or maybe our wrapper isn't clamping it).
Add checks to never bail if the width or height value from the number input component is 0.
- Revise control adapter config types
- Recreate all control adapter mutations in control layers slice
- Bit of renaming along the way - typing 'RegionalGuidanceLayer' over and over again was getting tedious
Konva doesn't react to changes to window zoom/scale. If you open the tab at, say, 90%, then bump to 100%, the pixel ratio of the canvas doesn't change. This results in lower-quality renders on the canvas (generation is unaffected).
- Shift+C: Reset selected layer mask (same as canvas)
- Shift+D: Delete selected layer (cannot be Del, that deletes an image in gallery)
- Shift+A: Add layer (cannot be Ctrl+Shift+N, that opens a new window)
- Ctrl/Cmd+Wheel: Brush size (same as canvas)
Trying a lot of different things as I iterated, so this is smooshed into one big commit... too hard to split it now.
- Iterated on IP adapter handling and UI. Unfortunately there is an bug related to undo/redo. The IP adapter state is split across the `controlAdapters` slice and the `regionalPrompts` slice, but only the `regionalPrompts` slice supports undo/redo. If you delete the IP adapter and then undo/redo to a history state where it existed, you'll get an error. The fix is likely to merge the slices... Maybe there's a workaround.
- Iterated on UI. I think the layers are OK now.
- Removed ability to disable RP globally for now. It's enabled if you have enabled RP layers.
- Many minor tweaks and fixes.
- Keep track of whether the bbox needs to be recalculated (e.g. had lines/points added)
- Keep track of whether the bbox has eraser strokes - if yes, we need to do the full pixel-perfect bbox calculation, otherwise we can use the faster getClientRect
- Use comparison rather than Math.min/max in bbox calculation (slightly faster)
- Return `null` if no pixel data at all in bbox
Adds an additional negative conditioning using the inverted mask of the positive conditioning and the positive prompt. May be useful for mutually exclusive regions.
Currently translated at 98.4% (1122 of 1140 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.4% (1120 of 1138 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.4% (1115 of 1133 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
Handful of intertwined fixes.
- Create and use helper function to reset staging area.
- Clear staging area when queue items are canceled, failed, cleared, etc. Fixes a bug where the bbox ends up offset and images are put into the wrong spot.
- Fix a number of similar bugs where canvas would "forget" it had pending generations, but they continued to generate. Canvas needs to track batches that should be displayed in it using `state.canvas.batchIds`, and this was getting cleared without actually canceling those batches.
- Disable the `discard current image` button on canvas if there is only one image. Prevents accidentally canceling all canvas batches if you spam the button.
This is intended for debug usage, so it's hidden away in the workflow library `...` menu. Hold shift to see the button for it.
- Paste a graph (from a network request, for example) and then click the convert button to convert it to a workflow.
- Disable auto layout to stack the nodes with an offset (try it out). If you change this, you must re-convert to get the changes.
- Edit the workflow JSON if you need to tweak something before loading it.
This data is already in the template but it wasn't ever used.
One big place where this improves UX is the noise node. Previously, the UI let you change width and height in increments of 1, despite the template requiring a multiple of 8. It now works in multiples of 8.
Retrieving the DTO happens as part of the metadata parsing, not recall. This way, we don't show the option to recall a nonexistent image.
This matches the flow for other metadata entities like models - we don't show the model recall button if the model isn't available.
Currently translated at 73.3% (826 of 1126 strings)
Co-authored-by: Alexander Eichhorn <pfannkuchensack@einfach-doof.de>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/de/
Translation: InvokeAI/Web UI
The valid values for this parameter changed when inpainting changed to gradient denoise. The generation slice's redux migration wasn't updated, resulting in a generation error until you change the setting or reset web UI.
- Add and use more performant `deepClone` method for deep copying throughout the UI.
Benchmarks indicate the Really Fast Deep Clone library (`rfdc`) is the best all-around way to deep-clone large objects.
This is particularly relevant in canvas. When drawing or otherwise manipulating canvas objects, we need to do a lot of deep cloning of the canvas layer state objects.
Previously, we were using lodash's `cloneDeep`.
I did some fairly realistic benchmarks with a handful of deep-cloning algorithms/libraries (including the native `structuredClone`). I used a snapshot of the canvas state as the data to be copied:
On Chromium, `rfdc` is by far the fastest, over an order of magnitude faster than `cloneDeep`.
On FF, `fastest-json-copy` and `recursiveDeepCopy` are even faster, but are rather limited in data types. `rfdc`, while only half as fast as the former 2, is still nearly an order of magnitude faster than `cloneDeep`.
On Safari, `structuredClone` is the fastest, about 2x as fast as `cloneDeep`. `rfdc` is only 30% faster than `cloneDeep`.
`rfdc`'s peak memory usage is about 10% more than `cloneDeep` on Chrome. I couldn't get memory measurements from FF and Safari, but let's just assume the memory usage is similar relative to the other algos.
Overall, `rfdc` is the best choice for a single algo for all browsers. It's definitely the best for Chromium, by far the most popular desktop browser and thus our primary target.
A future enhancement might be to detect the browser and use that to determine which algorithm to use.
There were two ways the canvas history could grow too large (past the `MAX_HISTORY` setting):
- Sometimes, when pushing to history, we didn't `shift` an item out when we exceeded the max history size.
- If the max history size was exceeded by more than one item, we still only `shift`, which removes one item.
These issue could appear after an extended canvas session, resulting in a memory leak and recurring major GCs/browser performance issues.
To fix these issues, a helper function is added for both past and future layer states, which uses slicing to ensure history never grows too large.
Currently translated at 98.3% (1106 of 1124 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.3% (1104 of 1122 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
Currently translated at 72.4% (813 of 1122 strings)
Co-authored-by: Alexander Eichhorn <pfannkuchensack@einfach-doof.de>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/de/
Translation: InvokeAI/Web UI
Setting to 'auto' works only for InvokeAI config and auto detects the SD model but will override if user explicitly sets it. If auto used with checkpoint models, we raise an error. Checkpoints will always need to set to non-auto.
Updating should always be done via the installer. We initially planned to only deprecate the updater, but given the scale of changes for v4, there's no point in waiting to remove it entirely.
Loading default workflows sometimes requires we mutate the workflow object in order to change the category or ID of the workflow.
This happens in `invokeai/frontend/web/src/features/nodes/util/workflow/validateWorkflow.ts`
The data we get back from the query hooks is frozen and sealed by redux, because they are part of redux state. We need to clone the workflow before operating on it.
It's not clear how this ever worked in the past, because redux state has always been frozen and sealed.
Currently translated at 98.2% (1102 of 1122 strings)
translationBot(ui): update translation (Italian)
Currently translated at 97.9% (1099 of 1122 strings)
translationBot(ui): update translation (Italian)
Currently translated at 97.9% (1099 of 1122 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
With the change to model identifiers from v3 to v4, if a user had persisted redux state with the old format, we could get unexpected runtime errors when rehydrating state if we try to access model attributes that no longer exist.
For example, the CLIP Skip component does this:
```ts
CLIP_SKIP_MAP[model.base].maxClip
```
In v3, models had a `base_type` attribute, but it is renamed to `base` in v4. This code therefore causes a runtime error:
- `model.base` is `undefined`
- `CLIP_SKIP_MAP[undefined]` is also undefined
- `undefined.maxClip` is a runtime error!
Resolved by adding a migration for the redux slices that have model identifiers. The migration simply resets the slice or the part of the slice that is affected, when it's simple to do a partial reset.
Closes#6000
- Display a toast on UI launch if the HF token is invalid
- Show form in MM if token is invalid or unable to be verified, let user set the token via this form
This allows users to create simple "profiles" via separate `invokeai.yaml` files.
- Remove `InvokeAIAppConfig.set_root()`, it's extraneous
- Remove `InvokeAIAppConfig.merge_from_file()`, it's extraneous
- Add `--config` to the app arg parser, add `InvokeAIAppConfig._config_file`, and consume in the config singleton getter
- `InvokeAIAppConfig.init_file_path` -> `InvokeAIAppConfig.config_file_path`
This flag acts as a proxy for the `get_config()` function to determine if the full application is running.
If it was, the config will set the root, do HF login, etc.
If not (e.g. it's called by an external script), all that stuff will be skipped.
When consolidating all the model queries I messed up the query tags. Fixed now, so that when a model is installed, removed, or changed, the list refreshes.
Currently translated at 52.5% (576 of 1096 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 52.0% (570 of 1096 strings)
Co-authored-by: Gohsuke Shimada <ghoskay@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/ja/
Translation: InvokeAI/Web UI
Currently translated at 98.2% (1077 of 1096 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.2% (1077 of 1096 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
Currently translated at 99.0% (1518 of 1533 strings)
translationBot(ui): update translation (Russian)
Currently translated at 99.0% (1518 of 1533 strings)
Co-authored-by: Васянатор <ilabulanov339@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/ru/
Translation: InvokeAI/Web UI
Currently translated at 97.8% (1510 of 1543 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.1% (1503 of 1532 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.1% (1503 of 1532 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
In order to allow for null and undefined metadata values, this hook returned a symbol to indicate that parsing failed or was pending.
For values where the parsed value will never be null or undefined, it is useful get the value or null (instead of a symbol).
We have two problems with how argparse is being utilized:
- We parse CLI args as the `api_app.py` file is read. This causes a problem pytest, which has an incompatible set of CLI args. Some tests import the FastAPI app, which triggers the config to parse CLI args, which receives the pytest args and fails.
- We've repeatedly had problems when something that uses the config is imported before the CLI args are parsed. When this happens, the root dir may not be set correctly, so we attempt to operate on incorrect paths.
To resolve these issues, we need to lift CLI arg parsing outside of the application code, but still let the application access the CLI args. We can create a external app entrypoint to do this.
- `InvokeAIArgs` is a simple helper class that parses CLI args and stores the result.
- `run_app()` is the new entrypoint. It first parses CLI args, then runs `invoke_api` to start the app.
The `invokeai-web` project script and `invokeai-web.py` dev script now call `run_app()` instead of `invoke_api()`.
The first time `get_config()` is called to get the singleton config object, it retrieves the args from `InvokeAIArgs`, sets the root dir if provided, then merges settings in from `invokeai.yaml`.
CLI arg parsing is now safely insulated from application code, but still accessible. And we don't need to worry about import order having an impact on anything, because by the time the app is running, we have already parsed CLI args. Whew!
- Remove OmegaConf. It functioned as an intermediary data format, between YAML/argparse and pydantic. It's not necessary - we can parse YAML or CLI args directly with pydantic.
- Remove dynamic CLI args. Only `root` is explicitly supported. This greatly simplifies config handling. Configuration is done by editing the YAML file. Frequently-used args can be added if there is a demand.
- A separate arg parser is created to handle the slimmed-down CLI args. It's run immediately in the `invokeai-web` script to handle `--version` and `--help`. It is also used inside the singleton config getter (see below).
- Remove categories from the config. Our settings model is mostly flat. Handling categories adds complexity for both us and users - we have to handle transforming a flat config to categorized config (and vice-versa), while users have to be careful with indentation in their YAML file.
- Add a `meta` key to the config file. Currently, this holds the config schema version only. It is not a part of the config object itself.
- Remove legacy settings that are no longer referenced, or were effectively no-op settings when referenced in code.
- Implement simple migration logic to for v3 configs. If migration is successful, the v3 config file is backed up to `invokeai.yaml.bak` and the new config written to `invokeai.yaml`.
- Previously, the singleton config was accessed by calling `InvokeAIAppConfig.get_config()`. This returned an instance of `InvokeAIAppConfig`, which _also_ has the `get_config` function. This created to a confusing situation where you weren't sure if you needed to call `get_config` or just use the config object. This method is replaced by a standalone `get_config` function which returns a singleton config object.
- Wrap CLI arg parsing (for `root`) and loading/migrating `invokeai.yaml` into the new `get_config()` function.
- Move `generate_config_docstrings` into standalone utility function.
- Make `root` a private attr (`_root`). This reduces the temptation to directly modify and or use this sensitive field and ensures it is neither serialized nor read from input data. Use `root_path` to access the resolved root path, or `set_root` to set the root to something.