Currently translated at 100.0% (1209 of 1209 strings)
translationBot(ui): update translation (Russian)
Currently translated at 100.0% (1209 of 1209 strings)
translationBot(ui): update translation (Russian)
Currently translated at 100.0% (1188 of 1188 strings)
translationBot(ui): update translation (Russian)
Currently translated at 100.0% (1185 of 1185 strings)
Co-authored-by: Васянатор <ilabulanov339@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/ru/
Translation: InvokeAI/Web UI
Currently translated at 71.9% (839 of 1166 strings)
Co-authored-by: Alexander Eichhorn <pfannkuchensack@einfach-doof.de>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/de/
Translation: InvokeAI/Web UI
Currently translated at 97.3% (1154 of 1185 strings)
translationBot(ui): update translation (Russian)
Currently translated at 100.0% (1174 of 1174 strings)
translationBot(ui): update translation (Russian)
Currently translated at 100.0% (1173 of 1173 strings)
translationBot(ui): update translation (Russian)
Currently translated at 100.0% (1166 of 1166 strings)
translationBot(ui): update translation (Russian)
Currently translated at 100.0% (1165 of 1165 strings)
translationBot(ui): update translation (Russian)
Currently translated at 100.0% (1149 of 1149 strings)
translationBot(ui): update translation (Russian)
Currently translated at 100.0% (1147 of 1147 strings)
Co-authored-by: Васянатор <ilabulanov339@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/ru/
Translation: InvokeAI/Web UI
Currently translated at 96.0% (1138 of 1185 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.4% (1156 of 1174 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.3% (1155 of 1174 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.4% (1129 of 1147 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
Make the Invoke button show a loading spinner while queueing.
The queue mutations need to be awaited else the `isLoading` state doesn't work as expected. I feel like I should understand why, but I don't...
This allows comboboxes for models to have more granular groupings. For example, Control Adapter models can be grouped by base model & model type.
Before:
- `SD-1`
- `SDXL`
After:
- `SD-1 / ControlNet`
- `SD-1 / T2I Adapter`
- `SDXL / ControlNet`
- `SDXL / T2I Adapter`
When a control adapter processor config is changed, if we were already processing an image, that batch is immediately canceled. This prevents the processed image from getting stuck in a weird state if you change or reset the processor at the right (err, wrong?) moment.
- Update internal state for control adapters to track processor batches, instead of just having a flag indicating if the image is processing. Add a slice migration to not break the user's existing app state.
- Update preprocessor listener with more sophisticated logic to handle canceling the batch and resetting the processed image when the config changes or is reset.
- Fixed error handling that erroneously showed "failed to queue graph" errors when an active listener instance is canceled, need to check the abort signal.
This is largely an internal change, and it should have been this way from the start - less tip-toeing around layer types. The user-facing change is when you click an IP Adapter layer, it is highlighted. That's it.
Turns out, it's more efficient to just use the bbox logic for empty mask calculations. We already track if if the bbox needs updating, so this calculation does minimal work.
The dedicated calculation wasn't able to use the bbox tracking so it ran far more often than the bbox calculation.
Removed the "fast" bbox calculation logic, bc the new logic means we are continually updating the bbox in the background - not only when the user switches to the move tool and/or selects a layer.
The bbox calculation logic is split out from the bbox rendering logic to support this.
Result - better perf overall, with the empty mask handling retained.
Mask vector data includes additive (brush, rect) shapes and subtractive (eraser) shapes. A different composite operation is used to draw a shape, depending on whether it is additive or subtractive.
This means that a mask may have vector objects, but once rendered, is _visually_ empty (fully transparent). The only way determine if a mask is visually empty is to render it and check every pixel.
When we generate and save layer metadata, these fully erased masks are still used. Generating with an empty mask is a no-op in the backend, so we want to avoid this and not pollute graphs/metadata.
Previously, we did that pixel-based when calculating the bbox, which we only did when using the move tool, and only for the selected layer.
This change introduces a simpler function to check if a mask is transparent, and if so, deletes all its objects to reset it. This allows us skip these no-op layers entirely.
This check is debounced to 300 ms, trailing edge only.
When layer metadata is stored, the layer IDs are included. When recalling the metadata, we need to assign fresh IDs, else we can end up with multiple layers with the same ID, which of course causes all sorts of issues.
- Viewer only exists on Generation tab
- Viewer defaults to open
- When clicking the Control Layers tab on the left panel, close the viewer (i.e. open the CL editor)
- Do not switch to editor when adding layers (this is handled by clicking the Control Layers tab)
- Do not open viewer when single-clicking images in gallery
- _Do_ open viewer when _double_-clicking images in gallery
- Do not change viewer state when switching between app tabs (this no longer makes sense; the viewer only exists on generation tab)
- Change the button to a drop down menu that states what you are currently doing, e.g. Viewing vs Editing
There are unresolved platform-specific issues with this component, and its utility is debatable.
Should be easy to just revert this commit to add it back in the future if desired.
There are a number of bugs with `framer-motion` that can result in sync issues with AnimatePresence and the conditionally rendered component.
You can see this if you rapidly click an accordion, occasionally it gets out of sync and is closed when it should be open.
This is a bigger problem with the viewer where the user may hold down the `z` key. It's trivial to get it to lock up.
For now, just remove the animation entirely.
Upstream issues for reference:
https://github.com/framer/motion/issues/2023https://github.com/framer/motion/issues/2618https://github.com/framer/motion/issues/2554
- Rects snap to stage edge when within a threshold (10 screen pixels)
- When mouse leaves stage, set last mousedown pos to null, preventing nonfunctional rect outlines
Partially addresses #6306.
There's a technical challenge to fully address the issue - mouse event are not fired when the mouse is outside the stage. While we could draw the rect even if the mouse leaves, we cannot update the rect's dimensions on mouse move, or complete the drawing on mouse up.
To fully address the issue, we'd need to a way to forward window events back to the stage, or at least handle window events. We can explore this later.
When invoking with control layers, we were creating and uploading the mask images on every enqueue, even when the mask didn't change. The mask image can be cached to greatly reduce the number of uploads.
With this change, we are a bit smarter about the mask images:
- Check if there is an uploaded mask image name
- If so, attempt to retrieve its DTO. Typically it will be in the RTKQ cache, so there is no network request, but it will make a network request if not cached to confirm the image actually exists on the server.
- If we don't have an uploaded mask image name, or the request fails, we go ahead and upload the generated blob
- Update the layer's state with a reference to this uploaded image for next time
- Continue as before
Any time we modify the mask (drawing/erasing, resetting the layer), we invalidate that cached image name (set it to null).
We now only upload images when we need to and generation starts faster.
- Rework styling
- Replace "CurrentImageDisplay" entirely
- Add a super short fade to reduce jarring transition
- Make the viewer a singleton component, overlaid on everything else - reduces change when switching tabs
- Works on txt2img, canvas and workflows tabs, img2img has its own side-by-side view
- In workflow editor, the is closeable only if you are in edit mode, else it's always there
- Press `i` to open
- Press `esc` to close
- Selecting an image or changing image selection opens the viewer
- When generating, if auto-switch to new image is enabled, the viewer opens when an image comes in
To support this change, I organized and restructured some tab stuff.
When recalling metadata and/or using control image dimensions, it was possible to set a width or height that was not a multiple of 8, resulting in generation failures.
Added a `clamp` option to the w/h actions to fix this. The option is used for all untrusted sources - everything except for the w/h number inputs, which clamp the values themselves.
Firefox v125.0.3 and below has a bug where `mouseenter` events are fired continually during mouse moves. The issue isn't present on FF v126.0b6 Developer Edition. It's not clear if the issue is present on FF nightly, and we're not sure if it will actually be fixed in the stable v126 release.
The control layers drawing logic relied on on `mouseenter` events to create new lines, and `mousemove` to extend existing lines. On the affected version of FF, all line extensions are turned into new lines, resulting in very poor performance, noncontiguous lines, and way-too-big internal state.
To resolve this, the drawing handling was updated to not use `mouseenter` at all. As a bonus, resolving this issue has resulted in simpler logic for drawing on the canvas.
- Add set of metadata handlers for the control layers CAs
- Use these conditionally depending on the active tab - when recalling on txt2img, the CAs go to control layers, else they go to the old CA area.
These changes were left over from the previous attempt to handle control adapters in control layers with the same logic. Control Layers are now handled totally separately, so these changes may be reverted.
There were some invalid constraints with the processors - minimum of 0 for resolution or multiple of 64 for resolution.
Made minimum 1px and no multiple ofs.
When typing in a number into the w/h number inputs, if the number is less than the step, it appears the value of 0 is used. This is unexpected; it means Chakra isn't clamping the value correctly (or maybe our wrapper isn't clamping it).
Add checks to never bail if the width or height value from the number input component is 0.
- Revise control adapter config types
- Recreate all control adapter mutations in control layers slice
- Bit of renaming along the way - typing 'RegionalGuidanceLayer' over and over again was getting tedious
Konva doesn't react to changes to window zoom/scale. If you open the tab at, say, 90%, then bump to 100%, the pixel ratio of the canvas doesn't change. This results in lower-quality renders on the canvas (generation is unaffected).
`PC_PATH_MAX` doesn't exist for (some?) external drives on macOS. We need error handling when retrieving this value.
Also added error handling for `PC_NAME_MAX` just in case. This does work for me for external drives on macOS, though.
Closes#6277
There are only a couple SDXL inpainting models, and my tests indicate they are not as good as SD1.5 inpainting, but at least we support them now.
- Add the config file. This matches what is used in A1111. The only difference from the non-inpainting SDXL config is the number of in-channels.
- Update the legacy config maps to use this config file.
Pending:
- Move model install calls into model manager and create passthrus in invocation_context.
- Consider splitting load_model_from_url() into a call to get the path and a call to load the path.
- Use the our adaptation of the HWC3 function with better types
- Extraction some of the util functions, name them better, add comments
- Improve type annotations
- Remove unreachable codepaths
- Shift+C: Reset selected layer mask (same as canvas)
- Shift+D: Delete selected layer (cannot be Del, that deletes an image in gallery)
- Shift+A: Add layer (cannot be Ctrl+Shift+N, that opens a new window)
- Ctrl/Cmd+Wheel: Brush size (same as canvas)
Trying a lot of different things as I iterated, so this is smooshed into one big commit... too hard to split it now.
- Iterated on IP adapter handling and UI. Unfortunately there is an bug related to undo/redo. The IP adapter state is split across the `controlAdapters` slice and the `regionalPrompts` slice, but only the `regionalPrompts` slice supports undo/redo. If you delete the IP adapter and then undo/redo to a history state where it existed, you'll get an error. The fix is likely to merge the slices... Maybe there's a workaround.
- Iterated on UI. I think the layers are OK now.
- Removed ability to disable RP globally for now. It's enabled if you have enabled RP layers.
- Many minor tweaks and fixes.
- Keep track of whether the bbox needs to be recalculated (e.g. had lines/points added)
- Keep track of whether the bbox has eraser strokes - if yes, we need to do the full pixel-perfect bbox calculation, otherwise we can use the faster getClientRect
- Use comparison rather than Math.min/max in bbox calculation (slightly faster)
- Return `null` if no pixel data at all in bbox
Adds an additional negative conditioning using the inverted mask of the positive conditioning and the positive prompt. May be useful for mutually exclusive regions.
* introduce new abstraction layer for GPU devices
* add unit test for device abstraction
* fix ruff
* convert TorchDeviceSelect into a stateless class
* move logic to select context-specific execution device into context API
* add mock hardware environments to pytest
* remove dangling mocker fixture
* fix unit test for running on non-CUDA systems
* remove unimplemented get_execution_device() call
* remove autocast precision
* Multiple changes:
1. Remove TorchDeviceSelect.get_execution_device(), as well as calls to
context.models.get_execution_device().
2. Rename TorchDeviceSelect to TorchDevice
3. Added back the legacy public API defined in `invocation_api`, including
choose_precision().
4. Added a config file migration script to accommodate removal of precision=autocast.
* add deprecation warnings to choose_torch_device() and choose_precision()
* fix test crash
* remove app_config argument from choose_torch_device() and choose_torch_dtype()
---------
Co-authored-by: Lincoln Stein <lstein@gmail.com>
Currently translated at 98.4% (1122 of 1140 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.4% (1120 of 1138 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.4% (1115 of 1133 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
When using refiner with a mask (i.e. inpainting), we don't have noise provided as an input to the node.
This situation uniquely hits a code path that wasn't reviewed when gradient denoising was implemented.
That code path does two things wrong:
- It lerp'd the input latents. This was fixed in 5a1f4cb1ce.
- It added noise to the latents an extra time. This is fixed in this change.
We don't need to add noise in `latents_from_embeddings` because we do it just a lines later in `AddsMaskGuidance`.
- Remove the extraneous call to `add_noise`
- Make `seed` a required arg. We never call the function without seed anyways. If we refactor this in the future, it will be clearer that we need to look at how seed is handled.
- Move the call to create the noise to a deeper conditional, just before we call `AddsMaskGuidance`. The created noise tensor is now only used in that function, no need to create it every time.
Note: Whether or not having both noise and latents as inputs on the node is correct is a separate conversation. This change just fixes the issue with the current setup.
`LatentsField` objects have an optional `seed` field. This should only be populated when the latents are noise, generated from a seed.
`DenoiseLatentsInvocation` needs a seed value for scheduler initialization. It's used in a few places, and there is some logic for determining the seed to use with a series of fallbacks:
- Use the seed from the noise (a `LatentsField` object)
- Use the seed from the latents (a `LatentsField` object - normally it won't have a seed)
- Use `0` as a final fallback
In `DenoisLatentsInvocation`, we set the seed in the `LatentsOutput`, even though the output latents are not noise.
This is normally fine, but when we use refiner, we re-use the those same latents for the refiner denoise. This causes that characteristic same-seed-fried look on the refiner pass.
Simple fix - do not set the field in the output latents.
Handful of intertwined fixes.
- Create and use helper function to reset staging area.
- Clear staging area when queue items are canceled, failed, cleared, etc. Fixes a bug where the bbox ends up offset and images are put into the wrong spot.
- Fix a number of similar bugs where canvas would "forget" it had pending generations, but they continued to generate. Canvas needs to track batches that should be displayed in it using `state.canvas.batchIds`, and this was getting cleared without actually canceling those batches.
- Disable the `discard current image` button on canvas if there is only one image. Prevents accidentally canceling all canvas batches if you spam the button.
This is intended for debug usage, so it's hidden away in the workflow library `...` menu. Hold shift to see the button for it.
- Paste a graph (from a network request, for example) and then click the convert button to convert it to a workflow.
- Disable auto layout to stack the nodes with an offset (try it out). If you change this, you must re-convert to get the changes.
- Edit the workflow JSON if you need to tweak something before loading it.
- Allow user-defined precision on MPS.
- Use more explicit logic to handle all possible cases.
- Add comments.
- Remove the app_config args (they were effectively unused, just get the config using the singleton getter util)
This data is already in the template but it wasn't ever used.
One big place where this improves UX is the noise node. Previously, the UI let you change width and height in increments of 1, despite the template requiring a multiple of 8. It now works in multiples of 8.
Retrieving the DTO happens as part of the metadata parsing, not recall. This way, we don't show the option to recall a nonexistent image.
This matches the flow for other metadata entities like models - we don't show the model recall button if the model isn't available.
Currently translated at 73.3% (826 of 1126 strings)
Co-authored-by: Alexander Eichhorn <pfannkuchensack@einfach-doof.de>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/de/
Translation: InvokeAI/Web UI
The previous algorithm errored if the image wasn't divisible by the tile size. I've reimplemented it from scratch to mitigate this issue.
The new algorithm is simpler. We create a pool of tiles, then use them to create an image composed completely of tiles. If there is any awkwardly sized space on the edge of the image, the tiles are cropped to fit.
Finally, paste the original image over the tile image.
I've added a jupyter notebook to do a smoke test of infilling methods, and 10 test images.
The other infill algorithms can be easily tested with the notebook on the same images, though I didn't set that up yet.
Tested and confirmed this gives results just as good as the earlier infill, though of course they aren't the same due to the change in the algorithm.
We have had a few bugs with v4 related to file encodings, especially on Windows.
Windows uses its own character encodings instead of `utf-8`, often `cp1252`. Some characters cannot be decoded using `utf-8`, causing `UnicodeDecodeError`.
There are a couple places where this can cause problems:
- In the installer bootstrap, we install or upgrade `pip` and decode the result, using `subprocess`.
The input to this includes the user's home dir. In #6105, the user had one of the problematic characters in their username. `subprocess` attempts and fails to decode the username, which crashes the installer.
To fix this, we need to use `locale.getpreferredencoding()` when executing the command.
- Similarly, in the model install service and config class, we attempt to load a yaml config file. If a problematic character is in the path to the file (which often includes the user's home dir), we can get the same error.
One example is #6129 in which the models.yaml migration fails.
To fix this, we need to open the file with `locale.getpreferredencoding()`.
Compare the installed paths to determine if the model is already installed. Fixes an issue where installed models showed up as uninstalled or vice-versa. Related to relative vs absolute path handling.
Renaming the model file to the model name introduces unnecessary contraints on model names.
For example, a model name can technically be any length, but a model _filename_ cannot be too long.
There are also constraints on valid characters for filenames which shouldn't be applied to model record names.
I believe the old behaviour is a holdover from the old system.
Setting to 'auto' works only for InvokeAI config and auto detects the SD model but will override if user explicitly sets it. If auto used with checkpoint models, we raise an error. Checkpoints will always need to set to non-auto.
The valid values for this parameter changed when inpainting changed to gradient denoise. The generation slice's redux migration wasn't updated, resulting in a generation error until you change the setting or reset web UI.
- Add and use more performant `deepClone` method for deep copying throughout the UI.
Benchmarks indicate the Really Fast Deep Clone library (`rfdc`) is the best all-around way to deep-clone large objects.
This is particularly relevant in canvas. When drawing or otherwise manipulating canvas objects, we need to do a lot of deep cloning of the canvas layer state objects.
Previously, we were using lodash's `cloneDeep`.
I did some fairly realistic benchmarks with a handful of deep-cloning algorithms/libraries (including the native `structuredClone`). I used a snapshot of the canvas state as the data to be copied:
On Chromium, `rfdc` is by far the fastest, over an order of magnitude faster than `cloneDeep`.
On FF, `fastest-json-copy` and `recursiveDeepCopy` are even faster, but are rather limited in data types. `rfdc`, while only half as fast as the former 2, is still nearly an order of magnitude faster than `cloneDeep`.
On Safari, `structuredClone` is the fastest, about 2x as fast as `cloneDeep`. `rfdc` is only 30% faster than `cloneDeep`.
`rfdc`'s peak memory usage is about 10% more than `cloneDeep` on Chrome. I couldn't get memory measurements from FF and Safari, but let's just assume the memory usage is similar relative to the other algos.
Overall, `rfdc` is the best choice for a single algo for all browsers. It's definitely the best for Chromium, by far the most popular desktop browser and thus our primary target.
A future enhancement might be to detect the browser and use that to determine which algorithm to use.
There were two ways the canvas history could grow too large (past the `MAX_HISTORY` setting):
- Sometimes, when pushing to history, we didn't `shift` an item out when we exceeded the max history size.
- If the max history size was exceeded by more than one item, we still only `shift`, which removes one item.
These issue could appear after an extended canvas session, resulting in a memory leak and recurring major GCs/browser performance issues.
To fix these issues, a helper function is added for both past and future layer states, which uses slicing to ensure history never grows too large.