- Commands, invocations and their parameters will now autocomplete using
introspection.
- Two types of parameter *arguments* will also autocomplete:
- --sampler_name will autocomplete the scheduler name
- --model will autocomplete the model name
- There don't seem to be commands for reading/writing image files yet,
so path autocompletion is not implemented
A long-standing issue with importing legacy checkpoints (both ckpt and
safetensors) is that the user has to identify the correct config file,
either by providing its path or by selecting which type of model the
checkpoint is (e.g. "v1 inpainting"). In addition, some users wish to
provide custom VAEs for use with the model. Currently this is done in
the WebUI by importing the model, editing it, and then typing in the
path to the VAE.
## Model configuration file selection
To improve the user experience, the model manager's `heuristic_import()`
method has been enhanced as follows:
1. When initially called, the caller can pass a config file path, in
which case it will be used.
2. If no config file provided, the method looks for a .yaml file in the
same directory as the model which bears the same basename. e.g.
```
my-new-model.safetensors
my-new-model.yaml
```
The yaml file is then used as the configuration file for importation and
conversion.
3. If no such file is found, then the method opens up the checkpoint and
probes it to determine whether it is V1, V1-inpaint or V2. If it is a V1
format, then the appropriate v1-inference.yaml config file is used.
Unfortunately there are two V2 variants that cannot be distinguished by
introspection.
4. If the probe algorithm is unable to determine the model type, then
its last-ditch effort is to execute an optional callback function that
can be provided by the caller. This callback, named
`config_file_callback` receives the path to the legacy checkpoint and
returns the path to the config file to use. The CLI uses to put up a
multiple choice prompt to the user. The WebUI **could** use this to
prompt the user to choose from a radio-button selection.
5. If the config file cannot be determined, then the import is
abandoned.
## Custom VAE Selection
The user can attach a custom VAE to the imported and converted model by
copying the desired VAE into the same directory as the file to be
imported, and giving it the same basename. E.g.:
```
my-new-model.safetensors
my-new-model.vae.pt
```
For this to work, the VAE must end with ".vae.pt", ".vae.ckpt", or
".vae.safetensors". The indicated VAE will be converted into diffusers
format and stored with the converted models file, so the ".pt" file can
be deleted after conversion.
No facility is currently provided to swap a diffusers VAE at import
time, but this can be done after the fact using the WebUI and CLI's
model editing functions.
Note that this is the same fix that was applied to the 2.3 branch in
#3043 . This applies to `main`.
## Enable the on-the-fly conversion of models based on SD 2.0/2.1 into
diffusers
This commit fixes bugs related to the on-the-fly conversion and loading
of legacy checkpoint models built on SD-2.0 base.
- When legacy checkpoints built on SD-2.0 models were converted
on-the-fly using --ckpt_convert, generation would crash with a precision
incompatibility error. This problem has been found and fixed.
This commit fixes bugs related to the on-the-fly conversion and loading of
legacy checkpoint models built on SD-2.0 base.
- When legacy checkpoints built on SD-2.0 models were converted
on-the-fly using --ckpt_convert, generation would crash with a
precision incompatibility error.
The Pytorch ROCm version in the documentation in outdated (`rocm5.2`)
which leads to errors during the installation of InvokeAI.
This PR updates the documentation with the latest Pytorch ROCm `5.4.2`
version.
A long-standing issue with importing legacy checkpoints (both ckpt and
safetensors) is that the user has to identify the correct config file,
either by providing its path or by selecting which type of model the
checkpoint is (e.g. "v1 inpainting"). In addition, some users wish to
provide custom VAEs for use with the model. Currently this is done in
the WebUI by importing the model, editing it, and then typing in the
path to the VAE.
To improve the user experience, the model manager's
`heuristic_import()` method has been enhanced as follows:
1. When initially called, the caller can pass a config file path, in
which case it will be used.
2. If no config file provided, the method looks for a .yaml file in the
same directory as the model which bears the same basename. e.g.
```
my-new-model.safetensors
my-new-model.yaml
```
The yaml file is then used as the configuration file for
importation and conversion.
3. If no such file is found, then the method opens up the checkpoint
and probes it to determine whether it is V1, V1-inpaint or V2.
If it is a V1 format, then the appropriate v1-inference.yaml config
file is used. Unfortunately there are two V2 variants that cannot be
distinguished by introspection.
4. If the probe algorithm is unable to determine the model type, then its
last-ditch effort is to execute an optional callback function that can
be provided by the caller. This callback, named `config_file_callback`
receives the path to the legacy checkpoint and returns the path to the
config file to use. The CLI uses to put up a multiple choice prompt to
the user. The WebUI **could** use this to prompt the user to choose
from a radio-button selection.
5. If the config file cannot be determined, then the import is abandoned.
The user can attach a custom VAE to the imported and converted model
by copying the desired VAE into the same directory as the file to be
imported, and giving it the same basename. E.g.:
```
my-new-model.safetensors
my-new-model.vae.pt
```
For this to work, the VAE must end with ".vae.pt", ".vae.ckpt", or
".vae.safetensors". The indicated VAE will be converted into diffusers
format and stored with the converted models file, so the ".pt" file
can be deleted after conversion.
No facility is currently provided to swap a diffusers VAE at import
time, but this can be done after the fact using the WebUI and CLI's
model editing functions.
keeping `main` up to date with my api nodes branch:
- bd7e515290: [nodes] Add cancelation to
the API @Kyle0654
- 5fe38f7: fix(backend): simple typing fixes
- just picking some low-hanging fruit to improve IDE hinting
- c34ac91: fix(nodes): fix cancel; fix callback for img2img, inpaint
- makes nodes cancel immediate, use fix progress images on nodes, fix
callbacks for img2img/inpaint
- 4221cf7: fix(nodes): fix schema generation for output classes
- did this previously for some other class; needed to not have node
outputs be optional
Some schedulers report not only the noisy latents at the current
timestep, but also their estimate so far of what the de-noised latents
will be.
It makes for a more legible preview than the noisy latents do.
I think this is a huge improvement, but there are a few considerations:
- Need to not spook @JPPhoto by changing how previews look.
- Some schedulers (most notably **DPM Solver++**) don't provide this
data, and it falls back to the current behavior there. That's not
terrible, but seeing such a big difference in how _previews_ look from
one scheduler to the next might mislead people into thinking there's a
bigger difference in their overall effectiveness than there really is.
My fear of configuration-option-overwhelm leaves me inclined to _not_
add a configuration option for this, but we could.
- Commands, invocations and their parameters will now autocomplete
using introspection.
- Two types of parameter *arguments* will also autocomplete:
- --sampler_name will autocomplete the scheduler name
- --model will autocomplete the model name
- There don't seem to be commands for reading/writing image files yet, so
path autocompletion is not implemented
- resolve conflicts with generate.py invocation
- remove unused symbols that pyflakes complains about
- add **untested** code for passing intermediate latent image to the
step callback in the format expected.
This PR fixes#2951 and restores the step_callback argument in the
refactored generate() method. Note that this issue states that
"something is still wrong because steps and step are zero." However,
I think this is confusion over the call signature of the callback, which
since the diffusers merge has been `callback(state:PipelineIntermediateState)`
This is the test script that I used to determine that `step` is being passed
correctly:
```
from pathlib import Path
from invokeai.backend import ModelManager, PipelineIntermediateState
from invokeai.backend.globals import global_config_dir
from invokeai.backend.generator import Txt2Img
def my_callback(state:PipelineIntermediateState, total_steps:int):
print(f'callback(step={state.step}/{total_steps})')
def main():
manager = ModelManager(Path(global_config_dir()) / "models.yaml")
model = manager.get_model('stable-diffusion-1.5')
print ('=== TXT2IMG TEST ===')
steps=30
output = next(Txt2Img(model).generate(prompt='banana sushi',
iterations=None,
steps=steps,
step_callback=lambda x: my_callback(x,steps)
)
)
print(f'image={output.image}, seed={output.seed}, steps={output.params.steps}')
if __name__=='__main__':
main()
```
- When a legacy checkpoint model is loaded via --convert_ckpt and its
models.yaml stanza refers to a custom VAE path (using the 'vae:' key),
the custom VAE will be converted and used within the diffusers model.
Otherwise the VAE contained within the legacy model will be used.
- Note that the checkpoint import functions in the CLI or Web UIs
continue to default to the standard stabilityai/sd-vae-ft-mse VAE. This
can be fixed after the fact by editing VAE key using either the CLI or
Web UI.
- Fixes issue #2917
The mkdocs-workflow has been failing over the past week due to
permission denied errors. I *think* this is the result of not passing
the GitHub API token to the workflow, and this is a speculative fix for
the issue.
- This PR turns on pickle scanning before a legacy checkpoint file is
loaded from disk within the checkpoint_to_diffusers module.
- Also miscellaneous diagnostic message cleanup.
- See also #3011 for a similar patch to the 2.3 branch.
Currently translated at 100.0% (504 of 504 strings)
translationBot(ui): update translation (Spanish)
Currently translated at 100.0% (501 of 501 strings)
Co-authored-by: gallegonovato <fran-carro@hotmail.es>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/es/
Translation: InvokeAI/Web UI
Currently translated at 100.0% (504 of 504 strings)
translationBot(ui): update translation (Italian)
Currently translated at 100.0% (501 of 501 strings)
translationBot(ui): update translation (Italian)
Currently translated at 100.0% (500 of 500 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
This PR fixes#2951 and restores the step_callback argument in the
refactored generate() method. Note that this issue states that
"something is still wrong because steps and step are zero." However,
I think this is confusion over the call signature of the callback, which
since the diffusers merge has been `callback(state:PipelineIntermediateState)`
This is the test script that I used to determine that `step` is being passed
correctly:
```
from pathlib import Path
from invokeai.backend import ModelManager, PipelineIntermediateState
from invokeai.backend.globals import global_config_dir
from invokeai.backend.generator import Txt2Img
def my_callback(state:PipelineIntermediateState, total_steps:int):
print(f'callback(step={state.step}/{total_steps})')
def main():
manager = ModelManager(Path(global_config_dir()) / "models.yaml")
model = manager.get_model('stable-diffusion-1.5')
print ('=== TXT2IMG TEST ===')
steps=30
output = next(Txt2Img(model).generate(prompt='banana sushi',
iterations=None,
steps=steps,
step_callback=lambda x: my_callback(x,steps)
)
)
print(f'image={output.image}, seed={output.seed}, steps={output.params.steps}')
if __name__=='__main__':
main()
```
This PR corrects a bug in which embeddings were not being applied when a
non-diffusers model was loaded.
- Fixes#2954
- Also improves diagnostic reporting during embedding loading.
- This PR turns on pickle scanning before a legacy checkpoint file
is loaded from disk within the checkpoint_to_diffusers module.
- Also miscellaneous diagnostic message cleanup.
- When a legacy checkpoint model is loaded via --convert_ckpt and its
models.yaml stanza refers to a custom VAE path (using the 'vae:'
key), the custom VAE will be converted and used within the diffusers
model. Otherwise the VAE contained within the legacy model will be
used.
- Note that the heuristic_import() method, which imports arbitrary
legacy files on disk and URLs, will continue to default to the
the standard stabilityai/sd-vae-ft-mse VAE. This can be fixed after
the fact by editing the models.yaml stanza using the Web or CLI
UIs.
- Fixes issue #2917
- 86932469e76f1315ee18bfa2fc52b588241dace1 add image_to_dataURL util
- 0c2611059711b45bb6142d30b1d1343ac24268f3 make fast latents method
static
- this method doesn't really need `self` and should be able to be called
without instantiating `Generator`
- 2360bfb6558ea511e9c9576f3d4b5535870d84b4 fix schema gen for
GraphExecutionState
- `GraphExecutionState` uses `default_factory` in its fields; the result
is the OpenAPI schema marks those fields as optional, which propagates
to the generated API client, which means we need a lot of unnecessary
type guards to use this data type. the [simple
fix](https://github.com/pydantic/pydantic/discussions/4577) is to add
config to explicitly say all class properties are required. looks this
this will be resolved in a future pydantic release
- 3cd7319cfdb0f07c6bb12d62d7d02efe1ab12675 fix step callback and fast
latent generation on nodes. have this working in UI. depends on the
small change in #2957
Update `compel` to 1.0.0.
This fixes#2832.
It also changes the way downweighting is applied. In particular,
downweighting should now be much better and more controllable.
From the [compel
changelog](https://github.com/damian0815/compel#changelog):
> Downweighting now works by applying an attention mask to remove the
downweighted tokens, rather than literally removing them from the
sequence. This behaviour is the default, but the old behaviour can be
re-enabled by passing `downweight_mode=DownweightMode.REMOVE` on init of
the `Compel` instance.
>
> Formerly, downweighting a token worked by both multiplying the
weighting of the token's embedding, and doing an inverse-weighted blend
with a copy of the token sequence that had the downweighted tokens
removed. The intuition is that as weight approaches zero, the tokens
being downweighted should be actually removed from the sequence.
However, removing the tokens resulted in the positioning of all
downstream tokens becoming messed up. The blend ended up blending a lot
more than just the tokens in question.
>
> As of v1.0.0, taking advice from @keturn and @bonlime
(https://github.com/damian0815/compel/issues/7) the procedure is by
default different. Downweighting still involves a blend but what is
blended is a version of the token sequence with the downweighted tokens
masked out, rather than removed. This correctly preserves positioning
embeddings of the other tokens.
* Update root component to allow optional children that will render as
dynamic header of UI
* Export additional components (logo & themeChanger) for use in said
dynamic header (more to come here)
# The Problem
Pickle files (.pkl, .ckpt, etc) are extremely unsafe as they can be
trivially crafted to execute arbitrary code when parsed using
`torch.load`
Right now the conventional wisdom among ML researchers and users is to
simply `not run untrusted pickle files ever` and instead only use
Safetensor files, which cannot be injected with arbitrary code. This is
very good advice.
Unfortunately, **I have discovered a vulnerability inside of InvokeAI
that allows an attacker to disguise a pickle file as a safetensor and
have the payload execute within InvokeAI.**
# How It Works
Within `model_manager.py` and `convert_ckpt_to_diffusers.py` there are
if-statements that decide which `load` method to use based on the file
extension of the model file. The logic (written in a slightly more
readable format than it exists in the codebase) is as follows:
```
if Path(file).suffix == '.safetensors':
safetensor_load(file)
else:
unsafe_pickle_load(file)
```
A malicious actor would only need to create an infected .ckpt file, and
then rename the extension to something that does not pass the `==
'.safetensors'` check, but still appears to a user to be a safetensors
file.
For example, this might be something like `.Safetensors`,
`.SAFETENSORS`, `SafeTensors`, etc.
InvokeAI will happily import the file in the Model Manager and execute
the payload.
# Proof of Concept
1. Create a malicious pickle file.
(https://gist.github.com/CodeZombie/27baa20710d976f45fb93928cbcfe368)
2. Rename the `.ckpt` extension to some variation of `.Safetensors`,
ensuring there is a capital letter anywhere in the extension (eg.
`malicious_pickle.SAFETENSORS`)
3. Import the 'model' like you would normally with any other safetensors
file with the Model Manager.
4. Upon trying to select the model in the web ui, it will be loaded (or
attempt to be converted to a Diffuser) with `torch.load` and the payload
will execute.

# The Fix
This pull request changes the logic InvokeAI uses to decide which model
loader to use so that the safe behavior is the default. Instead of
loading as a pickle if the extension is not exactly `.safetensors`, it
will now **always** load as a safetensors file unless the extension is
**exactly** `.ckpt`.
# Notes:
I think support for pickle files should be totally dropped ASAP as a
matter of security, but I understand that there are reasons this would
be difficult.
In the meantime, I think `RestrictedUnpickler` or something similar
should be implemented as a replacement for `torch.load`, as this
significantly reduces the amount of Python methods that an attacker has
to work with when crafting malicious payloads
inside a pickle file.
Automatic1111 already uses this with some success.
(https://github.com/AUTOMATIC1111/stable-diffusion-webui/blob/master/modules/safe.py)
- The value of png_compression was always 6, despite the value provided
to the --png_compression argument. This fixes the bug.
- It also fixes an inconsistency between the maximum range of
png_compression and the help text.
- Closes#2945
- The value of png_compression was always 6, despite the value provided to the
--png_compression argument. This fixes the bug.
- It also fixes an inconsistency between the maximum range of png_compression
and the help text.
- Closes#2945
Prior to this commit, all models would be loaded with the extremely unsafe `torch.load` method, except those with the exact extension `.safetensors`. Even a change in casing (eg. `saFetensors`, `Safetensors`, etc) would cause the file to be loaded with torch.load instead of the much safer `safetensors.toch.load_file`.
If a malicious actor renamed an infected `.ckpt` to something like `.SafeTensors` or `.SAFETENSORS` an unsuspecting user would think they are loading a safe .safetensor, but would in fact be parsing an unsafe pickle file, and executing an attacker's payload. This commit fixes this vulnerability by reversing the loading-method decision logic to only use the unsafe `torch.load` when the file extension is exactly `.ckpt`.
#2931 was caused by new code that held onto the PRNG in `get_make_image`
and used it in `make_image` for img2img and inpainting. This
functionality has been moved elsewhere so that we can generate multiple
images again.
fix(ui): remove old scrollbar css
fix(ui): make guidepopover lazy
feat(ui): wip resizable drawer
feat(ui): wip resizable drawer
feat(ui): add scroll-linked shadow
feat(ui): organize files
Align Scrollbar next to content
Move resizable drawer underneath the progress bar
Add InvokeLogo to unpinned & align
Adds Invoke Logo to Unpinned Parameters panel and aligns to make it feel seamless.
# Remove node dependencies on generate.py
This is a draft PR in which I am replacing `generate.py` with a cleaner,
more structured interface to the underlying image generation routines.
The basic code pattern to generate an image using the new API is this:
```
from invokeai.backend import ModelManager, Txt2Img, Img2Img
manager = ModelManager('/data/lstein/invokeai-main/configs/models.yaml')
model = manager.get_model('stable-diffusion-1.5')
txt2img = Txt2Img(model)
outputs = txt2img.generate(prompt='banana sushi', steps=12, scheduler='k_euler_a', iterations=5)
# generate() returns an iterator
for next_output in outputs:
print(next_output.image, next_output.seed)
outputs = Img2Img(model).generate(prompt='strawberry` sushi', init_img='./banana_sushi.png')
output = next(outputs)
output.image.save('strawberries.png')
```
### model management
The `ModelManager` handles model selection and initialization. Its
`get_model()` method will return a `dict` with the following keys:
`model`, `model_name`,`hash`, `width`, and `height`, where `model` is
the actual StableDiffusionGeneratorPIpeline. If `get_model()` is called
without a model name, it will return whatever is defined as the default
in `models.yaml`, or the first entry if no default is designated.
### InvokeAIGenerator
The abstract base class `InvokeAIGenerator` is subclassed into into
`Txt2Img`, `Img2Img`, `Inpaint` and `Embiggen`. The constructor for
these classes takes the model dict returned by
`model_manager.get_model()` and optionally an
`InvokeAIGeneratorBasicParams` object, which encapsulates all the
parameters in common among `Txt2Img`, `Img2Img` etc. If you don't
provide the basic params, a reasonable set of defaults will be chosen.
Any of these parameters can be overridden at `generate()` time.
These classes are defined in `invokeai.backend.generator`, but they are
also exported by `invokeai.backend` as shown in the example below.
```
from invokeai.backend import InvokeAIGeneratorBasicParams, Img2Img
params = InvokeAIGeneratorBasicParams(
perlin = 0.15
steps = 30
scheduler = 'k_lms'
)
img2img = Img2Img(model, params)
outputs = img2img.generate(scheduler='k_heun')
```
Note that we were able to override the basic params in the call to
`generate()`
The `generate()` method will returns an iterator over a series of
`InvokeAIGeneratorOutput` objects. These objects contain the PIL image,
the seed, the model name and hash, and attributes for all the parameters
used to generate the object (you can also get these as a dict). The
`iterations` argument controls how many objects will be returned,
defaulting to 1. Pass `None` to get an infinite iterator.
Given the proposed use of `compel` to generate a templated series of
prompts, I thought the API would benefit from a style that lets you loop
over the output results indefinitely. I did consider returning a single
`InvokeAIGeneratorOutput` object in the event that `iterations=1`, but I
think it's dangerous for a method to return different types of result
under different circumstances.
Changing the model is as easy as this:
```
model = manager.get_model('inkspot-2.0`)
txt2img = Txt2Img(model)
```
### Node and legacy support
With respect to `Nodes`, I have written `model_manager_initializer` and
`restoration_services` modules that return `model_manager` and
`restoration` services respectively. The latter is used by the face
reconstruction and upscaling nodes. There is no longer any reference to
`Generate` in the `app` tree.
I have confirmed that `txt2img` and `img2img` work in the nodes client.
I have not tested `embiggen` or `inpaint` yet. pytests are passing, with
some warnings that I don't think are related to what I did.
The legacy WebUI and CLI are still working off `Generate` (which has not
yet been removed from the source tree) and fully functional.
I've finished all the tasks on my TODO list:
- [x] Update the pytests, which are failing due to dangling references
to `generate`
- [x] Rewrite the `reconstruct.py` and `upscale.py` nodes to call
directly into the postprocessing modules rather than going through
`Generate`
- [x] Update the pytests, which are failing due to dangling references
to `generate`
Prior to the folder restructure, the `paths` for `test-invoke-pip` did
not include the UI's path `invokeai/frontend/`:
```yaml
paths:
- 'pyproject.toml'
- 'ldm/**'
- 'invokeai/backend/**'
- 'invokeai/configs/**'
- 'invokeai/frontend/dist/**'
```
After the restructure, more code was moved into the `invokeai/frontend/`
folder, and `paths` was updated:
```yaml
paths:
- 'pyproject.toml'
- 'invokeai/**'
- 'invokeai/backend/**'
- 'invokeai/configs/**'
- 'invokeai/frontend/web/dist/**'
```
Now, the second path includes the UI. The UI now needs to be excluded,
and must be excluded prior to `invokeai/frontend/web/dist/**` being
included.
On `test-invoke-pip-skip`, we need to do a bit of logic juggling to
invert the folder selection. First, include the web folder, then exclude
everying around it and finally exclude the `dist/` folder
Currently translated at 100.0% (500 of 500 strings)
translationBot(ui): update translation (Italian)
Currently translated at 100.0% (500 of 500 strings)
translationBot(ui): update translation (Italian)
Currently translated at 100.0% (482 of 482 strings)
translationBot(ui): update translation (Italian)
Currently translated at 100.0% (480 of 480 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
Currently translated at 100.0% (500 of 500 strings)
translationBot(ui): update translation (Spanish)
Currently translated at 100.0% (482 of 482 strings)
translationBot(ui): update translation (Spanish)
Currently translated at 100.0% (480 of 480 strings)
Co-authored-by: gallegonovato <fran-carro@hotmail.es>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/es/
Translation: InvokeAI/Web UI
Cause of the problem was inadvertent activation of the safety checker.
When conversion occurs on disk, the safety checker is disabled during loading.
However, when converting in RAM, the safety checker was not removed, resulting
in it activating even when user specified --no-nsfw_checker.
This PR fixes the problem by detecting when the caller has requested the InvokeAi
StableDiffusionGeneratorPipeline class to be returned and setting safety checker
to None. Do not do this with diffusers models destined for disk because then they
will be incompatible with the merge script!!
Closes#2836
Some schedulers report not only the noisy latents at the current timestep,
but also their estimate so far of what the de-noised latents will be.
It makes for a more legible preview than the noisy latents do.
Reverts invoke-ai/InvokeAI#2903
@mauwii has a point here. It looks like triggering on a comment results
in an action for each of the stale issues, even ones that have been
previously dealt with. I'd like to revert this back to the original
behavior of running once every time the cron job executes.
What's the original motivation for having more frequent labeling of the
issues?
I found it to be a chore to remove labels manually in order to
"un-stale" issues. This is contrary to the bot message which says
commenting should remove "stale" status. On the current `cron` schedule,
there may be a delay of up to 24 hours before the label is removed. This
PR will trigger the workflow on issue comments in addition to the
schedule.
Also adds a condition to not run this job on PRs (Github treats issues
and PRs equivalently in this respect), and rewords the messages for
clarity.
This ought to be working but i don't know how it's supposed to behave so
i haven't been able to verify. At least, I know the numbers are getting
pushed all the way to the SD unet, i just have been unable to verify if
what's coming out is what is expected. Please test.
You'll `need to pip install -e .` after switching to the branch, because
it's currently pulling from a non-main `compel` branch. Once it's
verified as working as intended i'll promote the compel branch to pypi.
# Overview
Adding a few accessibility items (I think 9 total items). Mostly
`aria-label`, but also a `<VisuallyHidden>` to the left-side nav tab
icons. Tried to match existing copy that was being used. Feedback
welcome
* Fix img2img and inpainting code so a strength of 1 behaves the same as txt2img.
* Make generated images identical to their txt2img counterparts when strength is 1.
Updates the CLI to define CLI commands as Pydantic objects, similar to
how Invocations (nodes) work. For example:
```py
class HelpCommand(BaseCommand):
"""Shows help"""
type: Literal['help'] = 'help'
def run(self, context: CliContext) -> None:
context.parser.print_help()
```
*looks like this #2814 was reverted accidentally. instead of trying to
revert the revert, this PR can simply be re-accepted and will fix the
ui.*
- Migrate UI from SCSS to Chakra's CSS-in-JS system
- better dx
- more capable theming
- full RTL language support (we now have Arabic and Hebrew)
- general cleanup of the whole UI's styling
- Tidy npm packages and update scripts, necessitates update to github
actions
To test this PR in dev mode, you will need to do a `yarn install` as a
lot has changed.
thanks to @blessedcoolant for helping out on this, it was a big effort.
There are actually two Stable Diffusion v2 legacy checkpoint
configurations:
1. "epsilon" prediction type for Stable Diffusion v2 Base
2. "v-prediction" type for Stable Diffusion v2-768
This commit adds the configuration file needed for epsilon prediction
type models as well as the UI that prompts the user to select the
appropriate configuration file when the code can't do so automatically.
To avoid `git blame` recording all the autoformatting changes under the
name 'lstein', this PR adds a `.git-blame-ignore-revs` that will ignore
any provenance changes that occurred during the recent refactor merge.
This fixes the crash that was occurring when trying to load a legacy
checkpoint file.
Note that this PR includes commits from #2867 to avoid diffusers files
from re-downloading at startup time.
There are actually two Stable Diffusion v2 legacy checkpoint
configurations:
1) "epsilon" prediction type for Stable Diffusion v2 Base
2) "v-prediction" type for Stable Diffusion v2-768
This commit adds the configuration file needed for epsilon prediction
type models as well as the UI that prompts the user to select the
appropriate configuration file when the code can't do so
automatically.
# Migrate to new HF diffusers cache location
This PR adjusts the model cache directory to use the layout of
`diffusers 0.14`. This will automatically migrate any diffusers models
located in `INVOKEAI_ROOT/models/diffusers` to
`INVOKEAI_ROOT/models/hub`, and cache new downloaded diffusers files
into the same location.
As before, if environment variable `HF_HOME` is set, then both
HuggingFace `from_pretrained()` calls as well as all InvokeAI methods
will use `HF_HOME/hub` as their cache.
- Migrate UI from SCSS to Chakra's CSS-in-JS system
- better dx
- more capable theming
- full RTL language support (we now have Arabic and Hebrew)
- general cleanup of the whole UI's styling
- Tidy npm packages and update scripts, necessitates update to github
actions
To test this PR in dev mode, you will need to do a `yarn install` as a
lot has changed.
thanks to @blessedcoolant for helping out on this, it was a big effort.
This removes modules that appear to be no longer used by any code under
the `invokeai` package now that the `ckpt_generator` is gone.
There are a few small changes in here to code that was referencing code
in a conditional branch for ckpt, or to swap out a ⚡ function for a
🤗 one, but only as much was strictly necessary to get things to
run. We'll follow with more clean-up to get lingering `if isinstance` or
`except AttributeError` branches later.
build(ui): fix husky path
build(ui): fix hmr issue, remove emotion cache
build(ui): clean up package.json
build(ui): update gh action and npm scripts
feat(ui): wip port lightbox to chakra theme
feat(ui): wip use chakra theme tokens
feat(ui): Add status text to main loading spinner
feat(ui): wip chakra theme tweaking
feat(ui): simply iaisimplemenu button
feat(ui): wip chakra theming
feat(ui): Theme Management
feat(ui): Add Ocean Blue Theme
feat(ui): wip lightbox
fix(ui): fix lightbox mouse
feat(ui): set default theme variants
feat(ui): model manager chakra theme
chore(ui): lint
feat(ui): remove last scss
feat(ui): fix switch theme
feat(ui): Theme Cleanup
feat(ui): Stylize Search Models Found List
feat(ui): hide scrollbars
feat(ui): fix floating button position
feat(ui): Scrollbar Styling
fix broken scripts
This PR fixes the following scripts:
1) Scripts that can be executed within the repo's scripts directory.
Note that these are for development testing and are not intended
to be exposed to the user.
configure_invokeai.py - configuration
dream.py - the legacy CLI
images2prompt.py - legacy "dream prompt" retriever
invoke-new.py - new nodes-based CLI
invoke.py - the legacy CLI under another name
make_models_markdown_table.py - a utility used during the release/doc process
pypi_helper.py - another utility used during the release process
sd-metadata.py - retrieve JSON-formatted metadata from a PNG file
2) Scripts that are installed by pip install. They get placed into the venv's
PATH and are intended to be the official entry points:
invokeai-node-cli - new nodes-based CLI
invokeai-node-web - new nodes-based web server
invokeai - legacy CLI
invokeai-configure - install time configuration script
invokeai-merge - model merging script
invokeai-ti - textual inversion script
invokeai-model-install - model installer
invokeai-update - update script
invokeai-metadata" - retrieve JSON-formatted metadata from PNG files
protect invocations against black autoformatting
deps: upgrade to diffusers 0.14, safetensors 0.3, transformers 4.26, accelerate 0.16
Things to check for in this version:
- `diffusers` cache location is now more consistent with other
huggingface-hub using code (i.e. `transformers`) as of
https://github.com/huggingface/diffusers/pull/2005. I think ultimately
this should make @damian0815 (and other folks with multiple
diffusers-using projects) happier, but it's worth taking a look to make
sure the way @lstein set things up to respect `HF_HOME` is still
functioning as intended.
- I've gone ahead and updated `transformers` to the current version
(4.26), but I have a vague memory that we were holding it back at some
point? Need to look that up and see if that's the case and why.
This PR fixes the following scripts:
1) Scripts that can be executed within the repo's scripts directory.
Note that these are for development testing and are not intended
to be exposed to the user.
```
configure_invokeai.py - configuration
dream.py - the legacy CLI
images2prompt.py - legacy "dream prompt" retriever
invoke-new.py - new nodes-based CLI
invoke.py - the legacy CLI under another name
make_models_markdown_table.py - a utility used during the release/doc process
pypi_helper.py - another utility used during the release process
sd-metadata.py - retrieve JSON-formatted metadata from a PNG file
```
2) Scripts that are installed by pip install. They get placed into the
venv's
PATH and are intended to be the official entry points:
```
invokeai-node-cli - new nodes-based CLI
invokeai-node-web - new nodes-based web server
invokeai - legacy CLI
invokeai-configure - install time configuration script
invokeai-merge - model merging script
invokeai-ti - textual inversion script
invokeai-model-install - model installer
invokeai-update - update script
invokeai-metadata" - retrieve JSON-formatted metadata from PNG files
```
Fix error when using txt2img
ModuleNotFoundError: No module named 'invokeai.backend.models'
and
ModuleNotFoundError: No module named
'invokeai.backend.generator.diffusers_pipeline'
This PR fixes the following scripts:
1) Scripts that can be executed within the repo's scripts directory.
Note that these are for development testing and are not intended
to be exposed to the user.
configure_invokeai.py - configuration
dream.py - the legacy CLI
images2prompt.py - legacy "dream prompt" retriever
invoke-new.py - new nodes-based CLI
invoke.py - the legacy CLI under another name
make_models_markdown_table.py - a utility used during the release/doc process
pypi_helper.py - another utility used during the release process
sd-metadata.py - retrieve JSON-formatted metadata from a PNG file
2) Scripts that are installed by pip install. They get placed into the venv's
PATH and are intended to be the official entry points:
invokeai-node-cli - new nodes-based CLI
invokeai-node-web - new nodes-based web server
invokeai - legacy CLI
invokeai-configure - install time configuration script
invokeai-merge - model merging script
invokeai-ti - textual inversion script
invokeai-model-install - model installer
invokeai-update - update script
invokeai-metadata" - retrieve JSON-formatted metadata from PNG files
To avoid `git blame` recording all the autoformatting changes
under the name 'lstein', this PR adds a `.git-blame-ignore-revs`
that will ignore any provenance changes that occurred during the
recent refactor merge.
# All python code has been moved under `invokeai`. All vestiges of `ldm`
and `ldm.invoke` are now gone.
***You will need to run `pip install -e .` before the code will work
again!***
Everything seems to be functional, but extensive testing is advised.
A guide to where the files have gone is forthcoming.
This is the first phase of a big shifting of files and directories
in the source tree.
You will need to run `pip install -e .` before the code will work again!
Here's what's in the current commit:
1) Remove a lot of dead code that dealt with checkpoint and safetensor loading.
2) Entire ckpt_generator hierarchy is now gone!
3) ldm.invoke.generator.* => invokeai.generator.*
4) ldm.model.* => invokeai.model.*
5) ldm.invoke.model_manager => invokeai.model.model_manager
6) In addition, a number of frequently-accessed classes can be imported
from the invokeai.model and invokeai.generator modules:
from invokeai.generator import ( Generator, PipelineIntermediateState,
StableDiffusionGeneratorPipeline, infill_methods)
from invokeai.models import ( ModelManager, SDLegacyType
InvokeAIDiffuserComponent, AttentionMapSaver,
DDIMSampler, KSampler, PLMSSampler,
PostprocessingSettings )
* [nodes] Add better error handling to processor and CLI
* [nodes] Use more explicit name for marking node execution error
* [nodes] Update the processor call to error
This should make caching way easier and therefore speed up the image
(re-)creation a lot.
Other small improvements:
- reorder .dockerignore
- rename amd flavor to rocm to align with cuda flavor
- use `user:group` for definitions
- add `--platform=${TARGETPLATFORM}` to base
This PR adds the core of the node-based invocation system first
discussed in https://github.com/invoke-ai/InvokeAI/discussions/597 and
implements it through a basic CLI and API. This supersedes #1047, which
was too far behind to rebase.
## Architecture
### Invocations
The core of the new system is **invocations**, found in
`/ldm/invoke/app/invocations`. These represent individual nodes of
execution, each with inputs and outputs. Core invocations are already
implemented (`txt2img`, `img2img`, `upscale`, `face_restore`) as well as
a debug invocation (`show_image`). To implement a new invocation, all
that is required is to add a new implementation in this folder (there is
a markdown document describing the specifics, though it is slightly
out-of-date).
### Sessions
Invocations and links between them are maintained in a **session**.
These can be queued for invocation (either the next ready node, or all
nodes). Some notes:
* Sessions may be added to at any time (including after invocation), but
may not be modified.
* Links are always added with a node, and are always links from existing
nodes to the new node. These links can be relative "history" links, e.g.
`-1` to link from a previously executed node, and can link either
specific outputs, or can opportunistically link all matching outputs by
name and type by using `*`.
* There are no iteration/looping constructs. Most needs for this could
be solved by either duplicating nodes or cloning sessions. This is open
for discussion, but is a difficult problem to solve in a way that
doesn't make the code even more complex/confusing (especially regarding
node ids and history).
### Services
These make up the core the invocation system, found in
`/ldm/invoke/app/services`. One of the key design philosophies here is
that most components should be replaceable when possible. For example,
if someone wants to use cloud storage for their images, they should be
able to replace the image storage service easily.
The services are broken down as follows (several of these are
intentionally implemented with an initial simple/naïve approach):
* Invoker: Responsible for creating and executing **sessions** and
managing services used to do so.
* Session Manager: Manages session history. An on-disk implementation is
provided, which stores sessions as json files on disk, and caches
recently used sessions for quick access.
* Image Storage: Stores images of multiple types. An on-disk
implementation is provided, which stores images on disk and retains
recently used images in an in-memory cache.
* Invocation Queue: Used to queue invocations for execution. An
in-memory implementation is provided.
* Events: An event system, primarily used with socket.io to support
future web UI integration.
## Apps
Apps are available through the `/scripts/invoke-new.py` script (to-be
integrated/renamed).
### CLI
```
python scripts/invoke-new.py
```
Implements a simple CLI. The CLI creates a single session, and
automatically links all inputs to the previous node's output. Commands
are automatically generated from all invocations, with command options
being automatically generated from invocation inputs. Help is also
available for the cli and for each command, and is very verbose.
Additionally, the CLI supports command piping for single-line entry of
multiple commands. Example:
```
> txt2img --prompt "a cat eating sushi" --steps 20 --seed 1234 | upscale | show_image
```
### API
```
python scripts/invoke-new.py --api --host 0.0.0.0
```
Implements an API using FastAPI with Socket.io support for signaling.
API documentation is available at `http://localhost:9090/docs` or
`http://localhost:9090/redoc`. This includes OpenAPI schema for all
available invocations, session interaction APIs, and image APIs.
Socket.io signals are per-session, and can be subscribed to by session
id. These aren't currently auto-documented, though the code for event
emission is centralized in `/ldm/invoke/app/services/events.py`.
A very simple test html and script are available at
`http://localhost:9090/static/test.html` This demonstrates creating a
session from a graph, invoking it, and receiving signals from Socket.io.
## What's left?
* There are a number of features not currently covered by invocations. I
kept the set of invocations small during core development in order to
simplify refactoring as I went. Now that the invocation code has
stabilized, I'd love some help filling those out!
* There's no image metadata generated. It would be fairly
straightforward (and would make good sense) to serialize either a
session and node reference into an image, or the entire node into the
image. There are a lot of questions to answer around source images,
linked images, etc. though. This history is all stored in the session as
well, and with complex sessions, the metadata in an image may lose its
value. This needs some further discussion.
* We need a list of features (both current and future) that would be
difficult to implement without looping constructs so we can have a good
conversation around it. I'm really hoping we can avoid needing
looping/iteration in the graph execution, since it'll necessitate
separating an execution of a graph into its own concept/system, and will
further complicate the system.
* The API likely needs further filling out to support the UI. I think
using the new API for the current UI is possible, and potentially
interesting, since it could work like the new/demo CLI in a "single
operation at a time" workflow. I don't know how compatible that will be
with our UI goals though. It would be nice to support only a single API
though.
* Deeper separation of systems. I intentionally tried to not touch
Generate or other systems too much, but a lot could be gained by
breaking those apart. Even breaking apart Args into two pieces (command
line arguments and the parser for the current CLI) would make it easier
to maintain. This is probably in the future though.
author Kyle Schouviller <kyle0654@hotmail.com> 1669872800 -0800
committer Kyle Schouviller <kyle0654@hotmail.com> 1676240900 -0800
Adding base node architecture
Fix type annotation errors
Runs and generates, but breaks in saving session
Fix default model value setting. Fix deprecation warning.
Fixed node api
Adding markdown docs
Simplifying Generate construction in apps
[nodes] A few minor changes (#2510)
* Pin api-related requirements
* Remove confusing extra CORS origins list
* Adds response models for HTTP 200
[nodes] Adding graph_execution_state to soon replace session. Adding tests with pytest.
Minor typing fixes
[nodes] Fix some small output query hookups
[node] Fixing some additional typing issues
[nodes] Move and expand graph code. Add base item storage and sqlite implementation.
Update startup to match new code
[nodes] Add callbacks to item storage
[nodes] Adding an InvocationContext object to use for invocations to provide easier extensibility
[nodes] New execution model that handles iteration
[nodes] Fixing the CLI
[nodes] Adding a note to the CLI
[nodes] Split processing thread into separate service
[node] Add error message on node processing failure
Removing old files and duplicated packages
Adding python-multipart
- Add curated set of starter models based on team discussion. The final
list of starter models can be found in
`invokeai/configs/INITIAL_MODELS.yaml`
- To test model installation, I selected and installed all the models on
the list. This led to my discovering that when there are no more starter
models to display, the console front end crashes. So I made a fix to
this in which the entire starter model selection is no longer shown.
- Update model table in 050_INSTALL_MODELS.md
- Add guide to dealing with low-memory situations
- Version is now `v2.3.1`
- add new script `scripts/make_models_markdown_table.py` that parses
INITIAL_MODELS.yaml and creates markdown table for the model installation
documentation file
- update 050_INSTALLING_MODELS.md with above table, and add a warning
about additional license terms that apply to some of the models.
- Final list can be found in invokeai/configs/INITIAL_MODELS.yaml
- After installing all the models, I discovered a bug in the file
selection form that caused a crash when no remaining uninstalled
models remained. So had to fix this.
The sample_to_image method in `ldm.invoke.generator.base` was still
using ckpt-era code. As a result when the WebUI was set to show
"accurate" intermediate images, there'd be a crash. This PR corrects the
problem.
- Closes#2784
- Closes#2775
- Discord member @marcus.llewellyn reported that some civitai
2.1-derived checkpoints were not converting properly (probably
dreambooth-generated):
https://discord.com/channels/1020123559063990373/1078386197589655582/1078387806122025070
- @blessedcoolant tracked this down to a missing key that was used to
derive vector length of the CLIP model used by fetching the second
dimension of the tensor at "cond_stage_model.model.text_projection".
- On inspection, I found that the same second dimension can be recovered
from key 'cond_stage_model.model.ln_final.bias', and use that instead. I
hope this is correct; tested on multiple v1, v2 and inpainting models
and they converted correctly.
- While debugging this, I found and fixed several other issues:
- model download script was not pre-downloading the OpenCLIP
text_encoder or text_tokenizer. This is fixed.
- got rid of legacy code in `ckpt_to_diffuser.py` and replaced with
calls into `model_manager`
- more consistent status reporting in the CLI.
without this change, the project can be installed on 3.9 but not used
this also fixes the container images
Maybe we should re-enable Python 3.9 checks which would have prevented
this.
- Discord member @marcus.llewellyn reported that some civitai 2.1-derived checkpoints were
not converting properly (probably dreambooth-generated):
https://discord.com/channels/1020123559063990373/1078386197589655582/1078387806122025070
- @blessedcoolant tracked this down to a missing key that was used to
derive vector length of the CLIP model used by fetching the second
dimension of the tensor at "cond_stage_model.model.text_projection".
His proposed solution was to hardcode a value of 1024.
- On inspection, I found that the same second dimension can be
recovered from key 'cond_stage_model.model.ln_final.bias', and use
that instead. I hope this is correct; tested on multiple v1, v2 and
inpainting models and they converted correctly.
- While debugging this, I found and fixed several other issues:
- model download script was not pre-downloading the OpenCLIP
text_encoder or text_tokenizer. This is fixed.
- got rid of legacy code in `ckpt_to_diffuser.py` and replaced
with calls into `model_manager`
- more consistent status reporting in the CLI.
Root directory finding algorithm is:
2) use --root argument
2) use INVOKEAI_ROOT environment variable
3) use VIRTUAL_ENV environment variable
4) use ~/invokeai
Since developers are liable to put virtual environments in their
favorite places, not necessarily in the invokeai root directory, this PR
adds a sanity check that looks for the existence of
`VIRTUAL_ENV/invokeai.init`, and moves on to (4) if not found.
# This will constitute v2.3.1+rc2
## Windows installer enhancements
1. resize installer window to give more room for configure and download
forms
2. replace '\' with '/' in directory names to allow user to
drag-and-drop
folders into the dialogue boxes that accept directories.
3. similar change in CLI for the !import_model and !convert_model
commands
4. better error reporting when a model download fails due to network
errors
5. put the launcher scripts into a loop so that menu reappears after
invokeai, merge script, etc exits. User can quit with "Q".
6. do not try to download fp16 of sd-ft-mse-vae, since it doesn't exist.
7. cleaned up status reporting when installing models
8. Detect when install failed for some reason and print helpful error
message rather than stack trace.
9. Detect window size and resize to minimum acceptable values to provide
better display of configure and install forms.
10. Fix a bug in the CLI which prevented diffusers imported by their
repo_ids
from being correctly registered in the current session (though they
install
correctly)
11. Capitalize the "i" in Imported in the autogenerated descriptions.
Root directory finding algorithm is:
2) use --root argument
2) use INVOKEAI_ROOT environment variable
3) use VIRTUAL_ENV environment variable
4) use ~/invokeai
Since developer's are liable to put virtual environments in their
favorite places, not necessarily in the invokeai root directory, this
PR adds a sanity check that looks for the existence of
VIRTUAL_ENV/invokeai.init, and moves to (4) if not found.
- Fix a bug in the CLI which prevented diffusers imported by their repo_ids
from being correctly registered in the current session (though they install
correctly)
- Capitalize the "i" in Imported in the autogenerated descriptions.
1. resize installer window to give more room for configure and download forms
2. replace '\' with '/' in directory names to allow user to drag-and-drop
folders into the dialogue boxes that accept directories.
3. similar change in CLI for the !import_model and !convert_model commands
4. better error reporting when a model download fails due to network errors
5. put the launcher scripts into a loop so that menu reappears after
invokeai, merge script, etc exits. User can quit with "Q".
6. do not try to download fp16 of sd-ft-mse-vae, since it doesn't exist.
7. cleaned up status reporting when installing models
- Detect when install failed for some reason and print helpful error
message rather than stack trace.
- Detect window size and resize to minimum acceptable values to provide
better display of configure and install forms.
Currently translated at 81.4% (382 of 469 strings)
translationBot(ui): update translation (Russian)
Currently translated at 81.6% (382 of 468 strings)
Co-authored-by: Sergey Krashevich <svk@svk.su>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/ru/
Translation: InvokeAI/Web UI
## Major Changes
The invokeai-configure script has now been refactored. The work of
selecting and downloading initial models at install time is now done by
a script named `invokeai-model-install` (module name is
`ldm.invoke.config.model_install`)
Screen 1 - adjust startup options:

Screen 2 - select SD models:

The calling arguments for `invokeai-configure` have not changed, so
nothing should break. After initializing the root directory, the script
calls `invokeai-model-install` to let the user select the starting
models to install.
`invokeai-model-install puts up a console GUI with checkboxes to
indicate which models to install. It respects the `--default_only` and
`--yes` arguments so that CI will continue to work. Here are the various
effects you can achieve:
`invokeai-configure`
This will use console-based UI to initialize invokeai.init,
download support models, and choose and download SD models
`invokeai-configure --yes`
Without activating the GUI, populate invokeai.init with default values,
download support models and download the "recommended" SD models
`invokeai-configure --default_only`
Activate the GUI for changing init options, but don't show the SD
download
form, and automatically download the default SD model (currently SD-1.5)
`invokeai-model-install`
Select and install models. This can be used to download arbitrary
models from the Internet, install HuggingFace models using their
repo_id,
or watch a directory for models to load at startup time
`invokeai-model-install --yes`
Import the recommended SD models without a GUI
`invokeai-model-install --default_only`
As above, but only import the default model
## Flexible Model Imports
The console GUI allows the user to import arbitrary models into InvokeAI
using:
1. A HuggingFace Repo_id
2. A URL (http/https/ftp) that points to a checkpoint or safetensors
file
3. A local path on disk pointing to a checkpoint/safetensors file or
diffusers directory
4. A directory to be scanned for all checkpoint/safetensors files to be
imported
The UI allows the user to specify multiple models to bulk import. The
user can specify whether to import the ckpt/safetensors as-is, or
convert to `diffusers`. The user can also designate a directory to be
scanned at startup time for checkpoint/safetensors files.
## Backend Changes
To support the model selection GUI PR introduces a new method in
`ldm.invoke.model_manager` called `heuristic_import(). This accepts a
string-like object which can be a repo_id, URL, local path or directory.
It will figure out what the object is and import it. It interrogates the
contents of checkpoint and safetensors files to determine what type of
SD model they are -- v1.x, v2.x or v1.x inpainting.
## Installer
I am attaching a zip file of the installer if you would like to try the
process from end to end.
[InvokeAI-installer-v2.3.0.zip](https://github.com/invoke-ai/InvokeAI/files/10785474/InvokeAI-installer-v2.3.0.zip)
motivation: i want to be doing future prompting development work in the
`compel` lib (https://github.com/damian0815/compel) - which is currently
pip installable with `pip install compel`.
-At some point pathlib was added to the list of imported modules and
this broken the os.path code that assembled the sample data set.
-Now fixed by replacing os.path calls with Path methods
-At some point pathlib was added to the list of imported modules and this
broken the os.path code that assembled the sample data set.
-Now fixed by replacing os.path calls with Path methods
- Disable responsive resizing below starting dimensions (you can make
form larger, but not smaller than what it was at startup)
- Fix bug that caused multiple --ckpt_convert entries (and similar) to
be written to init file.
This bug is related to the format in which we stored prompts for some time: an array of weighted subprompts.
This caused some strife when recalling a prompt if the prompt had colons in it, due to our recently introduced handling of negative prompts.
Currently there is no need to store a prompt as anything other than a string, so we revert to doing that.
Compatibility with structured prompts is maintained via helper hook.
Lots of earlier embeds use a common trigger token such as * or the
hebrew letter shan. Previously, the textual inversion manager would
refuse to load the second and subsequent embeddings that used a
previously-claimed trigger. Now, when this case is encountered, the
trigger token is replaced by <filename> and the user is informed of the
fact.
1. Fixed display crash when the number of installed models is less than
the number of desired columns to display them.
2. Added --ckpt_convert option to init file.
Enhancements:
1. Directory-based imports will not attempt to import components of diffusers models.
2. Diffuser directory imports now supported
3. Files that end with .ckpt that are not Stable Diffusion models (such as VAEs) are
skipped during import.
Bugs identified in Psychedelicious's review:
1. The invokeai-configure form now tracks the current contents of `invokeai.init` correctly.
2. The autoencoders are no longer treated like installable models, but instead are
mandatory support models. They will no longer appear in `models.yaml`
Bugs identified in Damian's review:
1. If invokeai-model-install is started before the root directory is initialized, it will
call invokeai-configure to fix the matter.
2. Fix bug that was causing empty `models.yaml` under certain conditions.
3. Made import textbox smaller
4. Hide the "convert to diffusers" options if nothing to import.
In theory, this reduces peak memory consumption by doing the conditioned
and un-conditioned predictions one after the other instead of in a
single mini-batch.
In practice, it doesn't reduce the reported "Max VRAM used for this
generation" for me, even without xformers. (But it does slow things down
by a good 18%.)
That suggests to me that the peak memory usage is during VAE decoding,
not the diffusion unet, but ymmv. It does [improve things for gogurt's
16 GB
M1](https://github.com/invoke-ai/InvokeAI/pull/2732#issuecomment-1436187407),
so it seems worthwhile.
To try it out, use the `--sequential_guidance` option:
2dded68267/ldm/invoke/args.py (L487-L492)
- Adds an update action to launcher script
- This action calls new python script `invokeai-update`, which prompts
user to update to latest release version, main development version, or
an arbitrary git tag or branch name.
- It then uses `pip` to update to whatever tag was specified.
The user interface (such as it is) looks like this:

- The TI script was looping over all files in the training image
directory, regardless of whether they were image files or not. This PR
adds a check for image file extensions.
-
- Closes#2715
- Fixes longstanding bug in the token vector size code which caused .pt
files to be assigned the wrong token vector length. These were then
tossed out during directory scanning.
- Fixes longstanding bug in the token vector size code which caused
.pt files to be assigned the wrong token vector length. These
were then tossed out during directory scanning.
- Fixed the test for token length; tested on several .pt and .bin files
- Also added a __main__ entrypoint for CLI.py, to make pdb debugging a
bit more convenient.
When selecting the last model of the third model-list in the
model-merging-TUI it crashed because the code forgot about the "None"
element.
Additionally it seems that it accidentally always took the wrong model
as third model if selected?
This simple fix resolves both issues.
Added symmetry to Invoke based on discussions with @damian0815. This can currently only be activated via the CLI with the `--h_symmetry_time_pct` and `--v_symmetry_time_pct` options. Those take values from 0.0-1.0, exclusive, indicating the percentage through generation at which symmetry is applied as a one-time operation. To have symmetry in either axis applied after the first step, use a very low value like 0.001.
- not sure why, but at some pont --ckpt_convert (which converts legacy checkpoints)
into diffusers in memory, stopped working due to float16/float32 issues.
- this commit repairs the problem
- also removed some debugging messages I found in passing
- Fixed the test for token length; tested on several .pt and .bin files
- Also added a __main__ entrypoint for CLI.py, to make pdb debugging a bit
more convenient.
- You can now achieve several effects:
`invokeai-configure`
This will use console-based UI to initialize invokeai.init,
download support models, and choose and download SD models
`invokeai-configure --yes`
Without activating the GUI, populate invokeai.init with default values,
download support models and download the "recommended" SD models
`invokeai-configure --default_only`
As above, but only download the default SD model (currently SD-1.5)
`invokeai-model-install`
Select and install models. This can be used to download arbitrary
models from the Internet, install HuggingFace models using their repo_id,
or watch a directory for models to load at startup time
`invokeai-model-install --yes`
Import the recommended SD models without a GUI
`invokeai-model-install --default_only`
As above, but only import the default model
A few bugs fixed.
- After the recent update to the Cancel Button, it was no longer
respecting sizing in Floating Mode and the Beta Canvas. Fixed that.
- After the recent dependency update, useHotkeys was bugging out for the
fullscreen hotkey `f`. Realized this was happening because the hotkey
was initialized in two places -- in both the gallery and the parameter
floating button. Removed it from both those places and moved it to the
InvokeTabs component. It makes sense to reside it here because it is a
global hotkey.
- Also added index `0` to the default Accordion index in state in order
to ensure that the main accordions stay open. Conveniently this works
great on all tabs. We have all the primary options in accordions so they
stay open. And as for advanced settings, the first one is always Seed
which is an important accordion, so it opens up by default.
Think there may be some more bugs. Looking in to them.
After upgrading the deps, the full screen hotkey started to bug out. I believe this was happening because it was triggered in two different components causing it to run twice. Removed it from both floating buttons and moved it to the Invoke tab. Makes sense to keep it there as it is a global hotkey.
After the recent changes the Cancel button wasn't maintaining min height in floating mode. Also the new button group was not scaling in width correctly on the Canvas Beta UI. Fixed both.
- Adds a translation status badge
- Adds a blurb about contributing a translation (we want Weblate to be
the source of truth for translations, and to avoid updating translations
directly here)
- Upgraded all dependencies
- Removed beta TS 5.0 as it conflicted with some packages
- Added types for `Array.prototype.findLast` and
`Array.prototype.findLastIndex` (these definitions are provided in TS
5.0
- Fixed fixed type import syntax in a few components
- Re-patched `redux-deep-persist` and tested to ensure the patch still
works
The husky pre-commit command was `npx run lint`, but it should run
`lint-staged`. Also, `npx` wasn't working for me. Changed the command to
`npm run lint-staged` and it all works. Extended the `lint-staged`
triggers to hit `json`, `scss` and `html`.
When encountering a bad embedding, InvokeAI was asking about reconfiguring models. This is because the embedding load error was never handled - it now is.
- Upgraded all dependencies
- Removed beta TS 5.0 as it conflicted with some packages
- Added types for `Array.prototype.findLast` and `Array.prototype.findLastIndex` (these definitions are provided in TS 5.0
- Fixed fixed type import syntax in a few components
- Re-patched `redux-deep-persist` and tested to ensure the patch still works
Model Manager lags a bit if you have a lot of models.
Basically added a fake delay to rendering the model list so the modal
has time to load first. Hacky but if it works it works.
## What was the problem/requirement? (What/Why)
Frequently, I wish to cancel the processing of images, but also want the
current image to finalize before I do. To work around this, I need to
wait until the current one finishes before pressing the cancel.
## What was the solution? (How)
* Implemented a button that allows to "Cancel after current iteration,"
which stores a state in the UI that will attempt to cancel the
processing after the current image finishes
* If the button is pressed again, while it is spinning and before the
next iteration happens, this will stop the scheduling of the cancel, and
behave as if the button was never pressed.
### Minor
* Added `.yarn` to `.gitignore` as this was an output folder produced
from following Frontend's README
### Revision 2
#### Major
* Changed from a standalone button to a context menu next to the
original cancel button. Pressing the context menu will give the
drop-down option to select which type of cancel method the user prefers,
and they can press that button for canceling in the specified type
* Moved states to system state for cross-screen and toggled cancel types
management
* Added in distribution for the target yarn version (allowing any
version of yarn to compile successfully), and updated the README to
ensure `--immutable` is passed for onboarding developers
#### Minor
* Updated `.gitignore` to ignore specific yarn folders, as specified by
their team -
https://yarnpkg.com/getting-started/qa#which-files-should-be-gitignored
## How were these changes tested?
* `yarn dev` => Server started successfully
* Manual testing on the development server to ensure the button behaved
as expected
* `yarn run build` => Success
### Artifacts
#### Revision 1
* Video showing the UI changes in action
https://user-images.githubusercontent.com/89283782/218347722-3a15ce61-2d8c-4c38-b681-e7a3e79dd595.mov
* Images showing the basic UI changes


#### Revision 2
* Video showing the UI changes in action
https://user-images.githubusercontent.com/89283782/219901217-048d2912-9b61-4415-85fd-9e8fedb00c79.mov
* Images showing the basic UI changes
(Default state)

(Drop-down context menu active)

(Scheduled cancel selected and running)

(Scheduled cancel started)

## Notes
* Using `SystemState`'s `currentStatus` variable, when the value is
`common:statusIterationComplete` is an alternative to this approach (and
would be more optimal as it should prevent the next iteration from even
starting), but since the names are within the translations, rather than
an enum or other type, this method of tracking the current iteration was
used instead.
* `isLoading` on `IAIIconButton` caused the Icon Button to also be
disabled, so the current solution works around that with conditionally
rendering the icon of the button instead of passing that value.
* I don't have context on the development expectation for `dist` folder
interactions (and couldn't find any documentation outside of the
`.gitignore` mentioning that the folder should remain. Let me know if
they need to be modified a certain way.
- The checkpoint conversion script was generating diffusers models with
the safety checker set to null. This resulted in models that could not
be merged with ones that have the safety checker activated.
- This PR fixes the issue by incorporating the safety checker into all
1.x-derived checkpoints, regardless of user's nsfw_checker setting.
- The checkpoint conversion script was generating diffusers models
with the safety checker set to null. This resulted in models
that could not be merged with ones that have the safety checker
activated.
- This PR fixes the issue by incorporating the safety checker into
all 1.x-derived checkpoints, regardless of user's nsfw_checker setting.
Also tighten up the typing of `device` attributes in general.
Fixes
> ValueError: Expected a torch.device with a specified index or an
integer, but got:cuda
Weblate's first PR was it attempting to fix some translation issues we
had overlooked!
It wanted to remove some keys which it did not see in our translation
source due to typos.
This PR instead corrects the key names to resolve the issues.
# Weblate Translation
After doing a full integration test of 3 translation service providers
on my fork of InvokeAI, we have chosen
[Weblate](https://hosted.weblate.org). The other two viable options were
[Crowdin](https://crowdin.com/) and
[Transifex](https://www.transifex.com/).
Weblate was the choice because its hosted service provides a very solid
UX / DX, can scale as much as we may ever need, is FOSS itself, and
generously offers free hosted service to other libre projects like ours.
## How it works
Weblate hosts its own fork of our repo and establishes a kind of
unidirectional relationship between our repo and its fork.
### InvokeAI --> Weblate
The `invoke-ai/InvokeAI` repo has had the Weblate GitHub app added to
it. This app watches for changes to our translation source
(`invokeai/frontend/public/locales/en.json`) and then updates the
Weblate fork. The Weblate UI then knows there are new strings to be
translated, or changes to be made.
### Translation
Our translators can then update the translations on the Weblate UI. The
plan now is to invite individual community members who have expressed
interest in maintaining a language or two and give them access to the
app. We can also open the doors to the general public if desired.
### Weblate --> InvokeAI
When a translation is ready or changed, the system will make a PR to
`main`. We have a substantial degree of control over this and will
likely manually trigger these PRs instead of letting them fire off
automatically.
Once a PR is merged, we will still need to rebuild the web UI. I think
we can set things up so that we only need the rebuild when a totally new
language is added, but for now, we will stick to this relatively simple
setup.
## This PR
This PR sets up the web UI's translation stuff to work with Weblate:
- merged each locale into a single file
- updated the i18next config and UI to work with this simpler file
structure
- update our eslint and prettier rules to ensure the locale files have
the same format as what Weblate outputs (`tabWidth: 4`)
- added a thank you to Weblate in our README
Once this is merged, I'll link Weblate to `main` and do a couple tests
to ensure it is all working as expected.
This fixes a few cosmetic bugs in the merge models console GUI:
1) Fix the minimum and maximum ranges on alpha. Was 0.05 to 0.95. Now
0.01 to 0.99.
2) Don't show the 'add_difference' interpolation method when 2 models
selected, or the other three methods when three models selected
## Convert v2 models in CLI
- This PR introduces a CLI prompt for the proper configuration file to
use when converting a ckpt file, in order to support both inpainting
and v2 models files.
- When user tries to directly !import a v2 model, it prints out a proper
warning that v2 ckpts are not directly supported and converts it into a
diffusers model automatically.
The user interaction looks like this:
```
(stable-diffusion-1.5) invoke> !import_model /home/lstein/graphic-art.ckpt
Short name for this model [graphic-art]: graphic-art-test
Description for this model [Imported model graphic-art]: Imported model graphic-art
What type of model is this?:
[1] A model based on Stable Diffusion 1.X
[2] A model based on Stable Diffusion 2.X
[3] An inpainting model based on Stable Diffusion 1.X
[4] Something else
Your choice: [1] 2
```
In addition, this PR enhances the bulk checkpoint import function. If a
directory path is passed to `!import_model` then it will be scanned for
`.ckpt` and `.safetensors` files. The user will be prompted to import
all the files found, or select which ones to import.
Addresses
https://discord.com/channels/1020123559063990373/1073730061380894740/1073954728544845855
- fix alpha slider to show values from 0.01 to 0.99
- fix interpolation list to show 'difference' method for 3 models,
- and weighted_sum, sigmoid and inverse_sigmoid methods for 2
Porting over as many usable options to slider as possible.
- Ported Face Restoration settings to Sliders.
- Ported Upscale Settings to Sliders.
- Ported Variation Amount to Sliders.
- Ported Noise Threshold to Sliders <-- Optimized slider so the values
actually make sense.
- Ported Perlin Noise to Sliders.
- Added a suboption hook for the High Res Strength Slider.
- Fixed a couple of small issues with the Slider component.
- Ported Main Options to Sliders.
- Corrected error that caused --full-precision argument to be ignored
when models downloaded using the --yes argument.
- Improved autodetection of v1 inpainting files; no longer relies on the
file having 'inpaint' in the name.
* new OffloadingDevice loads one model at a time, on demand
* fixup! new OffloadingDevice loads one model at a time, on demand
* fix(prompt_to_embeddings): call the text encoder directly instead of its forward method
allowing any associated hooks to run with it.
* more attempts to get things on the right device from the offloader
* more attempts to get things on the right device from the offloader
* make offloading methods an explicit part of the pipeline interface
* inlining some calls where device is only used once
* ensure model group is ready after pipeline.to is called
* fixup! Strategize slicing based on free [V]RAM (#2572)
* doc(offloading): docstrings for offloading.ModelGroup
* doc(offloading): docstrings for offloading-related pipeline methods
* refactor(offloading): s/SimpleModelGroup/FullyLoadedModelGroup
* refactor(offloading): s/HotSeatModelGroup/LazilyLoadedModelGroup
to frame it is the same terms as "FullyLoadedModelGroup"
---------
Co-authored-by: Damian Stewart <null@damianstewart.com>
- filter paths for `build-container.yml` and `test-invoke-pip.yml`
- add workflow to pass required checks on PRs with `paths-ignore`
- this triggers if `test-invoke-pip.yml` does not
- fix "CI checks on main link" in `/README.md`
- filter paths for `build-container.yml` and `test-invoke-pip.yml`
- add workflow to pass required checks on PRs with `paths-ignore`
- this triggers if `test-invoke-pip.yml` does not
- fix "CI checks on main link" in `/README.md`
Assuming that mixing `"literal strings"` and `{'JSX expressions'}`
throughout the code is not for a explicit reason but just a result IDE
autocompletion, I changed all props to be consistent with the
conventional style of using simple string literals where it is
sufficient.
This is a somewhat trivial change, but it makes the code a little more
readable and uniform
- quashed multiple bugs in model conversion and importing
- found old issue in handling of resume of interrupted downloads
- will require extensive testing
### WebUI Model Conversion
**Model Search Updates**
- Model Search now has a radio group that allows users to pick the type
of model they are importing. If they know their model has a custom
config file, they can assign it right here. Based on their pick, the
model config data is automatically populated. And this same information
is used when converting the model to `diffusers`.

- Files named `model.safetensors` and
`diffusion_pytorch_model.safetensors` are excluded from the search
because these are naming conventions used by diffusers models and they
will end up showing on the list because our conversion saves safetensors
and not bin files.
**Model Conversion UI**
- The **Convert To Diffusers** button can be found on the Edit page of
any **Checkpoint Model**.

- When converting the model, the entire process is handled
automatically. The corresponding config while at the time of the Ckpt
addition is used in the process.
- Users are presented with the choice on where to save the diffusers
converted model - same location as the ckpt, InvokeAI models root folder
or a completely custom location.

- When the model is converted, the checkpoint entry is replaced with the
diffusers model entry. A user can readd the ckpt if they wish to.
---
More or less done. Might make some minor UX improvements as I refine
things.
Tensors with diffusers no longer have to be multiples of 8. This broke Perlin noise generation. We now generate noise for the next largest multiple of 8 and return a cropped result. Fixes#2674.
`generator` now asks `InvokeAIDiffuserComponent` to do postprocessing work on latents after every step. Thresholding - now implemented as replacing latents outside of the threshold with random noise - is called at this point. This postprocessing step is also where we can hook up symmetry and other image latent manipulations in the future.
Note: code at this layer doesn't need to worry about MPS as relevant torch functions are wrapped and made MPS-safe by `generator.py`.
1. Now works with sites that produce lots of redirects, such as CIVITAI
2. Derive name of destination model file from HTTP Content-Disposition header,
if present.
3. Swap \\ for / in file paths provided by users, to hopefully fix issues with
Windows.
This PR adds a new attributer to ldm.generate, `embedding_trigger_strings`:
```
gen = Generate(...)
strings = gen.embedding_trigger_strings
strings = gen.embedding_trigger_strings()
```
The trigger strings will change when the model is updated to show only
those strings which are compatible with the current
model. Dynamically-downloaded triggers from the HF Concepts Library
will only show up after they are used for the first time. However, the
full list of concepts available for download can be retrieved
programatically like this:
```
from ldm.invoke.concepts_lib import HuggingFAceConceptsLibrary
concepts = HuggingFaceConceptsLibrary()
trigger_strings = concepts.list_concepts()
```
I have added the arabic locale files. There need to be some
modifications to the code in order to detect the language direction and
add it to the current document body properties.
For example we can use this:
import { appWithTranslation, useTranslation } from "next-i18next";
import React, { useEffect } from "react";
const { t, i18n } = useTranslation();
const direction = i18n.dir();
useEffect(() => {
document.body.dir = direction;
}, [direction]);
This should be added to the app file. It uses next-i18next to
automatically get the current language and sets the body text direction
(ltr or rtl) depending on the selected language.
## Provide informative error messages when TI and Merge scripts have
insufficient space for console UI
- The invokeai-ti and invokeai-merge scripts will crash if there is not
enough space in the console to fit the user interface (even after
responsive formatting).
- This PR intercepts the errors and prints a useful error message
advising user to make window larger.
1. The invokeai-configure script has now been refactored. The work of
selecting and downloading initial models at install time is now done
by a script named invokeai-initial-models (module
name is ldm.invoke.config.initial_model_select)
The calling arguments for invokeai-configure have not changed, so
nothing should break. After initializing the root directory, the
script calls invokeai-initial-models to let the user select the
starting models to install.
2. invokeai-initial-models puts up a console GUI with checkboxes to
indicate which models to install. It respects the --default_only
and --yes arguments so that CI will continue to work.
3. User can now edit the VAE assigned to diffusers models in the CLI.
4. Fixed a bug that caused a crash during model loading when the VAE
is set to None, rather than being empty.
- The invokeai-ti and invokeai-merge scripts will crash if there is not enough space
in the console to fit the user interface (even after responsive formatting).
- This PR intercepts the errors and prints a useful error message advising user to
make window larger.
- fix unused variables and f-strings found by pyflakes
- use global_converted_ckpts_dir() to find location of diffusers
- fixed bug in model_manager that was causing the description of converted
models to read "Optimized version of {model_name}'
Strategize slicing based on free [V]RAM when not using xformers. Free [V]RAM is evaluated at every generation. When there's enough memory, the entire generation occurs without slicing. If there is not enough free memory, we use diffusers' sliced attention.
- Adds an update action to launcher script
- This action calls new python script `invokeai-update`, which prompts
user to update to latest release version, main development version,
or an arbitrary git tag or branch name.
- It then uses `pip` to update to whatever tag was specified.
Some of the core features of this PR include:
- optional push image to dockerhub (will be skipped in repos which
didn't set it up)
- stop using the root user at runtime
- trigger builds also for update/docker/* and update/ci/docker/*
- always cache image from current branch and main branch
- separate caches for container flavors
- updated comments with instructions in build.sh and run.sh
This commit cleans up the code that did bulk imports of legacy model
files. The code has been refactored, and the user is now offered the
option of importing all the model files found in the directory, or
selecting which ones to import.
Users can now pick the folder to save their diffusers converted model. It can either be the same folder as the ckpt, or the invoke root models folder or a totally custom location.
Fixed a couple of bugs:
1. The original config file for the ckpt file is derived from the entry in
`models.yaml` rather than relying on the user to select. The implication
of this is that V2 ckpt models need to be assigned `v2-inference-v.yaml`
when they are first imported. Otherwise they won't convert right. Note
that currently V2 ckpts are imported with `v1-inference.yaml`, which
isn't right either.
2. Fixed a backslash in the output diffusers path, which was causing
load failures on Linux.
Remaining issues:
1. The radio buttons for selecting the model type are
nonfunctional. It feels to me like these should be moved into the
dialogue for importing ckpt/safetensors files, because this is
where the algorithm needs help from the user.
2. The output diffusers model is written into the same directory as
the input ckpt file. The CLI does it differently and stores the
diffusers model in `ROOTDIR/models/converted-ckpts`. We should
settle on one way or the other.
Converted the picker options to a Radio Group and also updated the backend to use the appropriate config if it is a v2 model that needs to be converted.
- This PR introduces a CLI prompt for the proper configuration file to
use when converting a ckpt file, in order to support both inpainting
and v2 models files.
- When user tries to directly !import a v2 model, it prints out a proper
warning that v2 ckpts are not directly supported.
## What was the problem/requirement? (What/Why)
* Windows location for the Python environment activate location is
currently incorrect
* Due to this, this command will fail for Windows-based users
* The contributing link within the `Developer Install` sections leads to
a [404](https://invoke-ai.github.io/index.md#Contributing)
* `Developer Install`'s numbered list currently lists 1, 1, 2, . . .
## What was the solution? (How)
* Changed the location of Windows script based on actual location -
[reference](https://docs.python.org/3/library/venv.html)
* Moved the link to point to one directory higher -- the main index.md
* Minor format adjustments to allow for the numbered list to appear as
expected
## How were these changes tested?
* `mkdocs serve` => Verified on local server that the changes reflected
as expected
## Notes
Contributing mentions to set the upstream towards the `development`
branch, but that branch has been untouched for several months, so I've
pointed to the `main` branch. Let me know if we need to switch to a
different one.
…odels
- If CLI asked to convert the currently loaded model, the model would
crash on the first rendering. CLI will now refuse to convert a model
loaded in memory (probably a good idea in any case).
- CLI will offer the `v1-inpainting-inference.yaml` as the configuration
file when importing an inpainting a .ckpt or .safetensors file that has
"inpainting" in the name. Otherwise it offers `v1-inference.yaml` as the
default.
rather than bypassing any path with diffusers in it, im specifically bypassing model.safetensors and diffusion_pytorch_model.safetensors both of which should be diffusers files in most cases.
- If CLI asked to convert the currently loaded model, the model would crash
on the first rendering. CLI will now refuse to convert a model loaded
in memory (probably a good idea in any case).
- CLI will offer the `v1-inpainting-inference.yaml` as the configuration
file when importing an inpainting a .ckpt or .safetensors file that
has "inpainting" in the name. Otherwise it offers `v1-inference.yaml`
as the default.
Found a couple of places where the formatting was messed up. I also
added a "Quick Start Guide" to the README for people who encounter
InvokeAI through PyPi. It features the PyPi install!
pulling in denoising support from upstream (its already there, invoke
just isn't using it). I've enabled this as a command line argument as
construction of the ESRGAN handler happens once. Ideally this would be a
UI option that could be adjusted for each upscaling task. Unfortunately
that is beyond my current level of InvokeAI-foo.
Upstream reference is here, starting on line 99 "use dni to control the
denoise strength"
https://github.com/xinntao/Real-ESRGAN/blob/master/inference_realesrgan.py
- `eslint` and `prettier` configs
- `husky` to format and lint via pre-commit hook
- `babel-plugin-transform-imports` to treeshake `lodash` and other packages if needed
Lints and formats codebase.
`options` slice was huge and managed a mix of generation parameters and general app settings. It has been split up:
- Generation parameters are now in `generationSlice`.
- Postprocessing parameters are now in `postprocessingSlice`
- UI related things are now in `uiSlice`
There is probably more to be done, like `gallerySlice` perhaps should only manage internal gallery state, and not if the gallery is displayed.
Full-slice selectors have been made for each slice.
Other organisational tweaks.
Previously conversions of .ckpt and .safetensors files to diffusers
models were failing with channel mismatch errors. This is corrected
with this PR.
- The model_manager convert_and_import() method now accepts the path
to the checkpoint file's configuration file, using the parameter
`original_config_file`. For inpainting files this should be set to
the full path to `v1-inpainting-inference.yaml`.
- If no configuration file is provided in the call, then the presence
of an inpainting file will be inferred at the
`ldm.ckpt_to_diffuser.convert_ckpt_to_diffUser()` level by looking
for the string "inpaint" in the path. AUTO1111 does something
similar to this, but it is brittle and not recommended.
- This PR also changes the model manager model_names() method to return
the model names in case folded sort order.
label:What version did you experience this issue on?
description:|
Please share the version of Invoke AI that you experienced the issue on. If this is not the latest version, please update first to confirm the issue still exists. If you are testing main, please include the commit hash instead.
stale-issue-message:"There has been no activity in this issue for ${{ env.DAYS_BEFORE_ISSUE_STALE }} days. If this issue is still being experienced, please reply with an updated confirmation that the issue is still being experienced with the latest release."
close-issue-message:"Due to inactivity, this issue was automatically closed. If you are still experiencing the issue, please recreate the issue."
[![CI checks on main badge]][CI checks on main link] [![latest commit to main badge]][latest commit to main link]
[![github open issues badge]][github open issues link] [![github open prs badge]][github open prs link]
[![github open issues badge]][github open issues link] [![github open prs badge]][github open prs link] [![translation status badge]][translation status link]
[CI checks on main badge]: https://flat.badgen.net/github/checks/invoke-ai/InvokeAI/main?label=CI%20status%20on%20main&cache=900&icon=github
[CI checks on main link]:https://github.com/invoke-ai/InvokeAI/actions/workflows/test-invoke-conda.yml
[CI checks on main link]:https://github.com/invoke-ai/InvokeAI/actions?query=branch%3Amain
[translation status badge]: https://hosted.weblate.org/widgets/invokeai/-/svg-badge.svg
[translation status link]: https://hosted.weblate.org/engage/invokeai/
</div>
InvokeAI is a leading creative engine built to empower professionals and enthusiasts alike. Generate and create stunning visual media using the latest AI-driven technologies. InvokeAI offers an industry leading Web Interface, interactive Command Line Interface, and also serves as the foundation for multiple commercial products.
**Quick links**: [[How to Install](#installation)] [<ahref="https://discord.gg/ZmtBAhwWhy">Discord Server</a>] [<ahref="https://invoke-ai.github.io/InvokeAI/">Documentation and Tutorials</a>] [<ahref="https://github.com/invoke-ai/InvokeAI/">Code and Downloads</a>] [<ahref="https://github.com/invoke-ai/InvokeAI/issues">Bug Reports</a>] [<ahref="https://github.com/invoke-ai/InvokeAI/discussions">Discussion, Ideas & Q&A</a>]
**Quick links**: [[How to Install](https://invoke-ai.github.io/InvokeAI/#installation)] [<ahref="https://discord.gg/ZmtBAhwWhy">Discord Server</a>] [<ahref="https://invoke-ai.github.io/InvokeAI/">Documentation and Tutorials</a>] [<ahref="https://github.com/invoke-ai/InvokeAI/">Code and Downloads</a>] [<ahref="https://github.com/invoke-ai/InvokeAI/issues">Bug Reports</a>] [<ahref="https://github.com/invoke-ai/InvokeAI/discussions">Discussion, Ideas & Q&A</a>]
_Note: InvokeAI is rapidly evolving. Please use the
[Issues](https://github.com/invoke-ai/InvokeAI/issues) tab to report bugs and make feature
@ -41,38 +43,136 @@ requests. Be sure to use the provided templates. They will help us diagnose issu
### Automatic Installer (suggested for 1st time users)
1. Go to the bottom of the [Latest Release Page](https://github.com/invoke-ai/InvokeAI/releases/latest)
2. Download the .zip file for your OS (Windows/macOS/Linux).
3. Unzip the file.
4. If you are on Windows, double-click on the `install.bat` script. On macOS, open a Terminal window, drag the file `install.sh` from Finder into the Terminal, and press return. On Linux, run `install.sh`.
5. Wait a while, until it is done.
6. The folder where you ran the installer from will now be filled with lots of files. If you are on Windows, double-click on the `invoke.bat` file. On macOS, open a Terminal window, drag `invoke.sh` from the folder into the Terminal, and press return. On Linux, run `invoke.sh`
7. Press 2 to open the "browser-based UI", press enter/return, wait a minute or two for Stable Diffusion to start up, then open your browser and go to http://localhost:9090.
8. Type `banana sushi` in the box on the top left and click `Invoke`
4. If you are on Windows, double-click on the `install.bat` script. On
macOS, open a Terminal window, drag the file `install.sh` from Finder
into the Terminal, and press return. On Linux, run `install.sh`.
## Table of Contents
5. You'll be asked to confirm the location of the folder in which
to install InvokeAI and its image generation model files. Pick a
location with at least 15 GB of free memory. More if you plan on
InvokeAI is supported across Linux, Windows and macOS. Linux
users can use either an Nvidia-based card (with CUDA support) or an
AMD card (using the ROCm driver).
#### System
### System
You will need one of the following:
@ -98,11 +198,11 @@ We do not recommend the GTX 1650 or 1660 series video cards. They are
unable to run in half-precision mode and do not have sufficient VRAM
to render 512x512 images.
#### Memory
### Memory
- At least 12 GB Main Memory RAM.
#### Disk
### Disk
- At least 12 GB of free disk space for the machine learning model, Python, and all its dependencies.
@ -152,13 +252,15 @@ Notes](https://github.com/invoke-ai/InvokeAI/releases) and the
Please check out our **[Q&A](https://invoke-ai.github.io/InvokeAI/help/TROUBLESHOOT/#faq)** to get solutions for common installation
problems and other issues.
# Contributing
## Contributing
Anyone who wishes to contribute to this project, whether documentation, features, bug fixes, code
cleanup, testing, or code reviews, is very much encouraged to do so.
To join, just raise your hand on the InvokeAI Discord server (#dev-chat) or the GitHub discussion board.
If you'd like to help with translation, please see our [translation guide](docs/other/TRANSLATION.md).
If you are unfamiliar with how
to contribute to GitHub projects, here is a
[Getting Started Guide](https://opensource.com/article/19/7/create-pull-request-github). A full set of contribution guidelines, along with templates, are in progress. You can **make your pull request against the "main" branch**.
@ -175,6 +277,8 @@ This fork is a combined effort of various people from across the world.
[Check out the list of all these amazing people](https://invoke-ai.github.io/InvokeAI/other/CONTRIBUTORS/). We thank them for
their time, hard work and effort.
Thanks to [Weblate](https://weblate.org/) for generously providing translation services to this project.
### Support
For support, please use this repository's GitHub Issues tracking service, or join the Discord.
Applications are built on top of the invoke framework. They should construct `invoker` and then interact through it. They should avoid interacting directly with core code in order to support a variety of configurations.
### Web UI
The Web UI is built on top of an HTTP API built with [FastAPI](https://fastapi.tiangolo.com/) and [Socket.IO](https://socket.io/). The frontend code is found in `/frontend` and the backend code is found in `/ldm/invoke/app/api_app.py` and `/ldm/invoke/app/api/`. The code is further organized as such:
| Component | Description |
| --- | --- |
| api_app.py | Sets up the API app, annotates the OpenAPI spec with additional data, and runs the API |
| dependencies | Creates all invoker services and the invoker, and provides them to the API |
| events | An eventing system that could in the future be adapted to support horizontal scale-out |
| sockets | The Socket.IO interface - handles listening to and emitting session events (events are defined in the events service module) |
| routers | API definitions for different areas of API functionality |
### CLI
The CLI is built automatically from invocation metadata, and also supports invocation piping and auto-linking. Code is available in `/ldm/invoke/app/cli_app.py`.
## Invoke
The Invoke framework provides the interface to the underlying AI systems and is built with flexibility and extensibility in mind. There are four major concepts: invoker, sessions, invocations, and services.
### Invoker
The invoker (`/ldm/invoke/app/services/invoker.py`) is the primary interface through which applications interact with the framework. Its primary purpose is to create, manage, and invoke sessions. It also maintains two sets of services:
- **invocation services**, which are used by invocations to interact with core functionality.
- **invoker services**, which are used by the invoker to manage sessions and manage the invocation queue.
### Sessions
Invocations and links between them form a graph, which is maintained in a session. Sessions can be queued for invocation, which will execute their graph (either the next ready invocation, or all invocations). Sessions also maintain execution history for the graph (including storage of any outputs). An invocation may be added to a session at any time, and there is capability to add and entire graph at once, as well as to automatically link new invocations to previous invocations. Invocations can not be deleted or modified once added.
The session graph does not support looping. This is left as an application problem to prevent additional complexity in the graph.
### Invocations
Invocations represent individual units of execution, with inputs and outputs. All invocations are located in `/ldm/invoke/app/invocations`, and are all automatically discovered and made available in the applications. These are the primary way to expose new functionality in Invoke.AI, and the [implementation guide](INVOCATIONS.md) explains how to add new invocations.
### Services
Services provide invocations access AI Core functionality and other necessary functionality (e.g. image storage). These are available in `/ldm/invoke/app/services`. As a general rule, new services should provide an interface as an abstract base class, and may provide a lightweight local implementation by default in their module. The goal for all services should be to enable the usage of different implementations (e.g. using cloud storage for image storage), but should not load any module dependencies unless that implementation has been used (i.e. don't import anything that won't be used, especially if it's expensive to import).
## AI Core
The AI Core is represented by the rest of the code base (i.e. the code outside of `/ldm/invoke/app/`).
Invocations represent a single operation, its inputs, and its outputs. These operations and their outputs can be chained together to generate and modify images.
## Creating a new invocation
To create a new invocation, either find the appropriate module file in `/ldm/invoke/app/invocations` to add your invocation to, or create a new one in that folder. All invocations in that folder will be discovered and made available to the CLI and API automatically. Invocations make use of [typing](https://docs.python.org/3/library/typing.html) and [pydantic](https://pydantic-docs.helpmanual.io/) for validation and integration into the CLI and API.
All invocations must derive from `BaseInvocation`. They should have a docstring that declares what they do in a single, short line. They should also have a `type` with a type hint that's `Literal["command_name"]`, where `command_name` is what the user will type on the CLI or use in the API to create this invocation. The `command_name` must be unique. The `type` must be assigned to the value of the literal in the type hint.
Inputs consist of three parts: a name, a type hint, and a `Field` with default, description, and validation information. For example:
| Part | Value | Description |
| ---- | ----- | ----------- |
| Name | `strength` | This field is referred to as `strength` |
| Type Hint | `float` | This field must be of type `float` |
| Field | `Field(default=0.75, gt=0, le=1, description="The strength")` | The default value is `0.75`, the value must be in the range (0,1], and help text will show "The strength" for this field. |
Notice that `image` has type `Union[ImageField,None]`. The `Union` allows this field to be parsed with `None` as a value, which enables linking to previous invocations. All fields should either provide a default value or allow `None` as a value, so that they can be overwritten with a linked output from another invocation.
The special type `ImageField` is also used here. All images are passed as `ImageField`, which protects them from pydantic validation errors (since images only ever come from links).
Finally, note that for all linking, the `type` of the linked fields must match. If the `name` also matches, then the field can be **automatically linked** to a previous invocation by name and matching.
The `invoke` function is the last portion of an invocation. It is provided an `InvocationContext` which contains services to perform work as well as a `session_id` for use as needed. It should return a class with output values that derives from `BaseInvocationOutput`.
Before being called, the invocation will have all of its fields set from defaults, inputs, and finally links (overriding in that order).
Assume that this invocation may be running simultaneously with other invocations, may be running on another machine, or in other interesting scenarios. If you need functionality, please provide it as a service in the `InvocationServices` class, and make sure it can be overridden.
### Outputs
```py
classImageOutput(BaseInvocationOutput):
"""Base class for invocations that output an image"""
Output classes look like an invocation class without the invoke method. Prefer to use an existing output class if available, and prefer to name inputs the same as outputs when possible, to promote automatic invocation linking.
@ -214,6 +214,8 @@ Here are the invoke> command that apply to txt2img:
| `--variation <float>` | `-v<float>` | `0.0` | Add a bit of noise (0.0=none, 1.0=high) to the image in order to generate a series of variations. Usually used in combination with `-S<seed>` and `-n<int>` to generate a series a riffs on a starting image. See [Variations](./VARIATIONS.md). |
| `--with_variations <pattern>` | | `None` | Combine two or more variations. See [Variations](./VARIATIONS.md) for now to use this. |
| `--save_intermediates <n>` | | `None` | Save the image from every nth step into an "intermediates" folder inside the output directory |
| `--h_symmetry_time_pct <float>` | | `None` | Create symmetry along the X axis at the desired percent complete of the generation process. (Must be between 0.0 and 1.0; set to a very small number like 0.0001 for just after the first step of generation.) |
| `--v_symmetry_time_pct <float>` | | `None` | Create symmetry along the Y axis at the desired percent complete of the generation process. (Must be between 0.0 and 1.0; set to a very small number like 0.0001 for just after the first step of generation.) |
You will need a GPU to perform training in a reasonable length of
time, and at least 12 GB of VRAM. We recommend using the [`xformers`
library](../installation/070_INSTALL_XFORMERS) to accelerate the
library](../installation/070_INSTALL_XFORMERS.md) to accelerate the
training process further. During training, about ~8 GB is temporarily
needed in order to store intermediate models, checkpoints and logs.
@ -250,6 +250,24 @@ invokeai-ti \
--only_save_embeds
```
## Using Embeddings
After training completes, the resultant embeddings will be saved into your `$INVOKEAI_ROOT/embeddings/<trigger word>/learned_embeds.bin`.
These will be automatically loaded when you start InvokeAI.
Add the trigger word, surrounded by angle brackets, to use that embedding. For example, if your trigger word was `terence`, use `<terence>` in prompts. This is the same syntax used by the HuggingFace concepts library.
**Note:**`.pt` embeddings do not require the angle brackets.
## Troubleshooting
### `Cannot load embedding for <trigger>. It was trained on a model with token dimension 1024, but the current model has token dimension 768`
Messages like this indicate you trained the embedding on a different base model than the currently selected one.
For example, in the error above, the training was done on SD2.1 (768x768) but it was used on SD1.5 (512x512).
## Reading
For more information on textual inversion, please see the following
|stable-diffusion-1.5 | runwayml/stable-diffusion-v1-5 | Most recent version of base Stable Diffusion model | https://huggingface.co/runwayml/stable-diffusion-v1-5 |
| stable-diffusion-1.4 | runwayml/stable-diffusion-v1-4 | Previous version of base Stable Diffusion model | https://huggingface.co/runwayml/stable-diffusion-v1-4 |
| stable-diffusion-2.1-base |stabilityai/stable-diffusion-2-1-base | Stable Diffusion version 2.1 trained on 512 pixel images | https://huggingface.co/stabilityai/stable-diffusion-2-1-base |
| stable-diffusion-2.1-768 |stabilityai/stable-diffusion-2-1 | Stable Diffusion version 2.1 trained on 768 pixel images | https://huggingface.co/stabilityai/stable-diffusion-2-1 |
| dreamlike-diffusion-1.0 | dreamlike-art/dreamlike-diffusion-1.0 | An SD 1.5 model finetuned on high quality art | https://huggingface.co/dreamlike-art/dreamlike-diffusion-1.0 |
| dreamlike-photoreal-2.0 | dreamlike-art/dreamlike-photoreal-2.0 | A photorealistic model trained on 768 pixel images| https://huggingface.co/dreamlike-art/dreamlike-photoreal-2.0 |
| openjourney-4.0 | prompthero/openjourney | An SD 1.5 model finetuned on Midjourney images prompt with "mdjrny-v4 style" | https://huggingface.co/prompthero/openjourney |
| nitro-diffusion-1.0 | nitrosocke/Nitro-Diffusion | An SD 1.5 model finetuned on three styles, prompt with "archer style", "arcane style" or "modern disney style" | https://huggingface.co/nitrosocke/Nitro-Diffusion|
| trinart-2.0 | naclbit/trinart_stable_diffusion_v2 | An SD 1.5 model finetuned with ~40,000 assorted high resolution manga/anime-style pictures | https://huggingface.co/naclbit/trinart_stable_diffusion_v2|
| trinart-characters-2_0 | naclbit/trinart_derrida_characters_v2_stable_diffusion | An SD1.5 model finetuned with 19.2M manga/anime-style pictures | https://huggingface.co/naclbit/trinart_derrida_characters_v2_stable_diffusion|
|Model Name | HuggingFace Repo ID | Description | URL |
|---------- | ---------- | ----------- | --- |
|stable-diffusion-1.5|runwayml/stable-diffusion-v1-5|Stable Diffusion version 1.5 diffusers model (4.27 GB)|https://huggingface.co/runwayml/stable-diffusion-v1-5 |
|sd-inpainting-1.5|runwayml/stable-diffusion-inpainting|RunwayML SD 1.5 model optimized for inpainting, diffusers version (4.27 GB)|https://huggingface.co/runwayml/stable-diffusion-inpainting |
|stable-diffusion-2.1|stabilityai/stable-diffusion-2-1|Stable Diffusion version 2.1 diffusers model, trained on 768 pixel images (5.21 GB)|https://huggingface.co/stabilityai/stable-diffusion-2-1 |
|sd-inpainting-2.0|stabilityai/stable-diffusion-2-1|Stable Diffusion version 2.0 inpainting model (5.21 GB)|https://huggingface.co/stabilityai/stable-diffusion-2-1 |
|analog-diffusion-1.0|wavymulder/Analog-Diffusion|An SD-1.5 model trained on diverse analog photographs (2.13 GB)|https://huggingface.co/wavymulder/Analog-Diffusion |
|deliberate-1.0|XpucT/Deliberate|Versatile model that produces detailed images up to 768px (4.27 GB)|https://huggingface.co/XpucT/Deliberate |
|dreamlike-photoreal-2.0|dreamlike-art/dreamlike-photoreal-2.0|A photorealistic model trained on 768 pixel images based on SD 1.5 (2.13 GB)|https://huggingface.co/dreamlike-art/dreamlike-photoreal-2.0 |
|inkpunk-1.0|Envvi/Inkpunk-Diffusion|Stylized illustrations inspired by Gorillaz, FLCL and Shinkawa; prompt with "nvinkpunk" (4.27 GB)|https://huggingface.co/Envvi/Inkpunk-Diffusion|
|openjourney-4.0|prompthero/openjourney|An SD 1.5 model finetuned on Midjourney; prompt with "mdjrny-v4 style" (2.13 GB)|https://huggingface.co/prompthero/openjourney |
|portrait-plus-1.0|wavymulder/portraitplus|An SD-1.5 model trained on close range portraits of people; prompt with "portrait+" (2.13 GB)|https://huggingface.co/wavymulder/portraitplus |
|seek-art-mega-1.0|coreco/seek.art_MEGA|A general use SD-1.5 "anything" model that supports multiple styles (2.1 GB)|https://huggingface.co/coreco/seek.art_MEGA |
|trinart-2.0|naclbit/trinart_stable_diffusion_v2|An SD-1.5 model finetuned with ~40K assorted high resolution manga/anime-style images (2.13 GB)|https://huggingface.co/naclbit/trinart_stable_diffusion_v2 |
|waifu-diffusion-1.4|hakurei/waifu-diffusion|An SD-1.5 model trained on 680k anime/manga-style images (2.13 GB)|https://huggingface.co/hakurei/waifu-diffusion |
Note that these files are covered by an "Ethical AI" license which forbids
certain uses. When you initially download them, you are asked to
accept the license terms.
Note that these files are covered by an "Ethical AI" license which
forbids certain uses. When you initially download them, you are asked
to accept the license terms. In addition, some of these models carry
additional license terms that limit their use in commercial
applications or on public servers. Be sure to familiarize yourself
with the model terms by visiting the URLs in the table above.
## Community-Contributed Models
@ -80,6 +86,13 @@ only `.safetensors` and `.ckpt` models, but they can be easily loaded
into InvokeAI and/or converted into optimized `diffusers` models. Be
aware that CIVITAI hosts many models that generate NSFW content.
!!! note
InvokeAI 2.3.x does not support directly importing and
running Stable Diffusion version 2 checkpoint models. You may instead
convert them into `diffusers` models using the conversion methods
described below.
## Installation
There are multiple ways to install and manage models:
@ -90,7 +103,7 @@ There are multiple ways to install and manage models:
models files.
3. The web interface (WebUI) has a GUI for importing and managing
models.
models.
### Installation via `invokeai-configure`
@ -106,7 +119,7 @@ confirm that the files are complete.
You can install a new model, including any of the community-supported ones, via
the command-line client's `!import_model` command.
#### Installing `.ckpt` and `.safetensors` models
#### Installing individual `.ckpt` and `.safetensors` models
If the model is already downloaded to your local disk, use
`!import_model /path/to/file.ckpt` to load it. For example:
InvokeAI uses [Weblate](https://weblate.org) for translation. Weblate is a FOSS project providing a scalable translation service. Weblate automates the tedious parts of managing translation of a growing project, and the service is generously provided at no cost to FOSS projects like InvokeAI.
## Contributing
If you'd like to contribute by adding or updating a translation, please visit our [Weblate project](https://hosted.weblate.org/engage/invokeai/). You'll need to sign in with your GitHub account (a number of other accounts are supported, including Google).
Once signed in, select a language and then the Web UI component. From here you can Browse and Translate strings from English to your chosen language. Zen mode offers a simpler translation experience.
Your changes will be attributed to you in the automated PR process; you don't need to do anything else.
## Help & Questions
Please check Weblate's [documentation](https://docs.weblate.org/en/latest/index.html) or ping @psychedelicious or @blessedcoolant on Discord if you have any questions.
## Thanks
Thanks to the InvokeAI community for their efforts to translate the project!
File diff suppressed because it is too large
Load Diff
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.