Currently hooks registers multiple time for some modules.
As result - lora applies multiple time to this modules on generation and
images looks weird.
If have any other minds how to fix it better - feel free to push.
# Minor fixes to the 2.3 branch
This is a proposed `2.3.5.post2` to correct the updater problems in
2.3.5.post1 and make transition to 3.0.0 easier.
## Updating fixed
The invokeai-update script will now recognize when the user previously
installed xformers and modifies the pip install command so as to include
xformers as an extra that needs to be updated. This will prevent the
problems experienced during the upgrade to `2.3.5.post1` in which torch
was updated but xformers wasn't.
## VAE autoconversion improved
In addition to looking for instances in which a user has entered a VAE
ckpt into the "vae" field directly, the model manager now also handles
the case in which the user entered a ckpt (rather than a diffusers) into
the path field. These two cases now both work:
```
vae: models/ldm/stable-diffusion-1/vae-ft-mse-840000-ema-pruned.ckpt
```
and
```
vae:
path: models/ldm/stable-diffusion-1/vae-ft-mse-840000-ema-pruned.ckpt
```
In addition, if a 32-bit checkpoint VAE is encountered and user is using
half precision, the VAE is now converted to 16 bits on the fly.
This PR makes the following minor fixes to the 2.3 branch:
1. The invokeai-update script will now recognize when the user
previously installed xformers and modifies the pip install command
so as to include xformers as an extra that needs to be updated.
2. In addition to looking for instances in which a user has entered a
VAE ckpt into the "vae" field directly, it also handles the case in
which the user entered a ckpt into the path field. These two cases
now work:
vae: models/ldm/stable-diffusion-1/vae-ft-mse-840000-ema-pruned.ckpt
and
vae:
path: models/ldm/stable-diffusion-1/vae-ft-mse-840000-ema-pruned.ckpt
3. If a 32-bit checkpoint VAE is encountered and user is using half precision,
the VAE is now converted to 16 bits on the fly.
This draft PR implements a system in which if a diffusers model is
loaded, and the model manager detects that the user tried to assign a
legacy checkpoint VAE to the model, the checkpoint will be converted to
a diffusers VAE in RAM.
It is draft because it has not been carefully tested yet, and there are
some edge cases that are not handled properly.
This is a partial fix for #3330 . It prevents the concepts library from
hitting HuggingFace to download concept library terms every time a model
is changed.
It does not address the issue that the web interface downloads the
concepts even if it is not going to use them.
# Version 2.3.5
This will be the 2.3.5 release once it is merged into the `v2.3` branch.
Changes on the RC branch are:
- Bump version number
- Fix bug in LoRA path determination (do it at runtime, not at module
load time, or root will get confused); closes#3293.
- Remove dangling debug statement.
# Distinguish LoRA/LyCORIS files based on what version of SD they were
built on top of
- Attempting to run a prompt with a LoRA based on SD v1.X against a
model based on v2.X will now throw an `IncompatibleModelException`. To
import this exception:
`from ldm.modules.lora_manager import IncompatibleModelException` (maybe
this should be defined in ModelManager?)
- Enhance `LoraManager.list_loras()` to accept an optional integer
argument, `token_vector_length`. This will filter the returned LoRA
models to return only those that match the indicated length. Use:
```
768 => for models based on SD v1.X
1024 => for models based on SD v2.X
```
Note that this filtering requires each LoRA file to be opened by
`torch.safetensors`. It will take ~8s to scan a directory of 40 files.
- Added new static methods to `ldm.modules.kohya_lora_manager`:
- check_model_compatibility()
- vector_length_from_checkpoint()
- vector_length_from_checkpoint_file()
- You can now create subdirectories within the `loras` directory and
organize the model files.
- Moved all postinstallation config file and model munging code out of
the CLI and into a separate script named `invokeai-postinstall`
- Fixed two calls to `shutil.copytree()` so that they don't try to
preserve the file mode of the copied files. This is necessary to run
correctly in a Nix environment (see thread at
https://discord.com/channels/1020123559063990373/1091716696965918732/1095662756738371615)
- Update the installer so that an existing virtual environment will be
updated, not overwritten.
- Pin npyscreen version to see if this fixes issues people have had with
installing this module.
Load LoRAs during compel's text embedding encode pass in case there are
requested LoRAs which also want to patch the text encoder.
Also generally cleanup the attention processor patching stuff. It's
still a mess, but at least now it's a *stateless* mess.
Both @mauwii and @keturn have been offline for some time. I am
temporarily removing them from CODEOWNERS so that they will not be
responsible for code reviews until they wish to/are able to re-engage
fully.
Note that I have volunteered @GreggHelt2 to be a codeowner of the
generation backend code, replacing @keturn . Let me know if you're
uncomfortable with this.
- If the user enters a VAE .ckpt path into the VAE field of a
diffusers model, the VAE will be automatically converted behind
the scenes into a diffusers version, then loaded.
- This commit is untested (done on an airplane).
- Pin diffusers to 0.14
- Small fix to LoRA loading routine that was preventing placement of
LoRA files in subdirectories.
- Bump version to 2.3.4.post1
- Pin diffusers to 0.14
- Small fix to LoRA loading routine that was preventing placement of
LoRA files in subdirectories.
- Bump version to 2.3.4.post1
- Moved all postinstallation config file and model munging code out
of the CLI and into a separate script named `invokeai-postinstall`
- Fixed two calls to `shutil.copytree()` so that they don't try to preserve
the file mode of the copied files. This is necessary to run correctly
in a Nix environment
(see thread at https://discord.com/channels/1020123559063990373/1091716696965918732/1095662756738371615)
- Update the installer so that an existing virtual environment will be
updated, not overwritten.
- Pin npyscreen version to see if this fixes issues people have had with
installing this module.
- Previous PR to truncate long filenames won't work on windows due to
lack of support for os.pathconf(). This works around the limitation by
hardcoding the value for PC_NAME_MAX when pathconf is unavailable.
- The `multiprocessing` send() and recv() methods weren't working
properly on Windows due to issues involving `utf-8` encoding and
pickling/unpickling. Changed these calls to `send_bytes()` and
`recv_bytes()` , which seems to fix the issue.
Not fully tested on Windows since I lack a GPU machine to test on, but
is working on CPU.
- Attempting to run a prompt with a LoRA based on SD v1.X against a
model based on v2.X will now throw an
`IncompatibleModelException`. To import this exception:
`from ldm.modules.lora_manager import IncompatibleModelException`
(maybe this should be defined in ModelManager?)
- Enhance `LoraManager.list_loras()` to accept an optional integer
argument, `token_vector_length`. This will filter the returned LoRA
models to return only those that match the indicated length. Use:
```
768 => for models based on SD v1.X
1024 => for models based on SD v2.X
```
Note that this filtering requires each LoRA file to be opened
by `torch.safetensors`. It will take ~8s to scan a directory of
40 files.
- Added new static methods to `ldm.modules.kohya_lora_manager`:
- check_model_compatibility()
- vector_length_from_checkpoint()
- vector_length_from_checkpoint_file()
- Previously the user's preferred precision was used to select which
version branch of a diffusers model would be downloaded. Half-precision
would try to download the 'fp16' branch if it existed.
- Turns out that with waifu-diffusion this logic doesn't work, as
'fp16' gets you waifu-diffusion v1.3, while 'main' gets you
waifu-diffusion v1.4. Who knew?
- This PR adds a new optional "revision" field to `models.yaml`. This
can be used to override the diffusers branch version. In the case of
Waifu diffusion, INITIAL_MODELS.yaml now specifies the "main" branch.
- This PR also quenches the NSFW nag that downloading diffusers sometimes
triggers.
- Closes#3160
- Previous PR to truncate long filenames won't work on windows
due to lack of support for os.pathconf(). This works around the
limitation by hardcoding the value for PC_NAME_MAX when pathconf
is unavailable.
Some quick bug fixes related to the UI for the 2.3.4. release.
**Features:**
- Added the ability to now add Textual Inversions to the Negative Prompt
using the UI.
- Added the ability to clear Textual Inversions and Loras from Prompt
and Negative Prompt with a single click.
- Textual Inversions now have status pips - indicating whether they are
used in the Main Prompt, Negative Prompt or both.
**Fixes**
- Fixes#3138
- Fixes#3144
- Fixed `usePrompt` not updating the Lora and TI count in prompt /
negative prompt.
- Fixed the TI regex not respecting names in substrings.
- Fixed trailing spaces when adding and removing loras and TI's.
- Fixed an issue with the TI regex not respecting the `<` and `>` used
by HuggingFace concepts.
- Some other minor bug fixes.
NOTE: This PR works with `diffusers` models **only**. As a result
InvokeAI is now converting all legacy checkpoint/safetensors files into
diffusers models on the fly. This introduces a bit of extra delay when
loading legacy models. You can avoid this by converting the files to
diffusers either at import time, or after the fact.
# Instructions:
1. Download LoRA .safetensors files of your choice and place in
`INVOKEAIROOT/loras`. Unlike the draft version of this PR, the file
names can now contain underscores and hyphens. Names with arbitrary
unicode characters are not supported.
2. Add `withLora(lora-file-basename,weight)` to your prompt. The weight
is optional and will default to 1.0. A few examples, assuming that a
LoRA file named `loras/sushi.safetensors` is present:
```
family sitting at dinner table eating sushi withLora(sushi,0.9)
family sitting at dinner table eating sushi withLora(sushi, 0.75)
family sitting at dinner table eating sushi withLora(sushi)
```
Multiple `withLora()` prompt fragments are allowed. The weight can be
arbitrarily large, but the useful range is roughly 0.5 to 1.0. Higher
weights make the LoRA's influence stronger. The last version of the
syntax, which uses the default weight of 1.0, is waiting on the next
version of the Compel library to be released and may not work at this
time.
In my limited testing, I found it useful to reduce the CFG to avoid
oversharpening. Also I got better results when running the LoRA on top
of the model on which it was based during training.
Don't try to load a SD 1.x-trained LoRA into a SD 2.x model, and vice
versa. You will get a nasty stack trace. This needs to be cleaned up.
3. You can change the location of the `loras` directory by passing the
`--lora_directory` option to `invokeai.
Documentation can be found in docs/features/LORAS.md.
Note that this PR incorporates the unmerged 2.3.3 PR code (#3058) and
bumps the version number up to 2.3.4a0.
A zillion thanks to @felorhik, @neecapp and many others for this
implementation. @blessedcoolant and I just did a little tidying up.
- Remove unused (and probably dangerous) `unload_applied_loras()` method
- Remove unused `LoraManager.loras_to_load` attribute
- Change default LoRA weight to 0.75 when using WebUI to add a LoRA to prompt.
Now, for python 3.9 installer run upgrade pip command like this:
`pip install --upgrade pip`
And because pip binary locked as running process this lead to error(at
least on windows):
```
ERROR: Could not install packages due to an OSError: [WinError 5] Access is denied: 'e:\invokeai\.venv\scripts\pip.exe'
Check the permissions.
```
To prevent this recomended command to upgrade pip is:
`python -m pip install --upgrade pip`
Which not locking pip file.
- removed app directory (a 3.0 feature), so app tests had to go too
- fixed regular expression in the concepts lib which was causing deprecation warnings
(note that this is actually release candidate 7, but I made the mistake
of including an old rc number in the branch and can't easily change it)
## Updating Root directory
- Introduced new mechanism for updating the root directory when
necessary. Currently only used to update the invoke.sh script using new
dialog colors.
- Fixed ROCm torch module version number
## Loading legacy 2.0/2.1 models
- Due to not converting the torch.dtype precision correctly, the
`load_pipeline_from_original_stable_diffusion_ckpt()` was returning
models of dtype float32 regardless of the precision setting. This caused
a precision mismatch crash.
- Problem now fixed (also see #3057 for the same fix to `main`)
## Support for a fourth textual inversion embedding file format
- This variant, exemplified by "easynegative.safetensors" has a single
'embparam' key containing a Tensor.
- Also refactored code to make it easier to read.
- Handle both pickle and safetensor formats.
## Persistent model selection
- To be consistent with WebUI parameter behavior, the currently selected
model is saved on exit and restored on restart for both WebUI and CLI
## Bug fixes
- Name of VAE cache directory was "hug", not "hub". This is fixed.
## VAE fixes
- Allow custom VAEs to be assigned to a legacy model by placing a
like-named vae file adjacent to the checkpoint file.
- The custom VAE will be picked up and incorporated into the diffusers
model if the user chooses to convert/optimize.
## Custom config file loading
- Some of the civitai models instruct users to place a custom .yaml file
adjacent to the checkpoint file. This generally wasn't working because
some of the .yaml files use FrozenCLIPEmbedder rather than
WeightedFrozenCLIPEmbedder, and our FrozenCLIPEmbedder class doesn't
handle the `personalization_config` section used by the the textual
inversion manager. Other .yaml files don't have the
`personalization_config` section at all. Both these issues are
fixed.#1685
## Consistent pytorch version
- There was an inconsistency between the pytorch version requirement in
`pyproject.toml` and the requirement in the installer (which does a
little jiggery-pokery to load torch with the right CUDA/ROCm version
prior to the main pip install. This was causing torch to be installed,
then uninstalled, and reinstalled with a different version number. This
is now fixed.
Instructions:
1. Download LoRA .safetensors files of your choice and place in
`INVOKEAIROOT/loras`. Unlike the draft version of this, the file
names can contain underscores and alphanumerics. Names with
arbitrary unicode characters are not supported.
2. Add `withLora(lora-file-basename,weight)` to your prompt. The
weight is optional and will default to 1.0. A few examples, assuming
that a LoRA file named `loras/sushi.safetensors` is present:
```
family sitting at dinner table eating sushi withLora(sushi,0.9)
family sitting at dinner table eating sushi withLora(sushi, 0.75)
family sitting at dinner table eating sushi withLora(sushi)
```
Multiple `withLora()` prompt fragments are allowed. The weight can be
arbitrarily large, but the useful range is roughly 0.5 to 1.0. Higher
weights make the LoRA's influence stronger.
In my limited testing, I found it useful to reduce the CFG to avoid
oversharpening. Also I got better results when running the LoRA on top
of the model on which it was based during training.
Don't try to load a SD 1.x-trained LoRA into a SD 2.x model, and vice
versa. You will get a nasty stack trace. This needs to be cleaned up.
3. You can change the location of the `loras` directory by passing the
`--lora_directory` option to `invokeai.
Documentation can be found in docs/features/LORAS.md.
- Allow invokeai-update to update using a release, tag or branch.
- Allow CLI's root directory update routine to update directory
contents regardless of whether current version is released.
- In model importation routine, clarify wording of instructions when user is
asked to choose the type of model being imported.
- This variant, exemplified by "easynegative.safetensors" has a single
'embparam' key containing a Tensor.
- Also refactored code to make it easier to read.
- Handle both pickle and safetensor formats.
This commit fixes bugs related to the on-the-fly conversion and loading of
legacy checkpoint models built on SD-2.0 base.
- When legacy checkpoints built on SD-2.0 models were converted
on-the-fly using --ckpt_convert, generation would crash with a
precision incompatibility error.
- In addition, broken logic was causing some 2.0-derived ckpt files to
be converted into diffusers and then processed through the legacy
generation routines - not good.
- installer now installs the pretty dialog-based console launcher
- added dialogrc for custom colors
- add updater to download new launcher when users do an update
- This variant, exemplified by "easynegative.safetensors" has a single
'embparam' key containing a Tensor.
- Also refactored code to make it easier to read.
- Handle both pickle and safetensor formats.
- Imported V2 legacy models will now autoconvert into diffusers at load
time regardless of setting of --ckpt_convert.
- model manager `heuristic_import()` function now looks for side-by-side
yaml and vae files for custom configuration and VAE respectively.
Example of this:
illuminati-v1.1.safetensors illuminati-v1.1.vae.safetensors
illuminati-v1.1.yaml
When the user tries to import `illuminati-v1.1.safetensors`, the yaml
file will be used for its configuration, and the VAE will be used for
its VAE. Conversion to diffusers will happen if needed, and the yaml
file will be used to determine which V2 format (if any) to apply.
NOTE that the changes to `ckpt_to_diffusers.py` were previously reviewed
by @JPPhoto on the `main` branch and approved.
- Imported V2 legacy models will now autoconvert into diffusers
at load time regardless of setting of --ckpt_convert.
- model manager `heuristic_import()` function now looks for
side-by-side yaml and vae files for custom configuration and VAE
respectively.
Example of this:
illuminati-v1.1.safetensors
illuminati-v1.1.vae.safetensors
illuminati-v1.1.yaml
When the user tries to import `illuminati-v1.1.safetensors`, the yaml
file will be used for its configuration, and the VAE will be used for
its VAE. Conversion to diffusers will happen if needed, and the yaml
file will be used to determine which V2 format (if any) to apply.
Several related security fixes:
1. Port #2946 from main to 2.3.2 branch - this closes a hole that allows
a pickle checkpoint file to masquerade as a safetensors file.
2. Add pickle scanning to the checkpoint to diffusers conversion script.
3. Pickle scan VAE non-safetensors files
4. Avoid running scanner twice on same file during the probing and
conversion process.
5. Clean up diagnostic messages.
- The command `invokeai-batch --invoke` was created a time-stamped
logfile with colons in its name, which is a Windows no-no. This corrects
the problem by writing the timestamp out as "13-06-2023_8-35-10"
- Closes#3005
- Since 2.3.2 invokeai stores the next PNG file's numeric prefix in a
file named `.next_prefix` in the outputs directory. This avoids the
overhead of doing a directory listing to find out what file number comes
next.
- The code uses advisory locking to prevent corruption of this file in
the event that multiple invokeai's try to access it simultaneously, but
some users have experienced corruption of the file nevertheless.
- This PR addresses the problem by detecting a potentially corrupted
`.next_prefix` file and falling back to the directory listing method. A
fixed version of the file is then written out.
- Closes#3001
This PR addresses issues raised by #3008.
1. Update documentation to indicate the correct maximum batch size for
TI training when xformers is and isn't used.
2. Update textual inversion code so that the default for batch size is
aware of xformer availability.
3. Add documentation for how to launch TI with distributed learning.
Lots of little bugs have been squashed since 2.3.2 and a new minor point
release is imminent. This PR updates the version number in preparation
for a RC.
- Since 2.3.2 invokeai stores the next PNG file's numeric prefix in a
file named `.next_prefix` in the outputs directory. This avoids the
overhead of doing a directory listing to find out what file number
comes next.
- The code uses advisory locking to prevent corruption of this file in
the event that multiple invokeai's try to access it simultaneously,
but some users have experienced corruption of the file nevertheless.
- This PR addresses the problem by detecting a potentially corrupted
`.next_prefix` file and falling back to the directory listing method.
A fixed version of the file is then written out.
- Closes#3001
Lots of little bugs have been squashed since 2.3.2 and a new minor
point release is imminent. This PR updates the version number in
preparation for a RC.
- `invokeai-batch --invoke` was created a time-stamped logfile with colons in its
name, which is a Windows no-no. This corrects the problem by writing
the timestamp out as "13-06-2023_8-35-10"
- Closes#3005
This PR addresses issues raised by #3008.
1. Update documentation to indicate the correct maximum batch size for
TI training when xformers is and isn't used.
2. Update textual inversion code so that the default for batch size
is aware of xformer availability.
3. Add documentation for how to launch TI with distributed learning.
Two related security fixes:
1. Port #2946 from main to 2.3.2 branch - this closes a hole that
allows a pickle checkpoint file to masquerade as a safetensors
file.
2. Add pickle scanning to the checkpoint to diffusers conversion
script. This will be ported to main in a separate PR.
This commit enhances support for V2 variant (epsilon and v-predict)
import and conversion to diffusers, by prompting the user to select the
proper config file during startup time autoimport as well as in the
invokeai installer script. Previously the user was only prompted when
doing an `!import` from the command line or when using the WebUI Model
Manager.
This commit enhances support for V2 variant (epsilon and v-predict)
import and conversion to diffusers, by prompting the user to select
the proper config file during startup time autoimport as well as
in the invokeai installer script..
At some point `pyproject.toml` was modified to remove the
invokeai-update and invokeai-model-install scripts. This PR fixes the
issue.
If this was an intentional change, let me know and we'll discuss.
# Support SD version 2 "epsilon" and "v-predict" inference
configurations in v2.3
This is a port of the `main` PR #2870 back into V2.3. It allows both
"epsilon" inference V2 models (e.g. "v2-base") and "v-predict" models
(e.g. "V2-768") to be imported and converted into correct diffusers
models. This depends on picking the right configuration file to use, and
since there is no intrinsic difference between the two types of models,
when we detect that a V2 model is being imported, we fall back to asking
the user to select the model type.
This PR ports the `main` PR #2871 to the v2.3 branch. This adjusts the
global diffusers model cache to work with the 0.14 diffusers layout of
placing models in HF_HOME/hub rather than HF_HOME/diffusers. It also
implements the one-time migration action to the new layout.
This PR ports the `main` PR #2871 to the v2.3 branch. This adjusts
the global diffusers model cache to work with the 0.14 diffusers
layout of placing models in HF_HOME/hub rather than HF_HOME/diffusers.
# Programatically generate a large number of images varying by prompt
and other image generation parameters
This is a little standalone script named `dynamic_prompting.py` that
enables the generation of dynamic prompts. Using YAML syntax, you
specify a template of prompt phrases and lists of generation parameters,
and the script will generate a cross product of prompts and generation
settings for you. You can save these prompts to disk for later use, or
pipe them to the invokeai CLI to generate the images on the fly.
Typical uses are testing step and CFG values systematically while
holding the seed and prompt constant, testing out various artist's
styles, and comparing the results of the same prompt across different
models.
A typical template will look like this:
```
model: stable-diffusion-1.5
steps: 30;50;10
seed: 50
dimensions: 512x512
cfg:
- 7
- 12
sampler:
- k_euler_a
- k_lms
prompt:
style:
- greg rutkowski
- gustav klimt
location:
- the mountains
- a desert
object:
- luxurious dwelling
- crude tent
template: a {object} in {location}, in the style of {style}
```
This will generate 96 different images, each of which varies by one of
the dimensions specified in the template. For example, the prompt axis
will generate a cross product list like:
```
a luxurious dwelling in the mountains, in the style of greg rutkowski
a luxurious dwelling in the mountains, in the style of gustav klimt
a luxious dwelling in a desert, in the style of greg rutkowski
... etc
```
A typical usage would be:
```
python scripts/dynamic_prompts.py --invoke --outdir=/tmp/scanning my_template.yaml
```
This will populate `/tmp/scanning` with each of the requested images,
and also generate a `log.md` file which you can open with an e-book
reader to show something like this:

Full instructions can be obtained using the `--instructions` switch, and
an example template can be printed out using `--example`:
```
python scripts/dynamic_prompts.py --instructions
python scripts/dynamic_prompts.py --example > my_first_template.yaml
```
When a legacy ckpt model was converted into diffusers in RAM, the
built-in NSFW checker was not being disabled, in contrast to models
converted and saved to disk. Because InvokeAI does its NSFW checking as
a separate post-processing step (in order to generate blurred images
rather than black ones), this defeated the
--nsfw and --no-nsfw switches.
This closes#2836 and #2580.
Note - this fix will be applied to `main` as a separate PR.
At some point `pyproject.toml` was modified to remove the
invokeai-update script, which in turn breaks the update
function in the launcher scripts. This PR fixes the
issue.
If this was an intentional change, let me know and we'll discuss.
When a legacy ckpt model was converted into diffusers in RAM, the
built-in NSFW checker was not being disabled, in contrast to models
converted and saved to disk. Because InvokeAI does its NSFW checking
as a separate post-processing step (in order to generate blurred
images rather than black ones), this defeated the
--nsfw and --no-nsfw switches.
This closes#2836 and #2580.
This is a different source/base branch from
https://github.com/invoke-ai/InvokeAI/pull/2823 but is otherwise the
same content. `yarn build` was ran on this clean branch.
## What was the problem/requirement? (What/Why)
As part of a [change in
2.3.0](d74c4009cb),
the high resolution fix was no longer being applied when 'Use all' was
selected. This effectively meant that users had to manually analyze
images to ensure that the parameters were set to match.
~~Additionally, and never actually working, Upscaling and Face
Restoration parameters were also not pulling through with the action,
causing a similar usability issue.~~ See:
https://github.com/invoke-ai/InvokeAI/pull/2823#issuecomment-1445530362
## What was the solution? (How)
This change adds a new reducer to the `postprocessingSlice` file,
mimicking the `generationSlice` reducer to assign all parameters
appropriate for the post processing options. This reducer assigns:
* Hi-res's toggle button only if the type is `txt2img`, since `img2img`
hi-res was removed previously
* ~~Upscaling's toggle button, scale, denoising strength, and upscale
strength~~
* ~~Face Restoration's toggle button, type, strength, and fidelity (if
present/applicable)~~
### Minor
* Added `endOfLine: 'crlf'` to prettier's config to prevent all files
from being checked out on Windows due to difference of line endings (and
git not picking up those changes as modifications, causing ghost
modified files from Git)
### Revision 2:
* Removed out upscaling and face restoration pulling of parameters
### Revision 3:
* More defensive coding for the `hires_fix` not present (assume false)
### Out of Scope
* Hi-res strength (applied as img2img strength in the initial image that
is generated) is not in the metadata of the final image and can't be
reconstructed easily
* Upscaling and face restoration have some peculiarities for multi-post
processing outside of the UI, which complicates it enough to scope out
of this PR.
## How were these changes tested?
* `yarn dev` => Server started successfully
* Manual testing on the development server to ensure parameters pulled
correctly
* `yarn build` => Success
## Notes
As with `generationSlice`, this code assumes `action.payload.image` is
valid and doesn't do a formal check on it to ensure it is valid.
- Crash would occur at the end of this sequence:
- launch CLI
- !convert <URL pointing to a legacy ckpt file>
- Answer "Y" when asked to delete original .ckpt file
- This commit modifies model_manager.heuristic_import() to silently
delete the downloaded legacy file after it has been converted into a
diffusers model. The user is no longer asked to approve deletion.
NB: This should be cherry-picked into main once refactor is done.
For your consideration, here is a revised set of codeowners for the v2.3
branch. The previous set had the bad property that both @blessedcoolant
and @lstein were codeowners of everything, meaning that we had the
superpower of being able to put in a PR and get full approval if any
other member of the team (not a codeowner) approved.
The proposed file is a bit more sensible but needs many eyes on it.
Please take a look and make improvements. I wasn't sure where to put
some people, such as @netsvetaev or @GreggHelt2
I don't think it makes sense to tinker with the `main` CODEOWNERS until
the "Big Freeze" code reorganization happens.
I subscribed everyone to this PR. Apologies
- Crash would occur at the end of this sequence:
- launch CLI
- !convert <URL pointing to a legacy ckpt file>
- Answer "Y" when asked to delete original .ckpt file
- This commit modifies model_manager.heuristic_import()
to silently delete the downloaded legacy file after
it has been converted into a diffusers model. The user
is no longer asked to approve deletion.
NB: This should be cherry-picked into main once refactor
is done.
Simple script to generate a file of InvokeAI prompts and settings
that scan across steps and other parameters.
To use, create a file named "template.yaml" (or similar) formatted like this
>>> cut here <<<
steps: "30:50:1"
seed: 50
cfg:
- 7
- 8
- 12
sampler:
- ddim
- k_lms
prompt:
- a sunny meadow in the mountains
- a gathering storm in the mountains
>>> cut here <<<
Create sections named "steps", "seed", "cfg", "sampler" and "prompt".
- Each section can have a constant value such as this:
steps: 50
- Or a range of numeric values in the format:
steps: "<start>:<stop>:<step>"
- Or a list of values in the format:
- value1
- value2
- value3
Be careful to: 1) put quotation marks around numeric ranges; 2) put a
space between the "-" and the value in a list of values; and 3) use spaces,
not tabs, at the beginnings of indented lines.
When you run this script, capture the output into a text file like this:
python generate_param_scan.py template.yaml > output_prompts.txt
"output_prompts.txt" will now contain an expansion of all the list
values you provided. You can examine it in a text editor such as
Notepad.
Now start the CLI, and feed the expanded prompt file to it using the
"!replay" command:
!replay output_prompts.txt
Alternatively, you can directly feed the output of this script
by issuing a command like this from the developer's console:
python generate_param_scan.py template.yaml | invokeai
You can use the web interface to view the resulting images and their
metadata.
- stable-diffusion-2.1-base base model from
stabilityai/stable-diffusion-2-1-base
- stable-diffusion-2.1-768 768 pixel model from
stabilityai/stable-diffusion-2-1-768
- sd-inpainting-2.0 512 pixel inpainting model from
runwayml/stable-diffusion-inpainting
This PR also bumps the version number up to v2.3.1.post2
- stable-diffusion-2.1-base
base model from stabilityai/stable-diffusion-2-1-base
- stable-diffusion-2.1-768
768 pixel model from stabilityai/stable-diffusion-2-1-768
- sd-inpainting-2.0
512 pixel inpainting model from runwayml/stable-diffusion-inpainting
author Kyle Schouviller <kyle0654@hotmail.com> 1669872800 -0800
committer Kyle Schouviller <kyle0654@hotmail.com> 1676240900 -0800
Adding base node architecture
Fix type annotation errors
Runs and generates, but breaks in saving session
Fix default model value setting. Fix deprecation warning.
Fixed node api
Adding markdown docs
Simplifying Generate construction in apps
[nodes] A few minor changes (#2510)
* Pin api-related requirements
* Remove confusing extra CORS origins list
* Adds response models for HTTP 200
[nodes] Adding graph_execution_state to soon replace session. Adding tests with pytest.
Minor typing fixes
[nodes] Fix some small output query hookups
[node] Fixing some additional typing issues
[nodes] Move and expand graph code. Add base item storage and sqlite implementation.
Update startup to match new code
[nodes] Add callbacks to item storage
[nodes] Adding an InvocationContext object to use for invocations to provide easier extensibility
[nodes] New execution model that handles iteration
[nodes] Fixing the CLI
[nodes] Adding a note to the CLI
[nodes] Split processing thread into separate service
[node] Add error message on node processing failure
Removing old files and duplicated packages
Adding python-multipart
black:
- extend-exclude legacy scripts
- config for python 3.9 as long as we support it
isort:
- set atomic to true to only apply if no syntax errors are introduced
- config for python 3.9 as long as we support it
- extend_skib_glob legacy scripts
- filter_files
- match line_length with black
- remove_redundant_aliases
- skip_gitignore
- set src paths
- include virtual_env to detect third party modules
- better order of hooks
- add flake8-comprehensions and flake8-simplify
- remove unecesarry hooks which are covered by previous hooks
- add hooks
- check-executables-have-shebangs
- check-shebang-scripts-are-executable
- Updated Spanish translation
- Updated Portuguese (Brazil) translation
- Fix a number of translation issues and add missing strings
- Fix vertical symmetry and symmetry steps issue when generation steps
is adjusted
This PR adds the core of the node-based invocation system first
discussed in https://github.com/invoke-ai/InvokeAI/discussions/597 and
implements it through a basic CLI and API. This supersedes #1047, which
was too far behind to rebase.
## Architecture
### Invocations
The core of the new system is **invocations**, found in
`/ldm/invoke/app/invocations`. These represent individual nodes of
execution, each with inputs and outputs. Core invocations are already
implemented (`txt2img`, `img2img`, `upscale`, `face_restore`) as well as
a debug invocation (`show_image`). To implement a new invocation, all
that is required is to add a new implementation in this folder (there is
a markdown document describing the specifics, though it is slightly
out-of-date).
### Sessions
Invocations and links between them are maintained in a **session**.
These can be queued for invocation (either the next ready node, or all
nodes). Some notes:
* Sessions may be added to at any time (including after invocation), but
may not be modified.
* Links are always added with a node, and are always links from existing
nodes to the new node. These links can be relative "history" links, e.g.
`-1` to link from a previously executed node, and can link either
specific outputs, or can opportunistically link all matching outputs by
name and type by using `*`.
* There are no iteration/looping constructs. Most needs for this could
be solved by either duplicating nodes or cloning sessions. This is open
for discussion, but is a difficult problem to solve in a way that
doesn't make the code even more complex/confusing (especially regarding
node ids and history).
### Services
These make up the core the invocation system, found in
`/ldm/invoke/app/services`. One of the key design philosophies here is
that most components should be replaceable when possible. For example,
if someone wants to use cloud storage for their images, they should be
able to replace the image storage service easily.
The services are broken down as follows (several of these are
intentionally implemented with an initial simple/naïve approach):
* Invoker: Responsible for creating and executing **sessions** and
managing services used to do so.
* Session Manager: Manages session history. An on-disk implementation is
provided, which stores sessions as json files on disk, and caches
recently used sessions for quick access.
* Image Storage: Stores images of multiple types. An on-disk
implementation is provided, which stores images on disk and retains
recently used images in an in-memory cache.
* Invocation Queue: Used to queue invocations for execution. An
in-memory implementation is provided.
* Events: An event system, primarily used with socket.io to support
future web UI integration.
## Apps
Apps are available through the `/scripts/invoke-new.py` script (to-be
integrated/renamed).
### CLI
```
python scripts/invoke-new.py
```
Implements a simple CLI. The CLI creates a single session, and
automatically links all inputs to the previous node's output. Commands
are automatically generated from all invocations, with command options
being automatically generated from invocation inputs. Help is also
available for the cli and for each command, and is very verbose.
Additionally, the CLI supports command piping for single-line entry of
multiple commands. Example:
```
> txt2img --prompt "a cat eating sushi" --steps 20 --seed 1234 | upscale | show_image
```
### API
```
python scripts/invoke-new.py --api --host 0.0.0.0
```
Implements an API using FastAPI with Socket.io support for signaling.
API documentation is available at `http://localhost:9090/docs` or
`http://localhost:9090/redoc`. This includes OpenAPI schema for all
available invocations, session interaction APIs, and image APIs.
Socket.io signals are per-session, and can be subscribed to by session
id. These aren't currently auto-documented, though the code for event
emission is centralized in `/ldm/invoke/app/services/events.py`.
A very simple test html and script are available at
`http://localhost:9090/static/test.html` This demonstrates creating a
session from a graph, invoking it, and receiving signals from Socket.io.
## What's left?
* There are a number of features not currently covered by invocations. I
kept the set of invocations small during core development in order to
simplify refactoring as I went. Now that the invocation code has
stabilized, I'd love some help filling those out!
* There's no image metadata generated. It would be fairly
straightforward (and would make good sense) to serialize either a
session and node reference into an image, or the entire node into the
image. There are a lot of questions to answer around source images,
linked images, etc. though. This history is all stored in the session as
well, and with complex sessions, the metadata in an image may lose its
value. This needs some further discussion.
* We need a list of features (both current and future) that would be
difficult to implement without looping constructs so we can have a good
conversation around it. I'm really hoping we can avoid needing
looping/iteration in the graph execution, since it'll necessitate
separating an execution of a graph into its own concept/system, and will
further complicate the system.
* The API likely needs further filling out to support the UI. I think
using the new API for the current UI is possible, and potentially
interesting, since it could work like the new/demo CLI in a "single
operation at a time" workflow. I don't know how compatible that will be
with our UI goals though. It would be nice to support only a single API
though.
* Deeper separation of systems. I intentionally tried to not touch
Generate or other systems too much, but a lot could be gained by
breaking those apart. Even breaking apart Args into two pieces (command
line arguments and the parser for the current CLI) would make it easier
to maintain. This is probably in the future though.
author Kyle Schouviller <kyle0654@hotmail.com> 1669872800 -0800
committer Kyle Schouviller <kyle0654@hotmail.com> 1676240900 -0800
Adding base node architecture
Fix type annotation errors
Runs and generates, but breaks in saving session
Fix default model value setting. Fix deprecation warning.
Fixed node api
Adding markdown docs
Simplifying Generate construction in apps
[nodes] A few minor changes (#2510)
* Pin api-related requirements
* Remove confusing extra CORS origins list
* Adds response models for HTTP 200
[nodes] Adding graph_execution_state to soon replace session. Adding tests with pytest.
Minor typing fixes
[nodes] Fix some small output query hookups
[node] Fixing some additional typing issues
[nodes] Move and expand graph code. Add base item storage and sqlite implementation.
Update startup to match new code
[nodes] Add callbacks to item storage
[nodes] Adding an InvocationContext object to use for invocations to provide easier extensibility
[nodes] New execution model that handles iteration
[nodes] Fixing the CLI
[nodes] Adding a note to the CLI
[nodes] Split processing thread into separate service
[node] Add error message on node processing failure
Removing old files and duplicated packages
Adding python-multipart
I had inadvertently un-safe-d our translation types when migrating to Weblate.
This PR fixes that, and a number of translation string bugs that went unnoticed due to the lack of type safety,
- Add curated set of starter models based on team discussion. The final
list of starter models can be found in
`invokeai/configs/INITIAL_MODELS.yaml`
- To test model installation, I selected and installed all the models on
the list. This led to my discovering that when there are no more starter
models to display, the console front end crashes. So I made a fix to
this in which the entire starter model selection is no longer shown.
- Update model table in 050_INSTALL_MODELS.md
- Add guide to dealing with low-memory situations
- Version is now `v2.3.1`
- add new script `scripts/make_models_markdown_table.py` that parses
INITIAL_MODELS.yaml and creates markdown table for the model installation
documentation file
- update 050_INSTALLING_MODELS.md with above table, and add a warning
about additional license terms that apply to some of the models.
- Final list can be found in invokeai/configs/INITIAL_MODELS.yaml
- After installing all the models, I discovered a bug in the file
selection form that caused a crash when no remaining uninstalled
models remained. So had to fix this.
The sample_to_image method in `ldm.invoke.generator.base` was still
using ckpt-era code. As a result when the WebUI was set to show
"accurate" intermediate images, there'd be a crash. This PR corrects the
problem.
- Closes#2784
- Closes#2775
- Discord member @marcus.llewellyn reported that some civitai
2.1-derived checkpoints were not converting properly (probably
dreambooth-generated):
https://discord.com/channels/1020123559063990373/1078386197589655582/1078387806122025070
- @blessedcoolant tracked this down to a missing key that was used to
derive vector length of the CLIP model used by fetching the second
dimension of the tensor at "cond_stage_model.model.text_projection".
- On inspection, I found that the same second dimension can be recovered
from key 'cond_stage_model.model.ln_final.bias', and use that instead. I
hope this is correct; tested on multiple v1, v2 and inpainting models
and they converted correctly.
- While debugging this, I found and fixed several other issues:
- model download script was not pre-downloading the OpenCLIP
text_encoder or text_tokenizer. This is fixed.
- got rid of legacy code in `ckpt_to_diffuser.py` and replaced with
calls into `model_manager`
- more consistent status reporting in the CLI.
without this change, the project can be installed on 3.9 but not used
this also fixes the container images
Maybe we should re-enable Python 3.9 checks which would have prevented
this.
- Discord member @marcus.llewellyn reported that some civitai 2.1-derived checkpoints were
not converting properly (probably dreambooth-generated):
https://discord.com/channels/1020123559063990373/1078386197589655582/1078387806122025070
- @blessedcoolant tracked this down to a missing key that was used to
derive vector length of the CLIP model used by fetching the second
dimension of the tensor at "cond_stage_model.model.text_projection".
His proposed solution was to hardcode a value of 1024.
- On inspection, I found that the same second dimension can be
recovered from key 'cond_stage_model.model.ln_final.bias', and use
that instead. I hope this is correct; tested on multiple v1, v2 and
inpainting models and they converted correctly.
- While debugging this, I found and fixed several other issues:
- model download script was not pre-downloading the OpenCLIP
text_encoder or text_tokenizer. This is fixed.
- got rid of legacy code in `ckpt_to_diffuser.py` and replaced
with calls into `model_manager`
- more consistent status reporting in the CLI.
Root directory finding algorithm is:
2) use --root argument
2) use INVOKEAI_ROOT environment variable
3) use VIRTUAL_ENV environment variable
4) use ~/invokeai
Since developers are liable to put virtual environments in their
favorite places, not necessarily in the invokeai root directory, this PR
adds a sanity check that looks for the existence of
`VIRTUAL_ENV/invokeai.init`, and moves on to (4) if not found.
# This will constitute v2.3.1+rc2
## Windows installer enhancements
1. resize installer window to give more room for configure and download
forms
2. replace '\' with '/' in directory names to allow user to
drag-and-drop
folders into the dialogue boxes that accept directories.
3. similar change in CLI for the !import_model and !convert_model
commands
4. better error reporting when a model download fails due to network
errors
5. put the launcher scripts into a loop so that menu reappears after
invokeai, merge script, etc exits. User can quit with "Q".
6. do not try to download fp16 of sd-ft-mse-vae, since it doesn't exist.
7. cleaned up status reporting when installing models
8. Detect when install failed for some reason and print helpful error
message rather than stack trace.
9. Detect window size and resize to minimum acceptable values to provide
better display of configure and install forms.
10. Fix a bug in the CLI which prevented diffusers imported by their
repo_ids
from being correctly registered in the current session (though they
install
correctly)
11. Capitalize the "i" in Imported in the autogenerated descriptions.
Root directory finding algorithm is:
2) use --root argument
2) use INVOKEAI_ROOT environment variable
3) use VIRTUAL_ENV environment variable
4) use ~/invokeai
Since developer's are liable to put virtual environments in their
favorite places, not necessarily in the invokeai root directory, this
PR adds a sanity check that looks for the existence of
VIRTUAL_ENV/invokeai.init, and moves to (4) if not found.
- Fix a bug in the CLI which prevented diffusers imported by their repo_ids
from being correctly registered in the current session (though they install
correctly)
- Capitalize the "i" in Imported in the autogenerated descriptions.
1. resize installer window to give more room for configure and download forms
2. replace '\' with '/' in directory names to allow user to drag-and-drop
folders into the dialogue boxes that accept directories.
3. similar change in CLI for the !import_model and !convert_model commands
4. better error reporting when a model download fails due to network errors
5. put the launcher scripts into a loop so that menu reappears after
invokeai, merge script, etc exits. User can quit with "Q".
6. do not try to download fp16 of sd-ft-mse-vae, since it doesn't exist.
7. cleaned up status reporting when installing models
- Detect when install failed for some reason and print helpful error
message rather than stack trace.
- Detect window size and resize to minimum acceptable values to provide
better display of configure and install forms.
Currently translated at 81.4% (382 of 469 strings)
translationBot(ui): update translation (Russian)
Currently translated at 81.6% (382 of 468 strings)
Co-authored-by: Sergey Krashevich <svk@svk.su>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/ru/
Translation: InvokeAI/Web UI
## Major Changes
The invokeai-configure script has now been refactored. The work of
selecting and downloading initial models at install time is now done by
a script named `invokeai-model-install` (module name is
`ldm.invoke.config.model_install`)
Screen 1 - adjust startup options:

Screen 2 - select SD models:

The calling arguments for `invokeai-configure` have not changed, so
nothing should break. After initializing the root directory, the script
calls `invokeai-model-install` to let the user select the starting
models to install.
`invokeai-model-install puts up a console GUI with checkboxes to
indicate which models to install. It respects the `--default_only` and
`--yes` arguments so that CI will continue to work. Here are the various
effects you can achieve:
`invokeai-configure`
This will use console-based UI to initialize invokeai.init,
download support models, and choose and download SD models
`invokeai-configure --yes`
Without activating the GUI, populate invokeai.init with default values,
download support models and download the "recommended" SD models
`invokeai-configure --default_only`
Activate the GUI for changing init options, but don't show the SD
download
form, and automatically download the default SD model (currently SD-1.5)
`invokeai-model-install`
Select and install models. This can be used to download arbitrary
models from the Internet, install HuggingFace models using their
repo_id,
or watch a directory for models to load at startup time
`invokeai-model-install --yes`
Import the recommended SD models without a GUI
`invokeai-model-install --default_only`
As above, but only import the default model
## Flexible Model Imports
The console GUI allows the user to import arbitrary models into InvokeAI
using:
1. A HuggingFace Repo_id
2. A URL (http/https/ftp) that points to a checkpoint or safetensors
file
3. A local path on disk pointing to a checkpoint/safetensors file or
diffusers directory
4. A directory to be scanned for all checkpoint/safetensors files to be
imported
The UI allows the user to specify multiple models to bulk import. The
user can specify whether to import the ckpt/safetensors as-is, or
convert to `diffusers`. The user can also designate a directory to be
scanned at startup time for checkpoint/safetensors files.
## Backend Changes
To support the model selection GUI PR introduces a new method in
`ldm.invoke.model_manager` called `heuristic_import(). This accepts a
string-like object which can be a repo_id, URL, local path or directory.
It will figure out what the object is and import it. It interrogates the
contents of checkpoint and safetensors files to determine what type of
SD model they are -- v1.x, v2.x or v1.x inpainting.
## Installer
I am attaching a zip file of the installer if you would like to try the
process from end to end.
[InvokeAI-installer-v2.3.0.zip](https://github.com/invoke-ai/InvokeAI/files/10785474/InvokeAI-installer-v2.3.0.zip)
motivation: i want to be doing future prompting development work in the
`compel` lib (https://github.com/damian0815/compel) - which is currently
pip installable with `pip install compel`.
-At some point pathlib was added to the list of imported modules and
this broken the os.path code that assembled the sample data set.
-Now fixed by replacing os.path calls with Path methods
-At some point pathlib was added to the list of imported modules and this
broken the os.path code that assembled the sample data set.
-Now fixed by replacing os.path calls with Path methods
- Disable responsive resizing below starting dimensions (you can make
form larger, but not smaller than what it was at startup)
- Fix bug that caused multiple --ckpt_convert entries (and similar) to
be written to init file.
This bug is related to the format in which we stored prompts for some time: an array of weighted subprompts.
This caused some strife when recalling a prompt if the prompt had colons in it, due to our recently introduced handling of negative prompts.
Currently there is no need to store a prompt as anything other than a string, so we revert to doing that.
Compatibility with structured prompts is maintained via helper hook.
Lots of earlier embeds use a common trigger token such as * or the
hebrew letter shan. Previously, the textual inversion manager would
refuse to load the second and subsequent embeddings that used a
previously-claimed trigger. Now, when this case is encountered, the
trigger token is replaced by <filename> and the user is informed of the
fact.
1. Fixed display crash when the number of installed models is less than
the number of desired columns to display them.
2. Added --ckpt_convert option to init file.
Enhancements:
1. Directory-based imports will not attempt to import components of diffusers models.
2. Diffuser directory imports now supported
3. Files that end with .ckpt that are not Stable Diffusion models (such as VAEs) are
skipped during import.
Bugs identified in Psychedelicious's review:
1. The invokeai-configure form now tracks the current contents of `invokeai.init` correctly.
2. The autoencoders are no longer treated like installable models, but instead are
mandatory support models. They will no longer appear in `models.yaml`
Bugs identified in Damian's review:
1. If invokeai-model-install is started before the root directory is initialized, it will
call invokeai-configure to fix the matter.
2. Fix bug that was causing empty `models.yaml` under certain conditions.
3. Made import textbox smaller
4. Hide the "convert to diffusers" options if nothing to import.
In theory, this reduces peak memory consumption by doing the conditioned
and un-conditioned predictions one after the other instead of in a
single mini-batch.
In practice, it doesn't reduce the reported "Max VRAM used for this
generation" for me, even without xformers. (But it does slow things down
by a good 18%.)
That suggests to me that the peak memory usage is during VAE decoding,
not the diffusion unet, but ymmv. It does [improve things for gogurt's
16 GB
M1](https://github.com/invoke-ai/InvokeAI/pull/2732#issuecomment-1436187407),
so it seems worthwhile.
To try it out, use the `--sequential_guidance` option:
2dded68267/ldm/invoke/args.py (L487-L492)
- Adds an update action to launcher script
- This action calls new python script `invokeai-update`, which prompts
user to update to latest release version, main development version, or
an arbitrary git tag or branch name.
- It then uses `pip` to update to whatever tag was specified.
The user interface (such as it is) looks like this:

- The TI script was looping over all files in the training image
directory, regardless of whether they were image files or not. This PR
adds a check for image file extensions.
-
- Closes#2715
- Fixes longstanding bug in the token vector size code which caused .pt
files to be assigned the wrong token vector length. These were then
tossed out during directory scanning.
- Fixes longstanding bug in the token vector size code which caused
.pt files to be assigned the wrong token vector length. These
were then tossed out during directory scanning.
- Fixed the test for token length; tested on several .pt and .bin files
- Also added a __main__ entrypoint for CLI.py, to make pdb debugging a
bit more convenient.
When selecting the last model of the third model-list in the
model-merging-TUI it crashed because the code forgot about the "None"
element.
Additionally it seems that it accidentally always took the wrong model
as third model if selected?
This simple fix resolves both issues.
Added symmetry to Invoke based on discussions with @damian0815. This can currently only be activated via the CLI with the `--h_symmetry_time_pct` and `--v_symmetry_time_pct` options. Those take values from 0.0-1.0, exclusive, indicating the percentage through generation at which symmetry is applied as a one-time operation. To have symmetry in either axis applied after the first step, use a very low value like 0.001.
- not sure why, but at some pont --ckpt_convert (which converts legacy checkpoints)
into diffusers in memory, stopped working due to float16/float32 issues.
- this commit repairs the problem
- also removed some debugging messages I found in passing
- Fixed the test for token length; tested on several .pt and .bin files
- Also added a __main__ entrypoint for CLI.py, to make pdb debugging a bit
more convenient.
- You can now achieve several effects:
`invokeai-configure`
This will use console-based UI to initialize invokeai.init,
download support models, and choose and download SD models
`invokeai-configure --yes`
Without activating the GUI, populate invokeai.init with default values,
download support models and download the "recommended" SD models
`invokeai-configure --default_only`
As above, but only download the default SD model (currently SD-1.5)
`invokeai-model-install`
Select and install models. This can be used to download arbitrary
models from the Internet, install HuggingFace models using their repo_id,
or watch a directory for models to load at startup time
`invokeai-model-install --yes`
Import the recommended SD models without a GUI
`invokeai-model-install --default_only`
As above, but only import the default model
A few bugs fixed.
- After the recent update to the Cancel Button, it was no longer
respecting sizing in Floating Mode and the Beta Canvas. Fixed that.
- After the recent dependency update, useHotkeys was bugging out for the
fullscreen hotkey `f`. Realized this was happening because the hotkey
was initialized in two places -- in both the gallery and the parameter
floating button. Removed it from both those places and moved it to the
InvokeTabs component. It makes sense to reside it here because it is a
global hotkey.
- Also added index `0` to the default Accordion index in state in order
to ensure that the main accordions stay open. Conveniently this works
great on all tabs. We have all the primary options in accordions so they
stay open. And as for advanced settings, the first one is always Seed
which is an important accordion, so it opens up by default.
Think there may be some more bugs. Looking in to them.
After upgrading the deps, the full screen hotkey started to bug out. I believe this was happening because it was triggered in two different components causing it to run twice. Removed it from both floating buttons and moved it to the Invoke tab. Makes sense to keep it there as it is a global hotkey.
After the recent changes the Cancel button wasn't maintaining min height in floating mode. Also the new button group was not scaling in width correctly on the Canvas Beta UI. Fixed both.
- Adds a translation status badge
- Adds a blurb about contributing a translation (we want Weblate to be
the source of truth for translations, and to avoid updating translations
directly here)
- Upgraded all dependencies
- Removed beta TS 5.0 as it conflicted with some packages
- Added types for `Array.prototype.findLast` and
`Array.prototype.findLastIndex` (these definitions are provided in TS
5.0
- Fixed fixed type import syntax in a few components
- Re-patched `redux-deep-persist` and tested to ensure the patch still
works
The husky pre-commit command was `npx run lint`, but it should run
`lint-staged`. Also, `npx` wasn't working for me. Changed the command to
`npm run lint-staged` and it all works. Extended the `lint-staged`
triggers to hit `json`, `scss` and `html`.
When encountering a bad embedding, InvokeAI was asking about reconfiguring models. This is because the embedding load error was never handled - it now is.
- Upgraded all dependencies
- Removed beta TS 5.0 as it conflicted with some packages
- Added types for `Array.prototype.findLast` and `Array.prototype.findLastIndex` (these definitions are provided in TS 5.0
- Fixed fixed type import syntax in a few components
- Re-patched `redux-deep-persist` and tested to ensure the patch still works
Model Manager lags a bit if you have a lot of models.
Basically added a fake delay to rendering the model list so the modal
has time to load first. Hacky but if it works it works.
## What was the problem/requirement? (What/Why)
Frequently, I wish to cancel the processing of images, but also want the
current image to finalize before I do. To work around this, I need to
wait until the current one finishes before pressing the cancel.
## What was the solution? (How)
* Implemented a button that allows to "Cancel after current iteration,"
which stores a state in the UI that will attempt to cancel the
processing after the current image finishes
* If the button is pressed again, while it is spinning and before the
next iteration happens, this will stop the scheduling of the cancel, and
behave as if the button was never pressed.
### Minor
* Added `.yarn` to `.gitignore` as this was an output folder produced
from following Frontend's README
### Revision 2
#### Major
* Changed from a standalone button to a context menu next to the
original cancel button. Pressing the context menu will give the
drop-down option to select which type of cancel method the user prefers,
and they can press that button for canceling in the specified type
* Moved states to system state for cross-screen and toggled cancel types
management
* Added in distribution for the target yarn version (allowing any
version of yarn to compile successfully), and updated the README to
ensure `--immutable` is passed for onboarding developers
#### Minor
* Updated `.gitignore` to ignore specific yarn folders, as specified by
their team -
https://yarnpkg.com/getting-started/qa#which-files-should-be-gitignored
## How were these changes tested?
* `yarn dev` => Server started successfully
* Manual testing on the development server to ensure the button behaved
as expected
* `yarn run build` => Success
### Artifacts
#### Revision 1
* Video showing the UI changes in action
https://user-images.githubusercontent.com/89283782/218347722-3a15ce61-2d8c-4c38-b681-e7a3e79dd595.mov
* Images showing the basic UI changes


#### Revision 2
* Video showing the UI changes in action
https://user-images.githubusercontent.com/89283782/219901217-048d2912-9b61-4415-85fd-9e8fedb00c79.mov
* Images showing the basic UI changes
(Default state)

(Drop-down context menu active)

(Scheduled cancel selected and running)

(Scheduled cancel started)

## Notes
* Using `SystemState`'s `currentStatus` variable, when the value is
`common:statusIterationComplete` is an alternative to this approach (and
would be more optimal as it should prevent the next iteration from even
starting), but since the names are within the translations, rather than
an enum or other type, this method of tracking the current iteration was
used instead.
* `isLoading` on `IAIIconButton` caused the Icon Button to also be
disabled, so the current solution works around that with conditionally
rendering the icon of the button instead of passing that value.
* I don't have context on the development expectation for `dist` folder
interactions (and couldn't find any documentation outside of the
`.gitignore` mentioning that the folder should remain. Let me know if
they need to be modified a certain way.
- The checkpoint conversion script was generating diffusers models with
the safety checker set to null. This resulted in models that could not
be merged with ones that have the safety checker activated.
- This PR fixes the issue by incorporating the safety checker into all
1.x-derived checkpoints, regardless of user's nsfw_checker setting.
- The checkpoint conversion script was generating diffusers models
with the safety checker set to null. This resulted in models
that could not be merged with ones that have the safety checker
activated.
- This PR fixes the issue by incorporating the safety checker into
all 1.x-derived checkpoints, regardless of user's nsfw_checker setting.
Also tighten up the typing of `device` attributes in general.
Fixes
> ValueError: Expected a torch.device with a specified index or an
integer, but got:cuda
Weblate's first PR was it attempting to fix some translation issues we
had overlooked!
It wanted to remove some keys which it did not see in our translation
source due to typos.
This PR instead corrects the key names to resolve the issues.
# Weblate Translation
After doing a full integration test of 3 translation service providers
on my fork of InvokeAI, we have chosen
[Weblate](https://hosted.weblate.org). The other two viable options were
[Crowdin](https://crowdin.com/) and
[Transifex](https://www.transifex.com/).
Weblate was the choice because its hosted service provides a very solid
UX / DX, can scale as much as we may ever need, is FOSS itself, and
generously offers free hosted service to other libre projects like ours.
## How it works
Weblate hosts its own fork of our repo and establishes a kind of
unidirectional relationship between our repo and its fork.
### InvokeAI --> Weblate
The `invoke-ai/InvokeAI` repo has had the Weblate GitHub app added to
it. This app watches for changes to our translation source
(`invokeai/frontend/public/locales/en.json`) and then updates the
Weblate fork. The Weblate UI then knows there are new strings to be
translated, or changes to be made.
### Translation
Our translators can then update the translations on the Weblate UI. The
plan now is to invite individual community members who have expressed
interest in maintaining a language or two and give them access to the
app. We can also open the doors to the general public if desired.
### Weblate --> InvokeAI
When a translation is ready or changed, the system will make a PR to
`main`. We have a substantial degree of control over this and will
likely manually trigger these PRs instead of letting them fire off
automatically.
Once a PR is merged, we will still need to rebuild the web UI. I think
we can set things up so that we only need the rebuild when a totally new
language is added, but for now, we will stick to this relatively simple
setup.
## This PR
This PR sets up the web UI's translation stuff to work with Weblate:
- merged each locale into a single file
- updated the i18next config and UI to work with this simpler file
structure
- update our eslint and prettier rules to ensure the locale files have
the same format as what Weblate outputs (`tabWidth: 4`)
- added a thank you to Weblate in our README
Once this is merged, I'll link Weblate to `main` and do a couple tests
to ensure it is all working as expected.
This fixes a few cosmetic bugs in the merge models console GUI:
1) Fix the minimum and maximum ranges on alpha. Was 0.05 to 0.95. Now
0.01 to 0.99.
2) Don't show the 'add_difference' interpolation method when 2 models
selected, or the other three methods when three models selected
## Convert v2 models in CLI
- This PR introduces a CLI prompt for the proper configuration file to
use when converting a ckpt file, in order to support both inpainting
and v2 models files.
- When user tries to directly !import a v2 model, it prints out a proper
warning that v2 ckpts are not directly supported and converts it into a
diffusers model automatically.
The user interaction looks like this:
```
(stable-diffusion-1.5) invoke> !import_model /home/lstein/graphic-art.ckpt
Short name for this model [graphic-art]: graphic-art-test
Description for this model [Imported model graphic-art]: Imported model graphic-art
What type of model is this?:
[1] A model based on Stable Diffusion 1.X
[2] A model based on Stable Diffusion 2.X
[3] An inpainting model based on Stable Diffusion 1.X
[4] Something else
Your choice: [1] 2
```
In addition, this PR enhances the bulk checkpoint import function. If a
directory path is passed to `!import_model` then it will be scanned for
`.ckpt` and `.safetensors` files. The user will be prompted to import
all the files found, or select which ones to import.
Addresses
https://discord.com/channels/1020123559063990373/1073730061380894740/1073954728544845855
- fix alpha slider to show values from 0.01 to 0.99
- fix interpolation list to show 'difference' method for 3 models,
- and weighted_sum, sigmoid and inverse_sigmoid methods for 2
Porting over as many usable options to slider as possible.
- Ported Face Restoration settings to Sliders.
- Ported Upscale Settings to Sliders.
- Ported Variation Amount to Sliders.
- Ported Noise Threshold to Sliders <-- Optimized slider so the values
actually make sense.
- Ported Perlin Noise to Sliders.
- Added a suboption hook for the High Res Strength Slider.
- Fixed a couple of small issues with the Slider component.
- Ported Main Options to Sliders.
- Corrected error that caused --full-precision argument to be ignored
when models downloaded using the --yes argument.
- Improved autodetection of v1 inpainting files; no longer relies on the
file having 'inpaint' in the name.
* new OffloadingDevice loads one model at a time, on demand
* fixup! new OffloadingDevice loads one model at a time, on demand
* fix(prompt_to_embeddings): call the text encoder directly instead of its forward method
allowing any associated hooks to run with it.
* more attempts to get things on the right device from the offloader
* more attempts to get things on the right device from the offloader
* make offloading methods an explicit part of the pipeline interface
* inlining some calls where device is only used once
* ensure model group is ready after pipeline.to is called
* fixup! Strategize slicing based on free [V]RAM (#2572)
* doc(offloading): docstrings for offloading.ModelGroup
* doc(offloading): docstrings for offloading-related pipeline methods
* refactor(offloading): s/SimpleModelGroup/FullyLoadedModelGroup
* refactor(offloading): s/HotSeatModelGroup/LazilyLoadedModelGroup
to frame it is the same terms as "FullyLoadedModelGroup"
---------
Co-authored-by: Damian Stewart <null@damianstewart.com>
- filter paths for `build-container.yml` and `test-invoke-pip.yml`
- add workflow to pass required checks on PRs with `paths-ignore`
- this triggers if `test-invoke-pip.yml` does not
- fix "CI checks on main link" in `/README.md`
- filter paths for `build-container.yml` and `test-invoke-pip.yml`
- add workflow to pass required checks on PRs with `paths-ignore`
- this triggers if `test-invoke-pip.yml` does not
- fix "CI checks on main link" in `/README.md`
Assuming that mixing `"literal strings"` and `{'JSX expressions'}`
throughout the code is not for a explicit reason but just a result IDE
autocompletion, I changed all props to be consistent with the
conventional style of using simple string literals where it is
sufficient.
This is a somewhat trivial change, but it makes the code a little more
readable and uniform
- quashed multiple bugs in model conversion and importing
- found old issue in handling of resume of interrupted downloads
- will require extensive testing
### WebUI Model Conversion
**Model Search Updates**
- Model Search now has a radio group that allows users to pick the type
of model they are importing. If they know their model has a custom
config file, they can assign it right here. Based on their pick, the
model config data is automatically populated. And this same information
is used when converting the model to `diffusers`.

- Files named `model.safetensors` and
`diffusion_pytorch_model.safetensors` are excluded from the search
because these are naming conventions used by diffusers models and they
will end up showing on the list because our conversion saves safetensors
and not bin files.
**Model Conversion UI**
- The **Convert To Diffusers** button can be found on the Edit page of
any **Checkpoint Model**.

- When converting the model, the entire process is handled
automatically. The corresponding config while at the time of the Ckpt
addition is used in the process.
- Users are presented with the choice on where to save the diffusers
converted model - same location as the ckpt, InvokeAI models root folder
or a completely custom location.

- When the model is converted, the checkpoint entry is replaced with the
diffusers model entry. A user can readd the ckpt if they wish to.
---
More or less done. Might make some minor UX improvements as I refine
things.
Tensors with diffusers no longer have to be multiples of 8. This broke Perlin noise generation. We now generate noise for the next largest multiple of 8 and return a cropped result. Fixes#2674.
`generator` now asks `InvokeAIDiffuserComponent` to do postprocessing work on latents after every step. Thresholding - now implemented as replacing latents outside of the threshold with random noise - is called at this point. This postprocessing step is also where we can hook up symmetry and other image latent manipulations in the future.
Note: code at this layer doesn't need to worry about MPS as relevant torch functions are wrapped and made MPS-safe by `generator.py`.
1. Now works with sites that produce lots of redirects, such as CIVITAI
2. Derive name of destination model file from HTTP Content-Disposition header,
if present.
3. Swap \\ for / in file paths provided by users, to hopefully fix issues with
Windows.
This PR adds a new attributer to ldm.generate, `embedding_trigger_strings`:
```
gen = Generate(...)
strings = gen.embedding_trigger_strings
strings = gen.embedding_trigger_strings()
```
The trigger strings will change when the model is updated to show only
those strings which are compatible with the current
model. Dynamically-downloaded triggers from the HF Concepts Library
will only show up after they are used for the first time. However, the
full list of concepts available for download can be retrieved
programatically like this:
```
from ldm.invoke.concepts_lib import HuggingFAceConceptsLibrary
concepts = HuggingFaceConceptsLibrary()
trigger_strings = concepts.list_concepts()
```
I have added the arabic locale files. There need to be some
modifications to the code in order to detect the language direction and
add it to the current document body properties.
For example we can use this:
import { appWithTranslation, useTranslation } from "next-i18next";
import React, { useEffect } from "react";
const { t, i18n } = useTranslation();
const direction = i18n.dir();
useEffect(() => {
document.body.dir = direction;
}, [direction]);
This should be added to the app file. It uses next-i18next to
automatically get the current language and sets the body text direction
(ltr or rtl) depending on the selected language.
## Provide informative error messages when TI and Merge scripts have
insufficient space for console UI
- The invokeai-ti and invokeai-merge scripts will crash if there is not
enough space in the console to fit the user interface (even after
responsive formatting).
- This PR intercepts the errors and prints a useful error message
advising user to make window larger.
1. The invokeai-configure script has now been refactored. The work of
selecting and downloading initial models at install time is now done
by a script named invokeai-initial-models (module
name is ldm.invoke.config.initial_model_select)
The calling arguments for invokeai-configure have not changed, so
nothing should break. After initializing the root directory, the
script calls invokeai-initial-models to let the user select the
starting models to install.
2. invokeai-initial-models puts up a console GUI with checkboxes to
indicate which models to install. It respects the --default_only
and --yes arguments so that CI will continue to work.
3. User can now edit the VAE assigned to diffusers models in the CLI.
4. Fixed a bug that caused a crash during model loading when the VAE
is set to None, rather than being empty.
- The invokeai-ti and invokeai-merge scripts will crash if there is not enough space
in the console to fit the user interface (even after responsive formatting).
- This PR intercepts the errors and prints a useful error message advising user to
make window larger.
- fix unused variables and f-strings found by pyflakes
- use global_converted_ckpts_dir() to find location of diffusers
- fixed bug in model_manager that was causing the description of converted
models to read "Optimized version of {model_name}'
Strategize slicing based on free [V]RAM when not using xformers. Free [V]RAM is evaluated at every generation. When there's enough memory, the entire generation occurs without slicing. If there is not enough free memory, we use diffusers' sliced attention.
- Adds an update action to launcher script
- This action calls new python script `invokeai-update`, which prompts
user to update to latest release version, main development version,
or an arbitrary git tag or branch name.
- It then uses `pip` to update to whatever tag was specified.
Some of the core features of this PR include:
- optional push image to dockerhub (will be skipped in repos which
didn't set it up)
- stop using the root user at runtime
- trigger builds also for update/docker/* and update/ci/docker/*
- always cache image from current branch and main branch
- separate caches for container flavors
- updated comments with instructions in build.sh and run.sh
This commit cleans up the code that did bulk imports of legacy model
files. The code has been refactored, and the user is now offered the
option of importing all the model files found in the directory, or
selecting which ones to import.
Users can now pick the folder to save their diffusers converted model. It can either be the same folder as the ckpt, or the invoke root models folder or a totally custom location.
Fixed a couple of bugs:
1. The original config file for the ckpt file is derived from the entry in
`models.yaml` rather than relying on the user to select. The implication
of this is that V2 ckpt models need to be assigned `v2-inference-v.yaml`
when they are first imported. Otherwise they won't convert right. Note
that currently V2 ckpts are imported with `v1-inference.yaml`, which
isn't right either.
2. Fixed a backslash in the output diffusers path, which was causing
load failures on Linux.
Remaining issues:
1. The radio buttons for selecting the model type are
nonfunctional. It feels to me like these should be moved into the
dialogue for importing ckpt/safetensors files, because this is
where the algorithm needs help from the user.
2. The output diffusers model is written into the same directory as
the input ckpt file. The CLI does it differently and stores the
diffusers model in `ROOTDIR/models/converted-ckpts`. We should
settle on one way or the other.
Converted the picker options to a Radio Group and also updated the backend to use the appropriate config if it is a v2 model that needs to be converted.
- This PR introduces a CLI prompt for the proper configuration file to
use when converting a ckpt file, in order to support both inpainting
and v2 models files.
- When user tries to directly !import a v2 model, it prints out a proper
warning that v2 ckpts are not directly supported.
## What was the problem/requirement? (What/Why)
* Windows location for the Python environment activate location is
currently incorrect
* Due to this, this command will fail for Windows-based users
* The contributing link within the `Developer Install` sections leads to
a [404](https://invoke-ai.github.io/index.md#Contributing)
* `Developer Install`'s numbered list currently lists 1, 1, 2, . . .
## What was the solution? (How)
* Changed the location of Windows script based on actual location -
[reference](https://docs.python.org/3/library/venv.html)
* Moved the link to point to one directory higher -- the main index.md
* Minor format adjustments to allow for the numbered list to appear as
expected
## How were these changes tested?
* `mkdocs serve` => Verified on local server that the changes reflected
as expected
## Notes
Contributing mentions to set the upstream towards the `development`
branch, but that branch has been untouched for several months, so I've
pointed to the `main` branch. Let me know if we need to switch to a
different one.
…odels
- If CLI asked to convert the currently loaded model, the model would
crash on the first rendering. CLI will now refuse to convert a model
loaded in memory (probably a good idea in any case).
- CLI will offer the `v1-inpainting-inference.yaml` as the configuration
file when importing an inpainting a .ckpt or .safetensors file that has
"inpainting" in the name. Otherwise it offers `v1-inference.yaml` as the
default.
rather than bypassing any path with diffusers in it, im specifically bypassing model.safetensors and diffusion_pytorch_model.safetensors both of which should be diffusers files in most cases.
- If CLI asked to convert the currently loaded model, the model would crash
on the first rendering. CLI will now refuse to convert a model loaded
in memory (probably a good idea in any case).
- CLI will offer the `v1-inpainting-inference.yaml` as the configuration
file when importing an inpainting a .ckpt or .safetensors file that
has "inpainting" in the name. Otherwise it offers `v1-inference.yaml`
as the default.
Found a couple of places where the formatting was messed up. I also
added a "Quick Start Guide" to the README for people who encounter
InvokeAI through PyPi. It features the PyPi install!
pulling in denoising support from upstream (its already there, invoke
just isn't using it). I've enabled this as a command line argument as
construction of the ESRGAN handler happens once. Ideally this would be a
UI option that could be adjusted for each upscaling task. Unfortunately
that is beyond my current level of InvokeAI-foo.
Upstream reference is here, starting on line 99 "use dni to control the
denoise strength"
https://github.com/xinntao/Real-ESRGAN/blob/master/inference_realesrgan.py
- `eslint` and `prettier` configs
- `husky` to format and lint via pre-commit hook
- `babel-plugin-transform-imports` to treeshake `lodash` and other packages if needed
Lints and formats codebase.
`options` slice was huge and managed a mix of generation parameters and general app settings. It has been split up:
- Generation parameters are now in `generationSlice`.
- Postprocessing parameters are now in `postprocessingSlice`
- UI related things are now in `uiSlice`
There is probably more to be done, like `gallerySlice` perhaps should only manage internal gallery state, and not if the gallery is displayed.
Full-slice selectors have been made for each slice.
Other organisational tweaks.
Previously conversions of .ckpt and .safetensors files to diffusers
models were failing with channel mismatch errors. This is corrected
with this PR.
- The model_manager convert_and_import() method now accepts the path
to the checkpoint file's configuration file, using the parameter
`original_config_file`. For inpainting files this should be set to
the full path to `v1-inpainting-inference.yaml`.
- If no configuration file is provided in the call, then the presence
of an inpainting file will be inferred at the
`ldm.ckpt_to_diffuser.convert_ckpt_to_diffUser()` level by looking
for the string "inpaint" in the path. AUTO1111 does something
similar to this, but it is brittle and not recommended.
- This PR also changes the model manager model_names() method to return
the model names in case folded sort order.
[![CI checks on main badge]][CI checks on main link] [![latest commit to main badge]][latest commit to main link]
[![github open issues badge]][github open issues link] [![github open prs badge]][github open prs link]
[![github open issues badge]][github open issues link] [![github open prs badge]][github open prs link] [![translation status badge]][translation status link]
[CI checks on main badge]: https://flat.badgen.net/github/checks/invoke-ai/InvokeAI/main?label=CI%20status%20on%20main&cache=900&icon=github
[CI checks on main link]:https://github.com/invoke-ai/InvokeAI/actions/workflows/test-invoke-conda.yml
[CI checks on main link]:https://github.com/invoke-ai/InvokeAI/actions?query=branch%3Amain
[translation status badge]: https://hosted.weblate.org/widgets/invokeai/-/svg-badge.svg
[translation status link]: https://hosted.weblate.org/engage/invokeai/
</div>
InvokeAI is a leading creative engine built to empower professionals and enthusiasts alike. Generate and create stunning visual media using the latest AI-driven technologies. InvokeAI offers an industry leading Web Interface, interactive Command Line Interface, and also serves as the foundation for multiple commercial products.
**Quick links**: [[How to Install](#installation)] [<ahref="https://discord.gg/ZmtBAhwWhy">Discord Server</a>] [<ahref="https://invoke-ai.github.io/InvokeAI/">Documentation and Tutorials</a>] [<ahref="https://github.com/invoke-ai/InvokeAI/">Code and Downloads</a>] [<ahref="https://github.com/invoke-ai/InvokeAI/issues">Bug Reports</a>] [<ahref="https://github.com/invoke-ai/InvokeAI/discussions">Discussion, Ideas & Q&A</a>]
**Quick links**: [[How to Install](https://invoke-ai.github.io/InvokeAI/#installation)] [<ahref="https://discord.gg/ZmtBAhwWhy">Discord Server</a>] [<ahref="https://invoke-ai.github.io/InvokeAI/">Documentation and Tutorials</a>] [<ahref="https://github.com/invoke-ai/InvokeAI/">Code and Downloads</a>] [<ahref="https://github.com/invoke-ai/InvokeAI/issues">Bug Reports</a>] [<ahref="https://github.com/invoke-ai/InvokeAI/discussions">Discussion, Ideas & Q&A</a>]
_Note: InvokeAI is rapidly evolving. Please use the
[Issues](https://github.com/invoke-ai/InvokeAI/issues) tab to report bugs and make feature
@ -41,38 +43,136 @@ requests. Be sure to use the provided templates. They will help us diagnose issu
### Automatic Installer (suggested for 1st time users)
1. Go to the bottom of the [Latest Release Page](https://github.com/invoke-ai/InvokeAI/releases/latest)
2. Download the .zip file for your OS (Windows/macOS/Linux).
3. Unzip the file.
4. If you are on Windows, double-click on the `install.bat` script. On macOS, open a Terminal window, drag the file `install.sh` from Finder into the Terminal, and press return. On Linux, run `install.sh`.
5. Wait a while, until it is done.
6. The folder where you ran the installer from will now be filled with lots of files. If you are on Windows, double-click on the `invoke.bat` file. On macOS, open a Terminal window, drag `invoke.sh` from the folder into the Terminal, and press return. On Linux, run `invoke.sh`
7. Press 2 to open the "browser-based UI", press enter/return, wait a minute or two for Stable Diffusion to start up, then open your browser and go to http://localhost:9090.
8. Type `banana sushi` in the box on the top left and click `Invoke`
4. If you are on Windows, double-click on the `install.bat` script. On
macOS, open a Terminal window, drag the file `install.sh` from Finder
into the Terminal, and press return. On Linux, run `install.sh`.
## Table of Contents
5. You'll be asked to confirm the location of the folder in which
to install InvokeAI and its image generation model files. Pick a
location with at least 15 GB of free memory. More if you plan on
InvokeAI is supported across Linux, Windows and macOS. Linux
users can use either an Nvidia-based card (with CUDA support) or an
AMD card (using the ROCm driver).
#### System
### System
You will need one of the following:
@ -98,11 +198,11 @@ We do not recommend the GTX 1650 or 1660 series video cards. They are
unable to run in half-precision mode and do not have sufficient VRAM
to render 512x512 images.
#### Memory
### Memory
- At least 12 GB Main Memory RAM.
#### Disk
### Disk
- At least 12 GB of free disk space for the machine learning model, Python, and all its dependencies.
@ -152,13 +252,15 @@ Notes](https://github.com/invoke-ai/InvokeAI/releases) and the
Please check out our **[Q&A](https://invoke-ai.github.io/InvokeAI/help/TROUBLESHOOT/#faq)** to get solutions for common installation
problems and other issues.
# Contributing
## Contributing
Anyone who wishes to contribute to this project, whether documentation, features, bug fixes, code
cleanup, testing, or code reviews, is very much encouraged to do so.
To join, just raise your hand on the InvokeAI Discord server (#dev-chat) or the GitHub discussion board.
If you'd like to help with translation, please see our [translation guide](docs/other/TRANSLATION.md).
If you are unfamiliar with how
to contribute to GitHub projects, here is a
[Getting Started Guide](https://opensource.com/article/19/7/create-pull-request-github). A full set of contribution guidelines, along with templates, are in progress. You can **make your pull request against the "main" branch**.
@ -175,6 +277,8 @@ This fork is a combined effort of various people from across the world.
[Check out the list of all these amazing people](https://invoke-ai.github.io/InvokeAI/other/CONTRIBUTORS/). We thank them for
their time, hard work and effort.
Thanks to [Weblate](https://weblate.org/) for generously providing translation services to this project.
### Support
For support, please use this repository's GitHub Issues tracking service, or join the Discord.
Applications are built on top of the invoke framework. They should construct `invoker` and then interact through it. They should avoid interacting directly with core code in order to support a variety of configurations.
### Web UI
The Web UI is built on top of an HTTP API built with [FastAPI](https://fastapi.tiangolo.com/) and [Socket.IO](https://socket.io/). The frontend code is found in `/frontend` and the backend code is found in `/ldm/invoke/app/api_app.py` and `/ldm/invoke/app/api/`. The code is further organized as such:
| Component | Description |
| --- | --- |
| api_app.py | Sets up the API app, annotates the OpenAPI spec with additional data, and runs the API |
| dependencies | Creates all invoker services and the invoker, and provides them to the API |
| events | An eventing system that could in the future be adapted to support horizontal scale-out |
| sockets | The Socket.IO interface - handles listening to and emitting session events (events are defined in the events service module) |
| routers | API definitions for different areas of API functionality |
### CLI
The CLI is built automatically from invocation metadata, and also supports invocation piping and auto-linking. Code is available in `/ldm/invoke/app/cli_app.py`.
## Invoke
The Invoke framework provides the interface to the underlying AI systems and is built with flexibility and extensibility in mind. There are four major concepts: invoker, sessions, invocations, and services.
### Invoker
The invoker (`/ldm/invoke/app/services/invoker.py`) is the primary interface through which applications interact with the framework. Its primary purpose is to create, manage, and invoke sessions. It also maintains two sets of services:
- **invocation services**, which are used by invocations to interact with core functionality.
- **invoker services**, which are used by the invoker to manage sessions and manage the invocation queue.
### Sessions
Invocations and links between them form a graph, which is maintained in a session. Sessions can be queued for invocation, which will execute their graph (either the next ready invocation, or all invocations). Sessions also maintain execution history for the graph (including storage of any outputs). An invocation may be added to a session at any time, and there is capability to add and entire graph at once, as well as to automatically link new invocations to previous invocations. Invocations can not be deleted or modified once added.
The session graph does not support looping. This is left as an application problem to prevent additional complexity in the graph.
### Invocations
Invocations represent individual units of execution, with inputs and outputs. All invocations are located in `/ldm/invoke/app/invocations`, and are all automatically discovered and made available in the applications. These are the primary way to expose new functionality in Invoke.AI, and the [implementation guide](INVOCATIONS.md) explains how to add new invocations.
### Services
Services provide invocations access AI Core functionality and other necessary functionality (e.g. image storage). These are available in `/ldm/invoke/app/services`. As a general rule, new services should provide an interface as an abstract base class, and may provide a lightweight local implementation by default in their module. The goal for all services should be to enable the usage of different implementations (e.g. using cloud storage for image storage), but should not load any module dependencies unless that implementation has been used (i.e. don't import anything that won't be used, especially if it's expensive to import).
## AI Core
The AI Core is represented by the rest of the code base (i.e. the code outside of `/ldm/invoke/app/`).
Invocations represent a single operation, its inputs, and its outputs. These operations and their outputs can be chained together to generate and modify images.
## Creating a new invocation
To create a new invocation, either find the appropriate module file in `/ldm/invoke/app/invocations` to add your invocation to, or create a new one in that folder. All invocations in that folder will be discovered and made available to the CLI and API automatically. Invocations make use of [typing](https://docs.python.org/3/library/typing.html) and [pydantic](https://pydantic-docs.helpmanual.io/) for validation and integration into the CLI and API.
All invocations must derive from `BaseInvocation`. They should have a docstring that declares what they do in a single, short line. They should also have a `type` with a type hint that's `Literal["command_name"]`, where `command_name` is what the user will type on the CLI or use in the API to create this invocation. The `command_name` must be unique. The `type` must be assigned to the value of the literal in the type hint.
Inputs consist of three parts: a name, a type hint, and a `Field` with default, description, and validation information. For example:
| Part | Value | Description |
| ---- | ----- | ----------- |
| Name | `strength` | This field is referred to as `strength` |
| Type Hint | `float` | This field must be of type `float` |
| Field | `Field(default=0.75, gt=0, le=1, description="The strength")` | The default value is `0.75`, the value must be in the range (0,1], and help text will show "The strength" for this field. |
Notice that `image` has type `Union[ImageField,None]`. The `Union` allows this field to be parsed with `None` as a value, which enables linking to previous invocations. All fields should either provide a default value or allow `None` as a value, so that they can be overwritten with a linked output from another invocation.
The special type `ImageField` is also used here. All images are passed as `ImageField`, which protects them from pydantic validation errors (since images only ever come from links).
Finally, note that for all linking, the `type` of the linked fields must match. If the `name` also matches, then the field can be **automatically linked** to a previous invocation by name and matching.
The `invoke` function is the last portion of an invocation. It is provided an `InvocationContext` which contains services to perform work as well as a `session_id` for use as needed. It should return a class with output values that derives from `BaseInvocationOutput`.
Before being called, the invocation will have all of its fields set from defaults, inputs, and finally links (overriding in that order).
Assume that this invocation may be running simultaneously with other invocations, may be running on another machine, or in other interesting scenarios. If you need functionality, please provide it as a service in the `InvocationServices` class, and make sure it can be overridden.
### Outputs
```py
classImageOutput(BaseInvocationOutput):
"""Base class for invocations that output an image"""
Output classes look like an invocation class without the invoke method. Prefer to use an existing output class if available, and prefer to name inputs the same as outputs when possible, to promote automatic invocation linking.
@ -214,6 +214,8 @@ Here are the invoke> command that apply to txt2img:
| `--variation <float>` | `-v<float>` | `0.0` | Add a bit of noise (0.0=none, 1.0=high) to the image in order to generate a series of variations. Usually used in combination with `-S<seed>` and `-n<int>` to generate a series a riffs on a starting image. See [Variations](./VARIATIONS.md). |
| `--with_variations <pattern>` | | `None` | Combine two or more variations. See [Variations](./VARIATIONS.md) for now to use this. |
| `--save_intermediates <n>` | | `None` | Save the image from every nth step into an "intermediates" folder inside the output directory |
| `--h_symmetry_time_pct <float>` | | `None` | Create symmetry along the X axis at the desired percent complete of the generation process. (Must be between 0.0 and 1.0; set to a very small number like 0.0001 for just after the first step of generation.) |
| `--v_symmetry_time_pct <float>` | | `None` | Create symmetry along the Y axis at the desired percent complete of the generation process. (Must be between 0.0 and 1.0; set to a very small number like 0.0001 for just after the first step of generation.) |
After training completes, the resultant embeddings will be saved into your `$INVOKEAI_ROOT/embeddings/<trigger word>/learned_embeds.bin`.
These will be automatically loaded when you start InvokeAI.
Add the trigger word, surrounded by angle brackets, to use that embedding. For example, if your trigger word was `terence`, use `<terence>` in prompts. This is the same syntax used by the HuggingFace concepts library.
**Note:**`.pt` embeddings do not require the angle brackets.
## Troubleshooting
### `Cannot load embedding for <trigger>. It was trained on a model with token dimension 1024, but the current model has token dimension 768`
Messages like this indicate you trained the embedding on a different base model than the currently selected one.
For example, in the error above, the training was done on SD2.1 (768x768) but it was used on SD1.5 (512x512).
## Reading
For more information on textual inversion, please see the following
Here you can find the documentation for InvokeAI's various features.
- The Basics
## The Basics
### * The [Web User Interface](WEB.md)
Guide to the Web interface. Also see the [WebUI Hotkeys Reference Guide](WEBUIHOTKEYS.md)
- The [Web User Interface](WEB.md)
### * The [Unified Canvas](UNIFIED_CANVAS.md)
Build complex scenes by combine and modifying multiple images in a stepwise
fashion. This feature combines img2img, inpainting and outpainting in
a single convenient digital artist-optimized user interface.
Guide to the Web interface. Also see the
[WebUI Hotkeys Reference Guide](WEBUIHOTKEYS.md)
### * The [Command Line Interface (CLI)](CLI.md)
Scriptable access to InvokeAI's features.
- The [Unified Canvas](UNIFIED_CANVAS.md)
## Image Generation
### * [Prompt Engineering](PROMPTS.md)
Get the images you want with the InvokeAI prompt engineering language.
Build complex scenes by combine and modifying multiple images in a
stepwise fashion. This feature combines img2img, inpainting and
outpainting in a single convenient digital artist-optimized user
interface.
## * [Post-Processing](POSTPROCESS.md)
Restore mangled faces and make images larger with upscaling. Also see the [Embiggen Upscaling Guide](EMBIGGEN.md).
- The [Command Line Interface (CLI)](CLI.md)
## * The [Concepts Library](CONCEPTS.md)
Add custom subjects and styles using HuggingFace's repository of embeddings.
Scriptable access to InvokeAI's features.
### * [Image-to-Image Guide for the CLI](IMG2IMG.md)
Use a seed image to build new creations in the CLI.
- [Visual Manual for InvokeAI](https://docs.google.com/presentation/d/e/2PACX-1vSE90aC7bVVg0d9KXVMhy-Wve-wModgPFp7AGVTOCgf4xE03SnV24mjdwldolfCr59D_35oheHe4Cow/pub?start=false&loop=true&delayms=60000) (contributed by Statcomm)
### * [Inpainting Guide for the CLI](INPAINTING.md)
Selectively erase and replace portions of an existing image in the CLI.
- Image Generation
### * [Outpainting Guide for the CLI](OUTPAINTING.md)
Extend the borders of the image with an "outcrop" function within the CLI.
- [Prompt Engineering](PROMPTS.md)
### * [Generating Variations](VARIATIONS.md)
Have an image you like and want to generate many more like it? Variations
are the ticket.
Get the images you want with the InvokeAI prompt engineering language.
- [WebUI Unified Canvas for Img2Img, inpainting and outpainting](features/UNIFIED_CANVAS.md)
- [Visual Manual for InvokeAI v2.3.1](https://docs.google.com/presentation/d/e/2PACX-1vSE90aC7bVVg0d9KXVMhy-Wve-wModgPFp7AGVTOCgf4xE03SnV24mjdwldolfCr59D_35oheHe4Cow/pub?start=false&loop=true&delayms=60000) (contributed by Statcomm)
<!-- separator -->
<!-- separator -->
### The InvokeAI Command Line Interface
- [Command Line Interace Reference Guide](features/CLI.md)
<!-- separator -->
### Image Management
- [Image2Image](features/IMG2IMG.md)
- [Inpainting](features/INPAINTING.md)
- [Outpainting](features/OUTPAINTING.md)
- [Adding custom styles and subjects](features/CONCEPTS.md)
- [Using LoRA models](features/LORAS.md)
- [Upscaling and Face Reconstruction](features/POSTPROCESS.md)
- [Not Safe for Work (NSFW) Checker](features/NSFW.md)
<!-- seperator -->
### Prompt Engineering
- [Prompt Syntax](features/PROMPTS.md)
- [Generating Variations](features/VARIATIONS.md)
- [Prompt Syntax](features/PROMPTS.md)
- [Generating Variations](features/VARIATIONS.md)
## :octicons-log-16: Latest Changes
### v2.3.3 <small>(29 March 2023)</small>
#### Bug Fixes
1. When using legacy checkpoints with an external VAE, the VAE file is now scanned for malware prior to loading. Previously only the main model weights file was scanned.
2. Textual inversion will select an appropriate batchsize based on whether `xformers` is active, and will default to `xformers` enabled if the library is detected.
3. The batch script log file names have been fixed to be compatible with Windows.
4. Occasional corruption of the `.next_prefix` file (which stores the next output file name in sequence) on Windows systems is now detected and corrected.
5. An infinite loop when opening the developer's console from within the `invoke.sh` script has been corrected.
#### Enhancements
1. It is now possible to load and run several community-contributed SD-2.0 based models, including the infamous "Illuminati" model.
2. The "NegativePrompts" embedding file, and others like it, can now be loaded by placing it in the InvokeAI `embeddings` directory.
3. If no `--model` is specified at launch time, InvokeAI will remember the last model used and restore it the next time it is launched.
4. On Linux systems, the `invoke.sh` launcher now uses a prettier console-based interface. To take advantage of it, install the `dialog` package using your package manager (e.g. `sudo apt install dialog`).
5. When loading legacy models (safetensors/ckpt) you can specify a custom config file and/or a VAE by placing like-named files in the same directory as the model following this example:
```
my-favorite-model.ckpt
my-favorite-model.yaml
my-favorite-model.vae.pt # or my-favorite-model.vae.safetensors
```
### v2.3.2 <small>(13 March 2023)</small>
#### Bugfixes
Since version 2.3.1 the following bugs have been fixed:
1. Black images appearing for potential NSFW images when generating with legacy checkpoint models and both `--no-nsfw_checker` and `--ckpt_convert` turned on.
2. Black images appearing when generating from models fine-tuned on Stable-Diffusion-2-1-base. When importing V2-derived models, you may be asked to select whether the model was derived from a "base" model (512 pixels) or the 768-pixel SD-2.1 model.
3. The "Use All" button was not restoring the Hi-Res Fix setting on the WebUI
4. When using the model installer console app, models failed to import correctly when importing from directories with spaces in their names. A similar issue with the output directory was also fixed.
5. Crashes that occurred during model merging.
6. Restore previous naming of Stable Diffusion base and 768 models.
7. Upgraded to latest versions of `diffusers`, `transformers`, `safetensors` and `accelerate` libraries upstream. We hope that this will fix the `assertion NDArray > 2**32` issue that MacOS users have had when generating images larger than 768x768 pixels. Please report back.
As part of the upgrade to `diffusers`, the location of the diffusers-based models has changed from `models/diffusers` to `models/hub`. When you launch InvokeAI for the first time, it will prompt you to OK a one-time move. This should be quick and harmless, but if you have modified your `models/diffusers` directory in some way, for example using symlinks, you may wish to cancel the migration and make appropriate adjustments.
#### New "Invokeai-batch" script
2.3.2 introduces a new command-line only script called
`invokeai-batch` that can be used to generate hundreds of images from
prompts and settings that vary systematically. This can be used to try
the same prompt across multiple combinations of models, steps, CFG
settings and so forth. It also allows you to template prompts and
generate a combinatorial list like: ``` a shack in the mountains,
photograph a shack in the mountains, watercolor a shack in the
mountains, oil painting a chalet in the mountains, photograph a chalet
in the mountains, watercolor a chalet in the mountains, oil painting a
shack in the desert, photograph ... ```
If you have a system with multiple GPUs, or a single GPU with lots of
VRAM, you can parallelize generation across the combinatorial set,
reducing wait times and using your system's resources efficiently
(make sure you have good GPU cooling).
To try `invokeai-batch` out. Launch the "developer's console" using
the `invoke` launcher script, or activate the invokeai virtual
environment manually. From the console, give the command
`invokeai-batch --help` in order to learn how the script works and
create your first template file for dynamic prompt generation.
### v2.3.1 <small>(26 February 2023)</small>
This is primarily a bugfix release, but it does provide several new features that will improve the user experience.
#### Enhanced support for model management
InvokeAI now makes it convenient to add, remove and modify models. You can individually import models that are stored on your local system, scan an entire folder and its subfolders for models and import them automatically, and even directly import models from the internet by providing their download URLs. You also have the option of designating a local folder to scan for new models each time InvokeAI is restarted.
There are three ways of accessing the model management features:
1. ***From the WebUI***, click on the cube to the right of the model selection menu. This will bring up a form that allows you to import models individually from your local disk or scan a directory for models to import.
Choose option (5) _download and install models_ from the `invoke` launcher script to start a new console-based application for model management. You can use this to select from a curated set of starter models, or import checkpoint, safetensors, and diffusers models from a local disk or the internet. The example below shows importing two checkpoint URLs from popular SD sites and a HuggingFace diffusers model using its Repository ID. It also shows how to designate a folder to be scanned at startup time for new models to import.
Command-line users can start this app using the command `invokeai-model-install`.
The `!install_model` and `!convert_model` commands have been enhanced to allow entering of URLs and local directories to scan and import. The first command installs .ckpt and .safetensors files as-is. The second one converts them into the faster diffusers format before installation.
Internally InvokeAI is able to probe the contents of a .ckpt or .safetensors file to distinguish among v1.x, v2.x and inpainting models. This means that you do **not** need to include "inpaint" in your model names to use an inpainting model. Note that Stable Diffusion v2.x models will be autoconverted into a diffusers model the first time you use it.
Please see [INSTALLING MODELS](https://invoke-ai.github.io/InvokeAI/installation/050_INSTALLING_MODELS/) for more information on model management.
#### An Improved Installer Experience
The installer now launches a console-based UI for setting and changing commonly-used startup options:
After selecting the desired options, the installer installs several support models needed by InvokeAI's face reconstruction and upscaling features and then launches the interface for selecting and installing models shown earlier. At any time, you can edit the startup options by launching `invoke.sh`/`invoke.bat` and entering option (6) _change InvokeAI startup options_
Command-line users can launch the new configure app using `invokeai-configure`.
This release also comes with a renewed updater. To do an update without going through a whole reinstallation, launch `invoke.sh` or `invoke.bat` and choose option (9) _update InvokeAI_ . This will bring you to a screen that prompts you to update to the latest released version, to the most current development version, or any released or unreleased version you choose by selecting the tag or branch of the desired version.
Command-line users can run this interface by typing `invokeai-configure`
#### Image Symmetry Options
There are now features to generate horizontal and vertical symmetry during generation. The way these work is to wait until a selected step in the generation process and then to turn on a mirror image effect. In addition to generating some cool images, you can also use this to make side-by-side comparisons of how an image will look with more or fewer steps. Access this option from the WebUI by selecting _Symmetry_ from the image generation settings, or within the CLI by using the options `--h_symmetry_time_pct` and `--v_symmetry_time_pct` (these can be abbreviated to `--h_sym` and `--v_sym` like all other options).
This release introduces a beta version of the WebUI Unified Canvas. To try it out, open up the settings dialogue in the WebUI (gear icon) and select _Use Canvas Beta Layout_:
Refresh the screen and go to to Unified Canvas (left side of screen, third icon from the top). The new layout is designed to provide more space to work in and to keep the image controls close to the image itself:
#### Model conversion and merging within the WebUI
The WebUI now has an intuitive interface for model merging, as well as for permanent conversion of models from legacy .ckpt/.safetensors formats into diffusers format. These options are also available directly from the `invoke.sh`/`invoke.bat` scripts.
#### An easier way to contribute translations to the WebUI
We have migrated our translation efforts to [Weblate](https://hosted.weblate.org/engage/invokeai/), a FOSS translation product. Maintaining the growing project's translations is now far simpler for the maintainers and community. Please review our brief [translation guide](https://github.com/invoke-ai/InvokeAI/blob/v2.3.1/docs/other/TRANSLATION.md) for more information on how to contribute.
#### Numerous internal bugfixes and performance issues
This releases quashes multiple bugs that were reported in 2.3.0. Major internal changes include upgrading to `diffusers 0.13.0`, and using the `compel` library for prompt parsing. See [Detailed Change Log](#full-change-log) for a detailed list of bugs caught and squished.
#### Summary of InvokeAI command line scripts (all accessible via the launcher menu)
| `invokeai-model-install` | Model installer with console forms-based front end |
| `invokeai-ti --gui` | Textual inversion, with a console forms-based front end |
| `invokeai-merge --gui` | Model merging, with a console forms-based front end |
| `invokeai-configure` | Startup configuration; can also be used to reinstall support models |
| `invokeai-update` | InvokeAI software updater |
### v2.3.0 <small>(9 February 2023)</small>
#### Migration to Stable Diffusion `diffusers` models
Previous versions of InvokeAI supported the original model file format introduced with Stable Diffusion 1.4. In the original format, known variously as "checkpoint", or "legacy" format, there is a single large weights file ending with `.ckpt` or `.safetensors`. Though this format has served the community well, it has a number of disadvantages, including file size, slow loading times, and a variety of non-standard variants that require special-case code to handle. In addition, because checkpoint files are actually a bundle of multiple machine learning sub-models, it is hard to swap different sub-models in and out, or to share common sub-models. A new format, introduced by the StabilityAI company in collaboration with HuggingFace, is called `diffusers` and consists of a directory of individual models. The most immediate benefit of `diffusers` is that they load from disk very quickly. A longer term benefit is that in the near future `diffusers` models will be able to share common sub-models, dramatically reducing disk space when you have multiple fine-tune models derived from the same base.
Previous versions of InvokeAI supported the original model file format
introduced with Stable Diffusion 1.4. In the original format, known variously as
"checkpoint", or "legacy" format, there is a single large weights file ending
with `.ckpt` or `.safetensors`. Though this format has served the community
well, it has a number of disadvantages, including file size, slow loading times,
and a variety of non-standard variants that require special-case code to handle.
In addition, because checkpoint files are actually a bundle of multiple machine
learning sub-models, it is hard to swap different sub-models in and out, or to
share common sub-models. A new format, introduced by the StabilityAI company in
collaboration with HuggingFace, is called `diffusers` and consists of a
directory of individual models. The most immediate benefit of `diffusers` is
that they load from disk very quickly. A longer term benefit is that in the near
future `diffusers` models will be able to share common sub-models, dramatically
reducing disk space when you have multiple fine-tune models derived from the
same base.
When you perform a new install of version 2.3.0, you will be offered the option to install the `diffusers` versions of a number of popular SD models, including Stable Diffusion versions 1.5 and 2.1 (including the 768x768 pixel version of 2.1). These will act and work just like the checkpoint versions. Do not be concerned if you already have a lot of ".ckpt" or ".safetensors" models on disk! InvokeAI 2.3.0 can still load these and generate images from them without any extra intervention on your part.
When you perform a new install of version 2.3.0, you will be offered the option
to install the `diffusers` versions of a number of popular SD models, including
Stable Diffusion versions 1.5 and 2.1 (including the 768x768 pixel version of
2.1). These will act and work just like the checkpoint versions. Do not be
concerned if you already have a lot of ".ckpt" or ".safetensors" models on disk!
InvokeAI 2.3.0 can still load these and generate images from them without any
extra intervention on your part.
To take advantage of the optimized loading times of `diffusers` models, InvokeAI offers options to convert legacy checkpoint models into optimized `diffusers` models. If you use the `invokeai` command line interface, the relevant commands are:
To take advantage of the optimized loading times of `diffusers` models, InvokeAI
offers options to convert legacy checkpoint models into optimized `diffusers`
models. If you use the `invokeai` command line interface, the relevant commands
are:
* `!convert_model` -- Take the path to a local checkpoint file or a URL that is pointing to one, convert it into a `diffusers` model, and import it into InvokeAI's models registry file.
* `!optimize_model` -- If you already have a checkpoint model in your InvokeAI models file, this command will accept its short name and convert it into a like-named `diffusers` model, optionally deleting the original checkpoint file.
* `!import_model` -- Take the local path of either a checkpoint file or a `diffusers` model directory and import it into InvokeAI's registry file. You may also provide the ID of any diffusers model that has been published on the [HuggingFace models repository](https://huggingface.co/models?pipeline_tag=text-to-image&sort=downloads) and it will be downloaded and installed automatically.
- `!convert_model` -- Take the path to a local checkpoint file or a URL that
is pointing to one, convert it into a `diffusers` model, and import it into
InvokeAI's models registry file.
- `!optimize_model` -- If you already have a checkpoint model in your InvokeAI
models file, this command will accept its short name and convert it into a
like-named `diffusers` model, optionally deleting the original checkpoint
file.
- `!import_model` -- Take the local path of either a checkpoint file or a
`diffusers` model directory and import it into InvokeAI's registry file. You
may also provide the ID of any diffusers model that has been published on
and it will be downloaded and installed automatically.
The WebGUI offers similar functionality for model management.
For advanced users, new command-line options provide additional functionality. Launching `invokeai` with the argument `--autoconvert <path to directory>` takes the path to a directory of checkpoint files, automatically converts them into `diffusers` models and imports them. Each time the script is launched, the directory will be scanned for new checkpoint files to be loaded. Alternatively, the `--ckpt_convert` argument will cause any checkpoint or safetensors model that is already registered with InvokeAI to be converted into a `diffusers` model on the fly, allowing you to take advantage of future diffusers-only features without explicitly converting the model and saving it to disk.
For advanced users, new command-line options provide additional functionality.
Launching `invokeai` with the argument `--autoconvert <path to directory>` takes
the path to a directory of checkpoint files, automatically converts them into
`diffusers` models and imports them. Each time the script is launched, the
directory will be scanned for new checkpoint files to be loaded. Alternatively,
the `--ckpt_convert` argument will cause any checkpoint or safetensors model
that is already registered with InvokeAI to be converted into a `diffusers`
model on the fly, allowing you to take advantage of future diffusers-only
features without explicitly converting the model and saving it to disk.
Please see [INSTALLING MODELS](https://invoke-ai.github.io/InvokeAI/installation/050_INSTALLING_MODELS/) for more information on model management in both the command-line and Web interfaces.
for more information on model management in both the command-line and Web
interfaces.
#### Support for the `XFormers` Memory-Efficient Crossattention Package
On CUDA (Nvidia) systems, version 2.3.0 supports the `XFormers` library. Once installed, the`xformers` package dramatically reduces the memory footprint of loaded Stable Diffusion models files and modestly increases image generation speed. `xformers` will be installed and activated automatically if you specify a CUDA system at install time.
On CUDA (Nvidia) systems, version 2.3.0 supports the `XFormers` library. Once
installed, the`xformers` package dramatically reduces the memory footprint of
loaded Stable Diffusion models files and modestly increases image generation
speed. `xformers` will be installed and activated automatically if you specify a
CUDA system at install time.
The caveat with using `xformers` is that it introduces slightly non-deterministic behavior, and images generated using the same seed and other settings will be subtly different between invocations. Generally the changes are unnoticeable unless you rapidly shift back and forth between images, but to disable `xformers` and restore fully deterministic behavior, you may launch InvokeAI using the `--no-xformers` option. This is most conveniently done by opening the file `invokeai/invokeai.init` with a text editor, and adding the line `--no-xformers` at the bottom.
The caveat with using `xformers` is that it introduces slightly
non-deterministic behavior, and images generated using the same seed and other
settings will be subtly different between invocations. Generally the changes are
unnoticeable unless you rapidly shift back and forth between images, but to
disable `xformers` and restore fully deterministic behavior, you may launch
InvokeAI using the `--no-xformers` option. This is most conveniently done by
opening the file `invokeai/invokeai.init` with a text editor, and adding the
line `--no-xformers` at the bottom.
#### A Negative Prompt Box in the WebUI
There is now a separate text input box for negative prompts in the WebUI. This is convenient for stashing frequently-used negative prompts ("mangled limbs, bad anatomy"). The `[negative prompt]` syntax continues to work in the main prompt box as well.
There is now a separate text input box for negative prompts in the WebUI. This
is convenient for stashing frequently-used negative prompts ("mangled limbs, bad
anatomy"). The `[negative prompt]` syntax continues to work in the main prompt
box as well.
To see exactly how your prompts are being parsed, launch `invokeai` with the `--log_tokenization` option. The console window will then display the tokenization process for both positive and negative prompts.
To see exactly how your prompts are being parsed, launch `invokeai` with the
`--log_tokenization` option. The console window will then display the
tokenization process for both positive and negative prompts.
#### Model Merging
Version 2.3.0 offers an intuitive user interface for merging up to three Stable Diffusion models using an intuitive user interface. Model merging allows you to mix the behavior of models to achieve very interesting effects. To use this, each of the models must already be imported into InvokeAI and saved in `diffusers` format, then launch the merger using a new menu item in the InvokeAI launcher script (`invoke.sh`, `invoke.bat`) or directly from the command line with `invokeai-merge --gui`. You will be prompted to select the models to merge, the proportions in which to mix them, and the mixing algorithm. The script will create a new merged `diffusers` model and import it into InvokeAI for your use.
Version 2.3.0 offers an intuitive user interface for merging up to three Stable
Diffusion models using an intuitive user interface. Model merging allows you to
mix the behavior of models to achieve very interesting effects. To use this,
each of the models must already be imported into InvokeAI and saved in
`diffusers` format, then launch the merger using a new menu item in the InvokeAI
launcher script (`invoke.sh`, `invoke.bat`) or directly from the command line
with `invokeai-merge --gui`. You will be prompted to select the models to merge,
the proportions in which to mix them, and the mixing algorithm. The script will
create a new merged `diffusers` model and import it into InvokeAI for your use.
See [MODEL MERGING](https://invoke-ai.github.io/InvokeAI/features/MODEL_MERGING/) for more details.
Textual Inversion (TI) is a technique for training a Stable Diffusion model to emit a particular subject or style when triggered by a keyword phrase. You can perform TI training by placing a small number of images of the subject or style in a directory, and choosing a distinctive trigger phrase, such as "pointillist-style". After successful training, The subject or style will be activated by including `<pointillist-style>` in your prompt.
Textual Inversion (TI) is a technique for training a Stable Diffusion model to
emit a particular subject or style when triggered by a keyword phrase. You can
perform TI training by placing a small number of images of the subject or style
in a directory, and choosing a distinctive trigger phrase, such as
"pointillist-style". After successful training, The subject or style will be
activated by including `<pointillist-style>` in your prompt.
Previous versions of InvokeAI were able to perform TI, but it required using a command-line script with dozens of obscure command-line arguments. Version 2.3.0 features an intuitive TI frontend that will build a TI model on top of any `diffusers` model. To access training you can launch from a new item in the launcher script or from the command line using `invokeai-ti --gui`.
Previous versions of InvokeAI were able to perform TI, but it required using a
command-line script with dozens of obscure command-line arguments. Version 2.3.0
features an intuitive TI frontend that will build a TI model on top of any
`diffusers` model. To access training you can launch from a new item in the
launcher script or from the command line using `invokeai-ti --gui`.
See [TEXTUAL INVERSION](https://invoke-ai.github.io/InvokeAI/features/TEXTUAL_INVERSION/) for further details.
The InvokeAI installer has been upgraded in order to provide a smoother and hopefully more glitch-free experience. In addition, InvokeAI is now packaged as a PyPi project, allowing developers and power-users to install InvokeAI with the command `pip install InvokeAI --use-pep517`. Please see [Installation](#installation) for details.
The InvokeAI installer has been upgraded in order to provide a smoother and
hopefully more glitch-free experience. In addition, InvokeAI is now packaged as
a PyPi project, allowing developers and power-users to install InvokeAI with the
command `pip install InvokeAI --use-pep517`. Please see
[Installation](#installation) for details.
Developers should be aware that the `pip` installation procedure has been simplified and that the `conda` method is no longer supported at all. Accordingly, the `environments_and_requirements` directory has been deleted from the repository.
Developers should be aware that the `pip` installation procedure has been
simplified and that the `conda` method is no longer supported at all.
Accordingly, the `environments_and_requirements` directory has been deleted from
the repository.
#### Command-line name changes
All of InvokeAI's functionality, including the WebUI, command-line interface, textual inversion training and model merging, can all be accessed from the `invoke.sh` and `invoke.bat` launcher scripts. The menu of options has been expanded to add the new functionality. For the convenience of developers and power users, we have normalized the names of the InvokeAI command-line scripts:
All of InvokeAI's functionality, including the WebUI, command-line interface,
textual inversion training and model merging, can all be accessed from the
`invoke.sh` and `invoke.bat` launcher scripts. The menu of options has been
expanded to add the new functionality. For the convenience of developers and
power users, we have normalized the names of the InvokeAI command-line scripts:
* `invokeai` -- Command-line client
* `invokeai --web` -- Web GUI
* `invokeai-merge --gui` -- Model merging script with graphical front end
* `invokeai-ti --gui` -- Textual inversion script with graphical front end
* `invokeai-configure` -- Configuration tool for initializing the `invokeai` directory and selecting popular starter models.
- `invokeai` -- Command-line client
- `invokeai --web` -- Web GUI
- `invokeai-merge --gui` -- Model merging script with graphical front end
- `invokeai-ti --gui` -- Textual inversion script with graphical front end
- `invokeai-configure` -- Configuration tool for initializing the `invokeai`
directory and selecting popular starter models.
For backward compatibility, the old command names are also recognized, including `invoke.py` and `configure-invokeai.py`. However, these are deprecated and will eventually be removed.
For backward compatibility, the old command names are also recognized, including
`invoke.py` and `configure-invokeai.py`. However, these are deprecated and will
eventually be removed.
Developers should be aware that the locations of the script's source code has been moved. The new locations are:
Developers should be aware that the locations of the script's source code has
been moved. The new locations are:
Developers are strongly encouraged to perform an "editable" install of InvokeAI using `pip install -e . --use-pep517` in the Git repository, and then to call the scripts using their 2.3.0 names, rather than executing the scripts directly. Developers should also be aware that the several important data files have been relocated into a new directory named `invokeai`. This includes the WebGUI's `frontend` and `backend` directories, and the `INITIAL_MODELS.yaml` files used by the installer to select starter models. Eventually all InvokeAI modules will be in subdirectories of `invokeai`.
|stable-diffusion-1.5 | runwayml/stable-diffusion-v1-5 | Most recent version of base Stable Diffusion model | https://huggingface.co/runwayml/stable-diffusion-v1-5 |
| stable-diffusion-1.4 | runwayml/stable-diffusion-v1-4 | Previous version of base Stable Diffusion model | https://huggingface.co/runwayml/stable-diffusion-v1-4 |
| stable-diffusion-2.1-base |stabilityai/stable-diffusion-2-1-base | Stable Diffusion version 2.1 trained on 512 pixel images | https://huggingface.co/stabilityai/stable-diffusion-2-1-base |
| stable-diffusion-2.1-768 |stabilityai/stable-diffusion-2-1 | Stable Diffusion version 2.1 trained on 768 pixel images | https://huggingface.co/stabilityai/stable-diffusion-2-1 |
| dreamlike-diffusion-1.0 | dreamlike-art/dreamlike-diffusion-1.0 | An SD 1.5 model finetuned on high quality art | https://huggingface.co/dreamlike-art/dreamlike-diffusion-1.0 |
| dreamlike-photoreal-2.0 | dreamlike-art/dreamlike-photoreal-2.0 | A photorealistic model trained on 768 pixel images| https://huggingface.co/dreamlike-art/dreamlike-photoreal-2.0 |
| openjourney-4.0 | prompthero/openjourney | An SD 1.5 model finetuned on Midjourney images prompt with "mdjrny-v4 style" | https://huggingface.co/prompthero/openjourney |
| nitro-diffusion-1.0 | nitrosocke/Nitro-Diffusion | An SD 1.5 model finetuned on three styles, prompt with "archer style", "arcane style" or "modern disney style" | https://huggingface.co/nitrosocke/Nitro-Diffusion|
| trinart-2.0 | naclbit/trinart_stable_diffusion_v2 | An SD 1.5 model finetuned with ~40,000 assorted high resolution manga/anime-style pictures | https://huggingface.co/naclbit/trinart_stable_diffusion_v2|
| trinart-characters-2_0 | naclbit/trinart_derrida_characters_v2_stable_diffusion | An SD1.5 model finetuned with 19.2M manga/anime-style pictures | https://huggingface.co/naclbit/trinart_derrida_characters_v2_stable_diffusion|
|Model Name | HuggingFace Repo ID | Description | URL |
|---------- | ---------- | ----------- | --- |
|stable-diffusion-1.5|runwayml/stable-diffusion-v1-5|Stable Diffusion version 1.5 diffusers model (4.27 GB)|https://huggingface.co/runwayml/stable-diffusion-v1-5 |
|sd-inpainting-1.5|runwayml/stable-diffusion-inpainting|RunwayML SD 1.5 model optimized for inpainting, diffusers version (4.27 GB)|https://huggingface.co/runwayml/stable-diffusion-inpainting |
|stable-diffusion-2.1|stabilityai/stable-diffusion-2-1|Stable Diffusion version 2.1 diffusers model, trained on 768 pixel images (5.21 GB)|https://huggingface.co/stabilityai/stable-diffusion-2-1 |
|sd-inpainting-2.0|stabilityai/stable-diffusion-2-1|Stable Diffusion version 2.0 inpainting model (5.21 GB)|https://huggingface.co/stabilityai/stable-diffusion-2-1 |
|analog-diffusion-1.0|wavymulder/Analog-Diffusion|An SD-1.5 model trained on diverse analog photographs (2.13 GB)|https://huggingface.co/wavymulder/Analog-Diffusion |
|deliberate-1.0|XpucT/Deliberate|Versatile model that produces detailed images up to 768px (4.27 GB)|https://huggingface.co/XpucT/Deliberate |
|dreamlike-photoreal-2.0|dreamlike-art/dreamlike-photoreal-2.0|A photorealistic model trained on 768 pixel images based on SD 1.5 (2.13 GB)|https://huggingface.co/dreamlike-art/dreamlike-photoreal-2.0 |
|inkpunk-1.0|Envvi/Inkpunk-Diffusion|Stylized illustrations inspired by Gorillaz, FLCL and Shinkawa; prompt with "nvinkpunk" (4.27 GB)|https://huggingface.co/Envvi/Inkpunk-Diffusion|
|openjourney-4.0|prompthero/openjourney|An SD 1.5 model finetuned on Midjourney; prompt with "mdjrny-v4 style" (2.13 GB)|https://huggingface.co/prompthero/openjourney |
|portrait-plus-1.0|wavymulder/portraitplus|An SD-1.5 model trained on close range portraits of people; prompt with "portrait+" (2.13 GB)|https://huggingface.co/wavymulder/portraitplus |
|seek-art-mega-1.0|coreco/seek.art_MEGA|A general use SD-1.5 "anything" model that supports multiple styles (2.1 GB)|https://huggingface.co/coreco/seek.art_MEGA |
|trinart-2.0|naclbit/trinart_stable_diffusion_v2|An SD-1.5 model finetuned with ~40K assorted high resolution manga/anime-style images (2.13 GB)|https://huggingface.co/naclbit/trinart_stable_diffusion_v2 |
|waifu-diffusion-1.4|hakurei/waifu-diffusion|An SD-1.5 model trained on 680k anime/manga-style images (2.13 GB)|https://huggingface.co/hakurei/waifu-diffusion |
Note that these files are covered by an "Ethical AI" license which forbids
certain uses. When you initially download them, you are asked to
accept the license terms.
Note that these files are covered by an "Ethical AI" license which
forbids certain uses. When you initially download them, you are asked
to accept the license terms. In addition, some of these models carry
additional license terms that limit their use in commercial
applications or on public servers. Be sure to familiarize yourself
with the model terms by visiting the URLs in the table above.
## Community-Contributed Models
@ -80,33 +87,70 @@ only `.safetensors` and `.ckpt` models, but they can be easily loaded
into InvokeAI and/or converted into optimized `diffusers` models. Be
aware that CIVITAI hosts many models that generate NSFW content.
!!! note
InvokeAI 2.3.x does not support directly importing and
running Stable Diffusion version 2 checkpoint models. If you
try to import them, they will be automatically
converted into `diffusers` models on the fly. This adds about 20s
to loading time. To avoid this overhead, you are encouraged to
use one of the conversion methods described below to convert them
permanently.
## Installation
There are multiple ways to install and manage models:
1. The `invokeai-configure` script which will download and install them for you.
1. The `invokeai-model-install` script which will download and install them for you.
2. The command-line tool (CLI) has commands that allows you to import, configure and modify
models files.
3. The web interface (WebUI) has a GUI for importing and managing
models.
models.
### Installation via `invokeai-configure`
### Installation via `invokeai-model-install`
From the `invoke` launcher, choose option (6) "re-run the configure
script to download new models." This will launch the same script that
prompted you to select models at install time. You can use this to add
models that you skipped the first time around. It is all right to
specify a model that was previously downloaded; the script will just
confirm that the files are complete.
From the `invoke` launcher, choose option (5) "Download and install
models." This will launch the same script that prompted you to select
models at install time. You can use this to add models that you
skipped the first time around. It is all right to specify a model that
was previously downloaded; the script will just confirm that the files
are complete.
This script allows you to load 3d party models. Look for a large text
entry box labeled "IMPORT LOCAL AND REMOTE MODELS." In this box, you
can cut and paste one or more of any of the following:
1. A URL that points to a downloadable .ckpt or .safetensors file.
2. A file path pointing to a .ckpt or .safetensors file.
3. A diffusers model repo_id (from HuggingFace) in the format
"owner/repo_name".
4. A directory path pointing to a diffusers model directory.
5. A directory path pointing to a directory containing a bunch of
.ckpt and .safetensors files. All will be imported.
You can enter multiple items into the textbox, each one on a separate
line. You can paste into the textbox using ctrl-shift-V or by dragging
and dropping a file/directory from the desktop into the box.
The script also lets you designate a directory that will be scanned
for new model files each time InvokeAI starts up. These models will be
added automatically.
Lastly, the script gives you a checkbox option to convert legacy models
into diffusers, or to run the legacy model directly. If you choose to
convert, the original .ckpt/.safetensors file will **not** be deleted,
but a new diffusers directory will be created, using twice your disk
space. However, the diffusers version will load faster, and will be
compatible with InvokeAI 3.0.
### Installation via the CLI
You can install a new model, including any of the community-supported ones, via
the command-line client's `!import_model` command.
#### Installing `.ckpt` and `.safetensors` models
#### Installing individual `.ckpt` and `.safetensors` models
If the model is already downloaded to your local disk, use
`!import_model /path/to/file.ckpt` to load it. For example:
* `--model <modelname>` -- Start up with the indicated model loaded
* `--ckpt_convert` -- When a checkpoint/safetensors model is loaded, convert it into a `diffusers` model in memory. This does not permanently save the converted model to disk.
* `--autoconvert <path/to/directory>` -- Scan the indicated directory path for new checkpoint/safetensors files, convert them into `diffusers` models, and import them into InvokeAI.
Here is an example of providing an argument on the command line using
*`--model <model name>` -- Start up with the indicated model loaded
*`--ckpt_convert` -- When a checkpoint/safetensors model is loaded, convert it into a `diffusers` model in memory. This does not permanently save the converted model to disk.
*`--autoconvert <path/to/directory>` -- Scan the indicated directory path for new checkpoint/safetensors files, convert them into `diffusers` models, and import them into InvokeAI.
Here is an example of providing an argument on the command line using
InvokeAI uses [Weblate](https://weblate.org) for translation. Weblate is a FOSS project providing a scalable translation service. Weblate automates the tedious parts of managing translation of a growing project, and the service is generously provided at no cost to FOSS projects like InvokeAI.
## Contributing
If you'd like to contribute by adding or updating a translation, please visit our [Weblate project](https://hosted.weblate.org/engage/invokeai/). You'll need to sign in with your GitHub account (a number of other accounts are supported, including Google).
Once signed in, select a language and then the Web UI component. From here you can Browse and Translate strings from English to your chosen language. Zen mode offers a simpler translation experience.
Your changes will be attributed to you in the automated PR process; you don't need to do anything else.
## Help & Questions
Please check Weblate's [documentation](https://docs.weblate.org/en/latest/index.html) or ping @psychedelicious or @blessedcoolant on Discord if you have any questions.
## Thanks
Thanks to the InvokeAI community for their efforts to translate the project!
# Coauthored by Lincoln Stein, Eugene Brodsky and Joshua Kimsey
# Copyright 2023, The InvokeAI Development Team
####
# This launch script assumes that:
# 1. it is located in the runtime directory,
@ -11,65 +16,168 @@
set -eu
# ensure we're in the correct folder in case user's CWD is somewhere else
# Ensure we're in the correct folder in case user's CWD is somewhere else
scriptdir=$(dirname "$0")
cd"$scriptdir"
. .venv/bin/activate
exportINVOKEAI_ROOT="$scriptdir"
PARAMS=$@
# set required env var for torch on mac MPS
# Check to see if dialog is installed (it seems to be fairly standard, but good tocheck regardless) and if the user has passed the --no-tui argument to disable the dialog TUI
tui=true
ifcommand -v dialog &>/dev/null;then
# This must use $@ to properly loop through the arguments passed by the user
for arg in "$@";do
if["$arg"=="--no-tui"];then
tui=false
# Remove the --no-tui argument to avoid errors later on when passing arguments to InvokeAI
PARAMS=$(echo"$PARAMS"| sed 's/--no-tui//')
break
fi
done
else
tui=false
fi
# Set required env var for torch on mac MPS
if["$(uname -s)"=="Darwin"];then
exportPYTORCH_ENABLE_MPS_FALLBACK=1
fi
if["$0" !="bash"];then
echo"Do you want to generate images using the"
echo"1. command-line"
echo"2. browser-based UI"
echo"3. run textual inversion training"
echo"4. merge models (diffusers type only)"
echo"5. open the developer console"
echo"6. re-run the configure script to download new models"
echo"7. command-line help "
echo""
read -p "Please enter 1, 2, 3, 4, 5, 6 or 7: [2] " yn
choice=${yn:='2'}
case$choice in
1)
echo"Starting the InvokeAI command-line..."
exec invokeai $@
;;
2)
echo"Starting the InvokeAI browser-based UI..."
exec invokeai --web $@
;;
3)
echo"Starting Textual Inversion:"
exec invokeai-ti --gui $@
;;
4)
echo"Merging Models:"
exec invokeai-merge --gui $@
;;
5)
echo"Developer Console:"
file_name=$(basename "${BASH_SOURCE[0]}")
bash --init-file "$file_name"
;;
6)
exec invokeai-configure --root ${INVOKEAI_ROOT}
;;
7)
exec invokeai --help
;;
*)
echo"Invalid selection"
exit;;
# Primary function for the case statement to determine user input
do_choice(){
case$1 in
1)
clear
printf"Generate images with a browser-based interface\n"
invokeai --web $PARAMS
;;
2)
clear
printf"Generate images using a command-line interface\n"
File diff suppressed because one or more lines are too long
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.