This feature was added to prevent the CI Macintosh tests from erroring
out when patchmatch is unable to retrieve its shared library from
github assets.
* add whole <style token> to vocab for concept library embeddings
* add ability to load multiple concept .bin files
* make --log_tokenization respect custom tokens
* start working on concept downloading system
* preliminary support for dynamic loading and merging of multiple embedded models
- The embedding_manager is now enhanced with ldm.invoke.concepts_lib,
which handles dynamic downloading and caching of embedded models from
the Hugging Face concepts library (https://huggingface.co/sd-concepts-library)
- Downloading of a embedded model is triggered by the presence of one or more
<concept> tags in the prompt.
- Once the embedded model is downloaded, its trigger phrase will be loaded
into the embedding manager and the prompt's <concept> tag will be replaced
with the <trigger_phrase>
- The downloaded model stays on disk for fast loading later.
- The CLI autocomplete will complete partial <concept> tags for you. Type a
'<' and hit tab to get all ~700 concepts.
BUGS AND LIMITATIONS:
- MODEL NAME VS TRIGGER PHRASE
You must use the name of the concept embed model from the SD
library, and not the trigger phrase itself. Usually these are the
same, but not always. For example, the model named "hoi4-leaders"
corresponds to the trigger "<HOI4-Leader>"
One reason for this design choice is that there is no apparent
constraint on the uniqueness of the trigger phrases and one trigger
phrase may map onto multiple models. So we use the model name
instead.
The second reason is that there is no way I know of to search
Hugging Face for models with certain trigger phrases. So we'd have
to download all 700 models to index the phrases.
The problem this presents is that this may confuse users, who will
want to reuse prompts from distributions that use the trigger phrase
directly. Usually this will work, but not always.
- WON'T WORK ON A FIREWALLED SYSTEM
If the host running IAI has no internet connection, it can't
download the concept libraries. I will add a script that allows
users to preload a list of concept models.
- BUG IN PROMPT REPLACEMENT WHEN MODEL NOT FOUND
There's a small bug that occurs when the user provides an invalid
model name. The <concept> gets replaced with <None> in the prompt.
* fix loading .pt embeddings; allow multi-vector embeddings; warn on dupes
* simplify replacement logic and remove cuda assumption
* download list of concepts from hugging face
* remove misleading customization of '*' placeholder
the existing code as-is did not do anything; unclear what it was supposed to do.
the obvious alternative -- setting using 'placeholder_strings' instead of
'placeholder_tokens' to match model.params.personalization_config.params.placeholder_strings --
caused a crash. i think this is because the passed string also needed to be handed over
on init of the PersonalizedBase as the 'placeholder_token' argument.
this is weird config dict magic and i don't want to touch it. put a
breakpoint in personalzied.py line 116 (top of PersonalizedBase.__init__) if
you want to have a crack at it yourself.
* address all the issues raised by damian0815 in review of PR #1526
* actually resize the token_embeddings
* multiple improvements to the concept loader based on code reviews
1. Activated the --embedding_directory option (alias --embedding_path)
to load a single embedding or an entire directory of embeddings at
startup time.
2. Can turn off automatic loading of embeddings using --no-embeddings.
3. Embedding checkpoints are scanned with the pickle scanner.
4. More informative error messages when a concept can't be loaded due
either to a 404 not found error or a network error.
* autocomplete terms end with ">" now
* fix startup error and network unreachable
1. If the .invokeai file does not contain the --root and --outdir options,
invoke.py will now fix it.
2. Catch and handle network problems when downloading hugging face textual
inversion concepts.
* fix misformatted error string
Co-authored-by: Damian Stewart <d@damianstewart.com>
- If there is not already a `.invokeai` file in the user's home directory
the first time invoke.py runs, it will create an empty one with comments
showing how to customize it.
- preload_models.py has been renamed load_models.py. I've left a
shell legacy version with the previous name to avoid breaking any
code.
- The load_models.py script now takes an optional --root argument,
which points to an install directory for the models, scripts, config
files, and the default outputs directory. In the future, the
embeddings manager directory will also be stored here.
- If no --root is provided, and no init file or environment variable
is present, load_models.py will install to '.' by default, which is
the current behavior. (This has *not* been tested thoroughly.)
- The location of the root directory is stored in the file .invokeai
in the user's home directory ($HOME on Linux/Mac, or HOMEPATH on
windows). The load_models.py script creates this file if it
does not already exist.
- invoke.py and load_models.py use the following search path to find
the install directory:
1. Contents of the environment variable INVOKEAI_ROOT
2. The --root=XXXXX option in ~/.invokeai
3. The --root option passed on the script command line.
4. As a last gasp, the currently working directory (".")
Running `python scripts/load_models.py --root ~/invokeai` will
create a directory structured like this (shortened for clarity):
~/invokeai
├── configs
│ ├── models.yaml
│ └── stable-diffusion
│ ├── v1-finetune.yaml
│ ├── v1-finetune_style.yaml
│ ├── v1-inference.yaml
│ ├── v1-inpainting-inference.yaml
│ └── v1-m1-finetune.yaml
├── models
│ ├── CompVis
│ ├── bert-base-uncased
│ ├── clipseg
│ ├── codeformer
│ ├── gfpgan
│ ├── ldm
│ │ └── stable-diffusion-v1
│ │ ├── sd-v1-5-inpainting.ckpt
│ │ └── vae-ft-mse-840000-ema-pruned.ckpt
│ └── openai
├── outputs
└── scripts
├── dream.py
├── images2prompt.py
├── invoke.py
├── legacy_api.py
├── load_models.py
├── merge_embeddings.py
├── orig_scripts
│ ├── download_first_stages.sh
│ ├── train_searcher.py
│ └── txt2img.py
├── preload_models.py
└── sd-metadata.py
1. You can now run invoke.py anywhere! Just copy it to one of your
bin directories, or put the ~/invokeai/scripts onto your PATH.
2. git pulls will no longer fight with you over models.yaml
3. It keeps end users out of the source code repo and will create
a path for us to do installs from invokeai.tar.gz.
This commit does several things that improve the customizability of the CLI `outcrop` command:
1. When outcropping an image you can now add a `--new_prompt` option, to specify a new prompt to be applied to the outpainted region instead of the prompt used to generate the image.
2. Similarly you can provide a new seed using `--seed` (or `-S`). A seed less than zero will pick one randomly.
3. The metadata written into the outcropped file is now more informative about what was previously stored.
4. This PR also fixes the crash that happened when trying to outcrop an image that does not contain InvokeAI metadata.
Other changes:
- add error checking suggested by @Kyle0654
- add special case in invoke.py to allow -1 to be passed as seed.
This now only occurs for postprocessing commands. Previously, -1
caused previous seed to be used, and this still applies to generate
operations.
- When outcropping an image you can now add a `--new_prompt` option, to specify
a new prompt to be used instead of the original one used to generate the image.
- Similarly you can provide a new seed using `--seed` (or `-S`). A seed of zero
will pick one randomly.
- This PR also fixes the crash that happened when trying to outcrop an image
that does not contain InvokeAI metadata.
- Place preferred startup command switches in a file named
"invokeai.init". The file can consist of a single line of switches
such as "--web --steps=28", a series of switches on each
line, or any combination of the two.
Example:
```
--web
--host=0.0.0.0
--steps=28
--grid
-f 0.6 -C 11.0 -A k_euler_a
```
- The following options, which were previously only available within
the CLI, are now available on the command line as well:
--steps
--strength
--cfg_scale
--width
--height
--fit
- ldm.generate.Generator() now takes an argument named `max_load_models`.
This is an integer that limits the model cache size. When the cache
reaches the limit, it will start purging older models from cache.
- CLI takes an argument --max_load_models, default to 2. This will keep
one model in GPU and the other in CPU and switch back and forth
quickly.
- To not cache models at all, pass --max_load_models=1
The Args object would crap out when trying to retrieve metadata from
an image file that did not contain InvokeAI-generated metadata, such
as a JPG. This corrects that and returns dummy values (seed of zero,
prompt of '') to avoid downstream breakage.
This was a difficult merge because both PR #1108 and #1243 made
changes to obscure parts of the diffusion code.
- prompt weighting, merging and cross-attention working
- cross-attention does not work with runwayML inpainting
model, but weighting and merging are tested and working
- CLI command parsing code rewritten in order to get embedded
quotes right
- --hires now works with runwayML inpainting
- --embiggen does not work with runwayML and will give an error
- Added an --invert option to invert masks applied to inpainting
- Updated documentation
- change default model back to 1.4
- remove --fnformat from canonicalized dream prompt arguments
(not needed for image reproducibility)
- add -tm to canonicalized dream prompt arguments
(definitely needed for image reproducibility)
Now you can activate the Hugging Face `diffusers` library safety check
for NSFW and other potentially disturbing imagery.
To turn on the safety check, pass --safety_checker at the command
line. For developers, the flag is `safety_checker=True` passed to
ldm.generate.Generate(). Once the safety checker is turned on, it
cannot be turned off unless you reinitialize a new Generate object.
When the safety checker is active, suspect images will be blurred and
a warning icon is added. There is also a warning message printed in
the CLI, but it can be a little hard to see because of its positioning
in the output stream.
There is a slight but noticeable delay when the safety checker runs.
Note that invisible watermarking is *not* currently implemented. The
watermark code distributed by the CompViz distribution uses a library
that does not seem to be able to retrieve the watermarks it creates,
and it does not appear that Hugging Face `diffusers` or other SD
distributions are doing any watermarking.
- code for committing config changes to models.yaml now in module
rather than in invoke script
- model marked "default" is now loaded if model not specified on
command line
- uncache changed models when edited, so that they reload properly
- removed liaon from models.yaml and added stable-diffusion-1.5
On the command line, the new option is --text_mask or -tm.
Example:
```
invoke> a baseball -I /path/to/still_life.png -tm orange
```
This will find the orange fruit in the still life painting and replace
it with an image of a baseball.