This commit does several things that improve the customizability of the CLI `outcrop` command:
1. When outcropping an image you can now add a `--new_prompt` option, to specify a new prompt to be applied to the outpainted region instead of the prompt used to generate the image.
2. Similarly you can provide a new seed using `--seed` (or `-S`). A seed less than zero will pick one randomly.
3. The metadata written into the outcropped file is now more informative about what was previously stored.
4. This PR also fixes the crash that happened when trying to outcrop an image that does not contain InvokeAI metadata.
Other changes:
- add error checking suggested by @Kyle0654
- add special case in invoke.py to allow -1 to be passed as seed.
This now only occurs for postprocessing commands. Previously, -1
caused previous seed to be used, and this still applies to generate
operations.
- When outcropping an image you can now add a `--new_prompt` option, to specify
a new prompt to be used instead of the original one used to generate the image.
- Similarly you can provide a new seed using `--seed` (or `-S`). A seed of zero
will pick one randomly.
- This PR also fixes the crash that happened when trying to outcrop an image
that does not contain InvokeAI metadata.
- Works best with runwayML inpainting model
- Numerous code changes required to propagate seed to final metadata.
Original code predicated on the image being generated within InvokeAI.
- Works best with runwayML inpainting model
- Numerous code changes required to propagate seed to final metadata.
Original code predicated on the image being generated within InvokeAI.
- ldm.generate.Generator() now takes an argument named `max_load_models`.
This is an integer that limits the model cache size. When the cache
reaches the limit, it will start purging older models from cache.
- CLI takes an argument --max_load_models, default to 2. This will keep
one model in GPU and the other in CPU and switch back and forth
quickly.
- To not cache models at all, pass --max_load_models=1
- ldm.generate.Generator() now takes an argument named `max_load_models`.
This is an integer that limits the model cache size. When the cache
reaches the limit, it will start purging older models from cache.
- CLI takes an argument --max_load_models, default to 2. This will keep
one model in GPU and the other in CPU and switch back and forth
quickly.
- To not cache models at all, pass --max_load_models=1
This was a difficult merge because both PR #1108 and #1243 made
changes to obscure parts of the diffusion code.
- prompt weighting, merging and cross-attention working
- cross-attention does not work with runwayML inpainting
model, but weighting and merging are tested and working
- CLI command parsing code rewritten in order to get embedded
quotes right
- --hires now works with runwayML inpainting
- --embiggen does not work with runwayML and will give an error
- Added an --invert option to invert masks applied to inpainting
- Updated documentation
Now you can activate the Hugging Face `diffusers` library safety check
for NSFW and other potentially disturbing imagery.
To turn on the safety check, pass --safety_checker at the command
line. For developers, the flag is `safety_checker=True` passed to
ldm.generate.Generate(). Once the safety checker is turned on, it
cannot be turned off unless you reinitialize a new Generate object.
When the safety checker is active, suspect images will be blurred and
a warning icon is added. There is also a warning message printed in
the CLI, but it can be a little hard to see because of its positioning
in the output stream.
There is a slight but noticeable delay when the safety checker runs.
Note that invisible watermarking is *not* currently implemented. The
watermark code distributed by the CompViz distribution uses a library
that does not seem to be able to retrieve the watermarks it creates,
and it does not appear that Hugging Face `diffusers` or other SD
distributions are doing any watermarking.
To add a VAE autoencoder to an existing model:
1. Download the appropriate autoencoder and put it into
models/ldm/stable-diffusion
Note that you MUST use a VAE that was written for the
original CompViz Stable Diffusion codebase. For v1.4,
that would be the file named vae-ft-mse-840000-ema-pruned.ckpt
that you can download from https://huggingface.co/stabilityai/sd-vae-ft-mse-original
2. Edit config/models.yaml to contain the following stanza, modifying `weights`
and `vae` as required to match the weights and vae model file names. There is
no requirement to rename the VAE file.
~~~
stable-diffusion-1.4:
weights: models/ldm/stable-diffusion-v1/sd-v1-4.ckpt
description: Stable Diffusion v1.4
config: configs/stable-diffusion/v1-inference.yaml
vae: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
width: 512
height: 512
~~~
3. Alternatively from within the `invoke.py` CLI, you may use the command
`!editmodel stable-diffusion-1.4` to bring up a simple editor that will
allow you to add the path to the VAE.
4. If you are just installing InvokeAI for the first time, you can also
use `!import_model models/ldm/stable-diffusion/sd-v1.4.ckpt` instead
to create the configuration from scratch.
5. That's it!
- code for committing config changes to models.yaml now in module
rather than in invoke script
- model marked "default" is now loaded if model not specified on
command line
- uncache changed models when edited, so that they reload properly
- removed liaon from models.yaml and added stable-diffusion-1.5
- The !mask command takes an image path, a text prompt, and
(optionally) a masking threshold. It creates a mask over the region
indicated by the prompt, and outputs several files that show which
regions will be masked by the chosen prompt and threshold.
- The mask images should not be passed directly to img2img because
they are designed for visualization only. Instead, use the
--text_mask option to pass the selected prompt and threshold.
- See docs/features/INPAINTING.md for details.
- In CLI: the argument is --png_compression <0..9> (-z<0..9>)
- In API, pass `compress_level` to PngWriter.save_image_and_prompt_to_png()
Compression ranges from 0 (no compression) to 9 (maximum compression).
Default value is 6 (as specified by Pillow package).
This addresses an issue first raised in #652.
- --inpaint_replace 0.X will cause inpainting to ignore what is under
the masked region with a strength ranging from 0 (don't ignore at all)
to 1.0 (ignore completely)
- sync with upstream development
- update docs
- Error checks for invalid model
- Add !del_model command to invoke.py
- Add del_model() method to model_cache
- Autocompleter kept in sync with model addition/subtraction.
At step counts greater than ~75, the ksamplers start producing noisy
images when using the Karras noise schedule. This PR reverts to using
the model's own noise schedule, which eliminates the problem at the
cost of slowing convergence at lower step counts.
This PR also introduces a new CLI `--save_intermediates <n>' argument,
which will save every nth intermediate image into a subdirectory
named `intermediates/<image_prefix>'.
Addresses issue #1083.
At step counts greater than ~75, the ksamplers start producing noisy
images when using the Karras noise schedule. This PR reverts to using
the model's own noise schedule, which eliminates the problem at the
cost of slowing convergence at lower step counts.
This PR also introduces a new CLI `--save_intermediates <n>' argument,
which will save every nth intermediate image into a subdirectory
named `intermediates/<image_prefix>'.
Addresses issue #1083.
- !import_model <path/to/model/weights> will import a new model,
prompt the user for its name and description, write it to the
models.yaml file, and load it.
- !edit_model <model_name> will bring up a previously-defined model
and prompt the user to edit its descriptive fields.
Example of !import_model
<pre>
invoke> <b>!import_model models/ldm/stable-diffusion-v1/model-epoch08-float16.ckpt</b>
>> Model import in process. Please enter the values needed to configure this model:
Name for this model: <b>waifu-diffusion</b>
Description of this model: <b>Waifu Diffusion v1.3</b>
Configuration file for this model: <b>configs/stable-diffusion/v1-inference.yaml</b>
Default image width: <b>512</b>
Default image height: <b>512</b>
>> New configuration:
waifu-diffusion:
config: configs/stable-diffusion/v1-inference.yaml
description: Waifu Diffusion v1.3
height: 512
weights: models/ldm/stable-diffusion-v1/model-epoch08-float16.ckpt
width: 512
OK to import [n]? <b>y</b>
>> Caching model stable-diffusion-1.4 in system RAM
>> Loading waifu-diffusion from models/ldm/stable-diffusion-v1/model-epoch08-float16.ckpt
| LatentDiffusion: Running in eps-prediction mode
| DiffusionWrapper has 859.52 M params.
| Making attention of type 'vanilla' with 512 in_channels
| Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
| Making attention of type 'vanilla' with 512 in_channels
| Using faster float16 precision
</pre>
Example of !edit_model
<pre>
invoke> <b>!edit_model waifu-diffusion</b>
>> Editing model waifu-diffusion from configuration file ./configs/models.yaml
description: <b>Waifu diffusion v1.4beta</b>
weights: models/ldm/stable-diffusion-v1/<b>model-epoch10-float16.ckpt</b>
config: configs/stable-diffusion/v1-inference.yaml
width: 512
height: 512
>> New configuration:
waifu-diffusion:
config: configs/stable-diffusion/v1-inference.yaml
description: Waifu diffusion v1.4beta
weights: models/ldm/stable-diffusion-v1/model-epoch10-float16.ckpt
height: 512
width: 512
OK to import [n]? y
>> Caching model stable-diffusion-1.4 in system RAM
>> Loading waifu-diffusion from models/ldm/stable-diffusion-v1/model-epoch10-float16.ckpt
...
</pre>
- This PR enables two new commands in the invoke.py script
!models -- list the available models and their cache status
!switch <model> -- switch to the indicated model
Example:
invoke> !models
laion400m not loaded Latent Diffusion LAION400M model
stable-diffusion-1.4 active Stable Diffusion inference model version 1.4
waifu-1.3 cached Waifu anime model version 1.3
invoke> !switch waifu-1.3
>> Caching model stable-diffusion-1.4 in system RAM
>> Retrieving model waifu-1.3 from system RAM cache
The name and descriptions of the models are taken from
`config/models.yaml`. A future enhancement to `model_cache.py` will be
to enable new model stanzas to be added to the file
programmatically. This will be useful for the WebGUI.
More details:
- Use fast switching algorithm described in PR #948
- Models are selected using their configuration stanza name
given in models.yaml.
- To avoid filling up CPU RAM with cached models, this PR
implements an LRU cache that monitors available CPU RAM.
- The caching code allows the minimum value of available RAM
to be adjusted, but invoke.py does not currently have a
command-line argument that allows you to set it. The
minimum free RAM is arbitrarily set to 2 GB.
- Add optional description field to configs/models.yaml
Unrelated fixes:
- Added ">>" to CompViz model loading messages in order to make user experience
more consistent.
- When generating an image greater than defaults, will only warn about possible
VRAM filling the first time.
- Fixed bug that was causing help message to be printed twice. This involved
moving the import line for the web backend into the section where it is
called.
Coauthored by: @ArDiouscuros
- rename dream.py to invoke.py
- create a compatibility script named dream.py that execs() invoke.py
- redo documentation
- change help message in args
- this does **not** rename the libraries, which are still ldm.dream.util, etc