- The plms sampler now works with custom inpainting model
- Quashed bug that was causing generation on normal models to fail (oops!)
- Can now generate non-square images with custom inpainting model
Credits for advice and assistance during porting:
@any-winter-4079 (http://github.com/any-winter-4079)
@db3000 (Danny Beer http://github.com/db3000)
- The plms sampler now works with custom inpainting model
- Quashed bug that was causing generation on normal models to fail (oops!)
- Can now generate non-square images with custom inpainting model
This is still a work in progress but seems functional. It supports
inpainting, txt2img and img2img on the ddim and k* samplers (plms
still needs work, but I know what to do).
To test this, get the file `sd-v1-5-inpainting.ckpt' from
https://huggingface.co/runwayml/stable-diffusion-inpainting and place it
at `models/ldm/stable-diffusion-v1/sd-v1-5-inpainting.ckpt`
Launch invoke.py with --model inpainting-1.5 and proceed as usual.
Caveats:
1. The inpainting model takes about 800 Mb more memory than the standard
1.5 model. This model will not work on 4 GB cards.
2. The inpainting model is temperamental. It wants you to describe the
entire scene and not just the masked area to replace. So if you want
to replace the parrot on a man's shoulder with a crow, the prompt
"crow" may fail. Try "man with a crow on shoulder" instead. The
symptom of a failed inpainting is that the area will be erased and
replaced with background.
3. This has not been tested well. Please report bugs.
- This is a merge of the final version of PR #1218 "Inpainting
Improvements"
Various merge conflicts made it easier to commit directly.
Author: Kyle0654
Co-Author: lstein
1. If tensors are passed to inpaint as init_image and/or init_mask, then
the post-generation image fixup code will be skipped.
2. Post-generation image fixup will work with either a black and white "L"
or "RGB" mask, or an "RGBA" mask.
- pass a PIL.Image to img2img and inpaint rather than tensor
- To support clipseg, inpaint needs to accept an "L" or "1" format
mask. Made the appropriate change.
To add a VAE autoencoder to an existing model:
1. Download the appropriate autoencoder and put it into
models/ldm/stable-diffusion
Note that you MUST use a VAE that was written for the
original CompViz Stable Diffusion codebase. For v1.4,
that would be the file named vae-ft-mse-840000-ema-pruned.ckpt
that you can download from https://huggingface.co/stabilityai/sd-vae-ft-mse-original
2. Edit config/models.yaml to contain the following stanza, modifying `weights`
and `vae` as required to match the weights and vae model file names. There is
no requirement to rename the VAE file.
~~~
stable-diffusion-1.4:
weights: models/ldm/stable-diffusion-v1/sd-v1-4.ckpt
description: Stable Diffusion v1.4
config: configs/stable-diffusion/v1-inference.yaml
vae: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
width: 512
height: 512
~~~
3. Alternatively from within the `invoke.py` CLI, you may use the command
`!editmodel stable-diffusion-1.4` to bring up a simple editor that will
allow you to add the path to the VAE.
4. If you are just installing InvokeAI for the first time, you can also
use `!import_model models/ldm/stable-diffusion/sd-v1.4.ckpt` instead
to create the configuration from scratch.
5. That's it!
1. If tensors are passed to inpaint as init_image and/or init_mask, then
the post-generation image fixup code will be skipped.
2. Post-generation image fixup will work with either a black and white "L"
or "RGB" mask, or an "RGBA" mask.
- pass a PIL.Image to img2img and inpaint rather than tensor
- To support clipseg, inpaint needs to accept an "L" or "1" format
mask. Made the appropriate change.
To add a VAE autoencoder to an existing model:
1. Download the appropriate autoencoder and put it into
models/ldm/stable-diffusion
Note that you MUST use a VAE that was written for the
original CompViz Stable Diffusion codebase. For v1.4,
that would be the file named vae-ft-mse-840000-ema-pruned.ckpt
that you can download from https://huggingface.co/stabilityai/sd-vae-ft-mse-original
2. Edit config/models.yaml to contain the following stanza, modifying `weights`
and `vae` as required to match the weights and vae model file names. There is
no requirement to rename the VAE file.
~~~
stable-diffusion-1.4:
weights: models/ldm/stable-diffusion-v1/sd-v1-4.ckpt
description: Stable Diffusion v1.4
config: configs/stable-diffusion/v1-inference.yaml
vae: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
width: 512
height: 512
~~~
3. Alternatively from within the `invoke.py` CLI, you may use the command
`!editmodel stable-diffusion-1.4` to bring up a simple editor that will
allow you to add the path to the VAE.
4. If you are just installing InvokeAI for the first time, you can also
use `!import_model models/ldm/stable-diffusion/sd-v1.4.ckpt` instead
to create the configuration from scratch.
5. That's it!
Ironically, the black and white mask file generated by the
`invoke> !mask` command could not be passed as the mask to
`img2img`. This is now fixed and the documentation updated.
- remove unsupported testtubelogger, use csvlogger instead
- fix logic for parsing --gpus option so that it won't crash if
trailing comma absent
- change trainer accelerator from unsupported 'ddp' to 'auto'
- code for committing config changes to models.yaml now in module
rather than in invoke script
- model marked "default" is now loaded if model not specified on
command line
- uncache changed models when edited, so that they reload properly
- removed liaon from models.yaml and added stable-diffusion-1.5
- The !mask command takes an image path, a text prompt, and
(optionally) a masking threshold. It creates a mask over the region
indicated by the prompt, and outputs several files that show which
regions will be masked by the chosen prompt and threshold.
- The mask images should not be passed directly to img2img because
they are designed for visualization only. Instead, use the
--text_mask option to pass the selected prompt and threshold.
- See docs/features/INPAINTING.md for details.