From 7b46d5f823c39dbd9e7c83c08da0216a23719abd Mon Sep 17 00:00:00 2001 From: Lincoln Stein Date: Thu, 27 Oct 2022 18:42:51 -0400 Subject: [PATCH] complete inpaint/outpaint documentation - still need to write INSTALLING-MODELS.md documentation. --- docs/features/INPAINTING.md | 84 ++++++++++++++++++++++++++-------- docs/features/OUTPAINTING.md | 88 ++++++++++++++++++------------------ 2 files changed, 108 insertions(+), 64 deletions(-) diff --git a/docs/features/INPAINTING.md b/docs/features/INPAINTING.md index fb4506abea..0308243b2f 100644 --- a/docs/features/INPAINTING.md +++ b/docs/features/INPAINTING.md @@ -149,33 +149,77 @@ region directly: invoke> medusa with cobras -I ./test-pictures/curly.png -tm hair -C20 ``` -## Outpainting +## Using the RunwayML inpainting model -Outpainting is the same as inpainting, except that the painting occurs -in the regions outside of the original image. To outpaint using the -`invoke.py` command line script, prepare an image in which the borders -to be extended are pure black. Add an alpha channel (if there isn't one -already), and make the borders completely transparent and the interior -completely opaque. If you wish to modify the interior as well, you may -create transparent holes in the transparency layer, which `img2img` will -paint into as usual. +The [RunwayML Inpainting Model +v1.5](https://huggingface.co/runwayml/stable-diffusion-inpainting) is +a specialized version of [Stable Diffusion +v1.5](https://huggingface.co/spaces/runwayml/stable-diffusion-v1-5) +that contains extra channels specifically designed to enhance +inpainting and outpainting. While it can do regular `txt2img` and +`img2img`, it really shines when filling in missing regions. It has an +almost uncanny ability to blend the new regions with existing ones in +a semantically coherent way. -Pass the image as the argument to the `-I` switch as you would for -regular inpainting. You'll likely be delighted by the results. +To install the inpainting model, follow the +[instructions](INSTALLING-MODELS.md) for installing a new model. You +may use either the CLI (`invoke.py` script) or directly edit the +`configs/models.yaml` configuration file to do this. The main thing to +watch out for is that the the model `config` option must be set up to +use `v1-inpainting-inference.yaml` rather than the `v1-inference.yaml` +file that is used by Stable Diffusion 1.4 and 1.5. -### Tips +After installation, your `models.yaml` should contain an entry that +looks like this one: -1. Do not try to expand the image too much at once. Generally it is best - to expand the margins in 64-pixel increments. 128 pixels often works, - but your mileage may vary depending on the nature of the image you are - trying to outpaint into. + inpainting-1.5: + weights: models/ldm/stable-diffusion-v1/sd-v1-5-inpainting.ckpt + description: SD inpainting v1.5 + config: configs/stable-diffusion/v1-inpainting-inference.yaml + vae: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt + width: 512 + height: 512 -2. There are a series of switches that can be used to adjust how the - inpainting algorithm operates. In particular, you can use these to - minimize the seam that sometimes appears between the original image - and the extended part. These switches are: +As shown in the example, you may include a VAE fine-tuning weights +file as well. This is strongly recommended. +To use the custom inpainting model, launch `invoke.py` with the +argument `--model inpainting-1.5` or alternatively from within the +script use the `!switch inpainting-1.5` command to load and switch to +the inpainting model. +You can now do inpainting and outpainting exactly as described above, +but there will (likely) be a noticeable improvement in +coherence. Txt2img and Img2img will work as well. + +There are a few caveats to be aware of: + +1. The inpainting model is larger than the standard model, and will + use nearly 4 GB of GPU VRAM. This makes it unlikely to run on + a 4 GB graphics card. + +2. When operating in Img2img mode, the inpainting model is much less + steerable than the standard model. It is great for making small + changes, such as changing the pattern of a fabric, or slightly + changing a subject's expression or hair, but the model will + resist making the dramatic alterations that the standard + model lets you do. + +3. While the `--hires` option works fine with the inpainting model, + some special features, such as `--embiggen` are disabled. + +4. Prompt weighting (`banana++ sushi`) and merging work well with + the inpainting model, but prompt swapping (a ("fluffy cat").swap("smiling dog") eating a hotdog`) + will not have any effect due to the way the model is set up. + You may use text masking (with `-tm thing-to-mask`) as an + effective replacement. + +5. The model tends to oversharpen image if you use high step or CFG + values. If you need to do large steps, use the standard model. + +6. The `--strength` (`-f`) option has no effect on the inpainting + model due to its fundamental differences with the standard + model. It will always take the full number of steps you specify. ## Troubleshooting diff --git a/docs/features/OUTPAINTING.md b/docs/features/OUTPAINTING.md index 1f1e1dbdfa..a6de893811 100644 --- a/docs/features/OUTPAINTING.md +++ b/docs/features/OUTPAINTING.md @@ -15,13 +15,52 @@ InvokeAI supports two versions of outpainting, one called "outpaint" and the other "outcrop." They work slightly differently and each has its advantages and drawbacks. +### Outpainting + +Outpainting is the same as inpainting, except that the painting occurs +in the regions outside of the original image. To outpaint using the +`invoke.py` command line script, prepare an image in which the borders +to be extended are pure black. Add an alpha channel (if there isn't one +already), and make the borders completely transparent and the interior +completely opaque. If you wish to modify the interior as well, you may +create transparent holes in the transparency layer, which `img2img` will +paint into as usual. + +Pass the image as the argument to the `-I` switch as you would for +regular inpainting: + + invoke> a stream by a river -I /path/to/transparent_img.png + +You'll likely be delighted by the results. + +### Tips + +1. Do not try to expand the image too much at once. Generally it is best + to expand the margins in 64-pixel increments. 128 pixels often works, + but your mileage may vary depending on the nature of the image you are + trying to outpaint into. + +2. There are a series of switches that can be used to adjust how the + inpainting algorithm operates. In particular, you can use these to + minimize the seam that sometimes appears between the original image + and the extended part. These switches are: + + --seam_size SEAM_SIZE Size of the mask around the seam between original and outpainted image (0) + --seam_blur SEAM_BLUR The amount to blur the seam inwards (0) + --seam_strength STRENGTH The img2img strength to use when filling the seam (0.7) + --seam_steps SEAM_STEPS The number of steps to use to fill the seam. (10) + --tile_size TILE_SIZE The tile size to use for filling outpaint areas (32) + ### Outcrop -The `outcrop` extension allows you to extend the image in 64 pixel -increments in any dimension. You can apply the module to any image -previously-generated by InvokeAI. Note that it will **not** work with -arbitrary photographs or Stable Diffusion images created by other -implementations. +The `outcrop` extension gives you a convenient `!fix` postprocessing +command that allows you to extend a previously-generated image in 64 +pixel increments in any direction. You can apply the module to any +image previously-generated by InvokeAI. Note that it works with +arbitrary PNG photographs, but not currently with JPG or other +formats. Outcropping is particularly effective when combined with the +[runwayML custom inpainting +model](INPAINTING.md#using-the-runwayml-inpainting-model). Consider this image: @@ -64,42 +103,3 @@ you'll get a slightly different result. You can run it repeatedly until you get an image you like. Unfortunately `!fix` does not currently respect the `-n` (`--iterations`) argument. -## Outpaint - -The `outpaint` extension does the same thing, but with subtle -differences. Starting with the same image, here is how we would add an -additional 64 pixels to the top of the image: - -```bash -invoke> !fix images/curly.png --out_direction top 64 -``` - -(you can abbreviate `--out_direction` as `-D`. - -The result is shown here: - -
-![curly_woman_outpaint](../assets/outpainting/curly-outpaint.png) -
- -Although the effect is similar, there are significant differences from -outcropping: - -- You can only specify one direction to extend at a time. -- The image is **not** resized. Instead, the image is shifted by the specified -number of pixels. If you look carefully, you'll see that less of the lady's -torso is visible in the image. -- Because the image dimensions remain the same, there's no rounding -to multiples of 64. -- Attempting to outpaint larger areas will frequently give rise to ugly - ghosting effects. -- For best results, try increasing the step number. -- If you don't specify a pixel value in `-D`, it will default to half - of the whole image, which is likely not what you want. - -!!! tip - - Neither `outpaint` nor `outcrop` are perfect, but we continue to tune - and improve them. If one doesn't work, try the other. You may also - wish to experiment with other `img2img` arguments, such as `-C`, `-f` - and `-s`.