mirror of
https://github.com/invoke-ai/InvokeAI
synced 2024-08-30 20:32:17 +00:00
updated postprocessing, prompts, img2img and web docs
This commit is contained in:
parent
00cb8a0c64
commit
a0b6654f6a
@ -494,7 +494,7 @@ sections describe what's new for InvokeAI.
|
|||||||
[Manual Installation](installation/020_INSTALL_MANUAL.md).
|
[Manual Installation](installation/020_INSTALL_MANUAL.md).
|
||||||
- The ability to save frequently-used startup options (model to load, steps,
|
- The ability to save frequently-used startup options (model to load, steps,
|
||||||
sampler, etc) in a `.invokeai` file. See
|
sampler, etc) in a `.invokeai` file. See
|
||||||
[Client](features/CLI.md)
|
[Client](deprecated/CLI.md)
|
||||||
- Support for AMD GPU cards (non-CUDA) on Linux machines.
|
- Support for AMD GPU cards (non-CUDA) on Linux machines.
|
||||||
- Multiple bugs and edge cases squashed.
|
- Multiple bugs and edge cases squashed.
|
||||||
|
|
||||||
@ -617,7 +617,7 @@ sections describe what's new for InvokeAI.
|
|||||||
- `dream.py` script renamed `invoke.py`. A `dream.py` script wrapper remains for
|
- `dream.py` script renamed `invoke.py`. A `dream.py` script wrapper remains for
|
||||||
backward compatibility.
|
backward compatibility.
|
||||||
- Completely new WebGUI - launch with `python3 scripts/invoke.py --web`
|
- Completely new WebGUI - launch with `python3 scripts/invoke.py --web`
|
||||||
- Support for [inpainting](features/INPAINTING.md) and
|
- Support for [inpainting](deprecated/INPAINTING.md) and
|
||||||
[outpainting](features/OUTPAINTING.md)
|
[outpainting](features/OUTPAINTING.md)
|
||||||
- img2img runs on all k\* samplers
|
- img2img runs on all k\* samplers
|
||||||
- Support for
|
- Support for
|
||||||
@ -629,7 +629,7 @@ sections describe what's new for InvokeAI.
|
|||||||
using facial reconstruction, ESRGAN upscaling, outcropping (similar to DALL-E
|
using facial reconstruction, ESRGAN upscaling, outcropping (similar to DALL-E
|
||||||
infinite canvas), and "embiggen" upscaling. See the `!fix` command.
|
infinite canvas), and "embiggen" upscaling. See the `!fix` command.
|
||||||
- New `--hires` option on `invoke>` line allows
|
- New `--hires` option on `invoke>` line allows
|
||||||
[larger images to be created without duplicating elements](features/CLI.md#this-is-an-example-of-txt2img),
|
[larger images to be created without duplicating elements](deprecated/CLI.md#this-is-an-example-of-txt2img),
|
||||||
at the cost of some performance.
|
at the cost of some performance.
|
||||||
- New `--perlin` and `--threshold` options allow you to add and control
|
- New `--perlin` and `--threshold` options allow you to add and control
|
||||||
variation during image generation (see
|
variation during image generation (see
|
||||||
@ -638,7 +638,7 @@ sections describe what's new for InvokeAI.
|
|||||||
of images and tweaking of previous settings.
|
of images and tweaking of previous settings.
|
||||||
- Command-line completion in `invoke.py` now works on Windows, Linux and Mac
|
- Command-line completion in `invoke.py` now works on Windows, Linux and Mac
|
||||||
platforms.
|
platforms.
|
||||||
- Improved [command-line completion behavior](features/CLI.md) New commands
|
- Improved [command-line completion behavior](deprecated/CLI.md) New commands
|
||||||
added:
|
added:
|
||||||
- List command-line history with `!history`
|
- List command-line history with `!history`
|
||||||
- Search command-line history with `!search`
|
- Search command-line history with `!search`
|
||||||
|
BIN
docs/assets/features/restoration-montage.png
Normal file
BIN
docs/assets/features/restoration-montage.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 4.0 MiB |
BIN
docs/assets/features/upscale-dialog.png
Normal file
BIN
docs/assets/features/upscale-dialog.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 310 KiB |
BIN
docs/assets/features/upscaling-montage.png
Normal file
BIN
docs/assets/features/upscaling-montage.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 8.3 MiB |
@ -205,14 +205,14 @@ Here are the invoke> command that apply to txt2img:
|
|||||||
| `--seamless` | | `False` | Activate seamless tiling for interesting effects |
|
| `--seamless` | | `False` | Activate seamless tiling for interesting effects |
|
||||||
| `--seamless_axes` | | `x,y` | Specify which axes to use circular convolution on. |
|
| `--seamless_axes` | | `x,y` | Specify which axes to use circular convolution on. |
|
||||||
| `--log_tokenization` | `-t` | `False` | Display a color-coded list of the parsed tokens derived from the prompt |
|
| `--log_tokenization` | `-t` | `False` | Display a color-coded list of the parsed tokens derived from the prompt |
|
||||||
| `--skip_normalization` | `-x` | `False` | Weighted subprompts will not be normalized. See [Weighted Prompts](./OTHER.md#weighted-prompts) |
|
| `--skip_normalization` | `-x` | `False` | Weighted subprompts will not be normalized. See [Weighted Prompts](../features/OTHER.md#weighted-prompts) |
|
||||||
| `--upscale <int> <float>` | `-U <int> <float>` | `-U 1 0.75` | Upscale image by magnification factor (2, 4), and set strength of upscaling (0.0-1.0). If strength not set, will default to 0.75. |
|
| `--upscale <int> <float>` | `-U <int> <float>` | `-U 1 0.75` | Upscale image by magnification factor (2, 4), and set strength of upscaling (0.0-1.0). If strength not set, will default to 0.75. |
|
||||||
| `--facetool_strength <float>` | `-G <float> ` | `-G0` | Fix faces (defaults to using the GFPGAN algorithm); argument indicates how hard the algorithm should try (0.0-1.0) |
|
| `--facetool_strength <float>` | `-G <float> ` | `-G0` | Fix faces (defaults to using the GFPGAN algorithm); argument indicates how hard the algorithm should try (0.0-1.0) |
|
||||||
| `--facetool <name>` | `-ft <name>` | `-ft gfpgan` | Select face restoration algorithm to use: gfpgan, codeformer |
|
| `--facetool <name>` | `-ft <name>` | `-ft gfpgan` | Select face restoration algorithm to use: gfpgan, codeformer |
|
||||||
| `--codeformer_fidelity` | `-cf <float>` | `0.75` | Used along with CodeFormer. Takes values between 0 and 1. 0 produces high quality but low accuracy. 1 produces high accuracy but low quality |
|
| `--codeformer_fidelity` | `-cf <float>` | `0.75` | Used along with CodeFormer. Takes values between 0 and 1. 0 produces high quality but low accuracy. 1 produces high accuracy but low quality |
|
||||||
| `--save_original` | `-save_orig` | `False` | When upscaling or fixing faces, this will cause the original image to be saved rather than replaced. |
|
| `--save_original` | `-save_orig` | `False` | When upscaling or fixing faces, this will cause the original image to be saved rather than replaced. |
|
||||||
| `--variation <float>` | `-v<float>` | `0.0` | Add a bit of noise (0.0=none, 1.0=high) to the image in order to generate a series of variations. Usually used in combination with `-S<seed>` and `-n<int>` to generate a series a riffs on a starting image. See [Variations](./VARIATIONS.md). |
|
| `--variation <float>` | `-v<float>` | `0.0` | Add a bit of noise (0.0=none, 1.0=high) to the image in order to generate a series of variations. Usually used in combination with `-S<seed>` and `-n<int>` to generate a series a riffs on a starting image. See [Variations](../features/VARIATIONS.md). |
|
||||||
| `--with_variations <pattern>` | | `None` | Combine two or more variations. See [Variations](./VARIATIONS.md) for now to use this. |
|
| `--with_variations <pattern>` | | `None` | Combine two or more variations. See [Variations](../features/VARIATIONS.md) for now to use this. |
|
||||||
| `--save_intermediates <n>` | | `None` | Save the image from every nth step into an "intermediates" folder inside the output directory |
|
| `--save_intermediates <n>` | | `None` | Save the image from every nth step into an "intermediates" folder inside the output directory |
|
||||||
| `--h_symmetry_time_pct <float>` | | `None` | Create symmetry along the X axis at the desired percent complete of the generation process. (Must be between 0.0 and 1.0; set to a very small number like 0.0001 for just after the first step of generation.) |
|
| `--h_symmetry_time_pct <float>` | | `None` | Create symmetry along the X axis at the desired percent complete of the generation process. (Must be between 0.0 and 1.0; set to a very small number like 0.0001 for just after the first step of generation.) |
|
||||||
| `--v_symmetry_time_pct <float>` | | `None` | Create symmetry along the Y axis at the desired percent complete of the generation process. (Must be between 0.0 and 1.0; set to a very small number like 0.0001 for just after the first step of generation.) |
|
| `--v_symmetry_time_pct <float>` | | `None` | Create symmetry along the Y axis at the desired percent complete of the generation process. (Must be between 0.0 and 1.0; set to a very small number like 0.0001 for just after the first step of generation.) |
|
||||||
@ -257,7 +257,7 @@ additional options:
|
|||||||
by `-M`. You may also supply just a single initial image with the areas
|
by `-M`. You may also supply just a single initial image with the areas
|
||||||
to overpaint made transparent, but you must be careful not to destroy
|
to overpaint made transparent, but you must be careful not to destroy
|
||||||
the pixels underneath when you create the transparent areas. See
|
the pixels underneath when you create the transparent areas. See
|
||||||
[Inpainting](./INPAINTING.md) for details.
|
[Inpainting](INPAINTING.md) for details.
|
||||||
|
|
||||||
inpainting accepts all the arguments used for txt2img and img2img, as well as
|
inpainting accepts all the arguments used for txt2img and img2img, as well as
|
||||||
the --mask (-M) and --text_mask (-tm) arguments:
|
the --mask (-M) and --text_mask (-tm) arguments:
|
||||||
@ -297,7 +297,7 @@ invoke> a piece of cake -I /path/to/breakfast.png -tm bagel 0.6
|
|||||||
|
|
||||||
You can load and use hundreds of community-contributed Textual
|
You can load and use hundreds of community-contributed Textual
|
||||||
Inversion models just by typing the appropriate trigger phrase. Please
|
Inversion models just by typing the appropriate trigger phrase. Please
|
||||||
see [Concepts Library](CONCEPTS.md) for more details.
|
see [Concepts Library](../features/CONCEPTS.md) for more details.
|
||||||
|
|
||||||
## Other Commands
|
## Other Commands
|
||||||
|
|
@ -4,86 +4,13 @@ title: Image-to-Image
|
|||||||
|
|
||||||
# :material-image-multiple: Image-to-Image
|
# :material-image-multiple: Image-to-Image
|
||||||
|
|
||||||
Both the Web and command-line interfaces provide an "img2img" feature
|
InvokeAI provides an "img2img" feature that lets you seed your
|
||||||
that lets you seed your creations with an initial drawing or
|
creations with an initial drawing or photo. This is a really cool
|
||||||
photo. This is a really cool feature that tells stable diffusion to
|
feature that tells stable diffusion to build the prompt on top of the
|
||||||
build the prompt on top of the image you provide, preserving the
|
image you provide, preserving the original's basic shape and layout.
|
||||||
original's basic shape and layout.
|
|
||||||
|
|
||||||
See the [WebUI Guide](WEB.md) for a walkthrough of the img2img feature
|
For a walkthrough of using Image-to-Image in the Web UI, see [InvokeAI
|
||||||
in the InvokeAI web server. This document describes how to use img2img
|
Web Server](./WEB.md#image-to-image).
|
||||||
in the command-line tool.
|
|
||||||
|
|
||||||
## Basic Usage
|
|
||||||
|
|
||||||
Launch the command-line client by launching `invoke.sh`/`invoke.bat`
|
|
||||||
and choosing option (1). Alternative, activate the InvokeAI
|
|
||||||
environment and issue the command `invokeai`.
|
|
||||||
|
|
||||||
Once the `invoke> ` prompt appears, you can start an img2img render by
|
|
||||||
pointing to a seed file with the `-I` option as shown here:
|
|
||||||
|
|
||||||
!!! example ""
|
|
||||||
|
|
||||||
```commandline
|
|
||||||
tree on a hill with a river, nature photograph, national geographic -I./test-pictures/tree-and-river-sketch.png -f 0.85
|
|
||||||
```
|
|
||||||
|
|
||||||
<figure markdown>
|
|
||||||
|
|
||||||
| original image | generated image |
|
|
||||||
| :------------: | :-------------: |
|
|
||||||
| { width=320 } | { width=320 } |
|
|
||||||
|
|
||||||
</figure>
|
|
||||||
|
|
||||||
The `--init_img` (`-I`) option gives the path to the seed picture. `--strength`
|
|
||||||
(`-f`) controls how much the original will be modified, ranging from `0.0` (keep
|
|
||||||
the original intact), to `1.0` (ignore the original completely). The default is
|
|
||||||
`0.75`, and ranges from `0.25-0.90` give interesting results. Other relevant
|
|
||||||
options include `-C` (classification free guidance scale), and `-s` (steps).
|
|
||||||
Unlike `txt2img`, adding steps will continuously change the resulting image and
|
|
||||||
it will not converge.
|
|
||||||
|
|
||||||
You may also pass a `-v<variation_amount>` option to generate `-n<iterations>`
|
|
||||||
count variants on the original image. This is done by passing the first
|
|
||||||
generated image back into img2img the requested number of times. It generates
|
|
||||||
interesting variants.
|
|
||||||
|
|
||||||
Note that the prompt makes a big difference. For example, this slight variation
|
|
||||||
on the prompt produces a very different image:
|
|
||||||
|
|
||||||
<figure markdown>
|
|
||||||
{ width=320 }
|
|
||||||
<caption markdown>photograph of a tree on a hill with a river</caption>
|
|
||||||
</figure>
|
|
||||||
|
|
||||||
!!! tip
|
|
||||||
|
|
||||||
When designing prompts, think about how the images scraped from the internet were
|
|
||||||
captioned. Very few photographs will be labeled "photograph" or "photorealistic."
|
|
||||||
They will, however, be captioned with the publication, photographer, camera model,
|
|
||||||
or film settings.
|
|
||||||
|
|
||||||
If the initial image contains transparent regions, then Stable Diffusion will
|
|
||||||
only draw within the transparent regions, a process called
|
|
||||||
[`inpainting`](./INPAINTING.md#creating-transparent-regions-for-inpainting).
|
|
||||||
However, for this to work correctly, the color information underneath the
|
|
||||||
transparent needs to be preserved, not erased.
|
|
||||||
|
|
||||||
!!! warning "**IMPORTANT ISSUE** "
|
|
||||||
|
|
||||||
`img2img` does not work properly on initial images smaller
|
|
||||||
than 512x512. Please scale your image to at least 512x512 before using it.
|
|
||||||
Larger images are not a problem, but may run out of VRAM on your GPU card. To
|
|
||||||
fix this, use the --fit option, which downscales the initial image to fit within
|
|
||||||
the box specified by width x height:
|
|
||||||
|
|
||||||
```
|
|
||||||
tree on a hill with a river, national geographic -I./test-pictures/big-sketch.png -H512 -W512 --fit
|
|
||||||
```
|
|
||||||
|
|
||||||
## How does it actually work, though?
|
|
||||||
|
|
||||||
The main difference between `img2img` and `prompt2img` is the starting point.
|
The main difference between `img2img` and `prompt2img` is the starting point.
|
||||||
While `prompt2img` always starts with pure gaussian noise and progressively
|
While `prompt2img` always starts with pure gaussian noise and progressively
|
||||||
@ -99,10 +26,6 @@ seed `1592514025` develops something like this:
|
|||||||
|
|
||||||
!!! example ""
|
!!! example ""
|
||||||
|
|
||||||
```bash
|
|
||||||
invoke> "fire" -s10 -W384 -H384 -S1592514025
|
|
||||||
```
|
|
||||||
|
|
||||||
<figure markdown>
|
<figure markdown>
|
||||||
{ width=720 }
|
{ width=720 }
|
||||||
</figure>
|
</figure>
|
||||||
@ -157,17 +80,8 @@ Diffusion has less chance to refine itself, so the result ends up inheriting all
|
|||||||
the problems of my bad drawing.
|
the problems of my bad drawing.
|
||||||
|
|
||||||
If you want to try this out yourself, all of these are using a seed of
|
If you want to try this out yourself, all of these are using a seed of
|
||||||
`1592514025` with a width/height of `384`, step count `10`, the default sampler
|
`1592514025` with a width/height of `384`, step count `10`, the
|
||||||
(`k_lms`), and the single-word prompt `"fire"`:
|
`k_lms` sampler, and the single-word prompt `"fire"`.
|
||||||
|
|
||||||
```bash
|
|
||||||
invoke> "fire" -s10 -W384 -H384 -S1592514025 -I /tmp/fire-drawing.png --strength 0.7
|
|
||||||
```
|
|
||||||
|
|
||||||
The code for rendering intermediates is on my (damian0815's) branch
|
|
||||||
[document-img2img](https://github.com/damian0815/InvokeAI/tree/document-img2img) -
|
|
||||||
run `invoke.py` and check your `outputs/img-samples/intermediates` folder while
|
|
||||||
generating an image.
|
|
||||||
|
|
||||||
### Compensating for the reduced step count
|
### Compensating for the reduced step count
|
||||||
|
|
||||||
@ -180,10 +94,6 @@ give each generation 20 steps.
|
|||||||
Here's strength `0.4` (note step count `50`, which is `20 ÷ 0.4` to make sure SD
|
Here's strength `0.4` (note step count `50`, which is `20 ÷ 0.4` to make sure SD
|
||||||
does `20` steps from my image):
|
does `20` steps from my image):
|
||||||
|
|
||||||
```bash
|
|
||||||
invoke> "fire" -s50 -W384 -H384 -S1592514025 -I /tmp/fire-drawing.png -f 0.4
|
|
||||||
```
|
|
||||||
|
|
||||||
<figure markdown>
|
<figure markdown>
|
||||||

|

|
||||||
</figure>
|
</figure>
|
||||||
@ -191,10 +101,6 @@ invoke> "fire" -s50 -W384 -H384 -S1592514025 -I /tmp/fire-drawing.png -f 0.4
|
|||||||
and here is strength `0.7` (note step count `30`, which is roughly `20 ÷ 0.7` to
|
and here is strength `0.7` (note step count `30`, which is roughly `20 ÷ 0.7` to
|
||||||
make sure SD does `20` steps from my image):
|
make sure SD does `20` steps from my image):
|
||||||
|
|
||||||
```commandline
|
|
||||||
invoke> "fire" -s30 -W384 -H384 -S1592514025 -I /tmp/fire-drawing.png -f 0.7
|
|
||||||
```
|
|
||||||
|
|
||||||
<figure markdown>
|
<figure markdown>
|
||||||

|

|
||||||
</figure>
|
</figure>
|
||||||
|
@ -8,12 +8,6 @@ title: Postprocessing
|
|||||||
|
|
||||||
This extension provides the ability to restore faces and upscale images.
|
This extension provides the ability to restore faces and upscale images.
|
||||||
|
|
||||||
Face restoration and upscaling can be applied at the time you generate the
|
|
||||||
images, or at any later time against a previously-generated PNG file, using the
|
|
||||||
[!fix](#fixing-previously-generated-images) command.
|
|
||||||
[Outpainting and outcropping](OUTPAINTING.md) can only be applied after the
|
|
||||||
fact.
|
|
||||||
|
|
||||||
## Face Fixing
|
## Face Fixing
|
||||||
|
|
||||||
The default face restoration module is GFPGAN. The default upscale is
|
The default face restoration module is GFPGAN. The default upscale is
|
||||||
@ -23,8 +17,7 @@ Real-ESRGAN. For an alternative face restoration module, see
|
|||||||
As of version 1.14, environment.yaml will install the Real-ESRGAN package into
|
As of version 1.14, environment.yaml will install the Real-ESRGAN package into
|
||||||
the standard install location for python packages, and will put GFPGAN into a
|
the standard install location for python packages, and will put GFPGAN into a
|
||||||
subdirectory of "src" in the InvokeAI directory. Upscaling with Real-ESRGAN
|
subdirectory of "src" in the InvokeAI directory. Upscaling with Real-ESRGAN
|
||||||
should "just work" without further intervention. Simply pass the `--upscale`
|
should "just work" without further intervention. Simply indicate the desired scale on
|
||||||
(`-U`) option on the `invoke>` command line, or indicate the desired scale on
|
|
||||||
the popup in the Web GUI.
|
the popup in the Web GUI.
|
||||||
|
|
||||||
**GFPGAN** requires a series of downloadable model files to work. These are
|
**GFPGAN** requires a series of downloadable model files to work. These are
|
||||||
@ -41,48 +34,75 @@ reconstruction.
|
|||||||
|
|
||||||
### Upscaling
|
### Upscaling
|
||||||
|
|
||||||
`-U : <upscaling_factor> <upscaling_strength>`
|
Open the upscaling dialog by clicking on the "expand" icon located
|
||||||
|
above the image display area in the Web UI:
|
||||||
|
|
||||||
The upscaling prompt argument takes two values. The first value is a scaling
|
<figure markdown>
|
||||||
factor and should be set to either `2` or `4` only. This will either scale the
|

|
||||||
image 2x or 4x respectively using different models.
|
</figure>
|
||||||
|
|
||||||
You can set the scaling stength between `0` and `1.0` to control intensity of
|
There are three different upscaling parameters that you can
|
||||||
the of the scaling. This is handy because AI upscalers generally tend to smooth
|
adjust. The first is the scale itself, either 2x or 4x.
|
||||||
out texture details. If you wish to retain some of those for natural looking
|
|
||||||
results, we recommend using values between `0.5 to 0.8`.
|
|
||||||
|
|
||||||
If you do not explicitly specify an upscaling_strength, it will default to 0.75.
|
The second is the "Denoising Strength." Higher values will smooth out
|
||||||
|
the image and remove digital chatter, but may lose fine detail at
|
||||||
|
higher values.
|
||||||
|
|
||||||
|
Third, "Upscale Strength" allows you to adjust how the You can set the
|
||||||
|
scaling stength between `0` and `1.0` to control the intensity of the
|
||||||
|
scaling. AI upscalers generally tend to smooth out texture details. If
|
||||||
|
you wish to retain some of those for natural looking results, we
|
||||||
|
recommend using values between `0.5 to 0.8`.
|
||||||
|
|
||||||
|
[This figure](../assets/features/upscaling-montage.png) illustrates
|
||||||
|
the effects of denoising and strength. The original image was 512x512,
|
||||||
|
4x scaled to 2048x2048. The "original" version on the upper left was
|
||||||
|
scaled using simple pixel averaging. The remainder use the ESRGAN
|
||||||
|
upscaling algorithm at different levels of denoising and strength.
|
||||||
|
|
||||||
|
<figure markdown>
|
||||||
|
{ width=720 }
|
||||||
|
</figure>
|
||||||
|
|
||||||
|
Both denoising and strength default to 0.75.
|
||||||
|
|
||||||
### Face Restoration
|
### Face Restoration
|
||||||
|
|
||||||
`-G : <facetool_strength>`
|
InvokeAI offers alternative two face restoration algorithms,
|
||||||
|
[GFPGAN](https://github.com/TencentARC/GFPGAN) and
|
||||||
|
[CodeFormer](https://huggingface.co/spaces/sczhou/CodeFormer). These
|
||||||
|
algorithms improve the appearance of faces, particularly eyes and
|
||||||
|
mouths. Issues with faces are less common with the latest set of
|
||||||
|
Stable Diffusion models than with the original 1.4 release, but the
|
||||||
|
restoration algorithms can still make a noticeable improvement in
|
||||||
|
certain cases. You can also apply restoration to old photographs you
|
||||||
|
upload.
|
||||||
|
|
||||||
This prompt argument controls the strength of the face restoration that is being
|
To access face restoration, click the "smiley face" icon in the
|
||||||
applied. Similar to upscaling, values between `0.5 to 0.8` are recommended.
|
toolbar above the InvokeAI image panel. You will be presented with a
|
||||||
|
dialog that offers a choice between the two algorithm and sliders that
|
||||||
|
allow you to adjust their parameters. Alternatively, you may open the
|
||||||
|
left-hand accordion panel labeled "Face Restoration" and have the
|
||||||
|
restoration algorithm of your choice applied to generated images
|
||||||
|
automatically.
|
||||||
|
|
||||||
You can use either one or both without any conflicts. In cases where you use
|
|
||||||
both, the image will be first upscaled and then the face restoration process
|
|
||||||
will be executed to ensure you get the highest quality facial features.
|
|
||||||
|
|
||||||
`--save_orig`
|
Like upscaling, there are a number of parameters that adjust the face
|
||||||
|
restoration output. GFPGAN has a single parameter, `strength`, which
|
||||||
|
controls how much the algorithm is allowed to adjust the
|
||||||
|
image. CodeFormer has two parameters, `strength`, and `fidelity`,
|
||||||
|
which together control the quality of the output image as described in
|
||||||
|
the [CodeFormer project
|
||||||
|
page](https://shangchenzhou.com/projects/CodeFormer/). Default values
|
||||||
|
are 0.75 for both parameters, which achieves a reasonable balance
|
||||||
|
between changing the image too much and not enough.
|
||||||
|
|
||||||
When you use either `-U` or `-G`, the final result you get is upscaled or face
|
[This figure](../assets/features/restoration-montage.png) illustrates
|
||||||
modified. If you want to save the original Stable Diffusion generation, you can
|
the effects of adjusting GFPGAN and CodeFormer parameters.
|
||||||
use the `-save_orig` prompt argument to save the original unaffected version
|
|
||||||
too.
|
|
||||||
|
|
||||||
### Example Usage
|
<figure markdown>
|
||||||
|
{ width=720 }
|
||||||
```bash
|
</figure>
|
||||||
invoke> "superman dancing with a panda bear" -U 2 0.6 -G 0.4
|
|
||||||
```
|
|
||||||
|
|
||||||
This also works with img2img:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
invoke> "a man wearing a pineapple hat" -I path/to/your/file.png -U 2 0.5 -G 0.6
|
|
||||||
```
|
|
||||||
|
|
||||||
!!! note
|
!!! note
|
||||||
|
|
||||||
@ -95,69 +115,8 @@ invoke> "a man wearing a pineapple hat" -I path/to/your/file.png -U 2 0.5 -G 0.6
|
|||||||
process is complete. While the image generation is taking place, you will still be able to preview
|
process is complete. While the image generation is taking place, you will still be able to preview
|
||||||
the base images.
|
the base images.
|
||||||
|
|
||||||
If you wish to stop during the image generation but want to upscale or face
|
|
||||||
restore a particular generated image, pass it again with the same prompt and
|
|
||||||
generated seed along with the `-U` and `-G` prompt arguments to perform those
|
|
||||||
actions.
|
|
||||||
|
|
||||||
## CodeFormer Support
|
|
||||||
|
|
||||||
This repo also allows you to perform face restoration using
|
|
||||||
[CodeFormer](https://github.com/sczhou/CodeFormer).
|
|
||||||
|
|
||||||
In order to setup CodeFormer to work, you need to download the models like with
|
|
||||||
GFPGAN. You can do this either by running `invokeai-configure` or by manually
|
|
||||||
downloading the
|
|
||||||
[model file](https://github.com/sczhou/CodeFormer/releases/download/v0.1.0/codeformer.pth)
|
|
||||||
and saving it to `ldm/invoke/restoration/codeformer/weights` folder.
|
|
||||||
|
|
||||||
You can use `-ft` prompt argument to swap between CodeFormer and the default
|
|
||||||
GFPGAN. The above mentioned `-G` prompt argument will allow you to control the
|
|
||||||
strength of the restoration effect.
|
|
||||||
|
|
||||||
### CodeFormer Usage
|
|
||||||
|
|
||||||
The following command will perform face restoration with CodeFormer instead of
|
|
||||||
the default gfpgan.
|
|
||||||
|
|
||||||
`<prompt> -G 0.8 -ft codeformer`
|
|
||||||
|
|
||||||
### Other Options
|
|
||||||
|
|
||||||
- `-cf` - cf or CodeFormer Fidelity takes values between `0` and `1`. 0 produces
|
|
||||||
high quality results but low accuracy and 1 produces lower quality results but
|
|
||||||
higher accuacy to your original face.
|
|
||||||
|
|
||||||
The following command will perform face restoration with CodeFormer. CodeFormer
|
|
||||||
will output a result that is closely matching to the input face.
|
|
||||||
|
|
||||||
`<prompt> -G 1.0 -ft codeformer -cf 0.9`
|
|
||||||
|
|
||||||
The following command will perform face restoration with CodeFormer. CodeFormer
|
|
||||||
will output a result that is the best restoration possible. This may deviate
|
|
||||||
slightly from the original face. This is an excellent option to use in
|
|
||||||
situations when there is very little facial data to work with.
|
|
||||||
|
|
||||||
`<prompt> -G 1.0 -ft codeformer -cf 0.1`
|
|
||||||
|
|
||||||
## Fixing Previously-Generated Images
|
|
||||||
|
|
||||||
It is easy to apply face restoration and/or upscaling to any
|
|
||||||
previously-generated file. Just use the syntax
|
|
||||||
`!fix path/to/file.png <options>`. For example, to apply GFPGAN at strength 0.8
|
|
||||||
and upscale 2X for a file named `./outputs/img-samples/000044.2945021133.png`,
|
|
||||||
just run:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
invoke> !fix ./outputs/img-samples/000044.2945021133.png -G 0.8 -U 2
|
|
||||||
```
|
|
||||||
|
|
||||||
A new file named `000044.2945021133.fixed.png` will be created in the output
|
|
||||||
directory. Note that the `!fix` command does not replace the original file,
|
|
||||||
unlike the behavior at generate time.
|
|
||||||
|
|
||||||
## How to disable
|
## How to disable
|
||||||
|
|
||||||
If, for some reason, you do not wish to load the GFPGAN and/or ESRGAN libraries,
|
If, for some reason, you do not wish to load the GFPGAN and/or ESRGAN libraries,
|
||||||
you can disable them on the invoke.py command line with the `--no_restore` and
|
you can disable them on the invoke.py command line with the `--no_restore` and
|
||||||
`--no_upscale` options, respectively.
|
`--no_esrgan` options, respectively.
|
||||||
|
@ -4,77 +4,12 @@ title: Prompting-Features
|
|||||||
|
|
||||||
# :octicons-command-palette-24: Prompting-Features
|
# :octicons-command-palette-24: Prompting-Features
|
||||||
|
|
||||||
## **Reading Prompts from a File**
|
|
||||||
|
|
||||||
You can automate `invoke.py` by providing a text file with the prompts you want
|
|
||||||
to run, one line per prompt. The text file must be composed with a text editor
|
|
||||||
(e.g. Notepad) and not a word processor. Each line should look like what you
|
|
||||||
would type at the invoke> prompt:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
"a beautiful sunny day in the park, children playing" -n4 -C10
|
|
||||||
"stormy weather on a mountain top, goats grazing" -s100
|
|
||||||
"innovative packaging for a squid's dinner" -S137038382
|
|
||||||
```
|
|
||||||
|
|
||||||
Then pass this file's name to `invoke.py` when you invoke it:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
python scripts/invoke.py --from_file "/path/to/prompts.txt"
|
|
||||||
```
|
|
||||||
|
|
||||||
You may also read a series of prompts from standard input by providing
|
|
||||||
a filename of `-`. For example, here is a python script that creates a
|
|
||||||
matrix of prompts, each one varying slightly:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
#!/usr/bin/env python
|
|
||||||
|
|
||||||
adjectives = ['sunny','rainy','overcast']
|
|
||||||
samplers = ['k_lms','k_euler_a','k_heun']
|
|
||||||
cfg = [7.5, 9, 11]
|
|
||||||
|
|
||||||
for adj in adjectives:
|
|
||||||
for samp in samplers:
|
|
||||||
for cg in cfg:
|
|
||||||
print(f'a {adj} day -A{samp} -C{cg}')
|
|
||||||
```
|
|
||||||
|
|
||||||
Its output looks like this (abbreviated):
|
|
||||||
|
|
||||||
```bash
|
|
||||||
a sunny day -Aklms -C7.5
|
|
||||||
a sunny day -Aklms -C9
|
|
||||||
a sunny day -Aklms -C11
|
|
||||||
a sunny day -Ak_euler_a -C7.5
|
|
||||||
a sunny day -Ak_euler_a -C9
|
|
||||||
...
|
|
||||||
a overcast day -Ak_heun -C9
|
|
||||||
a overcast day -Ak_heun -C11
|
|
||||||
```
|
|
||||||
|
|
||||||
To feed it to invoke.py, pass the filename of "-"
|
|
||||||
|
|
||||||
```bash
|
|
||||||
python matrix.py | python scripts/invoke.py --from_file -
|
|
||||||
```
|
|
||||||
|
|
||||||
When the script is finished, each of the 27 combinations
|
|
||||||
of adjective, sampler and CFG will be executed.
|
|
||||||
|
|
||||||
The command-line interface provides `!fetch` and `!replay` commands
|
|
||||||
which allow you to read the prompts from a single previously-generated
|
|
||||||
image or a whole directory of them, write the prompts to a file, and
|
|
||||||
then replay them. Or you can create your own file of prompts and feed
|
|
||||||
them to the command-line client from within an interactive session.
|
|
||||||
See [Command-Line Interface](CLI.md) for details.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## **Negative and Unconditioned Prompts**
|
## **Negative and Unconditioned Prompts**
|
||||||
|
|
||||||
Any words between a pair of square brackets will instruct Stable Diffusion to
|
Any words between a pair of square brackets will instruct Stable
|
||||||
attempt to ban the concept from the generated image.
|
Diffusion to attempt to ban the concept from the generated image. The
|
||||||
|
same effect is achieved by placing words in the "Negative Prompts"
|
||||||
|
textbox in the Web UI.
|
||||||
|
|
||||||
```text
|
```text
|
||||||
this is a test prompt [not really] to make you understand [cool] how this works.
|
this is a test prompt [not really] to make you understand [cool] how this works.
|
||||||
@ -87,7 +22,9 @@ Here's a prompt that depicts what it does.
|
|||||||
|
|
||||||
original prompt:
|
original prompt:
|
||||||
|
|
||||||
`#!bash "A fantastical translucent pony made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve" -s 20 -W 512 -H 768 -C 7.5 -A k_euler_a -S 1654590180`
|
`#!bash "A fantastical translucent pony made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve"`
|
||||||
|
|
||||||
|
`#!bash parameters: steps=20, dimensions=512x768, CFG=7.5, Scheduler=k_euler_a, seed=1654590180`
|
||||||
|
|
||||||
<figure markdown>
|
<figure markdown>
|
||||||
|
|
||||||
@ -99,7 +36,8 @@ That image has a woman, so if we want the horse without a rider, we can
|
|||||||
influence the image not to have a woman by putting [woman] in the prompt, like
|
influence the image not to have a woman by putting [woman] in the prompt, like
|
||||||
this:
|
this:
|
||||||
|
|
||||||
`#!bash "A fantastical translucent poney made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve [woman]" -s 20 -W 512 -H 768 -C 7.5 -A k_euler_a -S 1654590180`
|
`#!bash "A fantastical translucent poney made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve [woman]"`
|
||||||
|
(same parameters as above)
|
||||||
|
|
||||||
<figure markdown>
|
<figure markdown>
|
||||||
|
|
||||||
@ -110,7 +48,8 @@ this:
|
|||||||
That's nice - but say we also don't want the image to be quite so blue. We can
|
That's nice - but say we also don't want the image to be quite so blue. We can
|
||||||
add "blue" to the list of negative prompts, so it's now [woman blue]:
|
add "blue" to the list of negative prompts, so it's now [woman blue]:
|
||||||
|
|
||||||
`#!bash "A fantastical translucent poney made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve [woman blue]" -s 20 -W 512 -H 768 -C 7.5 -A k_euler_a -S 1654590180`
|
`#!bash "A fantastical translucent poney made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve [woman blue]"`
|
||||||
|
(same parameters as above)
|
||||||
|
|
||||||
<figure markdown>
|
<figure markdown>
|
||||||
|
|
||||||
@ -121,7 +60,8 @@ add "blue" to the list of negative prompts, so it's now [woman blue]:
|
|||||||
Getting close - but there's no sense in having a saddle when our horse doesn't
|
Getting close - but there's no sense in having a saddle when our horse doesn't
|
||||||
have a rider, so we'll add one more negative prompt: [woman blue saddle].
|
have a rider, so we'll add one more negative prompt: [woman blue saddle].
|
||||||
|
|
||||||
`#!bash "A fantastical translucent poney made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve [woman blue saddle]" -s 20 -W 512 -H 768 -C 7.5 -A k_euler_a -S 1654590180`
|
`#!bash "A fantastical translucent poney made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve [woman blue saddle]"`
|
||||||
|
(same parameters as above)
|
||||||
|
|
||||||
<figure markdown>
|
<figure markdown>
|
||||||
|
|
||||||
@ -261,19 +201,6 @@ Prompt2prompt `.swap()` is not compatible with xformers, which will be temporari
|
|||||||
The `prompt2prompt` code is based off
|
The `prompt2prompt` code is based off
|
||||||
[bloc97's colab](https://github.com/bloc97/CrossAttentionControl).
|
[bloc97's colab](https://github.com/bloc97/CrossAttentionControl).
|
||||||
|
|
||||||
Note that `prompt2prompt` is not currently working with the runwayML inpainting
|
|
||||||
model, and may never work due to the way this model is set up. If you attempt to
|
|
||||||
use `prompt2prompt` you will get the original image back. However, since this
|
|
||||||
model is so good at inpainting, a good substitute is to use the `clipseg` text
|
|
||||||
masking option:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
invoke> a fluffy cat eating a hotdog
|
|
||||||
Outputs:
|
|
||||||
[1010] outputs/000025.2182095108.png: a fluffy cat eating a hotdog
|
|
||||||
invoke> a smiling dog eating a hotdog -I 000025.2182095108.png -tm cat
|
|
||||||
```
|
|
||||||
|
|
||||||
### Escaping parantheses () and speech marks ""
|
### Escaping parantheses () and speech marks ""
|
||||||
|
|
||||||
If the model you are using has parentheses () or speech marks "" as part of its
|
If the model you are using has parentheses () or speech marks "" as part of its
|
||||||
@ -374,6 +301,5 @@ summoning up the concept of some sort of scifi creature? Let's find out.
|
|||||||
Indeed, removing the word "hybrid" produces an image that is more like what we'd
|
Indeed, removing the word "hybrid" produces an image that is more like what we'd
|
||||||
expect.
|
expect.
|
||||||
|
|
||||||
In conclusion, prompt blending is great for exploring creative space, but can be
|
In conclusion, prompt blending is great for exploring creative space,
|
||||||
difficult to direct. A forthcoming release of InvokeAI will feature more
|
but takes some trial and error to achieve the desired effect.
|
||||||
deterministic prompt weighting.
|
|
@ -299,14 +299,6 @@ initial image" icons are located.
|
|||||||
|
|
||||||
See the [Unified Canvas Guide](UNIFIED_CANVAS.md)
|
See the [Unified Canvas Guide](UNIFIED_CANVAS.md)
|
||||||
|
|
||||||
## Parting remarks
|
|
||||||
|
|
||||||
This concludes the walkthrough, but there are several more features that you can
|
|
||||||
explore. Please check out the [Command Line Interface](CLI.md) documentation for
|
|
||||||
further explanation of the advanced features that were not covered here.
|
|
||||||
|
|
||||||
The WebUI is only rapid development. Check back regularly for updates!
|
|
||||||
|
|
||||||
## Reference
|
## Reference
|
||||||
|
|
||||||
### Additional Options
|
### Additional Options
|
||||||
@ -349,11 +341,9 @@ the settings configured in the toolbar.
|
|||||||
|
|
||||||
See below for additional documentation related to each feature:
|
See below for additional documentation related to each feature:
|
||||||
|
|
||||||
- [Core Prompt Settings](./CLI.md)
|
|
||||||
- [Variations](./VARIATIONS.md)
|
- [Variations](./VARIATIONS.md)
|
||||||
- [Upscaling](./POSTPROCESS.md#upscaling)
|
- [Upscaling](./POSTPROCESS.md#upscaling)
|
||||||
- [Image to Image](./IMG2IMG.md)
|
- [Image to Image](./IMG2IMG.md)
|
||||||
- [Inpainting](./INPAINTING.md)
|
|
||||||
- [Other](./OTHER.md)
|
- [Other](./OTHER.md)
|
||||||
|
|
||||||
#### Invocation Gallery
|
#### Invocation Gallery
|
||||||
|
@ -17,21 +17,12 @@ a single convenient digital artist-optimized user interface.
|
|||||||
### * [Prompt Engineering](PROMPTS.md)
|
### * [Prompt Engineering](PROMPTS.md)
|
||||||
Get the images you want with the InvokeAI prompt engineering language.
|
Get the images you want with the InvokeAI prompt engineering language.
|
||||||
|
|
||||||
## * [Post-Processing](POSTPROCESS.md)
|
|
||||||
Restore mangled faces and make images larger with upscaling. Also see the [Embiggen Upscaling Guide](EMBIGGEN.md).
|
|
||||||
|
|
||||||
## * The [Concepts Library](CONCEPTS.md)
|
## * The [Concepts Library](CONCEPTS.md)
|
||||||
Add custom subjects and styles using HuggingFace's repository of embeddings.
|
Add custom subjects and styles using HuggingFace's repository of embeddings.
|
||||||
|
|
||||||
### * [Image-to-Image Guide for the CLI](IMG2IMG.md)
|
### * [Image-to-Image Guide](IMG2IMG.md)
|
||||||
Use a seed image to build new creations in the CLI.
|
Use a seed image to build new creations in the CLI.
|
||||||
|
|
||||||
### * [Inpainting Guide for the CLI](INPAINTING.md)
|
|
||||||
Selectively erase and replace portions of an existing image in the CLI.
|
|
||||||
|
|
||||||
### * [Outpainting Guide for the CLI](OUTPAINTING.md)
|
|
||||||
Extend the borders of the image with an "outcrop" function within the CLI.
|
|
||||||
|
|
||||||
### * [Generating Variations](VARIATIONS.md)
|
### * [Generating Variations](VARIATIONS.md)
|
||||||
Have an image you like and want to generate many more like it? Variations
|
Have an image you like and want to generate many more like it? Variations
|
||||||
are the ticket.
|
are the ticket.
|
||||||
|
@ -137,11 +137,8 @@ This method is recommended for those familiar with running Docker containers
|
|||||||
|
|
||||||
### Image Management
|
### Image Management
|
||||||
- [Image2Image](features/IMG2IMG.md)
|
- [Image2Image](features/IMG2IMG.md)
|
||||||
- [Inpainting](features/INPAINTING.md)
|
|
||||||
- [Outpainting](features/OUTPAINTING.md)
|
|
||||||
- [Adding custom styles and subjects](features/CONCEPTS.md)
|
- [Adding custom styles and subjects](features/CONCEPTS.md)
|
||||||
- [Upscaling and Face Reconstruction](features/POSTPROCESS.md)
|
- [Upscaling and Face Reconstruction](features/POSTPROCESS.md)
|
||||||
- [Embiggen upscaling](features/EMBIGGEN.md)
|
|
||||||
- [Other Features](features/OTHER.md)
|
- [Other Features](features/OTHER.md)
|
||||||
|
|
||||||
<!-- separator -->
|
<!-- separator -->
|
||||||
|
Loading…
x
Reference in New Issue
Block a user