refactor how postprocessors work

- similar call structures for outpainting, outcropping and face restoration modules
- added documentation for outcropping
- post-processing steps now leave a provenance chain (of sorts) in the sd-metadata field:

~~~
scripts/sd-metadata.py outputs/img-samples/curly.942491079.upscale.png
outputs/img-samples/curly.942491079.upscale.png:
 {
    "model": "stable diffusion",
    "model_id": "stable-diffusion-1.4",
    "model_hash": "fe4efff1e174c627256e44ec2991ba279b3816e364b49f9be2abc0b3ff3f8556",
    "app_id": "lstein/stable-diffusion",
    "app_version": "v1.15",
    "image": {
        "height": 512,
        "width": 512,
        "steps": 50,
        "cfg_scale": 7.5,
        "seed": 942491079,
        "prompt": [
            {
                "prompt": "pretty curly-haired redhead woman",
                "weight": 1.0
            }
        ],
        "postprocessing": [
            {
                "tool": "outcrop",
                "dream_command": "!fix \"test-pictures/curly.png\" -s 50 -S 942491079 -W 512 -H 512 -C 7.5 -A k_lms -c top 64 right 64"
            },
            {
                "tool": "gfpgan",
                "dream_command": "!fix \"outputs/img-samples/curly.942491079.outcrop-02.png\" -s 50 -S 942491079 -W 512 -H 512 -C 7.5 -A k_lms -G 0.8"
            },
            {
                "tool": "upscale",
                "dream_command": "!fix \"outputs/img-samples/curly.942491079.gfpgan.png\" -s 50 -S 942491079 -W 512 -H 512 -C 7.5 -A k_lms -U 4.0 0.75"
            }
        ],
        "sampler": "k_lms",
        "variations": [],
        "type": "txt2img"
    }
}
~~~
This commit is contained in:
Lincoln Stein 2022-10-03 16:53:12 -04:00
parent 609983ffa8
commit 4c482fe24a
13 changed files with 218 additions and 170 deletions

View File

@ -108,7 +108,7 @@ you can try starting `dream.py` with the `--precision=float32` flag:
- [Image To Image](docs/features/IMG2IMG.md) - [Image To Image](docs/features/IMG2IMG.md)
- [Inpainting Support](docs/features/INPAINTING.md) - [Inpainting Support](docs/features/INPAINTING.md)
- [Outpainting Support](docs/features/OUTPAINTING.md) - [Outpainting Support](docs/features/OUTPAINTING.md)
- [GFPGAN and Real-ESRGAN Support](docs/features/UPSCALE.md) - [Upscaling, face-restoration and outpainting](docs/features/POSTPROCESS.md)
- [Seamless Tiling](docs/features/OTHER.md#seamless-tiling) - [Seamless Tiling](docs/features/OTHER.md#seamless-tiling)
- [Google Colab](docs/features/OTHER.md#google-colab) - [Google Colab](docs/features/OTHER.md#google-colab)
- [Web Server](docs/features/WEB.md) - [Web Server](docs/features/WEB.md)

Binary file not shown.

After

Width:  |  Height:  |  Size: 500 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 422 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 428 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 572 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 538 KiB

View File

@ -4,75 +4,95 @@ title: Outpainting
# :octicons-paintbrush-16: Outpainting # :octicons-paintbrush-16: Outpainting
## Continous outpainting ## Outpainting and outcropping
This extension uses the inpainting code to extend an existing image to Outpainting is a process by which the AI generates parts of the image
any direction of "top", "right", "bottom" or "left". To use it you that are outside its original frame. It can be used to fix up images
need to provide an initial image with -I and an extension direction in which the subject is off center, or when some detail (often the top
with -D (direction). When extending using outpainting a higher img2img of someone's head!) is cut off.
strength value of 0.83 is the default.
The code is not foolproof. Sometimes it will do a good job extending InvokeAI supports two versions of outpainting, one called "outpaint"
the image, and other times it will generate ghost images and other and the other "outcrop." They work slightly differently and each has
artifacts. In addition, the code works best on images that were its advantages and drawbacks.
generated by dream.py, because it will be able to recover the original
prompt that generated the file and "understand" what you are trying to
achieve.
### Basic Usage ### Outcrop
To illustrate, consider this image generated with the prompt "fantasy The `outcrop` extension allows you to extend the image in 64 pixel
portrait of eleven princess." It's nice, but rather annoying that the increments in any dimension. You can apply the module to any image
top of the head has been cropped off. previously-generated by InvokeAI. Note that it will **not** work with
arbitrary photographs or Stable Diffusion images created by other
implementations.
![elven_princess](../assets/outpainting/elven_princess.png) Consider this image:
We can fix that using the `!fix` command! ![curly_woman](../assets/outpainting/curly.png)
Pretty nice, but it's annoying that the top of her head is cut
off. She's also a bit off center. Let's fix that!
~~~~ ~~~~
dream> !fix my_images/elven_princess.png -D top 50 dream> !fix images/curly.png --outcrop top 64 right 64
~~~~ ~~~~
This is telling dream.py to open up a rectangle 50 pixels high at the This is saying to apply the `outcrop` extension by extending the top
top of the image and outpaint into it. The result is: of the image by 64 pixels, and the right of the image by the same
amount. You can use any combination of top|left|right|bottom, and
specify any number of pixels to extend. You can also abbreviate
`--outcrop` to `-c`.
![elven_princess.fixed](../assets/outpainting/elven_princess.outpainted.png) The result looks like this:
Viola! You can similarly specify `bottom`, `left` or `right` to ![curly_woman_outcrop](../assets/outpainting/curly-outcrop.png)
outpaint into these margins.
There are some limitations to be aware of: The new image is actually slightly larger than the original (576x576,
because 64 pixels were added to the top and right sides.)
1. You cannot change the size of the image rectangle. In the example, A number of caveats:
notice that the whole image is shifted downwards by 50 pixels, rather
than the top being extended upwards.
2. Attempting to outpaint larger areas will frequently give rise to ugly 1. Although you can specify any pixel values, they will be rounded up
to the nearest multiple of 64. Smaller values are better. Larger
extensions are more likely to generate artefacts. However, if you wish
you can run the !fix command repeatedly to cautiously expand the
image.
2. The extension is stochastic, meaning that each time you run it
you'll get a slightly different result. You can run it repeatedly
until you get an image you like. Unfortunately `!fix` does not
currently respect the `-n` (`--iterations`) argument.
## Outpaint
The `outpaint` extension does the same thing, but with subtle
differences. Starting with the same image, here is how we would add an
additional 64 pixels to the top of the image:
~~~
dream> !fix images/curly.png --out_direction top 64
~~~
(you can abbreviate ``--out_direction` as `-D`.
The result is shown here:
![curly_woman_outpaint](../assets/outpainting/curly-outpaint.png)
Although the effect is similar, there are significant differences from
outcropping:
1. You can only specify one direction to extend at a time.
2. The image is **not** resized. Instead, the image is shifted by the specified
number of pixels. If you look carefully, you'll see that less of the lady's
torso is visible in the image.
3. Because the image dimensions remain the same, there's no rounding
to multiples of 64.
4. Attempting to outpaint larger areas will frequently give rise to ugly
ghosting effects. ghosting effects.
5. For best results, try increasing the step number.
3. For best results, try increasing the step number. 6. If you don't specify a pixel value in -D, it will default to half
4. If you don't specify a pixel value in -D, it will default to half
of the whole image, which is likely not what you want. of the whole image, which is likely not what you want.
You can do more with `!fix` including upscaling and facial Neither `outpaint` nor `outcrop` are perfect, but we continue to tune
reconstruction of previously-generated images. See and improve them. If one doesn't work, try the other. You may also
[./UPSCALE.md#fixing-previously-generated-images] for the details. wish to experiment with other `img2img` arguments, such as `-C`, `-f`
and `-s`.
### Advanced Usage
For more control over the outpaintihg process, you can provide the
`-D` option at image generation time. This allows you to apply all the
controls, including the ability to resize the image and apply face-fixing
and upscaling. For example:
~~~~
dream> man with cat on shoulder -I./images/man.png -D bottom 100 -W960 -H960 -fit
~~~~
Or even shorter, since the prompt is read from the metadata of the old image:
~~~~
dream> -I./images/man.png -D bottom 100 -W960 -H960 -fit -U2 -G1
~~~~

View File

@ -1,14 +1,18 @@
--- ---
title: Upscale title: Postprocessing
--- ---
## Intro ## Intro
The script provides the ability to restore faces and upscale. You can apply This extension provides the ability to restore faces and upscale
these operations at the time you generate the images, or at any time to a images.
previously-generated PNG file, using the
[!fix](#fixing-previously-generated-images) command. Face restoration and upscaling can be applied at the time you generate
the images, or at any later time against a previously-generated PNG
file, using the [!fix](#fixing-previously-generated-images)
command. [Outpainting and outcropping](OUTPAINTING.md) can only be
applied after the fact.
## Face Fixing ## Face Fixing
@ -158,9 +162,9 @@ situations when there is very little facial data to work with.
## Fixing Previously-Generated Images ## Fixing Previously-Generated Images
It is easy to apply face restoration and/or upscaling to any It is easy to apply face restoration and/or upscaling to any
previously-generated file. Just use the syntax previously-generated file. Just use the syntax `!fix path/to/file.png
`!fix path/to/file.png <options>`. For example, to apply GFPGAN at strength 0.8 <options>`. For example, to apply GFPGAN at strength 0.8 and upscale
and upscale 2X for a file named `./outputs/img-samples/000044.2945021133.png`, 2X for a file named `./outputs/img-samples/000044.2945021133.png`,
just run: just run:
``` ```

View File

@ -647,8 +647,8 @@ class Args(object):
'--outcrop', '--outcrop',
nargs='+', nargs='+',
type=str, type=str,
metavar=('direction:pixels'), metavar=('direction','pixels'),
help='Outcrop the image "direction:pixels direction:pixels..." where direction is (top|left|bottom|right)' help='Outcrop the image with one or more direction/pixel pairs: -c top 64 bottom 128 left 64 right 64',
) )
postprocessing_group.add_argument( postprocessing_group.add_argument(
'-ft', '-ft',

View File

@ -21,7 +21,6 @@ class Img2Img(Generator):
""" """
self.perlin = perlin self.perlin = perlin
print(f'DEBUG: init_image = {init_image}')
sampler.make_schedule( sampler.make_schedule(
ddim_num_steps=steps, ddim_eta=ddim_eta, verbose=False ddim_num_steps=steps, ddim_eta=ddim_eta, verbose=False
) )

View File

@ -1,40 +1,42 @@
import warnings import warnings
import math import math
from ldm.dream.conditioning import get_uc_and_c
from PIL import Image, ImageFilter from PIL import Image, ImageFilter
class Outcrop(): class Outcrop(object):
def __init__( def __init__(
self, self,
image, image,
generator, # current generator object generate, # current generate object
): ):
self.image = image self.image = image
self.generator = generator self.generate = generate
def extend( def process (
self, self,
extents:dict, extents:dict,
opt, opt, # current options
orig_opt, # ones originally used to generate the image
image_callback = None, image_callback = None,
prefix = None prefix = None
): ):
# grow and mask the image
extended_image = self._extend_all(extents) extended_image = self._extend_all(extents)
# switch samplers temporarily # switch samplers temporarily
curr_sampler = self.generator.sampler curr_sampler = self.generate.sampler
self.generator.sampler_name = opt.sampler_name self.generate.sampler_name = opt.sampler_name
self.generator._set_sampler() self.generate._set_sampler()
def wrapped_callback(img,seed,**kwargs): def wrapped_callback(img,seed,**kwargs):
image_callback(img,opt.seed,use_prefix=prefix,**kwargs) image_callback(img,orig_opt.seed,use_prefix=prefix,**kwargs)
result= self.generator.prompt2image( result= self.generate.prompt2image(
opt.prompt, orig_opt.prompt,
sampler = self.generator.sampler, # seed = orig_opt.seed, # uncomment to make it deterministic
sampler = self.generate.sampler,
steps = opt.steps, steps = opt.steps,
cfg_scale = opt.cfg_scale, cfg_scale = opt.cfg_scale,
ddim_eta = self.generator.ddim_eta, ddim_eta = self.generate.ddim_eta,
width = extended_image.width, width = extended_image.width,
height = extended_image.height, height = extended_image.height,
init_img = extended_image, init_img = extended_image,
@ -43,7 +45,7 @@ class Outcrop():
) )
# swap sampler back # swap sampler back
self.generator.sampler = curr_sampler self.generate.sampler = curr_sampler
return result return result
def _extend_all( def _extend_all(

View File

@ -0,0 +1,94 @@
import warnings
import math
from PIL import Image, ImageFilter
class Outpaint(object):
def __init__(self, image, generate):
self.image = image
self.generate = generate
def process(self, opt, old_opt, image_callback = None, prefix = None):
image = self._create_outpaint_image(self.image, opt.out_direction)
seed = old_opt.seed
prompt = old_opt.prompt
print(f'DEBUG: old seed={seed}, old prompt = {prompt}')
def wrapped_callback(img,seed,**kwargs):
image_callback(img,seed,use_prefix=prefix,**kwargs)
return self.generate.prompt2image(
prompt,
seed = seed,
sampler = self.generate.sampler,
steps = opt.steps,
cfg_scale = opt.cfg_scale,
ddim_eta = self.generate.ddim_eta,
width = opt.width,
height = opt.height,
init_img = image,
strength = 0.83,
image_callback = wrapped_callback,
prefix = prefix,
)
def _create_outpaint_image(self, image, direction_args):
assert len(direction_args) in [1, 2], 'Direction (-D) must have exactly one or two arguments.'
if len(direction_args) == 1:
direction = direction_args[0]
pixels = None
elif len(direction_args) == 2:
direction = direction_args[0]
pixels = int(direction_args[1])
assert direction in ['top', 'left', 'bottom', 'right'], 'Direction (-D) must be one of "top", "left", "bottom", "right"'
image = image.convert("RGBA")
# we always extend top, but rotate to extend along the requested side
if direction == 'left':
image = image.transpose(Image.Transpose.ROTATE_270)
elif direction == 'bottom':
image = image.transpose(Image.Transpose.ROTATE_180)
elif direction == 'right':
image = image.transpose(Image.Transpose.ROTATE_90)
pixels = image.height//2 if pixels is None else int(pixels)
assert 0 < pixels < image.height, 'Direction (-D) pixels length must be in the range 0 - image.size'
# the top part of the image is taken from the source image mirrored
# coordinates (0,0) are the upper left corner of an image
top = image.transpose(Image.Transpose.FLIP_TOP_BOTTOM).convert("RGBA")
top = top.crop((0, top.height - pixels, top.width, top.height))
# setting all alpha of the top part to 0
alpha = top.getchannel("A")
alpha.paste(0, (0, 0, top.width, top.height))
top.putalpha(alpha)
# taking the bottom from the original image
bottom = image.crop((0, 0, image.width, image.height - pixels))
new_img = image.copy()
new_img.paste(top, (0, 0))
new_img.paste(bottom, (0, pixels))
# create a 10% dither in the middle
dither = min(image.height//10, pixels)
for x in range(0, image.width, 2):
for y in range(pixels - dither, pixels + dither):
(r, g, b, a) = new_img.getpixel((x, y))
new_img.putpixel((x, y), (r, g, b, 0))
# let's rotate back again
if direction == 'left':
new_img = new_img.transpose(Image.Transpose.ROTATE_90)
elif direction == 'bottom':
new_img = new_img.transpose(Image.Transpose.ROTATE_180)
elif direction == 'right':
new_img = new_img.transpose(Image.Transpose.ROTATE_270)
return new_img

View File

@ -281,7 +281,6 @@ class Generate:
# these are specific to embiggen (which also relies on img2img args) # these are specific to embiggen (which also relies on img2img args)
embiggen = None, embiggen = None,
embiggen_tiles = None, embiggen_tiles = None,
out_direction = None,
# these are specific to GFPGAN/ESRGAN # these are specific to GFPGAN/ESRGAN
facetool = None, facetool = None,
gfpgan_strength = 0, gfpgan_strength = 0,
@ -405,7 +404,6 @@ class Generate:
width, width,
height, height,
fit=fit, fit=fit,
out_direction=out_direction,
) )
if (init_image is not None) and (mask_image is not None): if (init_image is not None) and (mask_image is not None):
generator = self._make_inpaint() generator = self._make_inpaint()
@ -519,9 +517,9 @@ class Generate:
seed = 42 seed = 42
# try to reuse the same filename prefix as the original file. # try to reuse the same filename prefix as the original file.
# note that this is hacky # we take everything up to the first period
prefix = None prefix = None
m = re.search('(\d+)\.',os.path.basename(image_path)) m = re.match('^([^.]+)\.',os.path.basename(image_path))
if m: if m:
prefix = m.groups()[0] prefix = m.groups()[0]
@ -559,13 +557,12 @@ class Generate:
extend_instructions = {} extend_instructions = {}
for direction,pixels in _pairwise(opt.outcrop): for direction,pixels in _pairwise(opt.outcrop):
extend_instructions[direction]=int(pixels) extend_instructions[direction]=int(pixels)
generator = Outcrop(
image, restorer = Outcrop(image,self,)
self, return restorer.process (
)
return generator.extend(
extend_instructions, extend_instructions,
args, opt = opt,
orig_opt = args,
image_callback = callback, image_callback = callback,
prefix = prefix, prefix = prefix,
) )
@ -593,24 +590,15 @@ class Generate:
image_callback = callback, image_callback = callback,
) )
elif tool == 'outpaint': elif tool == 'outpaint':
oldargs = metadata_from_png(image_path) from ldm.dream.restoration.outpaint import Outpaint
opt.strength = 0.83 restorer = Outpaint(image,self)
opt.init_img = image_path return restorer.process(
return self.prompt2image( opt,
oldargs.prompt, args,
out_direction = opt.out_direction,
sampler = self.sampler,
steps = opt.steps,
cfg_scale = opt.cfg_scale,
ddim_eta = self.ddim_eta,
conditioning= (uc,c),
width = opt.width,
height = opt.height,
init_img = image_path, # not the Image! (sigh)
strength = opt.strength,
image_callback = callback, image_callback = callback,
prefix = prefix, prefix = prefix
) )
elif tool is None: elif tool is None:
print(f'* please provide at least one postprocessing option, such as -G or -U') print(f'* please provide at least one postprocessing option, such as -G or -U')
return None return None
@ -626,7 +614,6 @@ class Generate:
width, width,
height, height,
fit=False, fit=False,
out_direction=None,
): ):
init_image = None init_image = None
init_mask = None init_mask = None
@ -637,7 +624,9 @@ class Generate:
img, img,
width, width,
height, height,
) # this returns an Image ) # this returns an Image
init_image = self._create_init_image(image) # this returns a torch tensor
# if image has a transparent area and no mask was provided, then try to generate mask # if image has a transparent area and no mask was provided, then try to generate mask
if self._has_transparency(image) and not mask: if self._has_transparency(image) and not mask:
@ -909,67 +898,7 @@ class Generate:
image = 2.0 * image - 1.0 image = 2.0 * image - 1.0
return image.to(self.device) return image.to(self.device)
# TODO: outpainting is a post-processing application and should be made to behave def _create_init_mask(self, image):
# like the other ones.
def _create_outpaint_image(self, image, direction_args):
assert len(direction_args) in [1, 2], 'Direction (-D) must have exactly one or two arguments.'
if len(direction_args) == 1:
direction = direction_args[0]
pixels = None
elif len(direction_args) == 2:
direction = direction_args[0]
pixels = int(direction_args[1])
assert direction in ['top', 'left', 'bottom', 'right'], 'Direction (-D) must be one of "top", "left", "bottom", "right"'
image = image.convert("RGBA")
# we always extend top, but rotate to extend along the requested side
if direction == 'left':
image = image.transpose(Image.Transpose.ROTATE_270)
elif direction == 'bottom':
image = image.transpose(Image.Transpose.ROTATE_180)
elif direction == 'right':
image = image.transpose(Image.Transpose.ROTATE_90)
pixels = image.height//2 if pixels is None else int(pixels)
assert 0 < pixels < image.height, 'Direction (-D) pixels length must be in the range 0 - image.size'
# the top part of the image is taken from the source image mirrored
# coordinates (0,0) are the upper left corner of an image
top = image.transpose(Image.Transpose.FLIP_TOP_BOTTOM).convert("RGBA")
top = top.crop((0, top.height - pixels, top.width, top.height))
# setting all alpha of the top part to 0
alpha = top.getchannel("A")
alpha.paste(0, (0, 0, top.width, top.height))
top.putalpha(alpha)
# taking the bottom from the original image
bottom = image.crop((0, 0, image.width, image.height - pixels))
new_img = image.copy()
new_img.paste(top, (0, 0))
new_img.paste(bottom, (0, pixels))
# create a 10% dither in the middle
dither = min(image.height//10, pixels)
for x in range(0, image.width, 2):
for y in range(pixels - dither, pixels + dither):
(r, g, b, a) = new_img.getpixel((x, y))
new_img.putpixel((x, y), (r, g, b, 0))
# let's rotate back again
if direction == 'left':
new_img = new_img.transpose(Image.Transpose.ROTATE_90)
elif direction == 'bottom':
new_img = new_img.transpose(Image.Transpose.ROTATE_180)
elif direction == 'right':
new_img = new_img.transpose(Image.Transpose.ROTATE_270)
return new_img
def _create_init_mask(self, image, width, height, fit=True):
# convert into a black/white mask # convert into a black/white mask
image = self._image_to_mask(image) image = self._image_to_mask(image)
image = image.convert('RGB') image = image.convert('RGB')