diff --git a/README.md b/README.md index 1525e3f51e..ffd2e0c542 100644 --- a/README.md +++ b/README.md @@ -108,7 +108,7 @@ you can try starting `dream.py` with the `--precision=float32` flag: - [Image To Image](docs/features/IMG2IMG.md) - [Inpainting Support](docs/features/INPAINTING.md) - [Outpainting Support](docs/features/OUTPAINTING.md) -- [GFPGAN and Real-ESRGAN Support](docs/features/UPSCALE.md) +- [Upscaling, face-restoration and outpainting](docs/features/POSTPROCESS.md) - [Seamless Tiling](docs/features/OTHER.md#seamless-tiling) - [Google Colab](docs/features/OTHER.md#google-colab) - [Web Server](docs/features/WEB.md) diff --git a/docs/assets/outpainting/curly-outcrop.png b/docs/assets/outpainting/curly-outcrop.png new file mode 100644 index 0000000000..ae8d8dacd3 Binary files /dev/null and b/docs/assets/outpainting/curly-outcrop.png differ diff --git a/docs/assets/outpainting/curly-outpaint.png b/docs/assets/outpainting/curly-outpaint.png new file mode 100644 index 0000000000..9f4a2ee431 Binary files /dev/null and b/docs/assets/outpainting/curly-outpaint.png differ diff --git a/docs/assets/outpainting/curly.png b/docs/assets/outpainting/curly.png new file mode 100644 index 0000000000..d9a4cb257e Binary files /dev/null and b/docs/assets/outpainting/curly.png differ diff --git a/docs/assets/outpainting/elven_princess.outpainted.png b/docs/assets/outpainting/elven_princess.outpainted.png deleted file mode 100644 index 98f98564df..0000000000 Binary files a/docs/assets/outpainting/elven_princess.outpainted.png and /dev/null differ diff --git a/docs/assets/outpainting/elven_princess.png b/docs/assets/outpainting/elven_princess.png deleted file mode 100644 index aa5f00ccf7..0000000000 Binary files a/docs/assets/outpainting/elven_princess.png and /dev/null differ diff --git a/docs/features/OUTPAINTING.md b/docs/features/OUTPAINTING.md index 9f72a5cb3c..952bbc97fc 100644 --- a/docs/features/OUTPAINTING.md +++ b/docs/features/OUTPAINTING.md @@ -4,75 +4,95 @@ title: Outpainting # :octicons-paintbrush-16: Outpainting -## Continous outpainting +## Outpainting and outcropping -This extension uses the inpainting code to extend an existing image to -any direction of "top", "right", "bottom" or "left". To use it you -need to provide an initial image with -I and an extension direction -with -D (direction). When extending using outpainting a higher img2img -strength value of 0.83 is the default. +Outpainting is a process by which the AI generates parts of the image +that are outside its original frame. It can be used to fix up images +in which the subject is off center, or when some detail (often the top +of someone's head!) is cut off. -The code is not foolproof. Sometimes it will do a good job extending -the image, and other times it will generate ghost images and other -artifacts. In addition, the code works best on images that were -generated by dream.py, because it will be able to recover the original -prompt that generated the file and "understand" what you are trying to -achieve. +InvokeAI supports two versions of outpainting, one called "outpaint" +and the other "outcrop." They work slightly differently and each has +its advantages and drawbacks. -### Basic Usage +### Outcrop -To illustrate, consider this image generated with the prompt "fantasy -portrait of eleven princess." It's nice, but rather annoying that the -top of the head has been cropped off. +The `outcrop` extension allows you to extend the image in 64 pixel +increments in any dimension. You can apply the module to any image +previously-generated by InvokeAI. Note that it will **not** work with +arbitrary photographs or Stable Diffusion images created by other +implementations. -![elven_princess](../assets/outpainting/elven_princess.png) +Consider this image: -We can fix that using the `!fix` command! +![curly_woman](../assets/outpainting/curly.png) + +Pretty nice, but it's annoying that the top of her head is cut +off. She's also a bit off center. Let's fix that! ~~~~ -dream> !fix my_images/elven_princess.png -D top 50 +dream> !fix images/curly.png --outcrop top 64 right 64 ~~~~ -This is telling dream.py to open up a rectangle 50 pixels high at the -top of the image and outpaint into it. The result is: +This is saying to apply the `outcrop` extension by extending the top +of the image by 64 pixels, and the right of the image by the same +amount. You can use any combination of top|left|right|bottom, and +specify any number of pixels to extend. You can also abbreviate +`--outcrop` to `-c`. -![elven_princess.fixed](../assets/outpainting/elven_princess.outpainted.png) +The result looks like this: -Viola! You can similarly specify `bottom`, `left` or `right` to -outpaint into these margins. +![curly_woman_outcrop](../assets/outpainting/curly-outcrop.png) -There are some limitations to be aware of: +The new image is actually slightly larger than the original (576x576, +because 64 pixels were added to the top and right sides.) -1. You cannot change the size of the image rectangle. In the example, - notice that the whole image is shifted downwards by 50 pixels, rather - than the top being extended upwards. +A number of caveats: -2. Attempting to outpaint larger areas will frequently give rise to ugly +1. Although you can specify any pixel values, they will be rounded up +to the nearest multiple of 64. Smaller values are better. Larger +extensions are more likely to generate artefacts. However, if you wish +you can run the !fix command repeatedly to cautiously expand the +image. + +2. The extension is stochastic, meaning that each time you run it +you'll get a slightly different result. You can run it repeatedly +until you get an image you like. Unfortunately `!fix` does not +currently respect the `-n` (`--iterations`) argument. + +## Outpaint + +The `outpaint` extension does the same thing, but with subtle +differences. Starting with the same image, here is how we would add an +additional 64 pixels to the top of the image: + +~~~ +dream> !fix images/curly.png --out_direction top 64 +~~~ + +(you can abbreviate ``--out_direction` as `-D`. + +The result is shown here: + +![curly_woman_outpaint](../assets/outpainting/curly-outpaint.png) + +Although the effect is similar, there are significant differences from +outcropping: + +1. You can only specify one direction to extend at a time. +2. The image is **not** resized. Instead, the image is shifted by the specified +number of pixels. If you look carefully, you'll see that less of the lady's +torso is visible in the image. +3. Because the image dimensions remain the same, there's no rounding +to multiples of 64. +4. Attempting to outpaint larger areas will frequently give rise to ugly ghosting effects. - -3. For best results, try increasing the step number. - -4. If you don't specify a pixel value in -D, it will default to half +5. For best results, try increasing the step number. +6. If you don't specify a pixel value in -D, it will default to half of the whole image, which is likely not what you want. -You can do more with `!fix` including upscaling and facial -reconstruction of previously-generated images. See -[./UPSCALE.md#fixing-previously-generated-images] for the details. - -### Advanced Usage - -For more control over the outpaintihg process, you can provide the -`-D` option at image generation time. This allows you to apply all the -controls, including the ability to resize the image and apply face-fixing -and upscaling. For example: - -~~~~ -dream> man with cat on shoulder -I./images/man.png -D bottom 100 -W960 -H960 -fit -~~~~ - -Or even shorter, since the prompt is read from the metadata of the old image: - -~~~~ -dream> -I./images/man.png -D bottom 100 -W960 -H960 -fit -U2 -G1 -~~~~ +Neither `outpaint` nor `outcrop` are perfect, but we continue to tune +and improve them. If one doesn't work, try the other. You may also +wish to experiment with other `img2img` arguments, such as `-C`, `-f` +and `-s`. diff --git a/docs/features/UPSCALE.md b/docs/features/POSTPROCESS.md similarity index 91% rename from docs/features/UPSCALE.md rename to docs/features/POSTPROCESS.md index 10f7c375d7..cd4fd7e9e6 100644 --- a/docs/features/UPSCALE.md +++ b/docs/features/POSTPROCESS.md @@ -1,14 +1,18 @@ --- -title: Upscale +title: Postprocessing --- ## Intro -The script provides the ability to restore faces and upscale. You can apply -these operations at the time you generate the images, or at any time to a -previously-generated PNG file, using the -[!fix](#fixing-previously-generated-images) command. +This extension provides the ability to restore faces and upscale +images. + +Face restoration and upscaling can be applied at the time you generate +the images, or at any later time against a previously-generated PNG +file, using the [!fix](#fixing-previously-generated-images) +command. [Outpainting and outcropping](OUTPAINTING.md) can only be +applied after the fact. ## Face Fixing @@ -158,9 +162,9 @@ situations when there is very little facial data to work with. ## Fixing Previously-Generated Images It is easy to apply face restoration and/or upscaling to any -previously-generated file. Just use the syntax -`!fix path/to/file.png `. For example, to apply GFPGAN at strength 0.8 -and upscale 2X for a file named `./outputs/img-samples/000044.2945021133.png`, +previously-generated file. Just use the syntax `!fix path/to/file.png +`. For example, to apply GFPGAN at strength 0.8 and upscale +2X for a file named `./outputs/img-samples/000044.2945021133.png`, just run: ``` diff --git a/ldm/dream/args.py b/ldm/dream/args.py index 8fdbbf7c5a..c4485d7c23 100644 --- a/ldm/dream/args.py +++ b/ldm/dream/args.py @@ -647,8 +647,8 @@ class Args(object): '--outcrop', nargs='+', type=str, - metavar=('direction:pixels'), - help='Outcrop the image "direction:pixels direction:pixels..." where direction is (top|left|bottom|right)' + metavar=('direction','pixels'), + help='Outcrop the image with one or more direction/pixel pairs: -c top 64 bottom 128 left 64 right 64', ) postprocessing_group.add_argument( '-ft', diff --git a/ldm/dream/generator/img2img.py b/ldm/dream/generator/img2img.py index d8d797cd28..09750b3748 100644 --- a/ldm/dream/generator/img2img.py +++ b/ldm/dream/generator/img2img.py @@ -21,7 +21,6 @@ class Img2Img(Generator): """ self.perlin = perlin - print(f'DEBUG: init_image = {init_image}') sampler.make_schedule( ddim_num_steps=steps, ddim_eta=ddim_eta, verbose=False ) diff --git a/ldm/dream/restoration/outcrop.py b/ldm/dream/restoration/outcrop.py index ace36d6120..017d9de7e1 100644 --- a/ldm/dream/restoration/outcrop.py +++ b/ldm/dream/restoration/outcrop.py @@ -1,40 +1,42 @@ import warnings import math -from ldm.dream.conditioning import get_uc_and_c from PIL import Image, ImageFilter -class Outcrop(): +class Outcrop(object): def __init__( self, image, - generator, # current generator object + generate, # current generate object ): self.image = image - self.generator = generator + self.generate = generate - def extend( + def process ( self, extents:dict, - opt, + opt, # current options + orig_opt, # ones originally used to generate the image image_callback = None, prefix = None ): + # grow and mask the image extended_image = self._extend_all(extents) # switch samplers temporarily - curr_sampler = self.generator.sampler - self.generator.sampler_name = opt.sampler_name - self.generator._set_sampler() + curr_sampler = self.generate.sampler + self.generate.sampler_name = opt.sampler_name + self.generate._set_sampler() def wrapped_callback(img,seed,**kwargs): - image_callback(img,opt.seed,use_prefix=prefix,**kwargs) + image_callback(img,orig_opt.seed,use_prefix=prefix,**kwargs) - result= self.generator.prompt2image( - opt.prompt, - sampler = self.generator.sampler, + result= self.generate.prompt2image( + orig_opt.prompt, +# seed = orig_opt.seed, # uncomment to make it deterministic + sampler = self.generate.sampler, steps = opt.steps, cfg_scale = opt.cfg_scale, - ddim_eta = self.generator.ddim_eta, + ddim_eta = self.generate.ddim_eta, width = extended_image.width, height = extended_image.height, init_img = extended_image, @@ -43,7 +45,7 @@ class Outcrop(): ) # swap sampler back - self.generator.sampler = curr_sampler + self.generate.sampler = curr_sampler return result def _extend_all( diff --git a/ldm/dream/restoration/outpaint.py b/ldm/dream/restoration/outpaint.py new file mode 100644 index 0000000000..525e158779 --- /dev/null +++ b/ldm/dream/restoration/outpaint.py @@ -0,0 +1,94 @@ +import warnings +import math +from PIL import Image, ImageFilter + +class Outpaint(object): + def __init__(self, image, generate): + self.image = image + self.generate = generate + + def process(self, opt, old_opt, image_callback = None, prefix = None): + image = self._create_outpaint_image(self.image, opt.out_direction) + + seed = old_opt.seed + prompt = old_opt.prompt + + print(f'DEBUG: old seed={seed}, old prompt = {prompt}') + + def wrapped_callback(img,seed,**kwargs): + image_callback(img,seed,use_prefix=prefix,**kwargs) + + + return self.generate.prompt2image( + prompt, + seed = seed, + sampler = self.generate.sampler, + steps = opt.steps, + cfg_scale = opt.cfg_scale, + ddim_eta = self.generate.ddim_eta, + width = opt.width, + height = opt.height, + init_img = image, + strength = 0.83, + image_callback = wrapped_callback, + prefix = prefix, + ) + + def _create_outpaint_image(self, image, direction_args): + assert len(direction_args) in [1, 2], 'Direction (-D) must have exactly one or two arguments.' + + if len(direction_args) == 1: + direction = direction_args[0] + pixels = None + elif len(direction_args) == 2: + direction = direction_args[0] + pixels = int(direction_args[1]) + + assert direction in ['top', 'left', 'bottom', 'right'], 'Direction (-D) must be one of "top", "left", "bottom", "right"' + + image = image.convert("RGBA") + # we always extend top, but rotate to extend along the requested side + if direction == 'left': + image = image.transpose(Image.Transpose.ROTATE_270) + elif direction == 'bottom': + image = image.transpose(Image.Transpose.ROTATE_180) + elif direction == 'right': + image = image.transpose(Image.Transpose.ROTATE_90) + + pixels = image.height//2 if pixels is None else int(pixels) + assert 0 < pixels < image.height, 'Direction (-D) pixels length must be in the range 0 - image.size' + + # the top part of the image is taken from the source image mirrored + # coordinates (0,0) are the upper left corner of an image + top = image.transpose(Image.Transpose.FLIP_TOP_BOTTOM).convert("RGBA") + top = top.crop((0, top.height - pixels, top.width, top.height)) + + # setting all alpha of the top part to 0 + alpha = top.getchannel("A") + alpha.paste(0, (0, 0, top.width, top.height)) + top.putalpha(alpha) + + # taking the bottom from the original image + bottom = image.crop((0, 0, image.width, image.height - pixels)) + + new_img = image.copy() + new_img.paste(top, (0, 0)) + new_img.paste(bottom, (0, pixels)) + + # create a 10% dither in the middle + dither = min(image.height//10, pixels) + for x in range(0, image.width, 2): + for y in range(pixels - dither, pixels + dither): + (r, g, b, a) = new_img.getpixel((x, y)) + new_img.putpixel((x, y), (r, g, b, 0)) + + # let's rotate back again + if direction == 'left': + new_img = new_img.transpose(Image.Transpose.ROTATE_90) + elif direction == 'bottom': + new_img = new_img.transpose(Image.Transpose.ROTATE_180) + elif direction == 'right': + new_img = new_img.transpose(Image.Transpose.ROTATE_270) + + return new_img + diff --git a/ldm/generate.py b/ldm/generate.py index e093fa8bf7..a215b8057c 100644 --- a/ldm/generate.py +++ b/ldm/generate.py @@ -281,7 +281,6 @@ class Generate: # these are specific to embiggen (which also relies on img2img args) embiggen = None, embiggen_tiles = None, - out_direction = None, # these are specific to GFPGAN/ESRGAN facetool = None, gfpgan_strength = 0, @@ -405,7 +404,6 @@ class Generate: width, height, fit=fit, - out_direction=out_direction, ) if (init_image is not None) and (mask_image is not None): generator = self._make_inpaint() @@ -519,9 +517,9 @@ class Generate: seed = 42 # try to reuse the same filename prefix as the original file. - # note that this is hacky + # we take everything up to the first period prefix = None - m = re.search('(\d+)\.',os.path.basename(image_path)) + m = re.match('^([^.]+)\.',os.path.basename(image_path)) if m: prefix = m.groups()[0] @@ -559,13 +557,12 @@ class Generate: extend_instructions = {} for direction,pixels in _pairwise(opt.outcrop): extend_instructions[direction]=int(pixels) - generator = Outcrop( - image, - self, - ) - return generator.extend( + + restorer = Outcrop(image,self,) + return restorer.process ( extend_instructions, - args, + opt = opt, + orig_opt = args, image_callback = callback, prefix = prefix, ) @@ -593,24 +590,15 @@ class Generate: image_callback = callback, ) elif tool == 'outpaint': - oldargs = metadata_from_png(image_path) - opt.strength = 0.83 - opt.init_img = image_path - return self.prompt2image( - oldargs.prompt, - out_direction = opt.out_direction, - sampler = self.sampler, - steps = opt.steps, - cfg_scale = opt.cfg_scale, - ddim_eta = self.ddim_eta, - conditioning= (uc,c), - width = opt.width, - height = opt.height, - init_img = image_path, # not the Image! (sigh) - strength = opt.strength, + from ldm.dream.restoration.outpaint import Outpaint + restorer = Outpaint(image,self) + return restorer.process( + opt, + args, image_callback = callback, - prefix = prefix, - ) + prefix = prefix + ) + elif tool is None: print(f'* please provide at least one postprocessing option, such as -G or -U') return None @@ -626,7 +614,6 @@ class Generate: width, height, fit=False, - out_direction=None, ): init_image = None init_mask = None @@ -637,7 +624,9 @@ class Generate: img, width, height, - ) # this returns an Image + ) # this returns an Image + + init_image = self._create_init_image(image) # this returns a torch tensor # if image has a transparent area and no mask was provided, then try to generate mask if self._has_transparency(image) and not mask: @@ -909,67 +898,7 @@ class Generate: image = 2.0 * image - 1.0 return image.to(self.device) - # TODO: outpainting is a post-processing application and should be made to behave - # like the other ones. - def _create_outpaint_image(self, image, direction_args): - assert len(direction_args) in [1, 2], 'Direction (-D) must have exactly one or two arguments.' - - if len(direction_args) == 1: - direction = direction_args[0] - pixels = None - elif len(direction_args) == 2: - direction = direction_args[0] - pixels = int(direction_args[1]) - - assert direction in ['top', 'left', 'bottom', 'right'], 'Direction (-D) must be one of "top", "left", "bottom", "right"' - - image = image.convert("RGBA") - # we always extend top, but rotate to extend along the requested side - if direction == 'left': - image = image.transpose(Image.Transpose.ROTATE_270) - elif direction == 'bottom': - image = image.transpose(Image.Transpose.ROTATE_180) - elif direction == 'right': - image = image.transpose(Image.Transpose.ROTATE_90) - - pixels = image.height//2 if pixels is None else int(pixels) - assert 0 < pixels < image.height, 'Direction (-D) pixels length must be in the range 0 - image.size' - - # the top part of the image is taken from the source image mirrored - # coordinates (0,0) are the upper left corner of an image - top = image.transpose(Image.Transpose.FLIP_TOP_BOTTOM).convert("RGBA") - top = top.crop((0, top.height - pixels, top.width, top.height)) - - # setting all alpha of the top part to 0 - alpha = top.getchannel("A") - alpha.paste(0, (0, 0, top.width, top.height)) - top.putalpha(alpha) - - # taking the bottom from the original image - bottom = image.crop((0, 0, image.width, image.height - pixels)) - - new_img = image.copy() - new_img.paste(top, (0, 0)) - new_img.paste(bottom, (0, pixels)) - - # create a 10% dither in the middle - dither = min(image.height//10, pixels) - for x in range(0, image.width, 2): - for y in range(pixels - dither, pixels + dither): - (r, g, b, a) = new_img.getpixel((x, y)) - new_img.putpixel((x, y), (r, g, b, 0)) - - # let's rotate back again - if direction == 'left': - new_img = new_img.transpose(Image.Transpose.ROTATE_90) - elif direction == 'bottom': - new_img = new_img.transpose(Image.Transpose.ROTATE_180) - elif direction == 'right': - new_img = new_img.transpose(Image.Transpose.ROTATE_270) - - return new_img - - def _create_init_mask(self, image, width, height, fit=True): + def _create_init_mask(self, image): # convert into a black/white mask image = self._image_to_mask(image) image = image.convert('RGB')