cleanup inpainting and img2img

- add a `--inpaint_replace` option that fills masked regions with
  latent noise. This allows radical changes to inpainted regions
  at the cost of losing context.
- fix up readline, arg processing and metadata writing to accommodate
  this change
- fixed bug in storage and retrieval of variations, discovered incidentally
  during testing
- update documentation
This commit is contained in:
Lincoln Stein 2022-10-02 16:37:36 -04:00
parent f6bc13736a
commit 6f93dc7712
6 changed files with 91 additions and 20 deletions

View File

@ -6,21 +6,29 @@ title: Inpainting
## **Creating Transparent Regions for Inpainting** ## **Creating Transparent Regions for Inpainting**
Inpainting is really cool. To do it, you start with an initial image and use a photoeditor to make Inpainting is really cool. To do it, you start with an initial image
one or more regions transparent (i.e. they have a "hole" in them). You then provide the path to this and use a photoeditor to make one or more regions transparent
image at the invoke> command line using the `-I` switch. Stable Diffusion will only paint within the (i.e. they have a "hole" in them). You then provide the path to this
transparent region. image at the dream> command line using the `-I` switch. Stable
Diffusion will only paint within the transparent region.
There's a catch. In the current implementation, you have to prepare the initial image correctly so There's a catch. In the current implementation, you have to prepare
that the underlying colors are preserved under the transparent area. Many imaging editing the initial image correctly so that the underlying colors are
applications will by default erase the color information under the transparent pixels and replace preserved under the transparent area. Many imaging editing
them with white or black, which will lead to suboptimal inpainting. You also must take care to applications will by default erase the color information under the
export the PNG file in such a way that the color information is preserved. transparent pixels and replace them with white or black, which will
lead to suboptimal inpainting. It often helps to apply incomplete
transparency, such as any value between 1 and 99%
If your photoeditor is erasing the underlying color information, `invoke.py` will give you a big fat You also must take care to export the PNG file in such a way that the
warning. If you can't find a way to coax your photoeditor to retain color values under transparent color information is preserved. There is often an option in the export
areas, then you can combine the `-I` and `-M` switches to provide both the original unedited image dialog that lets you specify this.
and the masked (partially transparent) image:
If your photoeditor is erasing the underlying color information,
`dream.py` will give you a big fat warning. If you can't find a way to
coax your photoeditor to retain color values under transparent areas,
then you can combine the `-I` and `-M` switches to provide both the
original unedited image and the masked (partially transparent) image:
```bash ```bash
invoke> "man with cat on shoulder" -I./images/man.png -M./images/man-transparent.png invoke> "man with cat on shoulder" -I./images/man.png -M./images/man-transparent.png
@ -28,6 +36,26 @@ invoke> "man with cat on shoulder" -I./images/man.png -M./images/man-transparent
We are hoping to get rid of the need for this workaround in an upcoming release. We are hoping to get rid of the need for this workaround in an upcoming release.
### Inpainting is not changing the masked region enough!
One of the things to understand about how inpainting works is that it
is equivalent to running img2img on just the masked (transparent)
area. img2img builds on top of the existing image data, and therefore
will attempt to preserve colors, shapes and textures to the best of
its ability. Unfortunately this means that if you want to make a
dramatic change in the inpainted region, for example replacing a red
wall with a blue one, the algorithm will fight you.
You have a couple of options. The first is to increase the values of
the requested steps (`-sXXX`), strength (`-f0.XX`), and/or
condition-free guidance (`-CXX.X`). If this is not working for you, a
more extreme step is to provide the `--inpaint_replace` option. This
causes the algorithm to entirely ignore the data underneath the masked
region and to treat this area like a blank canvas. This will enable
you to replace colored regions entirely, but beware that the masked
region will not blend in with the surrounding unmasked regions as
well.
--- ---
## Recipe for GIMP ## Recipe for GIMP

View File

@ -260,6 +260,8 @@ class Generate:
codeformer_fidelity = None, codeformer_fidelity = None,
save_original = False, save_original = False,
upscale = None, upscale = None,
# this is specific to inpainting and causes more extreme inpainting
inpaint_replace = 0.0,
# Set this True to handle KeyboardInterrupt internally # Set this True to handle KeyboardInterrupt internally
catch_interrupts = False, catch_interrupts = False,
hires_fix = False, hires_fix = False,
@ -358,6 +360,7 @@ class Generate:
f'variation weights must be in [0.0, 1.0]: got {[weight for _, weight in with_variations]}' f'variation weights must be in [0.0, 1.0]: got {[weight for _, weight in with_variations]}'
width, height, _ = self._resolution_check(width, height, log=True) width, height, _ = self._resolution_check(width, height, log=True)
assert inpaint_replace >=0.0 and inpaint_replace <= 1.0,'inpaint_replace must be between 0.0 and 1.0'
if sampler_name and (sampler_name != self.sampler_name): if sampler_name and (sampler_name != self.sampler_name):
self.sampler_name = sampler_name self.sampler_name = sampler_name
@ -385,6 +388,8 @@ class Generate:
height, height,
fit=fit, fit=fit,
) )
# TODO: Hacky selection of operation to perform. Needs to be refactored.
if (init_image is not None) and (mask_image is not None): if (init_image is not None) and (mask_image is not None):
generator = self._make_inpaint() generator = self._make_inpaint()
elif (embiggen != None or embiggen_tiles != None): elif (embiggen != None or embiggen_tiles != None):
@ -399,6 +404,7 @@ class Generate:
generator.set_variation( generator.set_variation(
self.seed, variation_amount, with_variations self.seed, variation_amount, with_variations
) )
results = generator.generate( results = generator.generate(
prompt, prompt,
iterations=iterations, iterations=iterations,
@ -420,6 +426,7 @@ class Generate:
perlin=perlin, perlin=perlin,
embiggen=embiggen, embiggen=embiggen,
embiggen_tiles=embiggen_tiles, embiggen_tiles=embiggen_tiles,
inpaint_replace=inpaint_replace,
) )
if init_color: if init_color:

View File

@ -239,6 +239,8 @@ class Args(object):
switches.append(f'--init_color {a["init_color"]}') switches.append(f'--init_color {a["init_color"]}')
if a['strength'] and a['strength']>0: if a['strength'] and a['strength']>0:
switches.append(f'-f {a["strength"]}') switches.append(f'-f {a["strength"]}')
if a['inpaint_replace']:
switches.append(f'--inpaint_replace')
else: else:
switches.append(f'-A {a["sampler_name"]}') switches.append(f'-A {a["sampler_name"]}')
@ -266,11 +268,12 @@ class Args(object):
# outpainting parameters # outpainting parameters
if a['out_direction']: if a['out_direction']:
switches.append(f'-D {" ".join([str(u) for u in a["out_direction"]])}') switches.append(f'-D {" ".join([str(u) for u in a["out_direction"]])}')
# LS: slight semantic drift which needs addressing in the future: # LS: slight semantic drift which needs addressing in the future:
# 1. Variations come out of the stored metadata as a packed string with the keyword "variations" # 1. Variations come out of the stored metadata as a packed string with the keyword "variations"
# 2. However, they come out of the CLI (and probably web) with the keyword "with_variations" and # 2. However, they come out of the CLI (and probably web) with the keyword "with_variations" and
# in broken-out form. Variation (1) should be changed to comply with (2) # in broken-out form. Variation (1) should be changed to comply with (2)
if a['with_variations']: if a['with_variations'] and len(a['with_variations'])>0:
formatted_variations = ','.join(f'{seed}:{weight}' for seed, weight in (a["with_variations"])) formatted_variations = ','.join(f'{seed}:{weight}' for seed, weight in (a["with_variations"]))
switches.append(f'-V {formatted_variations}') switches.append(f'-V {formatted_variations}')
if 'variations' in a and len(a['variations'])>0: if 'variations' in a and len(a['variations'])>0:
@ -694,6 +697,13 @@ class Args(object):
metavar=('direction','pixels'), metavar=('direction','pixels'),
help='Outcrop the image with one or more direction/pixel pairs: -c top 64 bottom 128 left 64 right 64', help='Outcrop the image with one or more direction/pixel pairs: -c top 64 bottom 128 left 64 right 64',
) )
img2img_group.add_argument(
'-r',
'--inpaint_replace',
type=float,
default=0.0,
help='when inpainting, adjust how aggressively to replace the part of the picture under the mask, from 0.0 (a gentle merge) to 1.0 (replace entirely)',
)
postprocessing_group.add_argument( postprocessing_group.add_argument(
'-ft', '-ft',
'--facetool', '--facetool',
@ -800,7 +810,8 @@ def metadata_dumps(opt,
# remove any image keys not mentioned in RFC #266 # remove any image keys not mentioned in RFC #266
rfc266_img_fields = ['type','postprocessing','sampler','prompt','seed','variations','steps', rfc266_img_fields = ['type','postprocessing','sampler','prompt','seed','variations','steps',
'cfg_scale','threshold','perlin','step_number','width','height','extra','strength'] 'cfg_scale','threshold','perlin','step_number','width','height','extra','strength',
'init_img','init_mask']
rfc_dict ={} rfc_dict ={}
@ -821,11 +832,15 @@ def metadata_dumps(opt,
# 'variations' should always exist and be an array, empty or consisting of {'seed': seed, 'weight': weight} pairs # 'variations' should always exist and be an array, empty or consisting of {'seed': seed, 'weight': weight} pairs
rfc_dict['variations'] = [{'seed':x[0],'weight':x[1]} for x in opt.with_variations] if opt.with_variations else [] rfc_dict['variations'] = [{'seed':x[0],'weight':x[1]} for x in opt.with_variations] if opt.with_variations else []
# if variations are present then we need to replace 'seed' with 'orig_seed'
if hasattr(opt,'first_seed'):
rfc_dict['seed'] = opt.first_seed
if opt.init_img: if opt.init_img:
rfc_dict['type'] = 'img2img' rfc_dict['type'] = 'img2img'
rfc_dict['strength_steps'] = rfc_dict.pop('strength') rfc_dict['strength_steps'] = rfc_dict.pop('strength')
rfc_dict['orig_hash'] = calculate_init_img_hash(opt.init_img) rfc_dict['orig_hash'] = calculate_init_img_hash(opt.init_img)
rfc_dict['sampler'] = 'ddim' # TODO: FIX ME WHEN IMG2IMG SUPPORTS ALL SAMPLERS rfc_dict['inpaint_replace'] = opt.inpaint_replace
else: else:
rfc_dict['type'] = 'txt2img' rfc_dict['type'] = 'txt2img'
rfc_dict.pop('strength') rfc_dict.pop('strength')

View File

@ -5,6 +5,7 @@ including img2img, txt2img, and inpaint
import torch import torch
import numpy as np import numpy as np
import random import random
import os
from tqdm import tqdm, trange from tqdm import tqdm, trange
from PIL import Image from PIL import Image
from einops import rearrange, repeat from einops import rearrange, repeat
@ -168,3 +169,14 @@ class Generator():
return v2 return v2
# this is a handy routine for debugging use. Given a generated sample,
# convert it into a PNG image and store it at the indicated path
def save_sample(self, sample, filepath):
image = self.sample_to_image(sample)
dirname = os.path.dirname(filepath) or '.'
if not os.path.exists(dirname):
print(f'** creating directory {dirname}')
os.makedirs(dirname, exist_ok=True)
image.save(filepath,'PNG')

View File

@ -18,7 +18,7 @@ class Inpaint(Img2Img):
@torch.no_grad() @torch.no_grad()
def get_make_image(self,prompt,sampler,steps,cfg_scale,ddim_eta, def get_make_image(self,prompt,sampler,steps,cfg_scale,ddim_eta,
conditioning,init_image,mask_image,strength, conditioning,init_image,mask_image,strength,
step_callback=None,**kwargs): step_callback=None,inpaint_replace=False,**kwargs):
""" """
Returns a function returning an image derived from the prompt and Returns a function returning an image derived from the prompt and
the initial image + mask. Return value depends on the seed at the initial image + mask. Return value depends on the seed at
@ -58,6 +58,14 @@ class Inpaint(Img2Img):
noise=x_T noise=x_T
) )
# to replace masked area with latent noise, weighted by inpaint_replace strength
if inpaint_replace > 0.0:
print(f'>> inpaint will replace what was under the mask with a strength of {inpaint_replace}')
l_noise = self.get_noise(kwargs['width'],kwargs['height'])
inverted_mask = 1.0-mask_image # there will be 1s where the mask is
masked_region = (1.0-inpaint_replace) * inverted_mask * z_enc + inpaint_replace * inverted_mask * l_noise
z_enc = z_enc * mask_image + masked_region
# decode it # decode it
samples = sampler.decode( samples = sampler.decode(
z_enc, z_enc,

View File

@ -52,6 +52,7 @@ COMMANDS = (
'--skip_normalize','-x', '--skip_normalize','-x',
'--log_tokenization','-t', '--log_tokenization','-t',
'--hires_fix', '--hires_fix',
'--inpaint_replace','-r',
'!fix','!fetch','!history','!search','!clear', '!fix','!fetch','!history','!search','!clear',
'!models','!switch','!import_model','!edit_model' '!models','!switch','!import_model','!edit_model'
) )