mirror of
https://github.com/invoke-ai/InvokeAI
synced 2024-08-30 20:32:17 +00:00
cleanup inpainting and img2img
- add a `--inpaint_replace` option that fills masked regions with latent noise. This allows radical changes to inpainted regions at the cost of losing context. - fix up readline, arg processing and metadata writing to accommodate this change - fixed bug in storage and retrieval of variations, discovered incidentally during testing - update documentation
This commit is contained in:
parent
f6bc13736a
commit
6f93dc7712
@ -6,21 +6,29 @@ title: Inpainting
|
|||||||
|
|
||||||
## **Creating Transparent Regions for Inpainting**
|
## **Creating Transparent Regions for Inpainting**
|
||||||
|
|
||||||
Inpainting is really cool. To do it, you start with an initial image and use a photoeditor to make
|
Inpainting is really cool. To do it, you start with an initial image
|
||||||
one or more regions transparent (i.e. they have a "hole" in them). You then provide the path to this
|
and use a photoeditor to make one or more regions transparent
|
||||||
image at the invoke> command line using the `-I` switch. Stable Diffusion will only paint within the
|
(i.e. they have a "hole" in them). You then provide the path to this
|
||||||
transparent region.
|
image at the dream> command line using the `-I` switch. Stable
|
||||||
|
Diffusion will only paint within the transparent region.
|
||||||
|
|
||||||
There's a catch. In the current implementation, you have to prepare the initial image correctly so
|
There's a catch. In the current implementation, you have to prepare
|
||||||
that the underlying colors are preserved under the transparent area. Many imaging editing
|
the initial image correctly so that the underlying colors are
|
||||||
applications will by default erase the color information under the transparent pixels and replace
|
preserved under the transparent area. Many imaging editing
|
||||||
them with white or black, which will lead to suboptimal inpainting. You also must take care to
|
applications will by default erase the color information under the
|
||||||
export the PNG file in such a way that the color information is preserved.
|
transparent pixels and replace them with white or black, which will
|
||||||
|
lead to suboptimal inpainting. It often helps to apply incomplete
|
||||||
|
transparency, such as any value between 1 and 99%
|
||||||
|
|
||||||
If your photoeditor is erasing the underlying color information, `invoke.py` will give you a big fat
|
You also must take care to export the PNG file in such a way that the
|
||||||
warning. If you can't find a way to coax your photoeditor to retain color values under transparent
|
color information is preserved. There is often an option in the export
|
||||||
areas, then you can combine the `-I` and `-M` switches to provide both the original unedited image
|
dialog that lets you specify this.
|
||||||
and the masked (partially transparent) image:
|
|
||||||
|
If your photoeditor is erasing the underlying color information,
|
||||||
|
`dream.py` will give you a big fat warning. If you can't find a way to
|
||||||
|
coax your photoeditor to retain color values under transparent areas,
|
||||||
|
then you can combine the `-I` and `-M` switches to provide both the
|
||||||
|
original unedited image and the masked (partially transparent) image:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
invoke> "man with cat on shoulder" -I./images/man.png -M./images/man-transparent.png
|
invoke> "man with cat on shoulder" -I./images/man.png -M./images/man-transparent.png
|
||||||
@ -28,6 +36,26 @@ invoke> "man with cat on shoulder" -I./images/man.png -M./images/man-transparent
|
|||||||
|
|
||||||
We are hoping to get rid of the need for this workaround in an upcoming release.
|
We are hoping to get rid of the need for this workaround in an upcoming release.
|
||||||
|
|
||||||
|
### Inpainting is not changing the masked region enough!
|
||||||
|
|
||||||
|
One of the things to understand about how inpainting works is that it
|
||||||
|
is equivalent to running img2img on just the masked (transparent)
|
||||||
|
area. img2img builds on top of the existing image data, and therefore
|
||||||
|
will attempt to preserve colors, shapes and textures to the best of
|
||||||
|
its ability. Unfortunately this means that if you want to make a
|
||||||
|
dramatic change in the inpainted region, for example replacing a red
|
||||||
|
wall with a blue one, the algorithm will fight you.
|
||||||
|
|
||||||
|
You have a couple of options. The first is to increase the values of
|
||||||
|
the requested steps (`-sXXX`), strength (`-f0.XX`), and/or
|
||||||
|
condition-free guidance (`-CXX.X`). If this is not working for you, a
|
||||||
|
more extreme step is to provide the `--inpaint_replace` option. This
|
||||||
|
causes the algorithm to entirely ignore the data underneath the masked
|
||||||
|
region and to treat this area like a blank canvas. This will enable
|
||||||
|
you to replace colored regions entirely, but beware that the masked
|
||||||
|
region will not blend in with the surrounding unmasked regions as
|
||||||
|
well.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Recipe for GIMP
|
## Recipe for GIMP
|
||||||
|
@ -260,6 +260,8 @@ class Generate:
|
|||||||
codeformer_fidelity = None,
|
codeformer_fidelity = None,
|
||||||
save_original = False,
|
save_original = False,
|
||||||
upscale = None,
|
upscale = None,
|
||||||
|
# this is specific to inpainting and causes more extreme inpainting
|
||||||
|
inpaint_replace = 0.0,
|
||||||
# Set this True to handle KeyboardInterrupt internally
|
# Set this True to handle KeyboardInterrupt internally
|
||||||
catch_interrupts = False,
|
catch_interrupts = False,
|
||||||
hires_fix = False,
|
hires_fix = False,
|
||||||
@ -358,6 +360,7 @@ class Generate:
|
|||||||
f'variation weights must be in [0.0, 1.0]: got {[weight for _, weight in with_variations]}'
|
f'variation weights must be in [0.0, 1.0]: got {[weight for _, weight in with_variations]}'
|
||||||
|
|
||||||
width, height, _ = self._resolution_check(width, height, log=True)
|
width, height, _ = self._resolution_check(width, height, log=True)
|
||||||
|
assert inpaint_replace >=0.0 and inpaint_replace <= 1.0,'inpaint_replace must be between 0.0 and 1.0'
|
||||||
|
|
||||||
if sampler_name and (sampler_name != self.sampler_name):
|
if sampler_name and (sampler_name != self.sampler_name):
|
||||||
self.sampler_name = sampler_name
|
self.sampler_name = sampler_name
|
||||||
@ -385,6 +388,8 @@ class Generate:
|
|||||||
height,
|
height,
|
||||||
fit=fit,
|
fit=fit,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
# TODO: Hacky selection of operation to perform. Needs to be refactored.
|
||||||
if (init_image is not None) and (mask_image is not None):
|
if (init_image is not None) and (mask_image is not None):
|
||||||
generator = self._make_inpaint()
|
generator = self._make_inpaint()
|
||||||
elif (embiggen != None or embiggen_tiles != None):
|
elif (embiggen != None or embiggen_tiles != None):
|
||||||
@ -399,6 +404,7 @@ class Generate:
|
|||||||
generator.set_variation(
|
generator.set_variation(
|
||||||
self.seed, variation_amount, with_variations
|
self.seed, variation_amount, with_variations
|
||||||
)
|
)
|
||||||
|
|
||||||
results = generator.generate(
|
results = generator.generate(
|
||||||
prompt,
|
prompt,
|
||||||
iterations=iterations,
|
iterations=iterations,
|
||||||
@ -420,6 +426,7 @@ class Generate:
|
|||||||
perlin=perlin,
|
perlin=perlin,
|
||||||
embiggen=embiggen,
|
embiggen=embiggen,
|
||||||
embiggen_tiles=embiggen_tiles,
|
embiggen_tiles=embiggen_tiles,
|
||||||
|
inpaint_replace=inpaint_replace,
|
||||||
)
|
)
|
||||||
|
|
||||||
if init_color:
|
if init_color:
|
||||||
|
@ -239,6 +239,8 @@ class Args(object):
|
|||||||
switches.append(f'--init_color {a["init_color"]}')
|
switches.append(f'--init_color {a["init_color"]}')
|
||||||
if a['strength'] and a['strength']>0:
|
if a['strength'] and a['strength']>0:
|
||||||
switches.append(f'-f {a["strength"]}')
|
switches.append(f'-f {a["strength"]}')
|
||||||
|
if a['inpaint_replace']:
|
||||||
|
switches.append(f'--inpaint_replace')
|
||||||
else:
|
else:
|
||||||
switches.append(f'-A {a["sampler_name"]}')
|
switches.append(f'-A {a["sampler_name"]}')
|
||||||
|
|
||||||
@ -266,11 +268,12 @@ class Args(object):
|
|||||||
# outpainting parameters
|
# outpainting parameters
|
||||||
if a['out_direction']:
|
if a['out_direction']:
|
||||||
switches.append(f'-D {" ".join([str(u) for u in a["out_direction"]])}')
|
switches.append(f'-D {" ".join([str(u) for u in a["out_direction"]])}')
|
||||||
|
|
||||||
# LS: slight semantic drift which needs addressing in the future:
|
# LS: slight semantic drift which needs addressing in the future:
|
||||||
# 1. Variations come out of the stored metadata as a packed string with the keyword "variations"
|
# 1. Variations come out of the stored metadata as a packed string with the keyword "variations"
|
||||||
# 2. However, they come out of the CLI (and probably web) with the keyword "with_variations" and
|
# 2. However, they come out of the CLI (and probably web) with the keyword "with_variations" and
|
||||||
# in broken-out form. Variation (1) should be changed to comply with (2)
|
# in broken-out form. Variation (1) should be changed to comply with (2)
|
||||||
if a['with_variations']:
|
if a['with_variations'] and len(a['with_variations'])>0:
|
||||||
formatted_variations = ','.join(f'{seed}:{weight}' for seed, weight in (a["with_variations"]))
|
formatted_variations = ','.join(f'{seed}:{weight}' for seed, weight in (a["with_variations"]))
|
||||||
switches.append(f'-V {formatted_variations}')
|
switches.append(f'-V {formatted_variations}')
|
||||||
if 'variations' in a and len(a['variations'])>0:
|
if 'variations' in a and len(a['variations'])>0:
|
||||||
@ -694,6 +697,13 @@ class Args(object):
|
|||||||
metavar=('direction','pixels'),
|
metavar=('direction','pixels'),
|
||||||
help='Outcrop the image with one or more direction/pixel pairs: -c top 64 bottom 128 left 64 right 64',
|
help='Outcrop the image with one or more direction/pixel pairs: -c top 64 bottom 128 left 64 right 64',
|
||||||
)
|
)
|
||||||
|
img2img_group.add_argument(
|
||||||
|
'-r',
|
||||||
|
'--inpaint_replace',
|
||||||
|
type=float,
|
||||||
|
default=0.0,
|
||||||
|
help='when inpainting, adjust how aggressively to replace the part of the picture under the mask, from 0.0 (a gentle merge) to 1.0 (replace entirely)',
|
||||||
|
)
|
||||||
postprocessing_group.add_argument(
|
postprocessing_group.add_argument(
|
||||||
'-ft',
|
'-ft',
|
||||||
'--facetool',
|
'--facetool',
|
||||||
@ -800,7 +810,8 @@ def metadata_dumps(opt,
|
|||||||
|
|
||||||
# remove any image keys not mentioned in RFC #266
|
# remove any image keys not mentioned in RFC #266
|
||||||
rfc266_img_fields = ['type','postprocessing','sampler','prompt','seed','variations','steps',
|
rfc266_img_fields = ['type','postprocessing','sampler','prompt','seed','variations','steps',
|
||||||
'cfg_scale','threshold','perlin','step_number','width','height','extra','strength']
|
'cfg_scale','threshold','perlin','step_number','width','height','extra','strength',
|
||||||
|
'init_img','init_mask']
|
||||||
|
|
||||||
rfc_dict ={}
|
rfc_dict ={}
|
||||||
|
|
||||||
@ -821,11 +832,15 @@ def metadata_dumps(opt,
|
|||||||
# 'variations' should always exist and be an array, empty or consisting of {'seed': seed, 'weight': weight} pairs
|
# 'variations' should always exist and be an array, empty or consisting of {'seed': seed, 'weight': weight} pairs
|
||||||
rfc_dict['variations'] = [{'seed':x[0],'weight':x[1]} for x in opt.with_variations] if opt.with_variations else []
|
rfc_dict['variations'] = [{'seed':x[0],'weight':x[1]} for x in opt.with_variations] if opt.with_variations else []
|
||||||
|
|
||||||
|
# if variations are present then we need to replace 'seed' with 'orig_seed'
|
||||||
|
if hasattr(opt,'first_seed'):
|
||||||
|
rfc_dict['seed'] = opt.first_seed
|
||||||
|
|
||||||
if opt.init_img:
|
if opt.init_img:
|
||||||
rfc_dict['type'] = 'img2img'
|
rfc_dict['type'] = 'img2img'
|
||||||
rfc_dict['strength_steps'] = rfc_dict.pop('strength')
|
rfc_dict['strength_steps'] = rfc_dict.pop('strength')
|
||||||
rfc_dict['orig_hash'] = calculate_init_img_hash(opt.init_img)
|
rfc_dict['orig_hash'] = calculate_init_img_hash(opt.init_img)
|
||||||
rfc_dict['sampler'] = 'ddim' # TODO: FIX ME WHEN IMG2IMG SUPPORTS ALL SAMPLERS
|
rfc_dict['inpaint_replace'] = opt.inpaint_replace
|
||||||
else:
|
else:
|
||||||
rfc_dict['type'] = 'txt2img'
|
rfc_dict['type'] = 'txt2img'
|
||||||
rfc_dict.pop('strength')
|
rfc_dict.pop('strength')
|
||||||
|
@ -5,6 +5,7 @@ including img2img, txt2img, and inpaint
|
|||||||
import torch
|
import torch
|
||||||
import numpy as np
|
import numpy as np
|
||||||
import random
|
import random
|
||||||
|
import os
|
||||||
from tqdm import tqdm, trange
|
from tqdm import tqdm, trange
|
||||||
from PIL import Image
|
from PIL import Image
|
||||||
from einops import rearrange, repeat
|
from einops import rearrange, repeat
|
||||||
@ -168,3 +169,14 @@ class Generator():
|
|||||||
|
|
||||||
return v2
|
return v2
|
||||||
|
|
||||||
|
# this is a handy routine for debugging use. Given a generated sample,
|
||||||
|
# convert it into a PNG image and store it at the indicated path
|
||||||
|
def save_sample(self, sample, filepath):
|
||||||
|
image = self.sample_to_image(sample)
|
||||||
|
dirname = os.path.dirname(filepath) or '.'
|
||||||
|
if not os.path.exists(dirname):
|
||||||
|
print(f'** creating directory {dirname}')
|
||||||
|
os.makedirs(dirname, exist_ok=True)
|
||||||
|
image.save(filepath,'PNG')
|
||||||
|
|
||||||
|
|
||||||
|
@ -18,7 +18,7 @@ class Inpaint(Img2Img):
|
|||||||
@torch.no_grad()
|
@torch.no_grad()
|
||||||
def get_make_image(self,prompt,sampler,steps,cfg_scale,ddim_eta,
|
def get_make_image(self,prompt,sampler,steps,cfg_scale,ddim_eta,
|
||||||
conditioning,init_image,mask_image,strength,
|
conditioning,init_image,mask_image,strength,
|
||||||
step_callback=None,**kwargs):
|
step_callback=None,inpaint_replace=False,**kwargs):
|
||||||
"""
|
"""
|
||||||
Returns a function returning an image derived from the prompt and
|
Returns a function returning an image derived from the prompt and
|
||||||
the initial image + mask. Return value depends on the seed at
|
the initial image + mask. Return value depends on the seed at
|
||||||
@ -58,6 +58,14 @@ class Inpaint(Img2Img):
|
|||||||
noise=x_T
|
noise=x_T
|
||||||
)
|
)
|
||||||
|
|
||||||
|
# to replace masked area with latent noise, weighted by inpaint_replace strength
|
||||||
|
if inpaint_replace > 0.0:
|
||||||
|
print(f'>> inpaint will replace what was under the mask with a strength of {inpaint_replace}')
|
||||||
|
l_noise = self.get_noise(kwargs['width'],kwargs['height'])
|
||||||
|
inverted_mask = 1.0-mask_image # there will be 1s where the mask is
|
||||||
|
masked_region = (1.0-inpaint_replace) * inverted_mask * z_enc + inpaint_replace * inverted_mask * l_noise
|
||||||
|
z_enc = z_enc * mask_image + masked_region
|
||||||
|
|
||||||
# decode it
|
# decode it
|
||||||
samples = sampler.decode(
|
samples = sampler.decode(
|
||||||
z_enc,
|
z_enc,
|
||||||
|
@ -52,6 +52,7 @@ COMMANDS = (
|
|||||||
'--skip_normalize','-x',
|
'--skip_normalize','-x',
|
||||||
'--log_tokenization','-t',
|
'--log_tokenization','-t',
|
||||||
'--hires_fix',
|
'--hires_fix',
|
||||||
|
'--inpaint_replace','-r',
|
||||||
'!fix','!fetch','!history','!search','!clear',
|
'!fix','!fetch','!history','!search','!clear',
|
||||||
'!models','!switch','!import_model','!edit_model'
|
'!models','!switch','!import_model','!edit_model'
|
||||||
)
|
)
|
||||||
|
Loading…
x
Reference in New Issue
Block a user