cleanup inpainting and img2img

- add a `--inpaint_replace` option that fills masked regions with latent noise. This allows radical changes to inpainted regions at the cost of losing context. - fix up readline, arg processing and metadata writing to accommodate this change - fixed bug in storage and retrieval of variations, discovered incidentally during testing - update documentation
2024-08-30 20:32:17 +00:00 · 2022-10-02 16:37:36 -04:00 · 2022-10-02 16:37:36 -04:00 · 6f93dc7712
commit 6f93dc7712
parent f6bc13736a
6 changed files with 91 additions and 20 deletions
--- a/docs/features/INPAINTING.md
+++ b/docs/features/INPAINTING.md
@ -6,21 +6,29 @@ title: Inpainting
 ## **Creating Transparent Regions for Inpainting**
-Inpainting is really cool. To do it, you start with an initial image and use a photoeditor to make
+Inpainting is really cool. To do it, you start with an initial image
-one or more regions transparent (i.e. they have a "hole" in them). You then provide the path to this
+and use a photoeditor to make one or more regions transparent
-image at the invoke> command line using the `-I` switch. Stable Diffusion will only paint within the
+(i.e. they have a "hole" in them). You then provide the path to this
-transparent region.
+image at the dream> command line using the `-I` switch. Stable
 Diffusion will only paint within the transparent region.
-There's a catch. In the current implementation, you have to prepare the initial image correctly so
+There's a catch. In the current implementation, you have to prepare
-that the underlying colors are preserved under the transparent area. Many imaging editing
+the initial image correctly so that the underlying colors are
-applications will by default erase the color information under the transparent pixels and replace
+preserved under the transparent area. Many imaging editing
-them with white or black, which will lead to suboptimal inpainting. You also must take care to
+applications will by default erase the color information under the
-export the PNG file in such a way that the color information is preserved.
+transparent pixels and replace them with white or black, which will
 lead to suboptimal inpainting. It often helps to apply incomplete
 transparency, such as any value between 1 and 99%
-If your photoeditor is erasing the underlying color information, `invoke.py` will give you a big fat
+You also must take care to export the PNG file in such a way that the
-warning. If you can't find a way to coax your photoeditor to retain color values under transparent
+color information is preserved. There is often an option in the export
-areas, then you can combine the `-I` and `-M` switches to provide both the original unedited image
+dialog that lets you specify this.
-and the masked (partially transparent) image:
+
 If your photoeditor is erasing the underlying color information,
 `dream.py` will give you a big fat warning. If you can't find a way to
 coax your photoeditor to retain color values under transparent areas,
 then you can combine the `-I` and `-M` switches to provide both the
 original unedited image and the masked (partially transparent) image:
 ```bash
 invoke> "man with cat on shoulder" -I./images/man.png -M./images/man-transparent.png
@ -28,6 +36,26 @@ invoke> "man with cat on shoulder" -I./images/man.png -M./images/man-transparent
 We are hoping to get rid of the need for this workaround in an upcoming release.
 ### Inpainting is not changing the masked region enough!
 One of the things to understand about how inpainting works is that it
 is equivalent to running img2img on just the masked (transparent)
 area. img2img builds on top of the existing image data, and therefore
 will attempt to preserve colors, shapes and textures to the best of
 its ability. Unfortunately this means that if you want to make a
 dramatic change in the inpainted region, for example replacing a red
 wall with a blue one, the algorithm will fight you.
 You have a couple of options. The first is to increase the values of
 the requested steps (`-sXXX`), strength (`-f0.XX`), and/or
 condition-free guidance (`-CXX.X`). If this is not working for you, a
 more extreme step is to provide the `--inpaint_replace` option. This
 causes the algorithm to entirely ignore the data underneath the masked
 region and to treat this area like a blank canvas. This will enable
 you to replace colored regions entirely, but beware that the masked
 region will not blend in with the surrounding unmasked regions as
 well.
 ---
 ## Recipe for GIMP
--- a/ldm/generate.py
+++ b/ldm/generate.py
@ -260,6 +260,8 @@ class Generate:
            codeformer_fidelity = None,
            save_original    = False,
            upscale          = None,
            # this is specific to inpainting and causes more extreme inpainting
            inpaint_replace  = 0.0,
            # Set this True to handle KeyboardInterrupt internally
            catch_interrupts = False,
            hires_fix        = False,
@ -358,6 +360,7 @@ class Generate:
                f'variation weights must be in [0.0, 1.0]: got {[weight for _, weight in with_variations]}'
        width, height, _ = self._resolution_check(width, height, log=True)
        assert inpaint_replace >=0.0 and inpaint_replace <= 1.0,'inpaint_replace must be between 0.0 and 1.0'
        if sampler_name and (sampler_name != self.sampler_name):
            self.sampler_name = sampler_name
@ -385,6 +388,8 @@ class Generate:
                height,
                fit=fit,
            )
            # TODO: Hacky selection of operation to perform. Needs to be refactored.
            if (init_image is not None) and (mask_image is not None):
                generator = self._make_inpaint()
            elif (embiggen != None or embiggen_tiles != None):
@ -399,6 +404,7 @@ class Generate:
            generator.set_variation(
                self.seed, variation_amount, with_variations
            )
            results = generator.generate(
                prompt,
                iterations=iterations,
@ -420,6 +426,7 @@ class Generate:
                perlin=perlin,
                embiggen=embiggen,
                embiggen_tiles=embiggen_tiles,
                inpaint_replace=inpaint_replace,
            )
            if init_color:
--- a/ldm/invoke/args.py
+++ b/ldm/invoke/args.py
@ -239,6 +239,8 @@ class Args(object):
                switches.append(f'--init_color {a["init_color"]}')
            if a['strength'] and a['strength']>0:
                switches.append(f'-f {a["strength"]}')
            if a['inpaint_replace']:
                switches.append(f'--inpaint_replace')
        else:
            switches.append(f'-A {a["sampler_name"]}')
@ -266,11 +268,12 @@ class Args(object):
        # outpainting parameters
        if a['out_direction']:
            switches.append(f'-D {" ".join([str(u) for u in a["out_direction"]])}')
        # LS: slight semantic drift which needs addressing in the future:
        # 1. Variations come out of the stored metadata as a packed string with the keyword "variations"
        # 2. However, they come out of the CLI (and probably web) with the keyword "with_variations" and
        #    in broken-out form. Variation (1) should be changed to comply with (2)
-        if a['with_variations']:
+        if a['with_variations'] and len(a['with_variations'])>0:
            formatted_variations = ','.join(f'{seed}:{weight}' for seed, weight in (a["with_variations"]))
            switches.append(f'-V {formatted_variations}')
        if 'variations' in a and len(a['variations'])>0:
@ -694,6 +697,13 @@ class Args(object):
            metavar=('direction','pixels'),
            help='Outcrop the image with one or more direction/pixel pairs: -c top 64 bottom 128 left 64 right 64',
        )
        img2img_group.add_argument(
            '-r',
            '--inpaint_replace',
            type=float,
            default=0.0,
            help='when inpainting, adjust how aggressively to replace the part of the picture under the mask, from 0.0 (a gentle merge) to 1.0 (replace entirely)',
        )
        postprocessing_group.add_argument(
            '-ft',
            '--facetool',
@ -800,7 +810,8 @@ def metadata_dumps(opt,
    # remove any image keys not mentioned in RFC #266
    rfc266_img_fields = ['type','postprocessing','sampler','prompt','seed','variations','steps',
-                         'cfg_scale','threshold','perlin','step_number','width','height','extra','strength']
+                         'cfg_scale','threshold','perlin','step_number','width','height','extra','strength',
                         'init_img','init_mask']
    rfc_dict ={}
@ -821,11 +832,15 @@ def metadata_dumps(opt,
    # 'variations' should always exist and be an array, empty or consisting of {'seed': seed, 'weight': weight} pairs
    rfc_dict['variations'] = [{'seed':x[0],'weight':x[1]} for x in opt.with_variations] if opt.with_variations else []
    # if variations are present then we need to replace 'seed' with 'orig_seed'
    if hasattr(opt,'first_seed'):
        rfc_dict['seed'] = opt.first_seed
    if opt.init_img:
-        rfc_dict['type']           = 'img2img'
+        rfc_dict['type']            = 'img2img'
-        rfc_dict['strength_steps'] = rfc_dict.pop('strength')
+        rfc_dict['strength_steps']  = rfc_dict.pop('strength')
-        rfc_dict['orig_hash']      = calculate_init_img_hash(opt.init_img)
+        rfc_dict['orig_hash']       = calculate_init_img_hash(opt.init_img)
-        rfc_dict['sampler']        = 'ddim'  # TODO: FIX ME WHEN IMG2IMG SUPPORTS ALL SAMPLERS
+        rfc_dict['inpaint_replace'] = opt.inpaint_replace
    else:
        rfc_dict['type']  = 'txt2img'
        rfc_dict.pop('strength')
--- a/ldm/invoke/generator/base.py
+++ b/ldm/invoke/generator/base.py
@ -5,6 +5,7 @@ including img2img, txt2img, and inpaint
 import torch
 import numpy as  np
 import random
 import os
 from tqdm import tqdm, trange
 from PIL               import Image
 from einops import rearrange, repeat
@ -168,3 +169,14 @@ class Generator():
        return v2
    # this is a handy routine for debugging use. Given a generated sample,
    # convert it into a PNG image and store it at the indicated path
    def save_sample(self, sample, filepath):
        image = self.sample_to_image(sample)
        dirname = os.path.dirname(filepath) or '.'
        if not os.path.exists(dirname):
            print(f'** creating directory {dirname}')
            os.makedirs(dirname, exist_ok=True)
        image.save(filepath,'PNG')
--- a/ldm/invoke/generator/inpaint.py
+++ b/ldm/invoke/generator/inpaint.py
@ -18,7 +18,7 @@ class Inpaint(Img2Img):
    @torch.no_grad()
    def get_make_image(self,prompt,sampler,steps,cfg_scale,ddim_eta,
                       conditioning,init_image,mask_image,strength,
-                       step_callback=None,**kwargs):
+                       step_callback=None,inpaint_replace=False,**kwargs):
        """
        Returns a function returning an image derived from the prompt and
        the initial image + mask.  Return value depends on the seed at
@ -58,6 +58,14 @@ class Inpaint(Img2Img):
                noise=x_T
            )
            # to replace masked area with latent noise, weighted by inpaint_replace strength
            if inpaint_replace > 0.0:
                print(f'>> inpaint will replace what was under the mask with a strength of {inpaint_replace}')
                l_noise = self.get_noise(kwargs['width'],kwargs['height'])
                inverted_mask = 1.0-mask_image  # there will be 1s where the mask is
                masked_region = (1.0-inpaint_replace) * inverted_mask * z_enc + inpaint_replace * inverted_mask * l_noise
                z_enc   = z_enc * mask_image + masked_region
            # decode it
            samples = sampler.decode(
                z_enc,
--- a/ldm/invoke/readline.py
+++ b/ldm/invoke/readline.py
@ -52,6 +52,7 @@ COMMANDS = (
    '--skip_normalize','-x',
    '--log_tokenization','-t',
    '--hires_fix',
    '--inpaint_replace','-r',
    '!fix','!fetch','!history','!search','!clear',
    '!models','!switch','!import_model','!edit_model'
    )