Merge branch 'development' into development

This commit is contained in:
Peter Baylies 2022-09-12 16:34:10 -04:00 committed by GitHub
commit 4a5a228fd8
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
8 changed files with 582 additions and 5 deletions

View File

@ -27,7 +27,6 @@ report bugs and make feature requests. Be sure to use the provided
templates. They will help aid diagnose issues faster._
# **Table of Contents**
1. [Installation](#installation)
2. [Major Features](#features)
3. [Changelog](#latest-changes)
@ -86,6 +85,8 @@ To run in full-precision mode, start `dream.py` with the
- ## [GFPGAN and Real-ESRGAN Support](docs/features/UPSCALE.md)
- ## [Embiggen upscaling](docs/features/EMBIGGEN.md)
- ## [Seamless Tiling](docs/features/OTHER.md#seamless-tiling)
- ## [Google Colab](docs/features/OTHER.md#google-colab)
@ -136,7 +137,7 @@ To run in full-precision mode, start `dream.py` with the
- Works on M1 Apple hardware.
- Multiple bug fixes.
For older changelogs, please visit **[CHANGELOGS](docs/CHANGELOG.md)**.
For older changelogs, please visit **[CHANGELOGS](docs/CHANGELOG.md)**.
# Troubleshooting

134
docs/features/EMBIGGEN.md Normal file
View File

@ -0,0 +1,134 @@
# **Embiggen -- upscale your images on limited memory machines**
GFPGAN and Real-ESRGAN are both memory intensive. In order to avoid
crashes and memory overloads during the Stable Diffusion process,
these effects are applied after Stable Diffusion has completed its
work.
In single image generations, you will see the output right away but
when you are using multiple iterations, the images will first be
generated and then upscaled and face restored after that process is
complete. While the image generation is taking place, you will still
be able to preview the base images.
If you wish to stop during the image generation but want to upscale or
face restore a particular generated image, pass it again with the same
prompt and generated seed along with the `-U` and `-G` prompt
arguments to perform those actions.
## Embiggen
If you wanted to be able to do more (pixels) without running out of VRAM,
or you want to upscale with details that couldn't possibly appear
without the context of a prompt, this is the feature to try out.
Embiggen automates the process of taking an init image, upscaling it,
cutting it into smaller tiles that slightly overlap, running all the
tiles through img2img to refine details with respect to the prompt,
and "stitching" the tiles back together into a cohesive image.
It automatically computes how many tiles are needed, and so it can be fed
*ANY* size init image and perform Img2Img on it (though it will be run only
one tile at a time, which can cause problems, see the Note at the end).
If you're familiar with "GoBig" (ala [progrock-stable](https://github.com/lowfuel/progrock-stable))
it's similar to that, except it can work up to an arbitrarily large size
(instead of just 2x), with tile overlaps configurable as a ratio, and
has extra logic to re-run any number of the tile sub-sections of the image
if for example a small part of a huge run got messed up.
**Usage**
`-embiggen <scaling_factor> <esrgan_strength> <overlap_ratio OR overlap_pixels>`
Takes a scaling factor relative to the size of the `--init_img` (`-I`), followed by
ESRGAN upscaling strength (0 - 1.0), followed by minimum amount of overlap
between tiles as a decimal ratio (0 - 1.0) *OR* a number of pixels.
The scaling factor is how much larger than the `--init_img` the output
should be, and will multiply both x and y axis, so an image that is a
scaling factor of 3.0 has 3*3= 9 times as many pixels, and will take
(at least) 9 times as long (see overlap for why it might be
longer). If the `--init_img` is already the right size `-embiggen 1`,
and it can also be less than one if the init_img is too big.
Esrgan_strength defaults to 0.75, and the overlap_ratio defaults to
0.25, both are optional.
Unlike Img2Img, the `--width` (`-W`) and `--height` (`-H`) arguments
do not control the size of the image as a whole, but the size of the
tiles used to Embiggen the image.
ESRGAN is used to upscale the `--init_img` prior to cutting it into
tiles/pieces to run through img2img and then stitch back
together. Embiggen can be run without ESRGAN; just set the strength to
zero (e.g. `-embiggen 1.75 0`). The output of Embiggen can also be
upscaled after it's finished (`-U`).
The overlap is the minimum that tiles will overlap with adjacent
tiles, specified as either a ratio or a number of pixels. How much the
tiles overlap determines the likelihood the tiling will be noticable,
really small overlaps (e.g. a couple of pixels) may produce noticeable
grid-like fuzzy distortions in the final stitched image. Though, as
the overlapping space doesn't contribute to making the image bigger,
and the larger the overlap the more tiles (and the more time) it will
take to finish.
Because the overlapping parts of tiles don't "contribute" to
increasing size, every tile after the first in a row or column
effectively only covers an extra `1 - overlap_ratio` on each axis. If
the input/`--init_img` is same size as a tile, the ideal (for time)
scaling factors with the default overlap (0.25) are 1.75, 2.5, 3.25,
4.0 etc..
`-embiggen_tiles <spaced list of tiles>`
An advanced usage useful if you only want to alter parts of the image
while running Embiggen. It takes a list of tiles by number to run and
replace onto the initial image e.g. `1 3 5`. It's useful for either
fixing problem spots from a previous Embiggen run, or selectively
altering the prompt for sections of an image - for creative or
coherency reasons.
Tiles are numbered starting with one, and left-to-right,
top-to-bottom. So, if you are generating a 3x3 tiled image, the
middle row would be `4 5 6`.
**Example Usage**
Running Embiggen with 512x512 tiles on an existing image, scaling up by a factor of 2.5x;
and doing the same again (default ESRGAN strength is 0.75, default overlap between tiles is 0.25):
```
dream > a photo of a forest at sunset -s 100 -W 512 -H 512 -I outputs/forest.png -f 0.4 -embiggen 2.5
dream > a photo of a forest at sunset -s 100 -W 512 -H 512 -I outputs/forest.png -f 0.4 -embiggen 2.5 0.75 0.25
```
If your starting image was also 512x512 this should have taken 9 tiles.
If there weren't enough clouds in the sky of that forest you just made
(and that image is about 1280 pixels (512*2.5) wide A.K.A. three
512x512 tiles with 0.25 overlaps wide) we can replace that top row of
tiles:
```
dream> a photo of puffy clouds over a forest at sunset -s 100 -W 512 -H 512 -I outputs/000002.seed.png -f 0.5 -embiggen_tiles 1 2 3
```
**Note**
Because the same prompt is used on all the tiled images, and the model
doesn't have the context of anything outside the tile being run - it
can end up creating repeated pattern (also called 'motifs') across all
the tiles based on that prompt. The best way to combat this is
lowering the `--strength` (`-f`) to stay more true to the init image,
and increasing the number of steps so there is more compute-time to
create the detail. Anecdotally `--strength` 0.35-0.45 works pretty
well on most things. It may also work great in some examples even with
the `--strength` set high for patterns, landscapes, or subjects that
are more abstract. Because this is (relatively) fast, you can also
always create a few Embiggen'ed images and manually composite them to
preserve the best parts from each.
Author: [Travco](https://github.com/travco)

View File

@ -0,0 +1,403 @@
'''
ldm.dream.generator.embiggen descends from ldm.dream.generator
and generates with ldm.dream.generator.img2img
'''
import torch
import numpy as np
from PIL import Image
from ldm.dream.generator.base import Generator
from ldm.models.diffusion.ddim import DDIMSampler
from ldm.dream.generator.img2img import Img2Img
class Embiggen(Generator):
def __init__(self,model):
super().__init__(model)
self.init_latent = None
@torch.no_grad()
def get_make_image(
self,
prompt,
sampler,
steps,
cfg_scale,
ddim_eta,
conditioning,
init_img,
strength,
width,
height,
embiggen,
embiggen_tiles,
step_callback=None,
**kwargs
):
"""
Returns a function returning an image derived from the prompt and multi-stage twice-baked potato layering over the img2img on the initial image
Return value depends on the seed at the time you call it
"""
# Construct embiggen arg array, and sanity check arguments
if embiggen == None: # embiggen can also be called with just embiggen_tiles
embiggen = [1.0] # If not specified, assume no scaling
elif embiggen[0] < 0 :
embiggen[0] = 1.0
print('>> Embiggen scaling factor cannot be negative, fell back to the default of 1.0 !')
if len(embiggen) < 2:
embiggen.append(0.75)
elif embiggen[1] > 1.0 or embiggen[1] < 0 :
embiggen[1] = 0.75
print('>> Embiggen upscaling strength for ESRGAN must be between 0 and 1, fell back to the default of 0.75 !')
if len(embiggen) < 3:
embiggen.append(0.25)
elif embiggen[2] < 0 :
embiggen[2] = 0.25
print('>> Overlap size for Embiggen must be a positive ratio between 0 and 1 OR a number of pixels, fell back to the default of 0.25 !')
# Convert tiles from their user-freindly count-from-one to count-from-zero, because we need to do modulo math
# and then sort them, because... people.
if embiggen_tiles:
embiggen_tiles = list(map(lambda n: n-1, embiggen_tiles))
embiggen_tiles.sort()
# Prep img2img generator, since we wrap over it
gen_img2img = Img2Img(self.model)
# Open original init image (not a tensor) to manipulate
initsuperimage = Image.open(init_img)
with Image.open(init_img) as img:
initsuperimage = img.convert('RGB')
# Size of the target super init image in pixels
initsuperwidth, initsuperheight = initsuperimage.size
# Increase by scaling factor if not already resized, using ESRGAN as able
if embiggen[0] != 1.0:
initsuperwidth = round(initsuperwidth*embiggen[0])
initsuperheight = round(initsuperheight*embiggen[0])
if embiggen[1] > 0: # No point in ESRGAN upscaling if strength is set zero
from ldm.gfpgan.gfpgan_tools import (
real_esrgan_upscale,
)
print(f'>> ESRGAN upscaling init image prior to cutting with Embiggen with strength {embiggen[1]}')
if embiggen[0] > 2:
initsuperimage = real_esrgan_upscale(
initsuperimage,
embiggen[1], # upscale strength
4, # upscale scale
self.seed,
)
else:
initsuperimage = real_esrgan_upscale(
initsuperimage,
embiggen[1], # upscale strength
2, # upscale scale
self.seed,
)
# We could keep recursively re-running ESRGAN for a requested embiggen[0] larger than 4x
# but from personal experiance it doesn't greatly improve anything after 4x
# Resize to target scaling factor resolution
initsuperimage = initsuperimage.resize((initsuperwidth, initsuperheight), Image.Resampling.LANCZOS)
# Use width and height as tile widths and height
# Determine buffer size in pixels
if embiggen[2] < 1:
if embiggen[2] < 0:
embiggen[2] = 0
overlap_size_x = round(embiggen[2] * width)
overlap_size_y = round(embiggen[2] * height)
else:
overlap_size_x = round(embiggen[2])
overlap_size_y = round(embiggen[2])
# With overall image width and height known, determine how many tiles we need
def ceildiv(a, b):
return -1 * (-a // b)
# X and Y needs to be determined independantly (we may have savings on one based on the buffer pixel count)
# (initsuperwidth - width) is the area remaining to the right that we need to layers tiles to fill
# (width - overlap_size_x) is how much new we can fill with a single tile
emb_tiles_x = 1
emb_tiles_y = 1
if (initsuperwidth - width) > 0:
emb_tiles_x = ceildiv(initsuperwidth - width, width - overlap_size_x) + 1
if (initsuperheight - height) > 0:
emb_tiles_y = ceildiv(initsuperheight - height, height - overlap_size_y) + 1
# Sanity
assert emb_tiles_x > 1 or emb_tiles_y > 1, f'ERROR: Based on the requested dimensions of {initsuperwidth}x{initsuperheight} and tiles of {width}x{height} you don\'t need to Embiggen! Check your arguments.'
# Prep alpha layers --------------
# https://stackoverflow.com/questions/69321734/how-to-create-different-transparency-like-gradient-with-python-pil
# agradientL is Left-side transparent
agradientL = Image.linear_gradient('L').rotate(90).resize((overlap_size_x, height))
# agradientT is Top-side transparent
agradientT = Image.linear_gradient('L').resize((width, overlap_size_y))
# radial corner is the left-top corner, made full circle then cut to just the left-top quadrant
agradientC = Image.new('L', (256, 256))
for y in range(256):
for x in range(256):
#Find distance to lower right corner (numpy takes arrays)
distanceToLR = np.sqrt([(255 - x) ** 2 + (255 - y) ** 2])[0]
#Clamp values to max 255
if distanceToLR > 255:
distanceToLR = 255
#Place the pixel as invert of distance
agradientC.putpixel((x, y), int(255 - distanceToLR))
# Create alpha layers default fully white
alphaLayerL = Image.new("L", (width, height), 255)
alphaLayerT = Image.new("L", (width, height), 255)
alphaLayerLTC = Image.new("L", (width, height), 255)
# Paste gradients into alpha layers
alphaLayerL.paste(agradientL, (0, 0))
alphaLayerT.paste(agradientT, (0, 0))
alphaLayerLTC.paste(agradientL, (0, 0))
alphaLayerLTC.paste(agradientT, (0, 0))
alphaLayerLTC.paste(agradientC.resize((overlap_size_x, overlap_size_y)), (0, 0))
if embiggen_tiles:
# Individual unconnected sides
alphaLayerR = Image.new("L", (width, height), 255)
alphaLayerR.paste(agradientL.rotate(180), (width - overlap_size_x, 0))
alphaLayerB = Image.new("L", (width, height), 255)
alphaLayerB.paste(agradientT.rotate(180), (0, height - overlap_size_y))
alphaLayerTB = Image.new("L", (width, height), 255)
alphaLayerTB.paste(agradientT, (0, 0))
alphaLayerTB.paste(agradientT.rotate(180), (0, height - overlap_size_y))
alphaLayerLR = Image.new("L", (width, height), 255)
alphaLayerLR.paste(agradientL, (0, 0))
alphaLayerLR.paste(agradientL.rotate(180), (width - overlap_size_x, 0))
# Sides and corner Layers
alphaLayerRBC = Image.new("L", (width, height), 255)
alphaLayerRBC.paste(agradientL.rotate(180), (width - overlap_size_x, 0))
alphaLayerRBC.paste(agradientT.rotate(180), (0, height - overlap_size_y))
alphaLayerRBC.paste(agradientC.rotate(180).resize((overlap_size_x, overlap_size_y)), (width - overlap_size_x, height - overlap_size_y))
alphaLayerLBC = Image.new("L", (width, height), 255)
alphaLayerLBC.paste(agradientL, (0, 0))
alphaLayerLBC.paste(agradientT.rotate(180), (0, height - overlap_size_y))
alphaLayerLBC.paste(agradientC.rotate(90).resize((overlap_size_x, overlap_size_y)), (0, height - overlap_size_y))
alphaLayerRTC = Image.new("L", (width, height), 255)
alphaLayerRTC.paste(agradientL.rotate(180), (width - overlap_size_x, 0))
alphaLayerRTC.paste(agradientT, (0, 0))
alphaLayerRTC.paste(agradientC.rotate(270).resize((overlap_size_x, overlap_size_y)), (width - overlap_size_x, 0))
# All but X layers
alphaLayerABT = Image.new("L", (width, height), 255)
alphaLayerABT.paste(alphaLayerLBC, (0, 0))
alphaLayerABT.paste(agradientL.rotate(180), (width - overlap_size_x, 0))
alphaLayerABT.paste(agradientC.rotate(180).resize((overlap_size_x, overlap_size_y)), (width - overlap_size_x, height - overlap_size_y))
alphaLayerABL = Image.new("L", (width, height), 255)
alphaLayerABL.paste(alphaLayerRTC, (0, 0))
alphaLayerABL.paste(agradientT.rotate(180), (0, height - overlap_size_y))
alphaLayerABL.paste(agradientC.rotate(180).resize((overlap_size_x, overlap_size_y)), (width - overlap_size_x, height - overlap_size_y))
alphaLayerABR = Image.new("L", (width, height), 255)
alphaLayerABR.paste(alphaLayerLBC, (0, 0))
alphaLayerABR.paste(agradientT, (0, 0))
alphaLayerABR.paste(agradientC.resize((overlap_size_x, overlap_size_y)), (0, 0))
alphaLayerABB = Image.new("L", (width, height), 255)
alphaLayerABB.paste(alphaLayerRTC, (0, 0))
alphaLayerABB.paste(agradientL, (0, 0))
alphaLayerABB.paste(agradientC.resize((overlap_size_x, overlap_size_y)), (0, 0))
# All-around layer
alphaLayerAA = Image.new("L", (width, height), 255)
alphaLayerAA.paste(alphaLayerABT, (0, 0))
alphaLayerAA.paste(agradientT, (0, 0))
alphaLayerAA.paste(agradientC.resize((overlap_size_x, overlap_size_y)), (0, 0))
alphaLayerAA.paste(agradientC.rotate(270).resize((overlap_size_x, overlap_size_y)), (width - overlap_size_x, 0))
# Clean up temporary gradients
del agradientL
del agradientT
del agradientC
def make_image(x_T):
# Make main tiles -------------------------------------------------
if embiggen_tiles:
print(f'>> Making {len(embiggen_tiles)} Embiggen tiles...')
else:
print(f'>> Making {(emb_tiles_x * emb_tiles_y)} Embiggen tiles ({emb_tiles_x}x{emb_tiles_y})...')
emb_tile_store = []
for tile in range(emb_tiles_x * emb_tiles_y):
# Determine if this is a re-run and replace
if embiggen_tiles and not tile in embiggen_tiles:
continue
# Get row and column entries
emb_row_i = tile // emb_tiles_x
emb_column_i = tile % emb_tiles_x
# Determine bounds to cut up the init image
# Determine upper-left point
if emb_column_i + 1 == emb_tiles_x:
left = initsuperwidth - width
else:
left = round(emb_column_i * (width - overlap_size_x))
if emb_row_i + 1 == emb_tiles_y:
top = initsuperheight - height
else:
top = round(emb_row_i * (height - overlap_size_y))
right = left + width
bottom = top + height
# Cropped image of above dimension (does not modify the original)
newinitimage = initsuperimage.crop((left, top, right, bottom))
# DEBUG:
# newinitimagepath = init_img[0:-4] + f'_emb_Ti{tile}.png'
# newinitimage.save(newinitimagepath)
if embiggen_tiles:
print(f'Making tile #{tile + 1} ({embiggen_tiles.index(tile) + 1} of {len(embiggen_tiles)} requested)')
else:
print(f'Starting {tile + 1} of {(emb_tiles_x * emb_tiles_y)} tiles')
# create a torch tensor from an Image
newinitimage = np.array(newinitimage).astype(np.float32) / 255.0
newinitimage = newinitimage[None].transpose(0, 3, 1, 2)
newinitimage = torch.from_numpy(newinitimage)
newinitimage = 2.0 * newinitimage - 1.0
newinitimage = newinitimage.to(self.model.device)
tile_results = gen_img2img.generate(
prompt,
iterations = 1,
seed = self.seed,
sampler = sampler,
steps = steps,
cfg_scale = cfg_scale,
conditioning = conditioning,
ddim_eta = ddim_eta,
image_callback = None, # called only after the final image is generated
step_callback = step_callback, # called after each intermediate image is generated
width = width,
height = height,
init_img = init_img, # img2img doesn't need this, but it might in the future
init_image = newinitimage, # notice that init_image is different from init_img
mask_image = None,
strength = strength,
)
emb_tile_store.append(tile_results[0][0])
# DEBUG (but, also has other uses), worth saving if you want tiles without a transparency overlap to manually composite
# emb_tile_store[-1].save(init_img[0:-4] + f'_emb_To{tile}.png')
del newinitimage
# Sanity check we have them all
if len(emb_tile_store) == (emb_tiles_x * emb_tiles_y) or (embiggen_tiles != [] and len(emb_tile_store) == len(embiggen_tiles)):
outputsuperimage = Image.new("RGBA", (initsuperwidth, initsuperheight))
if embiggen_tiles:
outputsuperimage.alpha_composite(initsuperimage.convert('RGBA'), (0, 0))
for tile in range(emb_tiles_x * emb_tiles_y):
if embiggen_tiles:
if tile in embiggen_tiles:
intileimage = emb_tile_store.pop(0)
else:
continue
else:
intileimage = emb_tile_store[tile]
intileimage = intileimage.convert('RGBA')
# Get row and column entries
emb_row_i = tile // emb_tiles_x
emb_column_i = tile % emb_tiles_x
if emb_row_i == 0 and emb_column_i == 0 and not embiggen_tiles:
left = 0
top = 0
else:
# Determine upper-left point
if emb_column_i + 1 == emb_tiles_x:
left = initsuperwidth - width
else:
left = round(emb_column_i * (width - overlap_size_x))
if emb_row_i + 1 == emb_tiles_y:
top = initsuperheight - height
else:
top = round(emb_row_i * (height - overlap_size_y))
# Handle gradients for various conditions
# Handle emb_rerun case
if embiggen_tiles:
# top of image
if emb_row_i == 0:
if emb_column_i == 0:
if (tile+1) in embiggen_tiles: # Look-ahead right
if (tile+emb_tiles_x) not in embiggen_tiles: # Look-ahead down
intileimage.putalpha(alphaLayerB)
# Otherwise do nothing on this tile
elif (tile+emb_tiles_x) in embiggen_tiles: # Look-ahead down only
intileimage.putalpha(alphaLayerR)
else:
intileimage.putalpha(alphaLayerRBC)
elif emb_column_i == emb_tiles_x - 1:
if (tile+emb_tiles_x) in embiggen_tiles: # Look-ahead down
intileimage.putalpha(alphaLayerL)
else:
intileimage.putalpha(alphaLayerLBC)
else:
if (tile+1) in embiggen_tiles: # Look-ahead right
if (tile+emb_tiles_x) in embiggen_tiles: # Look-ahead down
intileimage.putalpha(alphaLayerL)
else:
intileimage.putalpha(alphaLayerLBC)
elif (tile+emb_tiles_x) in embiggen_tiles: # Look-ahead down only
intileimage.putalpha(alphaLayerLR)
else:
intileimage.putalpha(alphaLayerABT)
# bottom of image
elif emb_row_i == emb_tiles_y - 1:
if emb_column_i == 0:
if (tile+1) in embiggen_tiles: # Look-ahead right
intileimage.putalpha(alphaLayerT)
else:
intileimage.putalpha(alphaLayerRTC)
elif emb_column_i == emb_tiles_x - 1:
# No tiles to look ahead to
intileimage.putalpha(alphaLayerLTC)
else:
if (tile+1) in embiggen_tiles: # Look-ahead right
intileimage.putalpha(alphaLayerLTC)
else:
intileimage.putalpha(alphaLayerABB)
# vertical middle of image
else:
if emb_column_i == 0:
if (tile+1) in embiggen_tiles: # Look-ahead right
if (tile+emb_tiles_x) in embiggen_tiles: # Look-ahead down
intileimage.putalpha(alphaLayerT)
else:
intileimage.putalpha(alphaLayerTB)
elif (tile+emb_tiles_x) in embiggen_tiles: # Look-ahead down only
intileimage.putalpha(alphaLayerRTC)
else:
intileimage.putalpha(alphaLayerABL)
elif emb_column_i == emb_tiles_x - 1:
if (tile+emb_tiles_x) in embiggen_tiles: # Look-ahead down
intileimage.putalpha(alphaLayerLTC)
else:
intileimage.putalpha(alphaLayerABR)
else:
if (tile+1) in embiggen_tiles: # Look-ahead right
if (tile+emb_tiles_x) in embiggen_tiles: # Look-ahead down
intileimage.putalpha(alphaLayerLTC)
else:
intileimage.putalpha(alphaLayerABR)
elif (tile+emb_tiles_x) in embiggen_tiles: # Look-ahead down only
intileimage.putalpha(alphaLayerABB)
else:
intileimage.putalpha(alphaLayerAA)
# Handle normal tiling case (much simpler - since we tile left to right, top to bottom)
else:
if emb_row_i == 0 and emb_column_i >= 1:
intileimage.putalpha(alphaLayerL)
elif emb_row_i >= 1 and emb_column_i == 0:
intileimage.putalpha(alphaLayerT)
else:
intileimage.putalpha(alphaLayerLTC)
# Layer tile onto final image
outputsuperimage.alpha_composite(intileimage, (left, top))
else:
print(f'Error: could not find all Embiggen output tiles in memory? Something must have gone wrong with img2img generation.')
# after internal loops and patching up return Embiggen image
return outputsuperimage
# end of function declaration
return make_image

View File

@ -1,5 +1,5 @@
'''
ldm.dream.generator.txt2img descends from ldm.dream.generator
ldm.dream.generator.img2img descends from ldm.dream.generator
'''
import torch

View File

@ -73,6 +73,10 @@ class PromptFormatter:
switches.append(f'-G{opt.gfpgan_strength}')
if opt.upscale:
switches.append(f'-U {" ".join([str(u) for u in opt.upscale])}')
if opt.embiggen:
switches.append(f'-embiggen {" ".join([str(u) for u in opt.embiggen])}')
if opt.embiggen_tiles:
switches.append(f'-embiggen_tiles {" ".join([str(u) for u in opt.embiggen_tiles])}')
if opt.variation_amount > 0:
switches.append(f'-v{opt.variation_amount}')
if opt.with_variations:

View File

@ -207,6 +207,9 @@ class Generate:
init_mask = None,
fit = False,
strength = None,
# these are specific to embiggen (which also relies on img2img args)
embiggen = None,
embiggen_tiles = None,
# these are specific to GFPGAN/ESRGAN
gfpgan_strength= 0,
save_original = False,
@ -232,6 +235,10 @@ class Generate:
image_callback // a function or method that will be called each time an image is generated
with_variations // a weighted list [(seed_1, weight_1), (seed_2, weight_2), ...] of variations which should be applied before doing any generation
variation_amount // optional 0-1 value to slerp from -S noise to random noise (allows variations on an image)
threshold // optional value to add thresholding to latent values for k-diffusion samplers (0 disables)
perlin // optional 0-1 value to add a percentage of perlin noise to the initial noise
embiggen // scale factor relative to the size of the --init_img (-I), followed by ESRGAN upscaling strength (0-1.0), followed by minimum amount of overlap between tiles as a decimal ratio (0 - 1.0) or number of pixels
embiggen_tiles // list of tiles by number in order to process and replace onto the image e.g. `0 2 4`
To use the step callback, define a function that receives two arguments:
- Image GPU data
@ -276,6 +283,9 @@ class Generate:
assert (
0.0 <= variation_amount <= 1.0
), '-v --variation_amount must be in [0.0, 1.0]'
assert (
(embiggen == None and embiggen_tiles == None) or ((embiggen != None or embiggen_tiles != None) and init_img != None)
), 'Embiggen requires an init/input image to be specified'
# check this logic - doesn't look right
if len(with_variations) > 0 or variation_amount > 1.0:
@ -312,6 +322,8 @@ class Generate:
if (init_image is not None) and (mask_image is not None):
generator = self._make_inpaint()
elif (embiggen != None or embiggen_tiles != None):
generator = self._make_embiggen()
elif init_image is not None:
generator = self._make_img2img()
else:
@ -331,11 +343,14 @@ class Generate:
step_callback = step_callback, # called after each intermediate image is generated
width = width,
height = height,
init_img = init_img, # embiggen needs to manipulate from the unmodified init_img
init_image = init_image, # notice that init_image is different from init_img
mask_image = mask_image,
strength = strength,
threshold = threshold,
perlin = perlin,
embiggen = embiggen,
embiggen_tiles = embiggen_tiles,
)
if upscale is not None or gfpgan_strength > 0:
@ -408,6 +423,12 @@ class Generate:
from ldm.dream.generator.img2img import Img2Img
self.generators['img2img'] = Img2Img(self.model)
return self.generators['img2img']
def _make_embiggen(self):
if not self.generators.get('embiggen'):
from ldm.dream.generator.embiggen import Embiggen
self.generators['embiggen'] = Embiggen(self.model)
return self.generators['embiggen']
def _make_txt2img(self):
if not self.generators.get('txt2img'):

View File

@ -11,7 +11,7 @@ opencv-python==4.6.0.66
pillow==9.2.0
pudb==2019.2
torch==1.12.1
torchvision==0.12.0
torchvision==0.13.0
pytorch-lightning==1.4.2
streamlit==1.12.0
test-tube>=0.7.5

View File

@ -631,7 +631,7 @@ def create_cmd_parser():
nargs='+',
default=None,
type=float,
help='Scale factor (2, 4) for upscaling followed by upscaling strength (0-1.0). If strength not specified, defaults to 0.75'
help='Scale factor (2, 4) for upscaling final output followed by upscaling strength (0-1.0). If strength not specified, defaults to 0.75'
)
parser.add_argument(
'-save_orig',
@ -639,6 +639,20 @@ def create_cmd_parser():
action='store_true',
help='Save original. Use it when upscaling to save both versions.',
)
parser.add_argument(
'-embiggen',
nargs='+',
default=None,
type=float,
help='Embiggen tiled img2img for higher resolution and detail without extra VRAM usage. Takes scale factor relative to the size of the --init_img (-I), followed by ESRGAN upscaling strength (0-1.0), followed by minimum amount of overlap between tiles as a decimal ratio (0 - 1.0) or number of pixels. ESRGAN strength defaults to 0.75, and overlap defaults to 0.25 . ESRGAN is used to upscale the init prior to cutting it into tiles/pieces to run through img2img and then stitch back togeather.',
)
parser.add_argument(
'-embiggen_tiles',
nargs='+',
default=None,
type=int,
help='If while doing Embiggen we are altering only parts of the image, takes a list of tiles by number to process and replace onto the image e.g. `1 3 5`, useful for redoing problematic spots from a prior Embiggen run',
)
# variants is going to be superseded by a generalized "prompt-morph" function
# parser.add_argument('-v','--variants',type=int,help="in img2img mode, the first generated image will get passed back to img2img to generate the requested number of variants")
parser.add_argument(