mirror of
https://github.com/invoke-ai/InvokeAI
synced 2024-08-30 20:32:17 +00:00
Merge branch 'development' into development
This commit is contained in:
commit
4a5a228fd8
@ -27,7 +27,6 @@ report bugs and make feature requests. Be sure to use the provided
|
||||
templates. They will help aid diagnose issues faster._
|
||||
|
||||
# **Table of Contents**
|
||||
|
||||
1. [Installation](#installation)
|
||||
2. [Major Features](#features)
|
||||
3. [Changelog](#latest-changes)
|
||||
@ -86,6 +85,8 @@ To run in full-precision mode, start `dream.py` with the
|
||||
|
||||
- ## [GFPGAN and Real-ESRGAN Support](docs/features/UPSCALE.md)
|
||||
|
||||
- ## [Embiggen upscaling](docs/features/EMBIGGEN.md)
|
||||
|
||||
- ## [Seamless Tiling](docs/features/OTHER.md#seamless-tiling)
|
||||
|
||||
- ## [Google Colab](docs/features/OTHER.md#google-colab)
|
||||
@ -136,7 +137,7 @@ To run in full-precision mode, start `dream.py` with the
|
||||
- Works on M1 Apple hardware.
|
||||
- Multiple bug fixes.
|
||||
|
||||
For older changelogs, please visit **[CHANGELOGS](docs/CHANGELOG.md)**.
|
||||
For older changelogs, please visit **[CHANGELOGS](docs/CHANGELOG.md)**.
|
||||
|
||||
# Troubleshooting
|
||||
|
||||
|
134
docs/features/EMBIGGEN.md
Normal file
134
docs/features/EMBIGGEN.md
Normal file
@ -0,0 +1,134 @@
|
||||
# **Embiggen -- upscale your images on limited memory machines**
|
||||
|
||||
GFPGAN and Real-ESRGAN are both memory intensive. In order to avoid
|
||||
crashes and memory overloads during the Stable Diffusion process,
|
||||
these effects are applied after Stable Diffusion has completed its
|
||||
work.
|
||||
|
||||
In single image generations, you will see the output right away but
|
||||
when you are using multiple iterations, the images will first be
|
||||
generated and then upscaled and face restored after that process is
|
||||
complete. While the image generation is taking place, you will still
|
||||
be able to preview the base images.
|
||||
|
||||
If you wish to stop during the image generation but want to upscale or
|
||||
face restore a particular generated image, pass it again with the same
|
||||
prompt and generated seed along with the `-U` and `-G` prompt
|
||||
arguments to perform those actions.
|
||||
|
||||
## Embiggen
|
||||
|
||||
If you wanted to be able to do more (pixels) without running out of VRAM,
|
||||
or you want to upscale with details that couldn't possibly appear
|
||||
without the context of a prompt, this is the feature to try out.
|
||||
|
||||
Embiggen automates the process of taking an init image, upscaling it,
|
||||
cutting it into smaller tiles that slightly overlap, running all the
|
||||
tiles through img2img to refine details with respect to the prompt,
|
||||
and "stitching" the tiles back together into a cohesive image.
|
||||
|
||||
It automatically computes how many tiles are needed, and so it can be fed
|
||||
*ANY* size init image and perform Img2Img on it (though it will be run only
|
||||
one tile at a time, which can cause problems, see the Note at the end).
|
||||
|
||||
If you're familiar with "GoBig" (ala [progrock-stable](https://github.com/lowfuel/progrock-stable))
|
||||
it's similar to that, except it can work up to an arbitrarily large size
|
||||
(instead of just 2x), with tile overlaps configurable as a ratio, and
|
||||
has extra logic to re-run any number of the tile sub-sections of the image
|
||||
if for example a small part of a huge run got messed up.
|
||||
|
||||
**Usage**
|
||||
|
||||
`-embiggen <scaling_factor> <esrgan_strength> <overlap_ratio OR overlap_pixels>`
|
||||
|
||||
Takes a scaling factor relative to the size of the `--init_img` (`-I`), followed by
|
||||
ESRGAN upscaling strength (0 - 1.0), followed by minimum amount of overlap
|
||||
between tiles as a decimal ratio (0 - 1.0) *OR* a number of pixels.
|
||||
|
||||
The scaling factor is how much larger than the `--init_img` the output
|
||||
should be, and will multiply both x and y axis, so an image that is a
|
||||
scaling factor of 3.0 has 3*3= 9 times as many pixels, and will take
|
||||
(at least) 9 times as long (see overlap for why it might be
|
||||
longer). If the `--init_img` is already the right size `-embiggen 1`,
|
||||
and it can also be less than one if the init_img is too big.
|
||||
|
||||
Esrgan_strength defaults to 0.75, and the overlap_ratio defaults to
|
||||
0.25, both are optional.
|
||||
|
||||
|
||||
Unlike Img2Img, the `--width` (`-W`) and `--height` (`-H`) arguments
|
||||
do not control the size of the image as a whole, but the size of the
|
||||
tiles used to Embiggen the image.
|
||||
|
||||
ESRGAN is used to upscale the `--init_img` prior to cutting it into
|
||||
tiles/pieces to run through img2img and then stitch back
|
||||
together. Embiggen can be run without ESRGAN; just set the strength to
|
||||
zero (e.g. `-embiggen 1.75 0`). The output of Embiggen can also be
|
||||
upscaled after it's finished (`-U`).
|
||||
|
||||
The overlap is the minimum that tiles will overlap with adjacent
|
||||
tiles, specified as either a ratio or a number of pixels. How much the
|
||||
tiles overlap determines the likelihood the tiling will be noticable,
|
||||
really small overlaps (e.g. a couple of pixels) may produce noticeable
|
||||
grid-like fuzzy distortions in the final stitched image. Though, as
|
||||
the overlapping space doesn't contribute to making the image bigger,
|
||||
and the larger the overlap the more tiles (and the more time) it will
|
||||
take to finish.
|
||||
|
||||
Because the overlapping parts of tiles don't "contribute" to
|
||||
increasing size, every tile after the first in a row or column
|
||||
effectively only covers an extra `1 - overlap_ratio` on each axis. If
|
||||
the input/`--init_img` is same size as a tile, the ideal (for time)
|
||||
scaling factors with the default overlap (0.25) are 1.75, 2.5, 3.25,
|
||||
4.0 etc..
|
||||
|
||||
`-embiggen_tiles <spaced list of tiles>`
|
||||
|
||||
An advanced usage useful if you only want to alter parts of the image
|
||||
while running Embiggen. It takes a list of tiles by number to run and
|
||||
replace onto the initial image e.g. `1 3 5`. It's useful for either
|
||||
fixing problem spots from a previous Embiggen run, or selectively
|
||||
altering the prompt for sections of an image - for creative or
|
||||
coherency reasons.
|
||||
|
||||
Tiles are numbered starting with one, and left-to-right,
|
||||
top-to-bottom. So, if you are generating a 3x3 tiled image, the
|
||||
middle row would be `4 5 6`.
|
||||
|
||||
**Example Usage**
|
||||
|
||||
Running Embiggen with 512x512 tiles on an existing image, scaling up by a factor of 2.5x;
|
||||
and doing the same again (default ESRGAN strength is 0.75, default overlap between tiles is 0.25):
|
||||
|
||||
```
|
||||
dream > a photo of a forest at sunset -s 100 -W 512 -H 512 -I outputs/forest.png -f 0.4 -embiggen 2.5
|
||||
dream > a photo of a forest at sunset -s 100 -W 512 -H 512 -I outputs/forest.png -f 0.4 -embiggen 2.5 0.75 0.25
|
||||
```
|
||||
|
||||
If your starting image was also 512x512 this should have taken 9 tiles.
|
||||
|
||||
If there weren't enough clouds in the sky of that forest you just made
|
||||
(and that image is about 1280 pixels (512*2.5) wide A.K.A. three
|
||||
512x512 tiles with 0.25 overlaps wide) we can replace that top row of
|
||||
tiles:
|
||||
|
||||
```
|
||||
dream> a photo of puffy clouds over a forest at sunset -s 100 -W 512 -H 512 -I outputs/000002.seed.png -f 0.5 -embiggen_tiles 1 2 3
|
||||
```
|
||||
|
||||
**Note**
|
||||
|
||||
Because the same prompt is used on all the tiled images, and the model
|
||||
doesn't have the context of anything outside the tile being run - it
|
||||
can end up creating repeated pattern (also called 'motifs') across all
|
||||
the tiles based on that prompt. The best way to combat this is
|
||||
lowering the `--strength` (`-f`) to stay more true to the init image,
|
||||
and increasing the number of steps so there is more compute-time to
|
||||
create the detail. Anecdotally `--strength` 0.35-0.45 works pretty
|
||||
well on most things. It may also work great in some examples even with
|
||||
the `--strength` set high for patterns, landscapes, or subjects that
|
||||
are more abstract. Because this is (relatively) fast, you can also
|
||||
always create a few Embiggen'ed images and manually composite them to
|
||||
preserve the best parts from each.
|
||||
|
||||
Author: [Travco](https://github.com/travco)
|
403
ldm/dream/generator/embiggen.py
Normal file
403
ldm/dream/generator/embiggen.py
Normal file
@ -0,0 +1,403 @@
|
||||
'''
|
||||
ldm.dream.generator.embiggen descends from ldm.dream.generator
|
||||
and generates with ldm.dream.generator.img2img
|
||||
'''
|
||||
|
||||
import torch
|
||||
import numpy as np
|
||||
from PIL import Image
|
||||
from ldm.dream.generator.base import Generator
|
||||
from ldm.models.diffusion.ddim import DDIMSampler
|
||||
from ldm.dream.generator.img2img import Img2Img
|
||||
|
||||
class Embiggen(Generator):
|
||||
def __init__(self,model):
|
||||
super().__init__(model)
|
||||
self.init_latent = None
|
||||
|
||||
@torch.no_grad()
|
||||
def get_make_image(
|
||||
self,
|
||||
prompt,
|
||||
sampler,
|
||||
steps,
|
||||
cfg_scale,
|
||||
ddim_eta,
|
||||
conditioning,
|
||||
init_img,
|
||||
strength,
|
||||
width,
|
||||
height,
|
||||
embiggen,
|
||||
embiggen_tiles,
|
||||
step_callback=None,
|
||||
**kwargs
|
||||
):
|
||||
"""
|
||||
Returns a function returning an image derived from the prompt and multi-stage twice-baked potato layering over the img2img on the initial image
|
||||
Return value depends on the seed at the time you call it
|
||||
"""
|
||||
# Construct embiggen arg array, and sanity check arguments
|
||||
if embiggen == None: # embiggen can also be called with just embiggen_tiles
|
||||
embiggen = [1.0] # If not specified, assume no scaling
|
||||
elif embiggen[0] < 0 :
|
||||
embiggen[0] = 1.0
|
||||
print('>> Embiggen scaling factor cannot be negative, fell back to the default of 1.0 !')
|
||||
if len(embiggen) < 2:
|
||||
embiggen.append(0.75)
|
||||
elif embiggen[1] > 1.0 or embiggen[1] < 0 :
|
||||
embiggen[1] = 0.75
|
||||
print('>> Embiggen upscaling strength for ESRGAN must be between 0 and 1, fell back to the default of 0.75 !')
|
||||
if len(embiggen) < 3:
|
||||
embiggen.append(0.25)
|
||||
elif embiggen[2] < 0 :
|
||||
embiggen[2] = 0.25
|
||||
print('>> Overlap size for Embiggen must be a positive ratio between 0 and 1 OR a number of pixels, fell back to the default of 0.25 !')
|
||||
|
||||
# Convert tiles from their user-freindly count-from-one to count-from-zero, because we need to do modulo math
|
||||
# and then sort them, because... people.
|
||||
if embiggen_tiles:
|
||||
embiggen_tiles = list(map(lambda n: n-1, embiggen_tiles))
|
||||
embiggen_tiles.sort()
|
||||
|
||||
# Prep img2img generator, since we wrap over it
|
||||
gen_img2img = Img2Img(self.model)
|
||||
|
||||
# Open original init image (not a tensor) to manipulate
|
||||
initsuperimage = Image.open(init_img)
|
||||
|
||||
with Image.open(init_img) as img:
|
||||
initsuperimage = img.convert('RGB')
|
||||
|
||||
# Size of the target super init image in pixels
|
||||
initsuperwidth, initsuperheight = initsuperimage.size
|
||||
|
||||
# Increase by scaling factor if not already resized, using ESRGAN as able
|
||||
if embiggen[0] != 1.0:
|
||||
initsuperwidth = round(initsuperwidth*embiggen[0])
|
||||
initsuperheight = round(initsuperheight*embiggen[0])
|
||||
if embiggen[1] > 0: # No point in ESRGAN upscaling if strength is set zero
|
||||
from ldm.gfpgan.gfpgan_tools import (
|
||||
real_esrgan_upscale,
|
||||
)
|
||||
print(f'>> ESRGAN upscaling init image prior to cutting with Embiggen with strength {embiggen[1]}')
|
||||
if embiggen[0] > 2:
|
||||
initsuperimage = real_esrgan_upscale(
|
||||
initsuperimage,
|
||||
embiggen[1], # upscale strength
|
||||
4, # upscale scale
|
||||
self.seed,
|
||||
)
|
||||
else:
|
||||
initsuperimage = real_esrgan_upscale(
|
||||
initsuperimage,
|
||||
embiggen[1], # upscale strength
|
||||
2, # upscale scale
|
||||
self.seed,
|
||||
)
|
||||
# We could keep recursively re-running ESRGAN for a requested embiggen[0] larger than 4x
|
||||
# but from personal experiance it doesn't greatly improve anything after 4x
|
||||
# Resize to target scaling factor resolution
|
||||
initsuperimage = initsuperimage.resize((initsuperwidth, initsuperheight), Image.Resampling.LANCZOS)
|
||||
|
||||
# Use width and height as tile widths and height
|
||||
# Determine buffer size in pixels
|
||||
if embiggen[2] < 1:
|
||||
if embiggen[2] < 0:
|
||||
embiggen[2] = 0
|
||||
overlap_size_x = round(embiggen[2] * width)
|
||||
overlap_size_y = round(embiggen[2] * height)
|
||||
else:
|
||||
overlap_size_x = round(embiggen[2])
|
||||
overlap_size_y = round(embiggen[2])
|
||||
|
||||
# With overall image width and height known, determine how many tiles we need
|
||||
def ceildiv(a, b):
|
||||
return -1 * (-a // b)
|
||||
|
||||
# X and Y needs to be determined independantly (we may have savings on one based on the buffer pixel count)
|
||||
# (initsuperwidth - width) is the area remaining to the right that we need to layers tiles to fill
|
||||
# (width - overlap_size_x) is how much new we can fill with a single tile
|
||||
emb_tiles_x = 1
|
||||
emb_tiles_y = 1
|
||||
if (initsuperwidth - width) > 0:
|
||||
emb_tiles_x = ceildiv(initsuperwidth - width, width - overlap_size_x) + 1
|
||||
if (initsuperheight - height) > 0:
|
||||
emb_tiles_y = ceildiv(initsuperheight - height, height - overlap_size_y) + 1
|
||||
# Sanity
|
||||
assert emb_tiles_x > 1 or emb_tiles_y > 1, f'ERROR: Based on the requested dimensions of {initsuperwidth}x{initsuperheight} and tiles of {width}x{height} you don\'t need to Embiggen! Check your arguments.'
|
||||
|
||||
# Prep alpha layers --------------
|
||||
# https://stackoverflow.com/questions/69321734/how-to-create-different-transparency-like-gradient-with-python-pil
|
||||
# agradientL is Left-side transparent
|
||||
agradientL = Image.linear_gradient('L').rotate(90).resize((overlap_size_x, height))
|
||||
# agradientT is Top-side transparent
|
||||
agradientT = Image.linear_gradient('L').resize((width, overlap_size_y))
|
||||
# radial corner is the left-top corner, made full circle then cut to just the left-top quadrant
|
||||
agradientC = Image.new('L', (256, 256))
|
||||
for y in range(256):
|
||||
for x in range(256):
|
||||
#Find distance to lower right corner (numpy takes arrays)
|
||||
distanceToLR = np.sqrt([(255 - x) ** 2 + (255 - y) ** 2])[0]
|
||||
#Clamp values to max 255
|
||||
if distanceToLR > 255:
|
||||
distanceToLR = 255
|
||||
#Place the pixel as invert of distance
|
||||
agradientC.putpixel((x, y), int(255 - distanceToLR))
|
||||
|
||||
# Create alpha layers default fully white
|
||||
alphaLayerL = Image.new("L", (width, height), 255)
|
||||
alphaLayerT = Image.new("L", (width, height), 255)
|
||||
alphaLayerLTC = Image.new("L", (width, height), 255)
|
||||
# Paste gradients into alpha layers
|
||||
alphaLayerL.paste(agradientL, (0, 0))
|
||||
alphaLayerT.paste(agradientT, (0, 0))
|
||||
alphaLayerLTC.paste(agradientL, (0, 0))
|
||||
alphaLayerLTC.paste(agradientT, (0, 0))
|
||||
alphaLayerLTC.paste(agradientC.resize((overlap_size_x, overlap_size_y)), (0, 0))
|
||||
|
||||
if embiggen_tiles:
|
||||
# Individual unconnected sides
|
||||
alphaLayerR = Image.new("L", (width, height), 255)
|
||||
alphaLayerR.paste(agradientL.rotate(180), (width - overlap_size_x, 0))
|
||||
alphaLayerB = Image.new("L", (width, height), 255)
|
||||
alphaLayerB.paste(agradientT.rotate(180), (0, height - overlap_size_y))
|
||||
alphaLayerTB = Image.new("L", (width, height), 255)
|
||||
alphaLayerTB.paste(agradientT, (0, 0))
|
||||
alphaLayerTB.paste(agradientT.rotate(180), (0, height - overlap_size_y))
|
||||
alphaLayerLR = Image.new("L", (width, height), 255)
|
||||
alphaLayerLR.paste(agradientL, (0, 0))
|
||||
alphaLayerLR.paste(agradientL.rotate(180), (width - overlap_size_x, 0))
|
||||
|
||||
# Sides and corner Layers
|
||||
alphaLayerRBC = Image.new("L", (width, height), 255)
|
||||
alphaLayerRBC.paste(agradientL.rotate(180), (width - overlap_size_x, 0))
|
||||
alphaLayerRBC.paste(agradientT.rotate(180), (0, height - overlap_size_y))
|
||||
alphaLayerRBC.paste(agradientC.rotate(180).resize((overlap_size_x, overlap_size_y)), (width - overlap_size_x, height - overlap_size_y))
|
||||
alphaLayerLBC = Image.new("L", (width, height), 255)
|
||||
alphaLayerLBC.paste(agradientL, (0, 0))
|
||||
alphaLayerLBC.paste(agradientT.rotate(180), (0, height - overlap_size_y))
|
||||
alphaLayerLBC.paste(agradientC.rotate(90).resize((overlap_size_x, overlap_size_y)), (0, height - overlap_size_y))
|
||||
alphaLayerRTC = Image.new("L", (width, height), 255)
|
||||
alphaLayerRTC.paste(agradientL.rotate(180), (width - overlap_size_x, 0))
|
||||
alphaLayerRTC.paste(agradientT, (0, 0))
|
||||
alphaLayerRTC.paste(agradientC.rotate(270).resize((overlap_size_x, overlap_size_y)), (width - overlap_size_x, 0))
|
||||
|
||||
# All but X layers
|
||||
alphaLayerABT = Image.new("L", (width, height), 255)
|
||||
alphaLayerABT.paste(alphaLayerLBC, (0, 0))
|
||||
alphaLayerABT.paste(agradientL.rotate(180), (width - overlap_size_x, 0))
|
||||
alphaLayerABT.paste(agradientC.rotate(180).resize((overlap_size_x, overlap_size_y)), (width - overlap_size_x, height - overlap_size_y))
|
||||
alphaLayerABL = Image.new("L", (width, height), 255)
|
||||
alphaLayerABL.paste(alphaLayerRTC, (0, 0))
|
||||
alphaLayerABL.paste(agradientT.rotate(180), (0, height - overlap_size_y))
|
||||
alphaLayerABL.paste(agradientC.rotate(180).resize((overlap_size_x, overlap_size_y)), (width - overlap_size_x, height - overlap_size_y))
|
||||
alphaLayerABR = Image.new("L", (width, height), 255)
|
||||
alphaLayerABR.paste(alphaLayerLBC, (0, 0))
|
||||
alphaLayerABR.paste(agradientT, (0, 0))
|
||||
alphaLayerABR.paste(agradientC.resize((overlap_size_x, overlap_size_y)), (0, 0))
|
||||
alphaLayerABB = Image.new("L", (width, height), 255)
|
||||
alphaLayerABB.paste(alphaLayerRTC, (0, 0))
|
||||
alphaLayerABB.paste(agradientL, (0, 0))
|
||||
alphaLayerABB.paste(agradientC.resize((overlap_size_x, overlap_size_y)), (0, 0))
|
||||
|
||||
# All-around layer
|
||||
alphaLayerAA = Image.new("L", (width, height), 255)
|
||||
alphaLayerAA.paste(alphaLayerABT, (0, 0))
|
||||
alphaLayerAA.paste(agradientT, (0, 0))
|
||||
alphaLayerAA.paste(agradientC.resize((overlap_size_x, overlap_size_y)), (0, 0))
|
||||
alphaLayerAA.paste(agradientC.rotate(270).resize((overlap_size_x, overlap_size_y)), (width - overlap_size_x, 0))
|
||||
|
||||
# Clean up temporary gradients
|
||||
del agradientL
|
||||
del agradientT
|
||||
del agradientC
|
||||
|
||||
def make_image(x_T):
|
||||
# Make main tiles -------------------------------------------------
|
||||
if embiggen_tiles:
|
||||
print(f'>> Making {len(embiggen_tiles)} Embiggen tiles...')
|
||||
else:
|
||||
print(f'>> Making {(emb_tiles_x * emb_tiles_y)} Embiggen tiles ({emb_tiles_x}x{emb_tiles_y})...')
|
||||
|
||||
emb_tile_store = []
|
||||
for tile in range(emb_tiles_x * emb_tiles_y):
|
||||
# Determine if this is a re-run and replace
|
||||
if embiggen_tiles and not tile in embiggen_tiles:
|
||||
continue
|
||||
# Get row and column entries
|
||||
emb_row_i = tile // emb_tiles_x
|
||||
emb_column_i = tile % emb_tiles_x
|
||||
# Determine bounds to cut up the init image
|
||||
# Determine upper-left point
|
||||
if emb_column_i + 1 == emb_tiles_x:
|
||||
left = initsuperwidth - width
|
||||
else:
|
||||
left = round(emb_column_i * (width - overlap_size_x))
|
||||
if emb_row_i + 1 == emb_tiles_y:
|
||||
top = initsuperheight - height
|
||||
else:
|
||||
top = round(emb_row_i * (height - overlap_size_y))
|
||||
right = left + width
|
||||
bottom = top + height
|
||||
|
||||
# Cropped image of above dimension (does not modify the original)
|
||||
newinitimage = initsuperimage.crop((left, top, right, bottom))
|
||||
# DEBUG:
|
||||
# newinitimagepath = init_img[0:-4] + f'_emb_Ti{tile}.png'
|
||||
# newinitimage.save(newinitimagepath)
|
||||
|
||||
if embiggen_tiles:
|
||||
print(f'Making tile #{tile + 1} ({embiggen_tiles.index(tile) + 1} of {len(embiggen_tiles)} requested)')
|
||||
else:
|
||||
print(f'Starting {tile + 1} of {(emb_tiles_x * emb_tiles_y)} tiles')
|
||||
|
||||
# create a torch tensor from an Image
|
||||
newinitimage = np.array(newinitimage).astype(np.float32) / 255.0
|
||||
newinitimage = newinitimage[None].transpose(0, 3, 1, 2)
|
||||
newinitimage = torch.from_numpy(newinitimage)
|
||||
newinitimage = 2.0 * newinitimage - 1.0
|
||||
newinitimage = newinitimage.to(self.model.device)
|
||||
|
||||
tile_results = gen_img2img.generate(
|
||||
prompt,
|
||||
iterations = 1,
|
||||
seed = self.seed,
|
||||
sampler = sampler,
|
||||
steps = steps,
|
||||
cfg_scale = cfg_scale,
|
||||
conditioning = conditioning,
|
||||
ddim_eta = ddim_eta,
|
||||
image_callback = None, # called only after the final image is generated
|
||||
step_callback = step_callback, # called after each intermediate image is generated
|
||||
width = width,
|
||||
height = height,
|
||||
init_img = init_img, # img2img doesn't need this, but it might in the future
|
||||
init_image = newinitimage, # notice that init_image is different from init_img
|
||||
mask_image = None,
|
||||
strength = strength,
|
||||
)
|
||||
|
||||
emb_tile_store.append(tile_results[0][0])
|
||||
# DEBUG (but, also has other uses), worth saving if you want tiles without a transparency overlap to manually composite
|
||||
# emb_tile_store[-1].save(init_img[0:-4] + f'_emb_To{tile}.png')
|
||||
del newinitimage
|
||||
|
||||
# Sanity check we have them all
|
||||
if len(emb_tile_store) == (emb_tiles_x * emb_tiles_y) or (embiggen_tiles != [] and len(emb_tile_store) == len(embiggen_tiles)):
|
||||
outputsuperimage = Image.new("RGBA", (initsuperwidth, initsuperheight))
|
||||
if embiggen_tiles:
|
||||
outputsuperimage.alpha_composite(initsuperimage.convert('RGBA'), (0, 0))
|
||||
for tile in range(emb_tiles_x * emb_tiles_y):
|
||||
if embiggen_tiles:
|
||||
if tile in embiggen_tiles:
|
||||
intileimage = emb_tile_store.pop(0)
|
||||
else:
|
||||
continue
|
||||
else:
|
||||
intileimage = emb_tile_store[tile]
|
||||
intileimage = intileimage.convert('RGBA')
|
||||
# Get row and column entries
|
||||
emb_row_i = tile // emb_tiles_x
|
||||
emb_column_i = tile % emb_tiles_x
|
||||
if emb_row_i == 0 and emb_column_i == 0 and not embiggen_tiles:
|
||||
left = 0
|
||||
top = 0
|
||||
else:
|
||||
# Determine upper-left point
|
||||
if emb_column_i + 1 == emb_tiles_x:
|
||||
left = initsuperwidth - width
|
||||
else:
|
||||
left = round(emb_column_i * (width - overlap_size_x))
|
||||
if emb_row_i + 1 == emb_tiles_y:
|
||||
top = initsuperheight - height
|
||||
else:
|
||||
top = round(emb_row_i * (height - overlap_size_y))
|
||||
# Handle gradients for various conditions
|
||||
# Handle emb_rerun case
|
||||
if embiggen_tiles:
|
||||
# top of image
|
||||
if emb_row_i == 0:
|
||||
if emb_column_i == 0:
|
||||
if (tile+1) in embiggen_tiles: # Look-ahead right
|
||||
if (tile+emb_tiles_x) not in embiggen_tiles: # Look-ahead down
|
||||
intileimage.putalpha(alphaLayerB)
|
||||
# Otherwise do nothing on this tile
|
||||
elif (tile+emb_tiles_x) in embiggen_tiles: # Look-ahead down only
|
||||
intileimage.putalpha(alphaLayerR)
|
||||
else:
|
||||
intileimage.putalpha(alphaLayerRBC)
|
||||
elif emb_column_i == emb_tiles_x - 1:
|
||||
if (tile+emb_tiles_x) in embiggen_tiles: # Look-ahead down
|
||||
intileimage.putalpha(alphaLayerL)
|
||||
else:
|
||||
intileimage.putalpha(alphaLayerLBC)
|
||||
else:
|
||||
if (tile+1) in embiggen_tiles: # Look-ahead right
|
||||
if (tile+emb_tiles_x) in embiggen_tiles: # Look-ahead down
|
||||
intileimage.putalpha(alphaLayerL)
|
||||
else:
|
||||
intileimage.putalpha(alphaLayerLBC)
|
||||
elif (tile+emb_tiles_x) in embiggen_tiles: # Look-ahead down only
|
||||
intileimage.putalpha(alphaLayerLR)
|
||||
else:
|
||||
intileimage.putalpha(alphaLayerABT)
|
||||
# bottom of image
|
||||
elif emb_row_i == emb_tiles_y - 1:
|
||||
if emb_column_i == 0:
|
||||
if (tile+1) in embiggen_tiles: # Look-ahead right
|
||||
intileimage.putalpha(alphaLayerT)
|
||||
else:
|
||||
intileimage.putalpha(alphaLayerRTC)
|
||||
elif emb_column_i == emb_tiles_x - 1:
|
||||
# No tiles to look ahead to
|
||||
intileimage.putalpha(alphaLayerLTC)
|
||||
else:
|
||||
if (tile+1) in embiggen_tiles: # Look-ahead right
|
||||
intileimage.putalpha(alphaLayerLTC)
|
||||
else:
|
||||
intileimage.putalpha(alphaLayerABB)
|
||||
# vertical middle of image
|
||||
else:
|
||||
if emb_column_i == 0:
|
||||
if (tile+1) in embiggen_tiles: # Look-ahead right
|
||||
if (tile+emb_tiles_x) in embiggen_tiles: # Look-ahead down
|
||||
intileimage.putalpha(alphaLayerT)
|
||||
else:
|
||||
intileimage.putalpha(alphaLayerTB)
|
||||
elif (tile+emb_tiles_x) in embiggen_tiles: # Look-ahead down only
|
||||
intileimage.putalpha(alphaLayerRTC)
|
||||
else:
|
||||
intileimage.putalpha(alphaLayerABL)
|
||||
elif emb_column_i == emb_tiles_x - 1:
|
||||
if (tile+emb_tiles_x) in embiggen_tiles: # Look-ahead down
|
||||
intileimage.putalpha(alphaLayerLTC)
|
||||
else:
|
||||
intileimage.putalpha(alphaLayerABR)
|
||||
else:
|
||||
if (tile+1) in embiggen_tiles: # Look-ahead right
|
||||
if (tile+emb_tiles_x) in embiggen_tiles: # Look-ahead down
|
||||
intileimage.putalpha(alphaLayerLTC)
|
||||
else:
|
||||
intileimage.putalpha(alphaLayerABR)
|
||||
elif (tile+emb_tiles_x) in embiggen_tiles: # Look-ahead down only
|
||||
intileimage.putalpha(alphaLayerABB)
|
||||
else:
|
||||
intileimage.putalpha(alphaLayerAA)
|
||||
# Handle normal tiling case (much simpler - since we tile left to right, top to bottom)
|
||||
else:
|
||||
if emb_row_i == 0 and emb_column_i >= 1:
|
||||
intileimage.putalpha(alphaLayerL)
|
||||
elif emb_row_i >= 1 and emb_column_i == 0:
|
||||
intileimage.putalpha(alphaLayerT)
|
||||
else:
|
||||
intileimage.putalpha(alphaLayerLTC)
|
||||
# Layer tile onto final image
|
||||
outputsuperimage.alpha_composite(intileimage, (left, top))
|
||||
else:
|
||||
print(f'Error: could not find all Embiggen output tiles in memory? Something must have gone wrong with img2img generation.')
|
||||
|
||||
# after internal loops and patching up return Embiggen image
|
||||
return outputsuperimage
|
||||
# end of function declaration
|
||||
return make_image
|
@ -1,5 +1,5 @@
|
||||
'''
|
||||
ldm.dream.generator.txt2img descends from ldm.dream.generator
|
||||
ldm.dream.generator.img2img descends from ldm.dream.generator
|
||||
'''
|
||||
|
||||
import torch
|
||||
|
@ -73,6 +73,10 @@ class PromptFormatter:
|
||||
switches.append(f'-G{opt.gfpgan_strength}')
|
||||
if opt.upscale:
|
||||
switches.append(f'-U {" ".join([str(u) for u in opt.upscale])}')
|
||||
if opt.embiggen:
|
||||
switches.append(f'-embiggen {" ".join([str(u) for u in opt.embiggen])}')
|
||||
if opt.embiggen_tiles:
|
||||
switches.append(f'-embiggen_tiles {" ".join([str(u) for u in opt.embiggen_tiles])}')
|
||||
if opt.variation_amount > 0:
|
||||
switches.append(f'-v{opt.variation_amount}')
|
||||
if opt.with_variations:
|
||||
|
@ -207,6 +207,9 @@ class Generate:
|
||||
init_mask = None,
|
||||
fit = False,
|
||||
strength = None,
|
||||
# these are specific to embiggen (which also relies on img2img args)
|
||||
embiggen = None,
|
||||
embiggen_tiles = None,
|
||||
# these are specific to GFPGAN/ESRGAN
|
||||
gfpgan_strength= 0,
|
||||
save_original = False,
|
||||
@ -232,6 +235,10 @@ class Generate:
|
||||
image_callback // a function or method that will be called each time an image is generated
|
||||
with_variations // a weighted list [(seed_1, weight_1), (seed_2, weight_2), ...] of variations which should be applied before doing any generation
|
||||
variation_amount // optional 0-1 value to slerp from -S noise to random noise (allows variations on an image)
|
||||
threshold // optional value to add thresholding to latent values for k-diffusion samplers (0 disables)
|
||||
perlin // optional 0-1 value to add a percentage of perlin noise to the initial noise
|
||||
embiggen // scale factor relative to the size of the --init_img (-I), followed by ESRGAN upscaling strength (0-1.0), followed by minimum amount of overlap between tiles as a decimal ratio (0 - 1.0) or number of pixels
|
||||
embiggen_tiles // list of tiles by number in order to process and replace onto the image e.g. `0 2 4`
|
||||
|
||||
To use the step callback, define a function that receives two arguments:
|
||||
- Image GPU data
|
||||
@ -276,6 +283,9 @@ class Generate:
|
||||
assert (
|
||||
0.0 <= variation_amount <= 1.0
|
||||
), '-v --variation_amount must be in [0.0, 1.0]'
|
||||
assert (
|
||||
(embiggen == None and embiggen_tiles == None) or ((embiggen != None or embiggen_tiles != None) and init_img != None)
|
||||
), 'Embiggen requires an init/input image to be specified'
|
||||
|
||||
# check this logic - doesn't look right
|
||||
if len(with_variations) > 0 or variation_amount > 1.0:
|
||||
@ -312,6 +322,8 @@ class Generate:
|
||||
|
||||
if (init_image is not None) and (mask_image is not None):
|
||||
generator = self._make_inpaint()
|
||||
elif (embiggen != None or embiggen_tiles != None):
|
||||
generator = self._make_embiggen()
|
||||
elif init_image is not None:
|
||||
generator = self._make_img2img()
|
||||
else:
|
||||
@ -331,11 +343,14 @@ class Generate:
|
||||
step_callback = step_callback, # called after each intermediate image is generated
|
||||
width = width,
|
||||
height = height,
|
||||
init_img = init_img, # embiggen needs to manipulate from the unmodified init_img
|
||||
init_image = init_image, # notice that init_image is different from init_img
|
||||
mask_image = mask_image,
|
||||
strength = strength,
|
||||
threshold = threshold,
|
||||
perlin = perlin,
|
||||
embiggen = embiggen,
|
||||
embiggen_tiles = embiggen_tiles,
|
||||
)
|
||||
|
||||
if upscale is not None or gfpgan_strength > 0:
|
||||
@ -408,6 +423,12 @@ class Generate:
|
||||
from ldm.dream.generator.img2img import Img2Img
|
||||
self.generators['img2img'] = Img2Img(self.model)
|
||||
return self.generators['img2img']
|
||||
|
||||
def _make_embiggen(self):
|
||||
if not self.generators.get('embiggen'):
|
||||
from ldm.dream.generator.embiggen import Embiggen
|
||||
self.generators['embiggen'] = Embiggen(self.model)
|
||||
return self.generators['embiggen']
|
||||
|
||||
def _make_txt2img(self):
|
||||
if not self.generators.get('txt2img'):
|
||||
|
@ -11,7 +11,7 @@ opencv-python==4.6.0.66
|
||||
pillow==9.2.0
|
||||
pudb==2019.2
|
||||
torch==1.12.1
|
||||
torchvision==0.12.0
|
||||
torchvision==0.13.0
|
||||
pytorch-lightning==1.4.2
|
||||
streamlit==1.12.0
|
||||
test-tube>=0.7.5
|
||||
|
@ -631,7 +631,7 @@ def create_cmd_parser():
|
||||
nargs='+',
|
||||
default=None,
|
||||
type=float,
|
||||
help='Scale factor (2, 4) for upscaling followed by upscaling strength (0-1.0). If strength not specified, defaults to 0.75'
|
||||
help='Scale factor (2, 4) for upscaling final output followed by upscaling strength (0-1.0). If strength not specified, defaults to 0.75'
|
||||
)
|
||||
parser.add_argument(
|
||||
'-save_orig',
|
||||
@ -639,6 +639,20 @@ def create_cmd_parser():
|
||||
action='store_true',
|
||||
help='Save original. Use it when upscaling to save both versions.',
|
||||
)
|
||||
parser.add_argument(
|
||||
'-embiggen',
|
||||
nargs='+',
|
||||
default=None,
|
||||
type=float,
|
||||
help='Embiggen tiled img2img for higher resolution and detail without extra VRAM usage. Takes scale factor relative to the size of the --init_img (-I), followed by ESRGAN upscaling strength (0-1.0), followed by minimum amount of overlap between tiles as a decimal ratio (0 - 1.0) or number of pixels. ESRGAN strength defaults to 0.75, and overlap defaults to 0.25 . ESRGAN is used to upscale the init prior to cutting it into tiles/pieces to run through img2img and then stitch back togeather.',
|
||||
)
|
||||
parser.add_argument(
|
||||
'-embiggen_tiles',
|
||||
nargs='+',
|
||||
default=None,
|
||||
type=int,
|
||||
help='If while doing Embiggen we are altering only parts of the image, takes a list of tiles by number to process and replace onto the image e.g. `1 3 5`, useful for redoing problematic spots from a prior Embiggen run',
|
||||
)
|
||||
# variants is going to be superseded by a generalized "prompt-morph" function
|
||||
# parser.add_argument('-v','--variants',type=int,help="in img2img mode, the first generated image will get passed back to img2img to generate the requested number of variants")
|
||||
parser.add_argument(
|
||||
|
Loading…
Reference in New Issue
Block a user