Merge branch 'development' into postprocessing-commands

This commit is contained in:
Lincoln Stein
2022-09-20 18:48:42 -04:00
committed by GitHub
134 changed files with 6352 additions and 4810 deletions

View File

@ -2,6 +2,8 @@
title: Changelog
---
# :octicons-log-16: Changelog
## v1.13 <small>(in process)</small>
- Supports a Google Colab notebook for a standalone server running on Google

View File

@ -1,27 +1,34 @@
---
title: CLI
hide:
- toc
---
# :material-bash: CLI
## **Interactive Command Line Interface**
The `dream.py` script, located in `scripts/dream.py`, provides an interactive interface to image
generation similar to the "dream mothership" bot that Stable AI provided on its Discord server.
Unlike the txt2img.py and img2img.py scripts provided in the original CompViz/stable-diffusion
source code repository, the time-consuming initialization of the AI model initialization only
happens once. After that image generation from the command-line interface is very fast.
Unlike the `txt2img.py` and `img2img.py` scripts provided in the original
[CompVis/stable-diffusion](https://github.com/CompVis/stable-diffusion) source code repository, the
time-consuming initialization of the AI model initialization only happens once. After that image
generation from the command-line interface is very fast.
The script uses the readline library to allow for in-line editing, command history (up and down
arrows), autocompletion, and more. To help keep track of which prompts generated which images, the
The script uses the readline library to allow for in-line editing, command history (++up++ and
++down++), autocompletion, and more. To help keep track of which prompts generated which images, the
script writes a log file of image names and prompts to the selected output directory.
In addition, as of version 1.02, it also writes the prompt into the PNG file's metadata where it can
be retrieved using scripts/images2prompt.py
be retrieved using `scripts/images2prompt.py`
The script is confirmed to work on Linux, Windows and Mac systems.
_Note:_ This script runs from the command-line or can be used as a Web application. The Web GUI is
currently rudimentary, but a much better replacement is on its way.
!!! note
This script runs from the command-line or can be used as a Web application. The Web GUI is
currently rudimentary, but a much better replacement is on its way.
```bash
(ldm) ~/stable-diffusion$ python3 ./scripts/dream.py
@ -47,185 +54,197 @@ dream> q
00011.png: "there's a fly in my soup" -n6 -g -S 2685670268
```
<p align='center'>
<img src="../assets/dream-py-demo.png"/>
</p>
![dream-py-demo](../assets/dream-py-demo.png)
The `dream>` prompt's arguments are pretty much identical to those used in the Discord bot, except
you don't need to type "!dream" (it doesn't hurt if you do). A significant change is that creation
of individual images is now the default unless --grid (-g) is given. A full list is given in [List
of prompt arguments] (#list-of-prompt-arguments).
of individual images is now the default unless `--grid` (`-g`) is given. A full list is given in
[List of prompt arguments](#list-of-prompt-arguments).
## Arguments
The script itself also recognizes a series of command-line switches that will change important
global defaults, such as the directory for image outputs and the location of the model weight files.
## List of arguments recognized at the command line
### List of arguments recognized at the command line
These command-line arguments can be passed to dream.py when you first run it from the Windows, Mac
These command-line arguments can be passed to `dream.py` when you first run it from the Windows, Mac
or Linux command line. Some set defaults that can be overridden on a per-prompt basis (see [List of
prompt arguments] (#list-of-prompt-arguments). Others
| Argument | Shortcut | Default | Description |
| :---------------------- | :---------: | ------------------------------------------------ | ---------------------------------------------------------------------------------------------------- |
| --help | -h | | Print a concise help message. |
| --outdir <path> | -o<path> | outputs/img_samples | Location for generated images. |
| --prompt_as_dir | -p | False | Name output directories using the prompt text. |
| --from_file <path> | | None | Read list of prompts from a file. Use "-" to read from standard input |
| --model <modelname> | | stable-diffusion-1.4 | Loads model specified in configs/models.yaml. Currently one of "stable-diffusion-1.4" or "laion400m" |
| --full_precision | -F | False | Run in slower full-precision mode. Needed for Macintosh M1/M2 hardware and some older video cards. |
| --web | | False | Start in web server mode |
| --host <ip addr> | | localhost | Which network interface web server should listen on. Set to 0.0.0.0 to listen on any. |
| --port <port> | | 9090 | Which port web server should listen for requests on. |
| --config <path> | | configs/models.yaml | Configuration file for models and their weights. |
| --iterations <int> | -n<int> | 1 | How many images to generate per prompt. |
| --grid | -g | False | Save all image series as a grid rather than individually. |
| --sampler <sampler> | -A<sampler> | k_lms | Sampler to use. Use -h to get list of available samplers. |
| --seamless | | False | Create interesting effects by tiling elements of the image. |
| --embedding_path <path> | | None | Path to pre-trained embedding manager checkpoints, for custom models |
| --gfpgan_dir | | src/gfpgan | Path to where GFPGAN is installed. |
| --gfpgan_model_path | | experiments/pretrained_models<br>/GFPGANv1.3.pth | Path to GFPGAN model file, relative to --gfpgan_dir. |
| --device <device> | -d<device> | torch.cuda.current_device() | Device to run SD on, e.g. "cuda:0" |
| Argument <img width="240" align="right"/> | Shortcut <img width="100" align="right"/> | Default <img width="320" align="right"/> | Description |
| ----------------------------------------- | ----------------------------------------- | ---------------------------------------------- | ---------------------------------------------------------------------------------------------------- |
| `--help` | `-h` | | Print a concise help message. |
| `--outdir <path>` | `-o<path>` | `outputs/img_samples` | Location for generated images. |
| `--prompt_as_dir` | `-p` | `False` | Name output directories using the prompt text. |
| `--from_file <path>` | | `None` | Read list of prompts from a file. Use `-` to read from standard input |
| `--model <modelname>` | | `stable-diffusion-1.4` | Loads model specified in configs/models.yaml. Currently one of "stable-diffusion-1.4" or "laion400m" |
| `--full_precision` | `-F` | `False` | Run in slower full-precision mode. Needed for Macintosh M1/M2 hardware and some older video cards. |
| `--web` | | `False` | Start in web server mode |
| `--host <ip addr>` | | `localhost` | Which network interface web server should listen on. Set to 0.0.0.0 to listen on any. |
| `--port <port>` | | `9090` | Which port web server should listen for requests on. |
| `--config <path>` | | `configs/models.yaml` | Configuration file for models and their weights. |
| `--iterations <int>` | `-n<int>` | `1` | How many images to generate per prompt. |
| `--grid` | `-g` | `False` | Save all image series as a grid rather than individually. |
| `--sampler <sampler>` | `-A<sampler>` | `k_lms` | Sampler to use. Use `-h` to get list of available samplers. |
| `--seamless` | | `False` | Create interesting effects by tiling elements of the image. |
| `--embedding_path <path>` | | `None` | Path to pre-trained embedding manager checkpoints, for custom models |
| `--gfpgan_dir` | | `src/gfpgan` | Path to where GFPGAN is installed. |
| `--gfpgan_model_path` | | `experiments/pretrained_models/GFPGANv1.3.pth` | Path to GFPGAN model file, relative to `--gfpgan_dir`. |
| `--device <device>` | `-d<device>` | `torch.cuda.current_device()` | Device to run SD on, e.g. "cuda:0" |
#### deprecated
These arguments are deprecated but still work:
| Argument | Shortcut | Default | Description |
| ---------------- | -------- | ------- | --------------------------------------------------------------- |
| --weights <path> | | None | Pth to weights file; use `--model stable-diffusion-1.4` instead |
| --laion400m | -l | False | Use older LAION400m weights; use `--model=laion400m` instead |
<figure markdown>
### **A note on path names:**
| Argument | Shortcut | Default | Description |
| ------------------ | -------- | ------- | --------------------------------------------------------------- |
| `--weights <path>` | | `None` | Pth to weights file; use `--model stable-diffusion-1.4` instead |
| `--laion400m` | `-l` | `False` | Use older LAION400m weights; use `--model=laion400m` instead |
On Windows systems, you may run into problems when passing the dream script standard backslashed
path names because the Python interpreter treats "\" as an escape. You can either double your
slashes (ick): `C:\\\\path\\\\to\\\\my\\\\file`, or use Linux/Mac style forward slashes (better):
`C:/path/to/my/file`.
</figure>
!!! note
On Windows systems, you may run into problems when passing the dream script standard backslashed
path names because the Python interpreter treats `\` as an escape. You can either double your
slashes (ick): `C:\\path\\to\\my\\file`, or use Linux/Mac style forward slashes (better):
`C:/path/to/my/file`.
### List of prompt arguments
After the dream.py script initializes, it will present you with a **dream>** prompt. Here you can
enter information to generate images from text (txt2img), to embellish an existing image or sketch
(img2img), or to selectively alter chosen regions of the image (inpainting).
After the `dream.py` script initializes, it will present you with a **`dream>`** prompt. Here you
can enter information to generate images from text (txt2img), to embellish an existing image or
sketch (img2img), or to selectively alter chosen regions of the image (inpainting).
### This is an example of txt2img
#### txt2img
```bash
dream> "waterfall and rainbow" -W640 -H480
```
!!! example
This will create the requested image with the dimensions 640 (width) and 480 (height).
```bash
dream> "waterfall and rainbow" -W640 -H480
```
This will create the requested image with the dimensions 640 (width) and 480 (height).
Those are the `dream` commands that apply to txt2img:
| Argument | Shortcut | Default | Description |
| --------------------------- | ---------------- | ------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| "my prompt" | | | Text prompt to use. The quotation marks are optional. |
| --width <int> | -W<int> | 512 | Width of generated image |
| --height <int> | -H<int> | 512 | Height of generated image |
| --iterations <int> | -n<int> | 1 | How many images to generate from this prompt |
| --steps <int> | -s<int> | 50 | How many steps of refinement to apply |
| --cfg_scale <float> | -C<float> | 7.5 | How hard to try to match the prompt to the generated image; any number greater than 0.0 works, but the useful range is roughly 5.0 to 20.0 |
| --seed <int> | -S<int> | None | Set the random seed for the next series of images. This can be used to recreate an image generated previously. |
| --sampler <sampler> | -A<sampler> | k_lms | Sampler to use. Use -h to get list of available samplers. |
| --grid | -g | False | Turn on grid mode to return a single image combining all the images generated by this prompt |
| --individual | -i | True | Turn off grid mode (deprecated; leave off --grid instead) |
| --outdir <path> | -o<path> | outputs/img_samples | Temporarily change the location of these images |
| --seamless | | False | Activate seamless tiling for interesting effects |
| --log_tokenization | -t | False | Display a color-coded list of the parsed tokens derived from the prompt |
| --skip_normalization | -x | False | Weighted subprompts will not be normalized. See [Weighted Prompts](./OTHER.md#weighted-prompts) |
| --upscale <int> <float> | -U <int> <float> | -U 1 0.75 | Upscale image by magnification factor (2, 4), and set strength of upscaling (0.0-1.0). If strength not set, will default to 0.75. |
| --gfpgan_strength <float> | -G <float> | -G0 | Fix faces using the GFPGAN algorithm; argument indicates how hard the algorithm should try (0.0-1.0) |
| --save_original | -save_orig | False | When upscaling or fixing faces, this will cause the original image to be saved rather than replaced. |
| --variation <float> | -v<float> | 0.0 | Add a bit of noise (0.0=none, 1.0=high) to the image in order to generate a series of variations. Usually used in combination with -S<seed> and -n<int> to generate a series a riffs on a starting image. See [Variations](./VARIATIONS.md). |
| --with_variations <pattern> | -V<pattern> | None | Combine two or more variations. See [Variations](./VARIATIONS.md) for now to use this. |
| Argument <img width="680" align="right"/> | Shortcut <img width="420" align="right"/> | Default <img width="480" align="right"/> | Description |
| ----------------------------------------- | ----------------------------------------- | ---------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `"my prompt"` | | | Text prompt to use. The quotation marks are optional. |
| `--width <int>` | `-W<int>` | `512` | Width of generated image |
| `--height <int>` | `-H<int>` | `512` | Height of generated image |
| `--iterations <int>` | `-n<int>` | `1` | How many images to generate from this prompt |
| `--steps <int>` | `-s<int>` | `50` | How many steps of refinement to apply |
| `--cfg_scale <float>` | `-C<float>` | `7.5` | How hard to try to match the prompt to the generated image; any number greater than 0.0 works, but the useful range is roughly 5.0 to 20.0 |
| `--seed <int>` | `-S<int>` | `None` | Set the random seed for the next series of images. This can be used to recreate an image generated previously. |
| `--sampler <sampler>` | `-A<sampler>` | `k_lms` | Sampler to use. Use `-h` to get list of available samplers. |
| `--grid` | `-g` | `False` | Turn on grid mode to return a single image combining all the images generated by this prompt |
| `--individual` | `-i` | `True` | Turn off grid mode (deprecated; leave off `--grid` instead) |
| `--outdir <path>` | `-o<path>` | `outputs/img_samples` | Temporarily change the location of these images |
| `--seamless` | | `False` | Activate seamless tiling for interesting effects |
| `--log_tokenization` | `-t` | `False` | Display a color-coded list of the parsed tokens derived from the prompt |
| `--skip_normalization` | `-x` | `False` | Weighted subprompts will not be normalized. See [Weighted Prompts](./OTHER.md#weighted-prompts) |
| `--upscale <int> <float>` | `-U <int> <float>` | `-U 1 0.75` | Upscale image by magnification factor (2, 4), and set strength of upscaling (0.0-1.0). If strength not set, will default to 0.75. |
| `--gfpgan_strength <float>` | `-G <float>` | `-G0` | Fix faces using the GFPGAN algorithm; argument indicates how hard the algorithm should try (0.0-1.0) |
| `--save_original` | `-save_orig` | `False` | When upscaling or fixing faces, this will cause the original image to be saved rather than replaced. |
| `--variation <float>` | `-v<float>` | `0.0` | Add a bit of noise (0.0=none, 1.0=high) to the image in order to generate a series of variations. Usually used in combination with `-S<seed>` and `-n<int>` to generate a series a riffs on a starting image. See [Variations](./VARIATIONS.md). |
| `--with_variations <pattern>` | `-V<pattern>` | `None` | Combine two or more variations. See [Variations](./VARIATIONS.md) for now to use this. |
Note that the width and height of the image must be multiples of 64. You can provide different
values, but they will be rounded down to the nearest multiple of 64.
!!! note
### This is an example of img2img
The width and height of the image must be multiples of 64. You can provide different
values, but they will be rounded down to the nearest multiple of 64.
```bash
dream> waterfall and rainbow -I./vacation-photo.png -W640 -H480 --fit
```
#### img2img
This will modify the indicated vacation photograph by making it more like the prompt. Results will
vary greatly depending on what is in the image. We also ask to --fit the image into a box no bigger
than 640x480. Otherwise the image size will be identical to the provided photo and you may run out
of memory if it is large.
!!! example
Repeated chaining of img2img on an image can result in significant color shifts
in the output, especially if run with lower strength. Color correction can be
run against a reference image to fix this issue. Use the original input image to the
chain as the the reference image for each step in the chain.
```bash
dream> "waterfall and rainbow" -I./vacation-photo.png -W640 -H480 --fit
```
This will modify the indicated vacation photograph by making it more like the prompt. Results will
vary greatly depending on what is in the image. We also ask to --fit the image into a box no bigger
than 640x480. Otherwise the image size will be identical to the provided photo and you may run out
of memory if it is large.
Repeated chaining of img2img on an image can result in significant color shifts in the output,
especially if run with lower strength. Color correction can be run against a reference image to fix
this issue. Use the original input image to the chain as the the reference image for each step in
the chain.
In addition to the command-line options recognized by txt2img, img2img accepts additional options:
| Argument | Shortcut | Default | Description |
| ------------------ | --------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------------ |
| --init_img <path> | -I<path> | None | Path to the initialization image |
| --init_color <path> | | None | Path to reference image for color correction |
| --fit | -F | False | Scale the image to fit into the specified -H and -W dimensions |
| --strength <float> | -s<float> | 0.75 | How hard to try to match the prompt to the initial image. Ranges from 0.0-0.99, with higher values replacing the initial image completely. |
| Argument <img width="160" align="right"/> | Shortcut | Default | Description |
| ----------------------------------------- | ----------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------------ |
| `--init_img <path>` | `-I<path>` | `None` | Path to the initialization image |
| `--init_color <path>` | | `None` | Path to reference image for color correction |
| `--fit` | `-F` | `False` | Scale the image to fit into the specified -H and -W dimensions |
| `--strength <float>` | `-f<float>` | `0.75` | How hard to try to match the prompt to the initial image. Ranges from 0.0-0.99, with higher values replacing the initial image completely. |
### This is an example of inpainting
#### Inpainting
```bash
dream> "waterfall and rainbow" -I./vacation-photo.png -M./vacation-mask.png -W640 -H480 --fit
```
!!! example
This will do the same thing as img2img, but image alterations will only occur within transparent
areas defined by the mask file specified by -M. You may also supply just a single initial image with
the areas to overpaint made transparent, but you must be careful not to destroy the pixels
underneath when you create the transparent areas. See [Inpainting](./INPAINTING.md) for details.
```bash
dream> "waterfall and rainbow" -I./vacation-photo.png -M./vacation-mask.png -W640 -H480 --fit
```
inpainting accepts all the arguments used for txt2img and img2img, as well as the --mask (-M)
This will do the same thing as img2img, but image alterations will only occur within transparent
areas defined by the mask file specified by `-M`. You may also supply just a single initial image with
the areas to overpaint made transparent, but you must be careful not to destroy the pixels
underneath when you create the transparent areas. See [Inpainting](./INPAINTING.md) for details.
Inpainting accepts all the arguments used for txt2img and img2img, as well as the `--mask` (`-M`)
argument:
| Argument | Shortcut | Default | Description |
| ------------------ | -------- | ------- | ------------------------------------------------------------------------------------------------ |
| --init_mask <path> | -M<path> | None | Path to an image the same size as the initial_image, with areas for inpainting made transparent. |
| Argument <img width="100" align="right"/> | Shortcut | Default | Description |
| ----------------------------------------- | ---------- | ------- | ------------------------------------------------------------------------------------------------ |
| `--init_mask <path>` | `-M<path>` | `None` | Path to an image the same size as the initial_image, with areas for inpainting made transparent. |
## Command-line editing and completion
If you are on a Macintosh or Linux machine, the command-line offers convenient history tracking,
editing, and command completion.
- To scroll through previous commands and potentially edit/reuse them, use the up and down cursor
keys.
- To edit the current command, use the left and right cursor keys to position the cursor, and then
backspace, delete or insert characters.
- To move to the very beginning of the command, type CTRL-A (or command-A on the Mac)
- To move to the end of the command, type CTRL-E.
- To scroll through previous commands and potentially edit/reuse them, use the ++up++ and ++down++
cursor keys.
- To edit the current command, use the ++left++ and ++right++ cursor keys to position the cursor,
and then ++backspace++, ++delete++ or ++insert++ characters.
- To move to the very beginning of the command, type ++ctrl+a++ (or ++command+a++ on the Mac)
- To move to the end of the command, type ++ctrl+e++.
- To cut a section of the command, position the cursor where you want to start cutting and type
CTRL-K.
- To paste a cut section back in, position the cursor where you want to paste, and type CTRL-Y
++ctrl+k++.
- To paste a cut section back in, position the cursor where you want to paste, and type ++ctrl+y++
Windows users can get similar, but more limited, functionality if they launch dream.py with the
Windows users can get similar, but more limited, functionality if they launch `dream.py` with the
"winpty" program:
```
> winpty python scripts\dream.py
```batch
winpty python scripts\dream.py
```
On the Mac and Linux platforms, when you exit dream.py, the last 1000 lines of your command-line
history will be saved. When you restart dream.py, you can access the saved history using the
up-arrow key.
On the Mac and Linux platforms, when you exit `dream.py`, the last 1000 lines of your command-line
history will be saved. When you restart `dream.py`, you can access the saved history using the
++up++ key.
In addition, limited command-line completion is installed. In various contexts, you can start typing
your command and press tab. A list of potential completions will be presented to you. You can then
type a little more, hit tab again, and eventually autocomplete what you want.
When specifying file paths using the one-letter shortcuts, the CLI will attempt to complete
pathnames for you. This is most handy for the -I (init image) and -M (init mask) paths. To initiate
completion, start the path with a slash ("/") or "./". For example:
pathnames for you. This is most handy for the `-I` (init image) and `-M` (init mask) paths. To
initiate completion, start the path with a slash `/` or `./`, for example:
```
dream> zebra with a mustache -I./test-pictures<TAB>
```bash
dream> "zebra with a mustache" -I./test-pictures<TAB>
-I./test-pictures/Lincoln-and-Parrot.png -I./test-pictures/zebra.jpg -I./test-pictures/madonna.png
-I./test-pictures/bad-sketch.png -I./test-pictures/man_with_eagle/
```
You can then type "z", hit tab again, and it will autofill to "zebra.jpg".
You can then type ++z++, hit ++tab++ again, and it will autofill to `zebra.jpg`.
More text completion features (such as autocompleting seeds) are on their way.

View File

@ -1,4 +1,10 @@
# **Embiggen -- upscale your images on limited memory machines**
---
title: Embiggen
---
# :material-loupe: Embiggen
**upscale your images on limited memory machines**
GFPGAN and Real-ESRGAN are both memory intensive. In order to avoid
crashes and memory overloads during the Stable Diffusion process,
@ -16,7 +22,7 @@ face restore a particular generated image, pass it again with the same
prompt and generated seed along with the `-U` and `-G` prompt
arguments to perform those actions.
## Embiggen
## Embiggen
If you wanted to be able to do more (pixels) without running out of VRAM,
or you want to upscale with details that couldn't possibly appear
@ -37,7 +43,7 @@ it's similar to that, except it can work up to an arbitrarily large size
has extra logic to re-run any number of the tile sub-sections of the image
if for example a small part of a huge run got messed up.
**Usage**
## Usage
`-embiggen <scaling_factor> <esrgan_strength> <overlap_ratio OR overlap_pixels>`
@ -94,12 +100,12 @@ Tiles are numbered starting with one, and left-to-right,
top-to-bottom. So, if you are generating a 3x3 tiled image, the
middle row would be `4 5 6`.
**Example Usage**
## Example Usage
Running Embiggen with 512x512 tiles on an existing image, scaling up by a factor of 2.5x;
and doing the same again (default ESRGAN strength is 0.75, default overlap between tiles is 0.25):
```
```bash
dream > a photo of a forest at sunset -s 100 -W 512 -H 512 -I outputs/forest.png -f 0.4 -embiggen 2.5
dream > a photo of a forest at sunset -s 100 -W 512 -H 512 -I outputs/forest.png -f 0.4 -embiggen 2.5 0.75 0.25
```
@ -111,23 +117,23 @@ If there weren't enough clouds in the sky of that forest you just made
512x512 tiles with 0.25 overlaps wide) we can replace that top row of
tiles:
```
```bash
dream> a photo of puffy clouds over a forest at sunset -s 100 -W 512 -H 512 -I outputs/000002.seed.png -f 0.5 -embiggen_tiles 1 2 3
```
**Note**
!!! note
Because the same prompt is used on all the tiled images, and the model
doesn't have the context of anything outside the tile being run - it
can end up creating repeated pattern (also called 'motifs') across all
the tiles based on that prompt. The best way to combat this is
lowering the `--strength` (`-f`) to stay more true to the init image,
and increasing the number of steps so there is more compute-time to
create the detail. Anecdotally `--strength` 0.35-0.45 works pretty
well on most things. It may also work great in some examples even with
the `--strength` set high for patterns, landscapes, or subjects that
are more abstract. Because this is (relatively) fast, you can also
always create a few Embiggen'ed images and manually composite them to
preserve the best parts from each.
Because the same prompt is used on all the tiled images, and the model
doesn't have the context of anything outside the tile being run - it
can end up creating repeated pattern (also called 'motifs') across all
the tiles based on that prompt. The best way to combat this is
lowering the `--strength` (`-f`) to stay more true to the init image,
and increasing the number of steps so there is more compute-time to
create the detail. Anecdotally `--strength` 0.35-0.45 works pretty
well on most things. It may also work great in some examples even with
the `--strength` set high for patterns, landscapes, or subjects that
are more abstract. Because this is (relatively) fast, you can also
always create a few Embiggen'ed images and manually composite them to
preserve the best parts from each.
Author: [Travco](https://github.com/travco)
Author: [Travco](https://github.com/travco)

View File

@ -2,7 +2,8 @@
title: Image-to-Image
---
## **IMG2IMG**
# :material-image-multiple: **IMG2IMG**
This script also provides an `img2img` feature that lets you seed your creations with an initial
drawing or photo. This is a really cool feature that tells stable diffusion to build the prompt on
top of the image you provide, preserving the original's basic shape and layout. To use it, provide

View File

@ -2,6 +2,8 @@
title: Inpainting
---
# :octicons-paintbrush-16: Inpainting
## **Creating Transparent Regions for Inpainting**
Inpainting is really cool. To do it, you start with an initial image and use a photoeditor to make
@ -26,6 +28,8 @@ dream> "man with cat on shoulder" -I./images/man.png -M./images/man-transparent.
We are hoping to get rid of the need for this workaround in an upcoming release.
---
## Recipe for GIMP
[GIMP](https://www.gimp.org/) is a popular Linux photoediting tool.
@ -34,7 +38,7 @@ We are hoping to get rid of the need for this workaround in an upcoming release.
2. Layer->Transparency->Add Alpha Channel
3. Use lasoo tool to select region to mask
4. Choose Select -> Float to create a floating selection
5. Open the Layers toolbar (^L) and select "Floating Selection"
5. Open the Layers toolbar (++ctrl+l++) and select "Floating Selection"
6. Set opacity to 0%
7. Export as PNG
8. In the export dialogue, Make sure the "Save colour values from
@ -44,37 +48,41 @@ We are hoping to get rid of the need for this workaround in an upcoming release.
## Recipe for Adobe Photoshop
1. Open image in Photoshop
<p align='left'>
<img src="../assets/step1.png"/>
</p>
<figure markdown>
![step1](../assets/step1.png)
</figure>
2. Use any of the selection tools (Marquee, Lasso, or Wand) to select the area you desire to inpaint.
<p align='left'>
<img src="../assets/step2.png"/>
</p>
3. Because we'll be applying a mask over the area we want to preserve, you should now select the inverse by using the Shift + Ctrl + I shortcut, or right clicking and using the "Select Inverse" option.
<figure markdown>
![step2](../assets/step2.png)
</figure>
4. You'll now create a mask by selecting the image layer, and Masking the selection. Make sure that you don't delete any of the underlying image, or your inpainting results will be dramatically impacted.
<p align='left'>
<img src="../assets/step4.png"/>
</p>
3. Because we'll be applying a mask over the area we want to preserve, you should now select the inverse by using the ++shift+ctrl+i++ shortcut, or right clicking and using the "Select Inverse" option.
4. You'll now create a mask by selecting the image layer, and Masking the selection. Make sure that you don't delete any of the undrlying image, or your inpainting results will be dramatically impacted.
<figure markdown>
![step4](../assets/step4.png)
</figure>
5. Make sure to hide any background layers that are present. You should see the mask applied to your image layer, and the image on your canvas should display the checkered background.
<p align='left'>
<img src="../assets/step5.png"/>
</p>
<p align='left'>
<img src="../assets/step6.png"/>
</p>
<figure markdown>
![step5](../assets/step5.png)
</figure>
6. Save the image as a transparent PNG by using the "Save a Copy" option in the File menu, or using the Alt + Ctrl + S keyboard shortcut.
6. Save the image as a transparent PNG by using the "Save a Copy" option in the File menu, or using the Alt + Ctrl + S keyboard shortcut
<figure markdown>
![step6](../assets/step6.png)
</figure>
7. After following the inpainting instructions above (either through the CLI or the Web UI), marvel at your newfound ability to selectively dream. Lookin' good!
<p align='left'>
<img src="../assets/step7.png"/>
</p>
8. In the export dialogue, Make sure the "Save colour values from transparent pixels" checkbox is
selected.
<figure markdown>
![step7](../assets/step7.png)
</figure>
8. In the export dialogue, Make sure the "Save colour values from transparent pixels" checkbox is selected.

View File

@ -2,6 +2,8 @@
title: Others
---
# :fontawesome-regular-share-from-square: Others
## **Google Colab**
Stable Diffusion AI Notebook: <a

View File

@ -1,4 +1,8 @@
# Prompting Features
---
title: Prompting Features
---
# :octicons-command-palette-24: Prompting Features
## **Reading Prompts from a File**
@ -54,43 +58,41 @@ In the above statement, the words 'not really cool` will be ignored by Stable Di
Here's a prompt that depicts what it does.
original prompt:
original prompt:
```bash
"A fantastical translucent poney made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve" -s 20 -W 512 -H 768 -C 7.5 -A k_euler_a -S 1654590180
```
`#!bash "A fantastical translucent poney made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve" -s 20 -W 512 -H 768 -C 7.5 -A k_euler_a -S 1654590180`
![step1](../assets/variation_walkthru/step1.png)
<figure markdown>
![step1](../assets/negative_prompt_walkthru/step1.png)
</figure>
That image has a woman, so if we want the horse without a rider, we can influence the image not to have a woman by putting [woman] in the prompt, like this:
```bash
"A fantastical translucent poney made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve [woman]" -s 20 -W 512 -H 768 -C 7.5 -A k_euler_a -S 1654590180
```
`#!bash "A fantastical translucent poney made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve [woman]" -s 20 -W 512 -H 768 -C 7.5 -A k_euler_a -S 1654590180`
![step2](../assets/variation_walkthru/step2.png)
<figure markdown>
![step2](../assets/negative_prompt_walkthru/step2.png)
</figure>
That's nice - but say we also don't want the image to be quite so blue. We can add "blue" to the list of negative prompts, so it's now [woman blue]:
```bash
"A fantastical translucent poney made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve [woman blue]" -s 20 -W 512 -H 768 -C 7.5 -A k_euler_a -S 1654590180
```
![step3](../assets/variation_walkthru/step3.png)
`#!bash "A fantastical translucent poney made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve [woman blue]" -s 20 -W 512 -H 768 -C 7.5 -A k_euler_a -S 1654590180`
<figure markdown>
![step3](../assets/negative_prompt_walkthru/step3.png)
</figure>
Getting close - but there's no sense in having a saddle when our horse doesn't have a rider, so we'll add one more negative prompt: [woman blue saddle].
```bash
"A fantastical translucent poney made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve [woman blue saddle]" -s 20 -W 512 -H 768 -C 7.5 -A k_euler_a -S 1654590180
```
`#!bash "A fantastical translucent poney made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve [woman blue saddle]" -s 20 -W 512 -H 768 -C 7.5 -A k_euler_a -S 1654590180`
![step4](../assets/variation_walkthru/step4.png)
<figure markdown>
![step4](../assets/negative_prompt_walkthru/step4.png)
</figure>
!!! notes "Notes about this feature:"
Notes about this feature:
* The only requirement for words to be ignored is that they are in between a pair of square brackets.
* You can provide multiple words within the same bracket.
* You can provide multiple brackets with multiple words in different places of your prompt. That works just fine.
* To improve typical anatomy problems, you can add negative prompts like [bad anatomy, extra legs, extra arms, extra fingers, poorly drawn hands, poorly drawn feet, disfigured, out of frame, tiling, bad art, deformed, mutated].
* The only requirement for words to be ignored is that they are in between a pair of square brackets.
* You can provide multiple words within the same bracket.
* You can provide multiple brackets with multiple words in different places of your prompt. That works just fine.
* To improve typical anatomy problems, you can add negative prompts like `[bad anatomy, extra legs, extra arms, extra fingers, poorly drawn hands, poorly drawn feet, disfigured, out of frame, tiling, bad art, deformed, mutated]`.

View File

@ -2,6 +2,8 @@
title: TEXTUAL_INVERSION
---
# :material-file-document-plus-outline: TEXTUAL_INVERSION
## **Personalizing Text-to-Image Generation**
You may personalize the generated images to provide your own styles or objects
@ -39,7 +41,7 @@ and one with the init word provided.
On a RTX3090, the process for SD will take ~1h @1.6 iterations/sec.
!!! Info _Note_
!!! note
According to the associated paper, the optimal number of
images is 3-5. Your model may not converge if you use more images than
@ -57,9 +59,7 @@ Once the model is trained, specify the trained .pt or .bin file when starting
dream using
```bash
python3 ./scripts/dream.py \
--embedding_path /path/to/embedding.pt \
--full_precision
python3 ./scripts/dream.py --embedding_path /path/to/embedding.pt
```
Then, to utilize your subject at the dream prompt

View File

@ -4,14 +4,16 @@ title: Upscale
## **Intro**
The script provides the ability to restore faces and upscale.
The script provides the ability to restore faces and upscale. You can apply these operations
at the time you generate the images, or at any time to a previously-generated PNG file, using
the [!fix](#Fixing Previously-Generated Images) command.
You can enable these features by passing `--restore` and `--esrgan` to your launch script to enable
face restoration modules and upscaling modules respectively.
# :material-image-size-select-large: Upscale
## **GFPGAN and Real-ESRGAN Support**
## **Face Fixing**
The default face restoration module is GFPGAN and the default upscaling module is ESRGAN.
The default face restoration module is GFPGAN. The default upscale is Real-ESRGAN. For an alternative
face restoration module, see [CodeFormer Support] below.
As of version 1.14, environment.yaml will install the Real-ESRGAN package into the standard install
location for python packages, and will put GFPGAN into a subdirectory of "src" in the
@ -36,11 +38,13 @@ this package which asked you to install GFPGAN in a sibling directory, you may u
`--gfpgan_dir` argument with `dream.py` to set a custom path to your GFPGAN directory. _There are
other GFPGAN related boot arguments if you wish to customize further._
**Note: Internet connection needed:** Users whose GPU machines are isolated from the Internet (e.g.
on a University cluster) should be aware that the first time you run dream.py with GFPGAN and
Real-ESRGAN turned on, it will try to download model files from the Internet. To rectify this, you
may run `python3 scripts/preload_models.py` after you have installed GFPGAN and all its
dependencies.
!!! warning "Internet connection needed"
Users whose GPU machines are isolated from the Internet (e.g.
on a University cluster) should be aware that the first time you run dream.py with GFPGAN and
Real-ESRGAN turned on, it will try to download model files from the Internet. To rectify this, you
may run `python3 scripts/preload_models.py` after you have installed GFPGAN and all its
dependencies.
## **Usage**
@ -89,16 +93,16 @@ This also works with img2img:
dream> a man wearing a pineapple hat -I path/to/your/file.png -U 2 0.5 -G 0.6
```
### **Note**
!!! note
GFPGAN and Real-ESRGAN are both memory intensive. In order to avoid crashes and memory overloads
during the Stable Diffusion process, these effects are applied after Stable Diffusion has completed
its work.
GFPGAN and Real-ESRGAN are both memory intensive. In order to avoid crashes and memory overloads
during the Stable Diffusion process, these effects are applied after Stable Diffusion has completed
its work.
In single image generations, you will see the output right away but when you are using multiple
iterations, the images will first be generated and then upscaled and face restored after that
process is complete. While the image generation is taking place, you will still be able to preview
the base images.
In single image generations, you will see the output right away but when you are using multiple
iterations, the images will first be generated and then upscaled and face restored after that
process is complete. While the image generation is taking place, you will still be able to preview
the base images.
If you wish to stop during the image generation but want to upscale or face restore a particular
generated image, pass it again with the same prompt and generated seed along with the `-U` and `-G`
@ -139,3 +143,22 @@ that is the best restoration possible. This may deviate slightly from the origin
excellent option to use in situations when there is very little facial data to work with.
`<prompt> -G 1.0 -ft codeformer -cf 0.1`
## Fixing Previously-Generated Images
It is easy to apply face restoration and/or upscaling to any previously-generated file. Just use the
syntax `!fix path/to/file.png <options>`. For example, to apply GFPGAN at strength 0.8 and upscale 2X
for a file named `./outputs/img-samples/000044.2945021133.png`, just run:
~~~~
dream> !fix ./outputs/img-samples/000044.2945021133.png -G 0.8 -U 2
~~~~
A new file named `000044.2945021133.fixed.png` will be created in the output directory. Note that
the `!fix` command does not replace the original file, unlike the behavior at generate time.
**Disabling:**
If, for some reason, you do not wish to load the GFPGAN and/or ESRGAN libraries, you can disable them
on the dream.py command line with the `--no_restore` and `--no_upscale` options, respectively.

View File

@ -2,6 +2,10 @@
title: Variations
---
# :material-tune-variant: Variations
## Intro
Release 1.13 of SD-Dream adds support for image variations.
You are able to do the following:
@ -29,7 +33,7 @@ This will be indicated as `prompt` in the examples below.
First we let SD create a series of images in the usual way, in this case
requesting six iterations:
```
```bash
dream> lucy lawless as xena, warrior princess, character portrait, high resolution -n6
...
Outputs:
@ -41,9 +45,10 @@ Outputs:
./outputs/Xena/000001.3357757885.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -S3357757885
```
The one with seed 3357757885 looks nice:
![var1](../assets/variation_walkthru/000001.3357757885.png)
<figure markdown>
![var1](../assets/variation_walkthru/000001.3357757885.png)
<figcaption>Seed 3357757885 looks nice</figcaption>
</figure>
---
@ -75,15 +80,21 @@ used to generate it.
This gives us a series of closely-related variations, including the two shown
here.
![var2](../assets/variation_walkthru/000002.3647897225.png)
<figure markdown>
![var2](../assets/variation_walkthru/000002.3647897225.png)
<figcaption>subseed 3647897225</figcaption>
</figure>
![var3](../assets/variation_walkthru/000002.1614299449.png)
<figure markdown>
![var3](../assets/variation_walkthru/000002.1614299449.png)
<figcaption>subseed 1614299449</figcaption>
</figure>
I like the expression on Xena's face in the first one (subseed 3647897225), and
the armor on her shoulder in the second one (subseed 1614299449). Can we combine
them to get the best of both worlds?
We combine the two variations using `-V` (--with_variations). Again, we must
We combine the two variations using `-V` (`--with_variations`). Again, we must
provide the seed for the originally-chosen image in order for this to work.
```bash
@ -95,7 +106,9 @@ Outputs:
Here we are providing equal weights (0.1 and 0.1) for both the subseeds. The
resulting image is close, but not exactly what I wanted:
![var4](../assets/variation_walkthru/000003.1614299449.png)
<figure markdown>
![var4](../assets/variation_walkthru/000003.1614299449.png)
</figure>
We could either try combining the images with different weights, or we can
generate more variations around the almost-but-not-quite image. We do the
@ -116,7 +129,10 @@ Outputs:
This produces six images, all slight variations on the combination of the chosen
two images. Here's the one I like best:
![var5](../assets/variation_walkthru/000004.3747154981.png)
<figure markdown>
![var5](../assets/variation_walkthru/000004.3747154981.png)
<figcaption>000004.3747154981.png</figcaption>
</figure>
As you can see, this is a very powerful tool, which when combined with subprompt
weighting, gives you great control over the content and quality of your

View File

@ -2,8 +2,10 @@
title: Barebones Web Server
---
# :material-web: Barebones Web Server
As of version 1.10, this distribution comes with a bare bones web server (see
screenshot). To use it, run the `dream.py` script by adding the `**--web**`
screenshot). To use it, run the `dream.py` script by adding the `--web`
option.
```bash