From 37c921dfe2aa25342934a101bf83eea4c0f5cfb7 Mon Sep 17 00:00:00 2001 From: Lincoln Stein Date: Sun, 11 Sep 2022 14:26:41 -0400 Subject: [PATCH] documentation enhancements --- docs/features/CLI.md | 186 ++++++++++++++++++++++++++++++++++++++- docs/features/IMG2IMG.md | 27 ++++-- docs/features/OTHER.md | 24 +++-- 3 files changed, 219 insertions(+), 18 deletions(-) diff --git a/docs/features/CLI.md b/docs/features/CLI.md index e475da7d64..ee23cca448 100644 --- a/docs/features/CLI.md +++ b/docs/features/CLI.md @@ -41,10 +41,188 @@ dream> q

-The `dream>` prompt's arguments are pretty much identical to those used in the Discord bot, except you don't need to type "!dream" (it doesn't -hurt if you do). A significant change is that creation of individual images is now the default unless --grid (-g) is given. +The `dream>` prompt's arguments are pretty much identical to those +used in the Discord bot, except you don't need to type "!dream" (it +doesn't hurt if you do). A significant change is that creation of +individual images is now the default unless --grid (-g) is given. A +full list is given in [List of prompt arguments] +(#list-of-prompt-arguments). -For backward compatibility, the -i switch is recognized. For command-line help type -h (or --help) at the dream> prompt. +# Arguments -The script itself also recognizes a series of command-line switches that will change important global defaults, such as the directory for +The script itself also recognizes a series of command-line switches +that will change important global defaults, such as the directory for image outputs and the location of the model weight files. + +## List of arguments recognized at the command line: + +These command-line arguments can be passed to dream.py when you first +run it from the Windows, Mac or Linux command line. Some set defaults +that can be overridden on a per-prompt basis (see [List of prompt +arguments] (#list-of-prompt-arguments). Others + +| Argument | Shortcut | Default | Description | +|--------------------|------------|---------------------|--------------| +| --help | -h | | Print a concise help message. | +| --outdir | -o | outputs/img_samples | Location for generated images. | +| --prompt_as_dir | -p | False | Name output directories using the prompt text. | +| --from_file | | None | Read list of prompts from a file. Use "-" to read from standard input | +| --model | | stable-diffusion-1.4| Loads model specified in configs/models.yaml. Currently one of "stable-diffusion-1.4" or "laion400m"| +| --full_precision | -F | False | Run in slower full-precision mode. Needed for Macintosh M1/M2 hardware and some older video cards. | +| --web | | False | Start in web server mode | +| --host | | localhost | Which network interface web server should listen on. Set to 0.0.0.0 to listen on any. | +| --port | | 9090 | Which port web server should listen for requests on. | +| --config | | configs/models.yaml | Configuration file for models and their weights. | +| --iterations | -n | 1 | How many images to generate per prompt. | +| --grid | -g | False | Save all image series as a grid rather than individually. | +| --sampler | -A| k_lms | Sampler to use. Use -h to get list of available samplers. | +| --seamless | | False | Create interesting effects by tiling elements of the image. | +| --embedding_path | | None | Path to pre-trained embedding manager checkpoints, for custom models | +| --gfpgan_dir | | src/gfpgan | Path to where GFPGAN is installed. | +| --gfpgan_model_path| | experiments/pretrained_models/GFPGANv1.3.pth| Path to GFPGAN model file, relative to --gfpgan_dir. | +| --device | -d| torch.cuda.current_device() | Device to run SD on, e.g. "cuda:0" | + +These arguments are deprecated but still work: + +| Argument | Shortcut | Default | Description | +|--------------------|------------|---------------------|--------------| +| --weights | | None | Pth to weights file; use `--model stable-diffusion-1.4` instead | +| --laion400m | -l | False | Use older LAION400m weights; use `--model=laion400m` instead | + +**A note on path names:** On Windows systems, you may run into + problems when passing the dream script standard backslashed path + names because the Python interpreter treats "\" as an escape. + You can either double your slashes (ick): C:\\path\\to\\my\\file, or + use Linux/Mac style forward slashes (better): C:/path/to/my/file. + +## List of prompt arguments + +After the dream.py script initializes, it will present you with a +**dream>** prompt. Here you can enter information to generate images +from text (txt2img), to embellish an existing image or sketch +(img2img), or to selectively alter chosen regions of the image +(inpainting). + +### This is an example of txt2img: + +~~~~ +dream> waterfall and rainbow -W640 -H480 +~~~~ + +This will create the requested image with the dimensions 640 (width) +and 480 (height). + +Here are the dream> command that apply to txt2img: + +| Argument | Shortcut | Default | Description | +|--------------------|------------|---------------------|--------------| +| "my prompt" | | | Text prompt to use. The quotation marks are optional. | +| --width | -W | 512 | Width of generated image | +| --height | -H | 512 | Height of generated image | +| --iterations | -n | 1 | How many images to generate from this prompt | +| --steps | -s | 50 | How many steps of refinement to apply | +| --cfg_scale | -C | 7.5 | How "hard to try" to match image to prompt| +| --seed | -S | None | Set the random seed for the next series of images. This can be used to recreate an image generated previously.| +| --sampler | -A| k_lms | Sampler to use. Use -h to get list of available samplers. | +| --grid | -g | False | Turn on grid mode to return a single image combining all the images generated by this prompt | +| --individual | -i | True | Turn off grid mode (deprecated; leave off --grid instead) | +| --outdir | -o | outputs/img_samples | Temporarily change the location of these images | +| --seamless | | False | Activate seamless tiling for interesting effects | +| --log_tokenization | -t | False | Display a color-coded list of the parsed tokens derived from the prompt | +| --skip_normalization| -x | False | Weighted subprompts will not be normalized. See [Weighted Prompts](./OTHER.md#**Weighted Prompts**) | +| --upscale | -U | -U 1 0.75| Upscale image by magnification factor (2, 4), and set strength of upscaling (0.0-1.0). If strength not set, will default to 0.75. | +| --gfpgan_strength | -G | -G0 | Fix faces using the GFPGAN algorithm; argument indicates how hard the algorithm should try (0.0-1.0) | +| --save_original | -save_orig| False | When upscaling or fixing faces, this will cause the original image to be saved rather than replaced. | +| --variation |-v| 0.0 | Add a bit of noise (0.0=none, 1.0=high) to the image in order to generate a series of variations. Usually used in combination with -S and -n to generate a series a riffs on a starting image. See [Variations](./VARIATIONS.md). | +| --with_variations | -V| None | Combine two or more variations. See [Variations](./VARIATIONS.md) for now to use this. | + +Note that the width and height of the image must be multiples of +64. You can provide different values, but they will be rounded down to +the nearest multiple of 64. + + +### This is an example of img2img: + +~~~~ +dream> waterfall and rainbow -I./vacation-photo.png -W640 -H480 --fit +~~~~ + +This will modify the indicated vacation photograph by making it more +like the prompt. Results will vary greatly depending on what is in the +image. We also ask to --fit the image into a box no bigger than +640x480. Otherwise the image size will be identical to the provided +photo and you may run out of memory if it is large. + +In addition to the command-line options recognized by txt2img, img2img +accepts additional options: + +| Argument | Shortcut | Default | Description | +|--------------------|------------|---------------------|--------------| +| --init_img | -I | None | Path to the initialization image | +| --fit | -F | False | Scale the image to fit into the specified -H and -W dimensions | +| --strength | -s | 0.75 | How hard to try to match the prompt to the initial image. Ranges from 0.0-0.99, with higher values replacing the initial image completely.| + +### This is an example of inpainting: + +~~~~ +dream> waterfall and rainbow -I./vacation-photo.png -M./vacation-mask.png -W640 -H480 --fit +~~~~ + +This will do the same thing as img2img, but image alterations will +only occur within transparent areas defined by the mask file specified +by -M. You may also supply just a single initial image with the areas +to overpaint made transparent, but you must be careful not to destroy +the pixels underneath when you create the transparent areas. See +[Inpainting](./INPAINTING.md) for details. + +inpainting accepts all the arguments used for txt2img and img2img, as +well as the --mask (-M) argument: + +| Argument | Shortcut | Default | Description | +|--------------------|------------|---------------------|--------------| +| --init_mask | -M | None |Path to an image the same size as the initial_image, with areas for inpainting made transparent.| + + +# Command-line editing and completion + +If you are on a Macintosh or Linux machine, the command-line offers +convenient history tracking, editing, and command completion. + +- To scroll through previous commands and potentially edit/reuse them, use the up and down cursor keys. +- To edit the current command, use the left and right cursor keys to position the cursor, and then backspace, delete or insert characters. +- To move to the very beginning of the command, type CTRL-A (or command-A on the Mac) +- To move to the end of the command, type CTRL-E. +- To cut a section of the command, position the cursor where you want to start cutting and type CTRL-K. +- To paste a cut section back in, position the cursor where you want to paste, and type CTRL-Y + +Windows users can get similar, but more limited, functionality if they +launch dream.py with the "winpty" program: + +~~~ +> winpty python scripts\dream.py +~~~ + +On the Mac and Linux platforms, when you exit dream.py, the last 1000 +lines of your command-line history will be saved. When you restart +dream.py, you can access the saved history using the up-arrow key. + +In addition, limited command-line completion is installed. In various +contexts, you can start typing your command and press tab. A list of +potential completions will be presented to you. You can then type a +little more, hit tab again, and eventually autocomplete what you want. + +When specifying file paths using the one-letter shortcuts, the CLI +will attempt to complete pathnames for you. This is most handy for the +-I (init image) and -M (init mask) paths. To initiate completion, start +the path with a slash ("/") or "./". For example: + +~~~ +dream> zebra with a mustache -I./test-pictures +-I./test-pictures/Lincoln-and-Parrot.png -I./test-pictures/zebra.jpg -I./test-pictures/madonna.png +-I./test-pictures/bad-sketch.png -I./test-pictures/man_with_eagle/ +~~~ + +You can then type "z", hit tab again, and it will autofill to "zebra.jpg". + +More text completion features (such as autocompleting seeds) are on their way. + diff --git a/docs/features/IMG2IMG.md b/docs/features/IMG2IMG.md index 64dab6b80c..ac560f6984 100644 --- a/docs/features/IMG2IMG.md +++ b/docs/features/IMG2IMG.md @@ -1,15 +1,30 @@ # **Image-to-Image** -This script also provides an img2img feature that lets you seed your creations with an initial drawing or photo. This is a really cool feature that tells stable diffusion to build the prompt on top of the image you provide, preserving the original's basic shape and layout. To use it, provide the `--init_img` option as shown here: +This script also provides an img2img feature that lets you seed your +creations with an initial drawing or photo. This is a really cool +feature that tells stable diffusion to build the prompt on top of the +image you provide, preserving the original's basic shape and +layout. To use it, provide the `--init_img` option as shown here: ``` dream> "waterfall and rainbow" --init_img=./init-images/crude_drawing.png --strength=0.5 -s100 -n4 ``` -The `--init_img (-I)` option gives the path to the seed picture. `--strength (-f)` controls how much the original will be modified, ranging from `0.0` (keep the original intact), to `1.0` (ignore -the original completely). The default is `0.75`, and ranges from `0.25-0.75` give interesting results. +The `--init_img (-I)` option gives the path to the seed +picture. `--strength (-f)` controls how much the original will be +modified, ranging from `0.0` (keep the original intact), to `1.0` +(ignore the original completely). The default is `0.75`, and ranges +from `0.25-0.75` give interesting results. -You may also pass a `-v` option to generate count variants on the original image. This is done by passing the first generated image back into img2img the requested number of times. It generates interesting variants. +You may also pass a `-v` option to generate count variants on +the original image. This is done by passing the first generated image +back into img2img the requested number of times. It generates +interesting variants. -If the initial image contains transparent regions, then Stable Diffusion will only draw within the transparent regions, a process -called "inpainting". However, for this to work correctly, the color information underneath the transparent needs to be preserved, not erased. See [Creating Transparent Images For Inpainting](./INPAINTING.md#creating-transparent-regions-for-inpainting) for details. +If the initial image contains transparent regions, then Stable +Diffusion will only draw within the transparent regions, a process +called "inpainting". However, for this to work correctly, the color +information underneath the transparent needs to be preserved, not +erased. See [Creating Transparent Images For +Inpainting](./INPAINTING.md#creating-transparent-regions-for-inpainting) +for details. diff --git a/docs/features/OTHER.md b/docs/features/OTHER.md index f8217691a0..3853b185ed 100644 --- a/docs/features/OTHER.md +++ b/docs/features/OTHER.md @@ -83,8 +83,11 @@ For example consider this prompt: tabby cat:0.25 white duck:0.75 hybrid ``` -This will tell the sampler to invest 25% of its effort on the tabby cat aspect of the image and 75% on the white duck aspect (surprisingly, this example actually works). The prompt weights can -use any combination of integers and floating point numbers, and they do not need to add up to 1. +This will tell the sampler to invest 25% of its effort on the tabby +cat aspect of the image and 75% on the white duck aspect +(surprisingly, this example actually works). The prompt weights can +use any combination of integers and floating point numbers, and they +do not need to add up to 1. --- @@ -93,22 +96,27 @@ use any combination of integers and floating point numbers, and they do not need For programmers who wish to incorporate stable-diffusion into other products, this repository includes a simplified API for text to image generation, which lets you create images from a prompt in just three lines of code: ``` -from ldm.simplet2i import T2I -model = T2I() -outputs = model.txt2img("a unicorn in manhattan") +from ldm.generate import Generate +g = Generate() +outputs = g.txt2img("a unicorn in manhattan") ``` Outputs is a list of lists in the format [filename1,seed1],[filename2,seed2]...]. -Please see ldm/simplet2i.py for more information. A set of example scripts is coming RSN. +Please see ldm/generate.py for more information. A set of example scripts is coming RSN. --- ## **Preload Models** -In situations where you have limited internet connectivity or are blocked behind a firewall, you can use the preload script to preload the required files for Stable Diffusion to run. +In situations where you have limited internet connectivity or are +blocked behind a firewall, you can use the preload script to preload +the required files for Stable Diffusion to run. -The preload script `scripts/preload_models.py` needs to be run once at least while connected to the internet. In the following runs, it will load up the cached versions of the required files from the `.cache` directory of the system. +The preload script `scripts/preload_models.py` needs to be run once at +least while connected to the internet. In the following runs, it will +load up the cached versions of the required files from the `.cache` +directory of the system. ``` (ldm) ~/stable-diffusion$ python3 ./scripts/preload_models.py