documentation enhancements

2024-08-30 20:32:17 +00:00 · 2022-09-11 14:26:41 -04:00
parent 4f72cb44ad
commit 37c921dfe2
3 changed files with 219 additions and 18 deletions
--- a/docs/features/CLI.md
+++ b/docs/features/CLI.md
@ -41,10 +41,188 @@ dream> q
 <img src="../assets/dream-py-demo.png"/>
 </p>

-The `dream>` prompt's arguments are pretty much identical to those used in the Discord bot, except you don't need to type "!dream" (it doesn't
-hurt if you do). A significant change is that creation of individual images is now the default unless --grid (-g) is given.
+The `dream>` prompt's arguments are pretty much identical to those
+used in the Discord bot, except you don't need to type "!dream" (it
+doesn't hurt if you do). A significant change is that creation of
+individual images is now the default unless --grid (-g) is given. A
+full list is given in [List of prompt arguments]
+(#list-of-prompt-arguments).

-For backward compatibility, the -i switch is recognized. For command-line help type -h (or --help) at the dream> prompt.
+# Arguments

-The script itself also recognizes a series of command-line switches that will change important global defaults, such as the directory for
+The script itself also recognizes a series of command-line switches
+that will change important global defaults, such as the directory for
 image outputs and the location of the model weight files.
+
+## List of arguments recognized at the command line:
+
+These command-line arguments can be passed to dream.py when you first
+run it from the Windows, Mac or Linux command line. Some set defaults
+that can be overridden on a per-prompt basis (see [List of prompt
+arguments] (#list-of-prompt-arguments). Others
+
+| Argument           |  Shortcut  |  Default            |  Description |
+|--------------------|------------|---------------------|--------------|
+| --help             | -h         |                     | Print a concise help message.  |
+| --outdir <path>    |  -o<path> | outputs/img_samples | Location for generated images. |
+| --prompt_as_dir    |  -p        | False               | Name output directories using the prompt text. |
+| --from_file <path> |            | None                | Read list of prompts from a file. Use "-" to read from standard input |
+| --model <modelname>|            | stable-diffusion-1.4| Loads model specified in configs/models.yaml. Currently one of "stable-diffusion-1.4" or "laion400m"|
+| --full_precision   |  -F        | False               | Run in slower full-precision mode. Needed for Macintosh M1/M2 hardware and some older video cards. |
+| --web              |            | False               | Start in web server mode |
+| --host <ip addr>   |            | localhost           | Which network interface web server should listen on. Set to 0.0.0.0 to listen on any. |
+| --port <port>      |            | 9090                | Which port web server should listen for requests on. |
+| --config <path>    |            | configs/models.yaml | Configuration file for models and their weights.     |
+| --iterations <int> |   -n<int> | 1                   | How many images to generate per prompt. |
+| --grid             |   -g       | False               | Save all image series as a grid rather than individually. |
+| --sampler <sampler>| -A<sampler>| k_lms              | Sampler to use. Use -h to get list of available samplers. |
+| --seamless         |            | False               | Create interesting effects by tiling elements of the image. |
+| --embedding_path <path>|        | None                | Path to pre-trained embedding manager checkpoints, for custom models |
+| --gfpgan_dir       |            | src/gfpgan        | Path to where GFPGAN is installed. |
+| --gfpgan_model_path|            | experiments/pretrained_models/GFPGANv1.3.pth| Path to GFPGAN model file, relative to --gfpgan_dir. |
+| --device <device>  | -d<device>| torch.cuda.current_device() | Device to run SD on, e.g. "cuda:0" |
+
+These arguments are deprecated but still work:
+
+| Argument           |  Shortcut  |  Default            |  Description |
+|--------------------|------------|---------------------|--------------|
+| --weights <path>   |            | None                | Pth to weights file; use `--model stable-diffusion-1.4` instead |
+| --laion400m        | -l         | False               | Use older LAION400m weights; use `--model=laion400m` instead |
+
+**A note on path names:** On Windows systems, you may run into
+  problems when passing the dream script standard backslashed path
+  names because the Python interpreter treats "\" as an escape.
+  You can either double your slashes (ick): C:\\path\\to\\my\\file, or
+  use Linux/Mac style forward slashes (better): C:/path/to/my/file.
+
+## List of prompt arguments
+
+After the dream.py script initializes, it will present you with a
+**dream>** prompt. Here you can enter information to generate images
+from text (txt2img), to embellish an existing image or sketch
+(img2img), or to selectively alter chosen regions of the image
+(inpainting).
+
+### This is an example of txt2img:
+
+~~~~
+dream> waterfall and rainbow -W640 -H480
+~~~~
+
+This will create the requested image with the dimensions 640 (width)
+and 480 (height).
+
+Here are the dream> command that apply to txt2img:
+
+| Argument           |  Shortcut  |  Default            |  Description |
+|--------------------|------------|---------------------|--------------|
+| "my prompt"        |            |                    | Text prompt to use. The quotation marks are optional. |
+| --width <int>      | -W<int>   | 512                 | Width of generated image |
+| --height <int>     | -H<int>   | 512                 | Height of generated image |
+| --iterations <int> | -n<int>   | 1                   | How many images to generate from this prompt |
+| --steps <int>      | -s<int>   | 50                  | How many steps of refinement to apply |
+| --cfg_scale <float>| -C<float> | 7.5                 | How "hard to try" to match image to prompt|
+| --seed <int>       | -S<int>   | None                | Set the random seed for the next series of images. This can be used to recreate an image generated previously.|
+| --sampler <sampler>| -A<sampler>| k_lms              | Sampler to use. Use -h to get list of available samplers. |
+| --grid             | -g        | False               | Turn on grid mode to return a single image combining all the images generated by this prompt |
+| --individual       | -i        | True                | Turn off grid mode (deprecated; leave off --grid instead) |
+| --outdir <path>    |  -o<path> | outputs/img_samples  | Temporarily change the location of these images |
+| --seamless         |           | False               | Activate seamless tiling for interesting effects |
+| --log_tokenization | -t        | False               | Display a color-coded list of the parsed tokens derived from the prompt |
+| --skip_normalization| -x       | False               | Weighted subprompts will not be normalized. See [Weighted Prompts](./OTHER.md#**Weighted Prompts**) |
+| --upscale <int> <float> | -U <int> <float> | -U 1 0.75| Upscale image by magnification factor (2, 4), and set strength of upscaling (0.0-1.0). If strength not set, will default to 0.75. |
+| --gfpgan_strength <float>  | -G <float> | -G0        | Fix faces using the GFPGAN algorithm; argument indicates how hard the algorithm should try (0.0-1.0) |
+| --save_original    | -save_orig| False               | When upscaling or fixing faces, this will cause the original image to be saved rather than replaced. |
+| --variation <float>  |-v<float>| 0.0                 | Add a bit of noise (0.0=none, 1.0=high) to the image in order to generate a series of variations. Usually used in combination with -S<seed> and -n<int> to generate a series a riffs on a starting image. See [Variations](./VARIATIONS.md). |
+| --with_variations <pattern> | -V<pattern>| None      | Combine two or more variations. See [Variations](./VARIATIONS.md) for now to use this. |
+
+Note that the width and height of the image must be multiples of
+64. You can provide different values, but they will be rounded down to
+the nearest multiple of 64.
+
+
+### This is an example of img2img:	
+
+~~~~
+dream> waterfall and rainbow -I./vacation-photo.png -W640 -H480 --fit
+~~~~
+
+This will modify the indicated vacation photograph by making it more
+like the prompt. Results will vary greatly depending on what is in the
+image. We also ask to --fit the image into a box no bigger than
+640x480. Otherwise the image size will be identical to the provided
+photo and you may run out of memory if it is large.
+
+In addition to the command-line options recognized by txt2img, img2img
+accepts additional options:
+
+| Argument           |  Shortcut  |  Default            |  Description |
+|--------------------|------------|---------------------|--------------|
+| --init_img <path>  | -I<path>   | None                | Path to the initialization image |
+| --fit              | -F         | False               | Scale the image to fit into the specified -H and -W dimensions |
+| --strength <float> | -s<float>  | 0.75                | How hard to try to match the prompt to the initial image. Ranges from 0.0-0.99, with higher values replacing the initial image completely.|
+
+### This is an example of inpainting:
+
+~~~~
+dream> waterfall and rainbow -I./vacation-photo.png -M./vacation-mask.png -W640 -H480 --fit
+~~~~
+
+This will do the same thing as img2img, but image alterations will
+only occur within transparent areas defined by the mask file specified
+by -M. You may also supply just a single initial image with the areas
+to overpaint made transparent, but you must be careful not to destroy
+the pixels underneath when you create the transparent areas. See
+[Inpainting](./INPAINTING.md) for details.
+
+inpainting accepts all the arguments used for txt2img and img2img, as
+well as the --mask (-M) argument:
+
+| Argument           |  Shortcut  |  Default            |  Description |
+|--------------------|------------|---------------------|--------------|
+| --init_mask <path> | -M<path>   | None                |Path to an image the same size as the initial_image, with areas for inpainting made transparent.|
+
+
+# Command-line editing and completion
+
+If you are on a Macintosh or Linux machine, the command-line offers
+convenient history tracking, editing, and command completion.
+
+- To scroll through previous commands and potentially edit/reuse them, use the up and down cursor keys.
+- To edit the current command, use the left and right cursor keys to position the cursor, and then backspace, delete or insert characters.
+- To move to the very beginning of the command, type CTRL-A (or command-A on the Mac)
+- To move to the end of the command, type CTRL-E.
+- To cut a section of the command, position the cursor where you want to start cutting and type CTRL-K.
+- To paste a cut section back in, position the cursor where you want to paste, and type CTRL-Y
+
+Windows users can get similar, but more limited, functionality if they
+launch dream.py with the "winpty" program:
+
+~~~
+> winpty python scripts\dream.py
+~~~
+
+On the Mac and Linux platforms, when you exit dream.py, the last 1000
+lines of your command-line history will be saved. When you restart
+dream.py, you can access the saved history using the up-arrow key.
+
+In addition, limited command-line completion is installed. In various
+contexts, you can start typing your command and press tab. A list of
+potential completions will be presented to you. You can then type a
+little more, hit tab again, and eventually autocomplete what you want.
+
+When specifying file paths using the one-letter shortcuts, the CLI
+will attempt to complete pathnames for you. This is most handy for the
+-I (init image) and -M (init mask) paths. To initiate completion, start
+the path with a slash ("/") or "./". For example:
+
+~~~
+dream> zebra with a mustache -I./test-pictures<TAB>
+-I./test-pictures/Lincoln-and-Parrot.png  -I./test-pictures/zebra.jpg        -I./test-pictures/madonna.png
+-I./test-pictures/bad-sketch.png          -I./test-pictures/man_with_eagle/
+~~~
+
+You can then type "z", hit tab again, and it will autofill to "zebra.jpg".
+
+More text completion features (such as autocompleting seeds) are on their way.
+
--- a/docs/features/IMG2IMG.md
+++ b/docs/features/IMG2IMG.md
@ -1,15 +1,30 @@
 # **Image-to-Image**

-This script also provides an img2img feature that lets you seed your creations with an initial drawing or photo. This is a really cool feature that tells stable diffusion to build the prompt on top of the image you provide, preserving the original's basic shape and layout. To use it, provide the `--init_img` option as shown here:
+This script also provides an img2img feature that lets you seed your
+creations with an initial drawing or photo. This is a really cool
+feature that tells stable diffusion to build the prompt on top of the
+image you provide, preserving the original's basic shape and
+layout. To use it, provide the `--init_img` option as shown here:

 ```
 dream> "waterfall and rainbow" --init_img=./init-images/crude_drawing.png --strength=0.5 -s100 -n4
 ```

-The `--init_img (-I)` option gives the path to the seed picture. `--strength (-f)` controls how much the original will be modified, ranging from `0.0` (keep the original intact), to `1.0` (ignore
-the original completely). The default is `0.75`, and ranges from `0.25-0.75` give interesting results.
+The `--init_img (-I)` option gives the path to the seed
+picture. `--strength (-f)` controls how much the original will be
+modified, ranging from `0.0` (keep the original intact), to `1.0`
+(ignore the original completely). The default is `0.75`, and ranges
+from `0.25-0.75` give interesting results.

-You may also pass a `-v<count>` option to generate count variants on the original image. This is done by passing the first generated image back into img2img the requested number of times. It generates interesting variants.
+You may also pass a `-v<count>` option to generate count variants on
+the original image. This is done by passing the first generated image
+back into img2img the requested number of times. It generates
+interesting variants.

-If the initial image contains transparent regions, then Stable Diffusion will only draw within the transparent regions, a process
-called "inpainting". However, for this to work correctly, the color information underneath the transparent needs to be preserved, not erased. See [Creating Transparent Images For Inpainting](./INPAINTING.md#creating-transparent-regions-for-inpainting) for details.
+If the initial image contains transparent regions, then Stable
+Diffusion will only draw within the transparent regions, a process
+called "inpainting". However, for this to work correctly, the color
+information underneath the transparent needs to be preserved, not
+erased. See [Creating Transparent Images For
+Inpainting](./INPAINTING.md#creating-transparent-regions-for-inpainting)
+for details.
--- a/docs/features/OTHER.md
+++ b/docs/features/OTHER.md
@ -83,8 +83,11 @@ For example consider this prompt:
    tabby cat:0.25 white duck:0.75 hybrid
 ```

-This will tell the sampler to invest 25% of its effort on the tabby cat aspect of the image and 75% on the white duck aspect (surprisingly, this example actually works). The prompt weights can
-use any combination of integers and floating point numbers, and they do not need to add up to 1.
+This will tell the sampler to invest 25% of its effort on the tabby
+cat aspect of the image and 75% on the white duck aspect
+(surprisingly, this example actually works). The prompt weights can
+use any combination of integers and floating point numbers, and they
+do not need to add up to 1.

 ---

@ -93,22 +96,27 @@ use any combination of integers and floating point numbers, and they do not need
 For programmers who wish to incorporate stable-diffusion into other products, this repository includes a simplified API for text to image generation, which lets you create images from a prompt in just three lines of code:

 ```
-from ldm.simplet2i import T2I
-model   = T2I()
-outputs = model.txt2img("a unicorn in manhattan")
+from ldm.generate import Generate
+g       = Generate()
+outputs = g.txt2img("a unicorn in manhattan")
 ```

 Outputs is a list of lists in the format [filename1,seed1],[filename2,seed2]...].

-Please see ldm/simplet2i.py for more information. A set of example scripts is coming RSN.
+Please see ldm/generate.py for more information. A set of example scripts is coming RSN.

 ---

 ## **Preload Models**

-In situations where you have limited internet connectivity or are blocked behind a firewall, you can use the preload script to preload the required files for Stable Diffusion to run.
+In situations where you have limited internet connectivity or are
+blocked behind a firewall, you can use the preload script to preload
+the required files for Stable Diffusion to run.

-The preload script `scripts/preload_models.py` needs to be run once at least while connected to the internet. In the following runs, it will load up the cached versions of the required files from the `.cache` directory of the system.
+The preload script `scripts/preload_models.py` needs to be run once at
+least while connected to the internet. In the following runs, it will
+load up the cached versions of the required files from the `.cache`
+directory of the system.

 ```
 (ldm) ~/stable-diffusion$ python3 ./scripts/preload_models.py