finished CLI, IMG2IMG and WEB updates

2024-08-30 20:32:17 +00:00 · 2023-02-08 12:45:56 -05:00
parent 9d69843a9d
commit c6a2ba12e2
6 changed files with 201 additions and 212 deletions
--- a/docs/features/CLI.md
+++ b/docs/features/CLI.md
@ -6,38 +6,51 @@ title: Command-Line Interface

 ## **Interactive Command Line Interface**

-The `invoke.py` script, located in `scripts/`, provides an interactive interface
-to image generation similar to the "invoke mothership" bot that Stable AI
-provided on its Discord server.
+The InvokeAI command line interface (CLI) provides scriptable access
+to InvokeAI's features.Some advanced features are only available
+through the CLI, though they eventually find their way into the WebUI.

-Unlike the `txt2img.py` and `img2img.py` scripts provided in the original
-[CompVis/stable-diffusion](https://github.com/CompVis/stable-diffusion) source
-code repository, the time-consuming initialization of the AI model
-initialization only happens once. After that image generation from the
-command-line interface is very fast.
+The CLI is accessible from the `invoke.sh`/`invoke.bat` launcher by
+selecting option (1). Alternatively, it can be launched directly from
+the command line by activating the InvokeAI environment and giving the
+command:
+
+```bash
+invokeai
+```
+
+After some startup messages, you will be presented with the `invoke> `
+prompt. Here you can type prompts to generate images and issue other
+commands to load and manipulate generative models. The CLI has a large
+number of command-line options that control its behavior. To get a
+concise summary of the options, call `invokeai` with the `--help` argument:
+
+```bash
+invokeai --help
+```

 The script uses the readline library to allow for in-line editing, command
 history (++up++ and ++down++), autocompletion, and more. To help keep track of
 which prompts generated which images, the script writes a log file of image
 names and prompts to the selected output directory.

-In addition, as of version 1.02, it also writes the prompt into the PNG file's
-metadata where it can be retrieved using `scripts/images2prompt.py`
-
-The script is confirmed to work on Linux, Windows and Mac systems.
-
-!!! note
-
-    This script runs from the command-line or can be used as a Web application. The Web GUI is
-    currently rudimentary, but a much better replacement is on its way.
+Here is a typical session

 ```bash
-(invokeai) ~/stable-diffusion$ python3 ./scripts/invoke.py
+PS1:C:\Users\fred> invokeai
 * Initializing, be patient...
-Loading model from models/ldm/text2img-large/model.ckpt
-(...more initialization messages...)
-
-* Initialization done! Awaiting your command...
+* Initializing, be patient...
+>> Initialization file /home/lstein/invokeai/invokeai.init found. Loading...
+>> Internet connectivity is True
+>> InvokeAI, version 2.3.0-rc5
+>> InvokeAI runtime directory is "/home/lstein/invokeai"
+>> GFPGAN Initialized
+>> CodeFormer Initialized
+>> ESRGAN Initialized
+>> Using device_type cuda
+>> xformers memory-efficient attention is available and enabled
+     (...more initialization messages...)
+* Initialization done! Awaiting your command (-h for help, 'q' to quit)
 invoke> ashley judd riding a camel -n2 -s150
 Outputs:
   outputs/img-samples/00009.png: "ashley judd riding a camel" -n2 -s150 -S 416354203
@ -47,27 +60,15 @@ invoke> "there's a fly in my soup" -n6 -g
    outputs/img-samples/00011.png: "there's a fly in my soup" -n6 -g -S 2685670268
    seeds for individual rows: [2685670268, 1216708065, 2335773498, 822223658, 714542046, 3395302430]
 invoke> q
-
-# this shows how to retrieve the prompt stored in the saved image's metadata
-(invokeai) ~/stable-diffusion$ python ./scripts/images2prompt.py outputs/img_samples/*.png
-00009.png: "ashley judd riding a camel" -s150 -S 416354203
-00010.png: "ashley judd riding a camel" -s150 -S 1362479620
-00011.png: "there's a fly in my soup" -n6 -g -S 2685670268
 ```

 ![invoke-py-demo](../assets/dream-py-demo.png)

-The `invoke>` prompt's arguments are pretty much identical to those used in the
-Discord bot, except you don't need to type `!invoke` (it doesn't hurt if you
-do). A significant change is that creation of individual images is now the
-default unless `--grid` (`-g`) is given. A full list is given in
-[List of prompt arguments](#list-of-prompt-arguments).
-
 ## Arguments

-The script itself also recognizes a series of command-line switches that will
-change important global defaults, such as the directory for image outputs and
-the location of the model weight files.
+The script recognizes a series of command-line switches that will
+change important global defaults, such as the directory for image
+outputs and the location of the model weight files.

 ### List of arguments recognized at the command line

@ -82,10 +83,14 @@ overridden on a per-prompt basis (see
 | `--outdir <path>`                         | `-o<path>`                                | `outputs/img_samples`                          | Location for generated images.                                                                       |
 | `--prompt_as_dir`                         | `-p`                                      | `False`                                        | Name output directories using the prompt text.                                                       |
 | `--from_file <path>`                      |                                           | `None`                                         | Read list of prompts from a file. Use `-` to read from standard input                                |
-| `--model <modelname>`                     |                                           | `stable-diffusion-1.4`                         | Loads model specified in configs/models.yaml. Currently one of "stable-diffusion-1.4" or "laion400m" |
-| `--full_precision`                        | `-F`                                      | `False`                                        | Run in slower full-precision mode. Needed for Macintosh M1/M2 hardware and some older video cards.   |
+| `--model <modelname>`                     |                                           | `stable-diffusion-1.5`                         | Loads the initial model specified in configs/models.yaml. |
+| `--ckpt_convert `           |                                                         | `False`                                        | If provided both .ckpt and .safetensors files will be auto-converted into diffusers format in memory |
+| `--autoconvert <path>`                    |                          | `None`                                        | On startup, scan the indicated directory for new .ckpt/.safetensor files and automatically convert and import them |
+| `--precision`                             |                                           | `fp16`                                         | Provide `fp32` for full precision mode, `fp16` for half-precision. `fp32` needed for Macintoshes and some NVidia cards. |
 | `--png_compression <0-9>`                 | `-z<0-9>`                                 | `6`                                            | Select level of compression for output files, from 0 (no compression) to 9 (max compression)         |
 | `--safety-checker`                        |                                           | `False`                                        | Activate safety checker for NSFW and other potentially disturbing imagery                            |
+| `--patchmatch`, `--no-patchmatch`                 |                                   | `--patchmatch`                                        | Load/Don't load the PatchMatch inpainting extension    |
+| `--xformers`, `--no-xformers`                 |                                   | `--xformers`                                        | Load/Don't load the Xformers memory-efficient attention module (CUDA only)    |
 | `--web`                                   |                                           | `False`                                        | Start in web server mode                                                                             |
 | `--host <ip addr>`                        |                                           | `localhost`                                    | Which network interface web server should listen on. Set to 0.0.0.0 to listen on any.                |
 | `--port <port>`                           |                                           | `9090`                                         | Which port web server should listen for requests on.                                                 |
@ -109,6 +114,7 @@ overridden on a per-prompt basis (see

    | Argument           |  Shortcut  |  Default            |  Description |
    |--------------------|------------|---------------------|--------------|
+    | `--full_precision`  |             | `False`                | Same as `--precision=fp32`|
    | `--weights <path>`   |            | `None`                | Path to weights file; use `--model stable-diffusion-1.4` instead |
    | `--laion400m`        | `-l`         | `False`               | Use older LAION400m weights; use `--model=laion400m` instead |

@ -336,8 +342,10 @@ useful for debugging the text masking process prior to inpainting with the

 ### Model selection and importation

-The CLI allows you to add new models on the fly, as well as to switch among them
-rapidly without leaving the script.
+The CLI allows you to add new models on the fly, as well as to switch
+among them rapidly without leaving the script. There are several
+different model formats, each described in the [Model Installation
+Guide](050_INSTALLING_MODELS.md).

 #### `!models`

@ -347,9 +355,9 @@ model is bold-faced
 Example:

 <pre>
-laion400m                 not loaded  <no description>
-<b>stable-diffusion-1.4          active  Stable Diffusion v1.4</b>
-waifu-diffusion           not loaded  Waifu Diffusion v1.3
+inpainting-1.5            not loaded  Stable Diffusion inpainting model
+<b>stable-diffusion-1.5          active  Stable Diffusion v1.5</b>
+waifu-diffusion           not loaded  Waifu Diffusion v1.4
 </pre>

 #### `!switch <model>`
@ -361,43 +369,30 @@ Note how the second column of the `!models` table changes to `cached` after a
 model is first loaded, and that the long initialization step is not needed when
 loading a cached model.

-<pre>
-invoke> !models
-laion400m                 not loaded  <no description>
-<b>stable-diffusion-1.4          cached  Stable Diffusion v1.4</b>
-waifu-diffusion               active  Waifu Diffusion v1.3
+#### `!import_model <hugging_face_repo_ID>`

-invoke> !switch waifu-diffusion
->> Caching model stable-diffusion-1.4 in system RAM
->> Loading waifu-diffusion from models/ldm/stable-diffusion-v1/model-epoch08-float16.ckpt
-   | LatentDiffusion: Running in eps-prediction mode
-   | DiffusionWrapper has 859.52 M params.
-   | Making attention of type 'vanilla' with 512 in_channels
-   | Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
-   | Making attention of type 'vanilla' with 512 in_channels
-   | Using faster float16 precision
->> Model loaded in 18.24s
->> Max VRAM used to load the model: 2.17G
->> Current VRAM usage:2.17G
->> Setting Sampler to k_lms
+This imports and installs a `diffusers`-style model that is stored on
+the [HuggingFace Web Site](https://huggingface.co). You can look up
+any [Stable Diffusion diffusers
+model](https://huggingface.co/models?library=diffusers) and install it
+with a command like the following:

-invoke> !models
-laion400m                 not loaded  <no description>
-stable-diffusion-1.4          cached  Stable Diffusion v1.4
-<b>waifu-diffusion               active  Waifu Diffusion v1.3</b>
+```bash
+!import_model prompthero/openjourney
+```

-invoke> !switch stable-diffusion-1.4
->> Caching model waifu-diffusion in system RAM
->> Retrieving model stable-diffusion-1.4 from system RAM cache
->> Setting Sampler to k_lms
+#### `!import_model <path/to/diffusers/directory>`

-invoke> !models
-laion400m                 not loaded  <no description>
-<b>stable-diffusion-1.4          active  Stable Diffusion v1.4</b>
-waifu-diffusion               cached  Waifu Diffusion v1.3
-</pre>
+If you have a copy of a `diffusers`-style model saved to disk, you can
+import it by passing the path to model's top-level directory.

-#### `!import_model <path/to/model/weights>`
+#### `!import_model <url>`
+
+For a `.ckpt` or `.safetensors` file, if you have a direct download
+URL for the file, you can provide it to `!import_model` and the file
+will be downloaded and installed for you.
+
+#### `!import_model <path/to/model/weights.ckpt>`

 This command imports a new model weights file into InvokeAI, makes it available
 for image generation within the script, and writes out the configuration for the
@ -417,35 +412,12 @@ below, the bold-faced text shows what the user typed in with the exception of
 the width, height and configuration file paths, which were filled in
 automatically.

-Example:
+#### `!import_model <path/to/directory_of_models>`

-<pre>
-invoke> <b>!import_model models/ldm/stable-diffusion-v1/model-epoch08-float16.ckpt</b>
->> Model import in process. Please enter the values needed to configure this model:
-
-Name for this model: <b>waifu-diffusion</b>
-Description of this model: <b>Waifu Diffusion v1.3</b>
-Configuration file for this model: <b>configs/stable-diffusion/v1-inference.yaml</b>
-Default image width: <b>512</b>
-Default image height: <b>512</b>
->> New configuration:
-waifu-diffusion:
-  config: configs/stable-diffusion/v1-inference.yaml
-  description: Waifu Diffusion v1.3
-  height: 512
-  weights: models/ldm/stable-diffusion-v1/model-epoch08-float16.ckpt
-  width: 512
-OK to import [n]? <b>y</b>
->> Caching model stable-diffusion-1.4 in system RAM
->> Loading waifu-diffusion from models/ldm/stable-diffusion-v1/model-epoch08-float16.ckpt
-   | LatentDiffusion: Running in eps-prediction mode
-   | DiffusionWrapper has 859.52 M params.
-   | Making attention of type 'vanilla' with 512 in_channels
-   | Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
-   | Making attention of type 'vanilla' with 512 in_channels
-   | Using faster float16 precision
-invoke>
-</pre>
+If you provide the path of a directory that contains one or more
+`.ckpt` or `.safetensors` files, the CLI will scan the directory and
+interactively offer to import the models it finds there. Also see the
+`--autoconvert` command-line option.

 #### `!edit_model <name_of_model>`

@ -479,11 +451,6 @@ OK to import [n]? y
 ...
 </pre>

-======= invoke> !fix 000017.4829112.gfpgan-00.png --embiggen 3 ...lots of
-text... Outputs: [2] outputs/img-samples/000018.2273800735.embiggen-00.png: !fix
-"outputs/img-samples/000017.243781548.gfpgan-00.png" -s 50 -S 2273800735 -W 512
-H 512 -C 7.5 -A k_lms --embiggen 3.0 0.75 0.25 ```
-
 ### History processing

 The CLI provides a series of convenient commands for reviewing previous actions,
--- a/docs/features/IMG2IMG.md
+++ b/docs/features/IMG2IMG.md
@ -4,13 +4,24 @@ title: Image-to-Image

 # :material-image-multiple: Image-to-Image

-## `img2img`
+Both the Web and command-line interfaces provide an "img2img" feature
+that lets you seed your creations with an initial drawing or
+photo. This is a really cool feature that tells stable diffusion to
+build the prompt on top of the image you provide, preserving the
+original's basic shape and layout.

-This script also provides an `img2img` feature that lets you seed your creations
-with an initial drawing or photo. This is a really cool feature that tells
-stable diffusion to build the prompt on top of the image you provide, preserving
-the original's basic shape and layout. To use it, provide the `--init_img`
-option as shown here:
+See the [WebUI Guide](WEB.md) for a walkthrough of the img2img feature
+in the InvokeAI web server. This document describes how to use img2img
+in the command-line tool.
+
+## Basic Usage
+
+Launch the command-line client by launching `invoke.sh`/`invoke.bat`
+and choosing option (1). Alternative, activate the InvokeAI
+environment and issue the command `invokeai`.
+
+Once the `invoke> ` prompt appears, you can start an img2img render by
+pointing to a seed file with the `-I` option as shown here:

 !!! example ""

--- a/docs/features/WEB.md
+++ b/docs/features/WEB.md
@ -5,11 +5,14 @@ title: InvokeAI Web Server
 # :material-web: InvokeAI Web Server

 As of version 2.0.0, this distribution comes with a full-featured web server
-(see screenshot). To use it, run the `invoke.py` script by adding the `--web`
-option:
+(see screenshot).
+
+To use it, launch the `invoke.sh`/`invoke.bat` script and select
+option (2). Alternatively, with the InvokeAI environment active, run
+the `invokeai` script by adding the `--web` option:

 ```bash
-(invokeai) ~/InvokeAI$ python3 scripts/invoke.py --web
+invokeai --web
 ```

 You can then connect to the server by pointing your web browser at
@ -19,17 +22,23 @@ address of the host you are running it on, or the wildcard `0.0.0.0`. For
 example:

 ```bash
-(invokeai) ~/InvokeAI$ python3 scripts/invoke.py --web --host 0.0.0.0
+invoke.sh --host 0.0.0.0
 ```

-## Quick guided walkthrough of the WebGUI's features
+or

-While most of the WebGUI's features are intuitive, here is a guided walkthrough
+```bash
+invokeai --web --host 0.0.0.0
+```
+
+## Quick guided walkthrough of the WebUI's features
+
+While most of the WebUI's features are intuitive, here is a guided walkthrough
 through its various components.

 ![Invoke Web Server - Major Components](../assets/invoke-web-server-1.png){:width="640px"}

-The screenshot above shows the Text to Image tab of the WebGUI. There are three
+The screenshot above shows the Text to Image tab of the WebUI. There are three
 main sections:

 1. A **control panel** on the left, which contains various settings for text to
@ -63,12 +72,14 @@ From top to bottom, these are:
 1. Text to Image - generate images from text
 2. Image to Image - from an uploaded starting image (drawing or photograph)
   generate a new one, modified by the text prompt
-3. Inpainting (pending) - Interactively erase portions of a starting image and
-   have the AI fill in the erased region from a text prompt.
-4. Outpainting (pending) - Interactively add blank space to the borders of a
-   starting image and fill in the background from a text prompt.
-5. Postprocessing (pending) - Interactively postprocess generated images using a
-   variety of filters.
+3. Unified Canvas - Interactively combine multiple images, extend them
+   with outpainting,and modify interior portions of the image with
+   inpainting, erase portions of a starting image and have the AI fill in
+   the erased region from a text prompt.
+4. Workflow Management (not yet implemented) - this panel will allow you to create
+   pipelines of common operations and combine them into workflows.
+5. Training (not yet implemented) - this panel will provide an interface to [textual
+   inversion training](TEXTUAL_INVERSION.md) and fine tuning.

 The inpainting, outpainting and postprocessing tabs are currently in
 development. However, limited versions of their features can already be accessed
@ -76,18 +87,18 @@ through the Text to Image and Image to Image tabs.

 ## Walkthrough

-The following walkthrough will exercise most (but not all) of the WebGUI's
+The following walkthrough will exercise most (but not all) of the WebUI's
 feature set.

 ### Text to Image

-1. Launch the WebGUI using `python scripts/invoke.py --web` and connect to it
+1. Launch the WebUI using `python scripts/invoke.py --web` and connect to it
   with your browser by accessing `http://localhost:9090`. If the browser and
   server are running on different machines on your LAN, add the option
   `--host 0.0.0.0` to the launch command line and connect to the machine
   hosting the web server using its IP address or domain name.

-2. If all goes well, the WebGUI should come up and you'll see a green
+2. If all goes well, the WebUI should come up and you'll see a green
   `connected` message on the upper right.

 #### Basics
@ -234,7 +245,7 @@ walkthrough.

 2.  Drag-and-drop the Lincoln-and-Parrot image into the Image panel, or click
    the blank area to get an upload dialog. The image will load into an area
-    marked _Initial Image_. (The WebGUI will also load the most
+    marked _Initial Image_. (The WebUI will also load the most
    recently-generated image from the gallery into a section on the left, but
    this image will be replaced in the next step.)

@ -284,13 +295,17 @@ initial image" icons are located.

 ![Invoke Web Server - Use as Image Links](../assets/invoke-web-server-9.png){:width="640px"}

+### Unified Canvas
+
+See the [Unified Canvas Guide](UNIFIED_CANVAS.md)
+
 ## Parting remarks

 This concludes the walkthrough, but there are several more features that you can
 explore. Please check out the [Command Line Interface](CLI.md) documentation for
 further explanation of the advanced features that were not covered here.

-The WebGUI is only rapid development. Check back regularly for updates!
+The WebUI is only rapid development. Check back regularly for updates!

 ## Reference