add missing doc files

2024-08-30 20:32:17 +00:00 · 2022-10-09 11:38:39 -04:00
parent b1d43eae46
commit 5a22a83f4c
3 changed files with 302 additions and 42 deletions
--- a/README.md
+++ b/README.md
@ -41,10 +41,13 @@ _This repository was formally known as lstein/stable-diffusion_
 [latest release link]: https://github.com/invoke-ai/InvokeAI/releases
 </div>

-This is a fork of [CompVis/stable-diffusion](https://github.com/CompVis/stable-diffusion), the open
-source text-to-image generator. It provides a streamlined process with various new features and
-options to aid the image generation process. It runs on Windows, Mac and Linux machines, and runs on
-GPU cards with as little as 4 GB or RAM.
+This is a fork of
+[CompVis/stable-diffusion](https://github.com/CompVis/stable-diffusion),
+the open source text-to-image generator. It provides a streamlined
+process with various new features and options to aid the image
+generation process. It runs on Windows, Mac and Linux machines, with
+GPU cards with as little as 4 GB or RAM. It provides both a polished
+Web interface, and an easy-to-use command-line interface.

 _Note: This fork is rapidly evolving. Please use the
 [Issues](https://github.com/invoke-ai/InvokeAI/issues) tab to report bugs and make feature
@ -109,6 +112,7 @@ you can try starting `invoke.py` with the `--precision=float32` flag:

 #### Major Features

+- [Web Server](docs/features/WEB.md)
 - [Interactive Command Line Interface](docs/features/CLI.md)
 - [Image To Image](docs/features/IMG2IMG.md)
 - [Inpainting Support](docs/features/INPAINTING.md)
@ -116,7 +120,6 @@ you can try starting `invoke.py` with the `--precision=float32` flag:
 - [Upscaling, face-restoration and outpainting](docs/features/POSTPROCESS.md)
 - [Seamless Tiling](docs/features/OTHER.md#seamless-tiling)
 - [Google Colab](docs/features/OTHER.md#google-colab)
- [Web Server](docs/features/WEB.md)
 - [Reading Prompts From File](docs/features/PROMPTS.md#reading-prompts-from-a-file)
 - [Shortcut: Reusing Seeds](docs/features/OTHER.md#shortcuts-reusing-seeds)
 - [Prompt Blending](docs/features/PROMPTS.md#prompt-blending)
--- a/docs/features/POSTPROCESS.md
+++ b/docs/features/POSTPROCESS.md
@ -20,39 +20,33 @@ The default face restoration module is GFPGAN. The default upscale is
 Real-ESRGAN. For an alternative face restoration module, see [CodeFormer
 Support] below.

-As of version 1.14, environment.yaml will install the Real-ESRGAN package into
-the standard install location for python packages, and will put GFPGAN into a
-subdirectory of "src" in the InvokeAI directory. (The reason for this is
-that the standard GFPGAN distribution has a minor bug that adversely affects
-image color.) Upscaling with Real-ESRGAN should "just work" without further
-intervention. Simply pass the --upscale (-U) option on the invoke> command line,
-or indicate the desired scale on the popup in the Web GUI.
+As of version 1.14, environment.yaml will install the Real-ESRGAN
+package into the standard install location for python packages, and
+will put GFPGAN into a subdirectory of "src" in the InvokeAI
+directory. Upscaling with Real-ESRGAN should "just work" without
+further intervention. Simply pass the --upscale (-U) option on the
+invoke> command line, or indicate the desired scale on the popup in
+the Web GUI.

-For **GFPGAN** to work, there is one additional step needed. You will need to
-download and copy the GFPGAN
-[models file](https://github.com/TencentARC/GFPGAN/releases/download/v1.3.0/GFPGANv1.4.pth)
-into **src/gfpgan/experiments/pretrained_models**. On Mac and Linux systems,
-here's how you'd do it using **wget**:
+**GFPGAN** requires a series of downloadable model files to
+work. These are loaded when you run `scripts/preload_models.py`. If
+GFPAN is failing with an error, please run the following from the
+InvokeAI directory:

-```bash
-wget https://github.com/TencentARC/GFPGAN/releases/download/v1.3.0/GFPGANv1.4.pth -P src/gfpgan/experiments/pretrained_models/
-```
+~~~~
+python scripts/preload_models.py
+~~~~

-Make sure that you're in the InvokeAI directory when you do this.
+If you do not run this script in advance, the GFPGAN module will attempt
+to download the models files the first time you try to perform facial
+reconstruction. 

-Alternatively, if you have GFPGAN installed elsewhere, or if you are using an
-earlier version of this package which asked you to install GFPGAN in a sibling
-directory, you may use the `--gfpgan_dir` argument with `invoke.py` to set a
-custom path to your GFPGAN directory. _There are other GFPGAN related boot
-arguments if you wish to customize further._
-
-!!! warning "Internet connection needed"
-
-    Users whose GPU machines are isolated from the Internet (e.g.
-    on a University cluster) should be aware that the first time you run invoke.py with GFPGAN and
-    Real-ESRGAN turned on, it will try to download model files from the Internet. To rectify this, you
-    may run `python3 scripts/preload_models.py` after you have installed GFPGAN and all its
-    dependencies.
+Alternatively, if you have GFPGAN installed elsewhere, or if you are
+using an earlier version of this package which asked you to install
+GFPGAN in a sibling directory, you may use the `--gfpgan_dir` argument
+with `invoke.py` to set a custom path to your GFPGAN directory. _There
+are other GFPGAN related boot arguments if you wish to customize
+further._

 ## Usage

@ -124,15 +118,15 @@ actions.
 This repo also allows you to perform face restoration using
 [CodeFormer](https://github.com/sczhou/CodeFormer).

-In order to setup CodeFormer to work, you need to download the models like with
-GFPGAN. You can do this either by running `preload_models.py` or by manually
-downloading the
-[model file](https://github.com/sczhou/CodeFormer/releases/download/v0.1.0/codeformer.pth)
+In order to setup CodeFormer to work, you need to download the models
+like with GFPGAN. You can do this either by running
+`preload_models.py` or by manually downloading the [model
+file](https://github.com/sczhou/CodeFormer/releases/download/v0.1.0/codeformer.pth)
 and saving it to `ldm/restoration/codeformer/weights` folder.

-You can use `-ft` prompt argument to swap between CodeFormer and the default
-GFPGAN. The above mentioned `-G` prompt argument will allow you to control the
-strength of the restoration effect.
+You can use `-ft` prompt argument to swap between CodeFormer and the
+default GFPGAN. The above mentioned `-G` prompt argument will allow
+you to control the strength of the restoration effect.

 ### Usage:

--- a/docs/features/WEB.md
+++ b/docs/features/WEB.md
@ -20,10 +20,273 @@ wildcard `0.0.0.0`. For example:
 (ldm) ~/InvokeAI$ python3 scripts/invoke.py --web --host 0.0.0.0
 ```

+# Quick guided walkthrough of the WebGUI's features
+
+While most of the WebGUI's features are intuitive, here is a guided
+walkthrough through its various components.
+
+<img src="../assets/invoke-web-server.png-1" width=350>
+
+The screenshot above shows the Text to Image tab of the WebGUI. There
+are three main sections:
+
+1. A **control panel** on the left, which contains various settings
+for text to image generation. The most important part is the text
+field (currently showing `strawberry sushi`) for entering the text
+prompt, and the camera icon directly underneath that will render the
+image. We'll call this the *Invoke* button from now on.
+
+2. The **current image** section in the middle, which shows a large
+format version of the image you are currently working on. A series of
+buttons at the top ("image to image", "Use All", "Use Seed", etc) lets
+you modify the image in various ways.
+
+3. A **gallery* section on the left that contains a history of the
+images you have generated. These images are read and written to the
+directory specified at launch time in `--outdir`.
+
+In addition to these three elements, there are a series of icons for
+changing global settings, reporting bugs, and changing the theme on
+the upper right.
+
+There are also a series of icons to the left of the control panel (see
+highlighted area in the screenshot below) which select among a series
+of tabs for performing different types of operations.
+
+<img src="../assets/invoke-web-server.png-2">
+
+From top to bottom, these are:
+
+1. Text to Image  - generate images from text
+2. Image to Image - from an uploaded starting image (drawing or photograph) generate a new one, modified by the text prompt
+3. Inpainting (pending) - Interactively erase portions of a starting image and have the AI fill in the erased region from a text prompt.
+4. Outpainting (pending) - Interactively add blank space to the borders of a starting image and fill in the background from a text prompt.
+5. Postprocessing (pending) - Interactively postprocess generated images using a variety of filters.
+
+The inpainting, outpainting and postprocessing tabs are currently in
+development. However, limited versions of their features can already
+be accessed through the Text to Image and Image to Image tabs.
+
+## Walkthrough
+
+The following walkthrough will exercise most (but not all) of the
+WebGUI's feature set.
+
+### Text to Image
+
+1. Launch the WebGUI using `python scripts/invoke.py --web` and
+connect to it with your browser by accessing
+`http://localhost:9090`. If the browser and server are running on
+different machines on your LAN, add the option `--host 0.0.0.0` to the
+launch command line and connect to the machine hosting the web server
+using its IP address or domain name.
+
+2. If all goes well, the WebGUI should come up and you'll see a green
+`connected` message on the upper right.
+
+#### Basics
+
+3. Generate an image by typing *strawberry sushi* into the large
+prompt field on the upper left and then clicking on the Invoke button
+(the one with the Camera icon). After a short wait, you'll see a large
+image of sushi in the image panel, and a new thumbnail in the gallery
+on the right.
+
+If you need more room on the screen, you can turn the gallery  off
+by clicking on the **x** to the right of "Your Invocations". You can
+turn it back on later by clicking the image icon that appears in the
+gallery's place.
+
+The images are written into the directory indicated by the `--outdir`
+option provided at script launch time. By default, this is
+`outputs/img-samples` under the InvokeAI directory.
+
+4. Generate a bunch of strawberry sushi images by increasing the
+number of requested images by adjusting the Images counter just below
+the Camera button. As each is generated, it will be added to the
+gallery. You can switch the active image by clicking on the gallery
+thumbnails.
+
+5. Try playing with different settings, including image width and
+height, the Sampler, the Steps and the CFG scale.
+
+Image *Width* and *Height* do what you'd expect. However, be aware that
+larger images consume more VRAM memory and take longer to generate.
+
+The *Sampler* controls how the AI selects the image to display. Some
+samplers are more "creative" than others and will produce a wider
+range of variations (see next section). Some samplers run faster than
+others.
+
+*Steps* controls how many noising/denoising/sampling steps the AI will
+take. The higher this value, the more refined the image will be, but
+the longer the image will take to generate. A typical strategy is to
+generate images with a low number of steps in order to select one to
+work on further, and then regenerate it using a higher number of
+steps.
+
+The *CFG Scale* controls how hard the AI tries to match the generated
+image to the input prompt. You can go as high or low as you like, but
+generally values greater than 20 won't improve things much, and values
+lower than 5 will produce unexpected images. There are complex
+interactions between *Steps*, *CFG Scale* and the *Sampler*, so
+experiment to find out what works for you.
+
+6. To regenerate a previously-generated image, select the image you
+want and click *Use All*. This loads the text prompt and other
+original settings into the control panel. If you then press *Invoke*
+it will regenerate the image exactly. You can also selectively modify
+the prompt or other settings to tweak the image.
+
+Alternatively, you may click on *Use Seed* to load just the image's
+seed, and leave other settings unchanged.
+
+7. To regenerate a Stable Diffusion image that was generated by
+another SD package, you need to know its text prompt and its
+*Seed*. Copy-paste the prompt into the prompt box, unset the
+*Randomize Seed* control in the control panel, and copy-paste the
+desired *Seed* into its text field. When you Invoke, you will get
+something similar to the original image. It will not be exact unless
+you also set the correct values for the original sampler, CFG,
+steps and dimensions, but it will (usually) be close.
+
+#### Variations on a theme
+
+5. Let's try generating some variations. Select your favorite sushi
+image from the gallery to load it. Then select "Use All" from the list
+of buttons above. This will load up all the settings used to generate
+this image, including its unique seed.
+
+Go down to the Variations section of the Control Panel and set the
+button to On. Set Variation Amount to 0.2 to generate a modest
+number of variations on the image, and also set the Image counter to
+4. Press the `invoke` button. This will generate a series of related
+images. To obtain smaller variations, just lower the Variation
+Amount. You may also experiment with changing the Sampler. Some
+samplers generate more variability than others. *k_euler_a* is
+particularly creative, while *ddim* is pretty conservative.
+
+6. For even more variations, experiment with increasing the setting
+for *Perlin*. This adds a bit of noise to the image generation
+process. Note that values of Perlin noise greater than 0.15 produce
+poor images for several of the samplers.
+
+#### Facial reconstruction and upscaling
+
+Stable Diffusion frequently produces mangled faces, particularly when
+there are multiple figures in the same scene. Stable Diffusion has
+particular issues with generating reallistic eyes. InvokeAI provides
+the ability to reconstruct faces using either the GFPGAN or CodeFormer
+libraries. For more information see [POSTPROCESS](POSTPROCESS.md).
+  
+7. Invoke a prompt that generates a mangled face. A prompt that often
+gives this is "portrait of a lawyer, 3/4 shot" (this is not intended
+as a slur against lawyers!) Once you have an image that needs some
+touching up, load it into the Image panel, and press the button with
+the face icon (highlighted in the first screenshot below). A dialog
+box will appear. Leave *Strength* at 0.8 and press *Restore Faces". If
+all goes well, the eyes and other aspects of the face will be improved
+(see the second screenshot)
+
+<img src="../assets/invoke-web-server-3.png">
+<img src="../assets/invoke-web-server-4.png">
+
+The facial reconstruction *Strength* field adjusts how aggressively
+the face library will try to alter the face. It can be as high as 1.0,
+but be aware that this often softens the face airbrush style, losing
+some details. The default 0.8 is usually sufficient.
+
+8. "Upscaling" is the process of increasing the size of an image while
+retaining the sharpness. InvokeAI uses an external library called
+"ESRGAN" to do this. To invoke upscaling, simply select an image and
+press the *HD* button above it. You can select between 2X and 4X
+upscaling, and adjust the upscaling strength, which has much the same
+meaning as in facial reconstruction. Try running this on one of your
+previously-generated images.
+
+9. Finally, you can run facial reconstruction and/or upscaling
+automatically after each Invocation. Go to the Advanced Options
+section of the Control Panel and turn on *Restore Face* and/or
+*Upscale*.
+
+### Image to Image
+
+InvokeAI lets you take an existing image and use it as the basis for a
+new creation. You can use any sort of image, including a photograph, a
+scanned sketch, or a digital drawing, as long as it is in PNG or JPEG
+format.
+
+For this tutorial, we'll use files named
+[Lincoln-and-Parrot-512.png](../assets/Lincoln-and-Parrot-512.png),
+and
+[Lincoln-and-Parrot-512-transparent.png](../assets/Lincoln-and-Parrot-512-transparent.png).
+Download these images to your local machine now to continue with the walkthrough.
+
+10. Click on the *Image to Image* tab icon, which is the second icon
+from the top on the left-hand side of the screen:
+
+<img src="../assets/invoke-web-server-5.png">
+
+This will bring you to a screen similar to the one shown here:
+
+<img src="../assets/invoke-web-server-6.png" width=350>
+
+Drag-and-drop the Lincoln-and-Parrot image into the Image panel, or
+click the blank area to get an upload dialog. The image will load into
+an area marked *Initial Image*. (The WebGUI will also load the most
+recently-generated image from the gallery into a section on the left,
+but this image will be replaced in the next step.)
+
+11. Go to the prompt box and type *old sea captain with raven on
+shoulder* and press Invoke. A derived image will appear to the right
+of the original one:
+
+<img src="../assets/invoke-web-server-7.png" width=350>
+
+12. Experiment with the different settings. The most influential one
+in Image to Image is *Image to Image Strength* located about midway
+down the control panel. By default it is set to 0.75, but can range
+from 0.0 to 0.99. The higher the value, the more of the original image
+the AI will replace. A value of 0 will leave the initial image
+completely unchanged, while 0.99 will replace it completely. However,
+the Sampler and CFG Scale also influence the final result. You can
+also generate variations in the same way as described in Text to
+Image.
+
+13. What if we only want to change certain part(s) of the image and
+leave the rest intact? This is called Inpainting, and a future version
+of the InvokeAI web server will provide an interactive painting canvas
+on which you can directly draw the areas you wish to Inpaint into. For
+now, you can achieve this effect by using an external photoeditor tool
+to make one or more regions of the image transparent as described in
+[INPAINTING.md] and uploading that.
+
+The file
+[Lincoln-and-Parrot-512-transparent.png](../assets/Lincoln-and-Parrot-512-transparent.png)
+is a version of the earlier image in which the area around the parrot
+has been replaced with transparency. Click on the "x" in the upper
+right of the Initial Image and upload the transparent version. Using
+the same prompt "old sea captain with raven on shoulder" try Invoking
+an image. This time, only the parrot will be replaced, leaving the
+rest of the original image intact:
+
+<img src="../assets/invoke-web-server-8.png" width=350>
+
+## Parting remarks
+
+This concludes the walkthrough, but there are several more features that you
+can explore. Please check out the [Command Line Interface](CLI.md)
+documentation for further explanation of the advanced features that
+were not covered here.
+
+The WebGUI is only rapid development. Check back regularly for
+updates!
+
+## Credits
+
 Kudos to [Psychedelicious](https://github.com/psychedelicious),
 [BlessedCoolant](https://github.com/blessedcoolant), [Tesseract
 Cat](https://github.com/TesseractCat),
 [dagf2101](https://github.com/dagf2101), and many others who
 contributed to this code.

-![Dream Web Server](../assets/invoke_web_server.png)