Merge-main-into-development (#1373)

To get the rid of the difference between main and development.

Since otherwise it will be a pain to start fixing the documentatino
(when the state between main and development is not the same ...)

Also this should fix the problem of all tests failing since environment
yamls get updated.
This commit is contained in:
Lincoln Stein 2022-11-04 12:08:44 -04:00 committed by GitHub
commit ef8b3ce639
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
21 changed files with 241 additions and 370 deletions

View File

@ -1,28 +0,0 @@
name: Deploy
on:
push:
branches:
- main
# pull_request:
# branches:
# - main
jobs:
build:
name: Deploy docs to GitHub Pages
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v3
with:
fetch-depth: 0
- name: Build
uses: Tiryoh/actions-mkdocs@v0
with:
mkdocs_version: 'latest' # option
requirements: '/requirements-mkdocs.txt' # option
configfile: '/mkdocs.yml' # option
- name: Deploy
uses: peaceiris/actions-gh-pages@v3
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: ./site

View File

@ -4,7 +4,6 @@ on:
branches:
- 'main'
- 'development'
- 'release-candidate-2-1'
jobs:
mkdocs-material:

View File

@ -4,6 +4,7 @@ on:
branches:
- 'main'
- 'development'
- 'fix-gh-actions-fork'
pull_request:
branches:
- 'main'

2
.gitignore vendored
View File

@ -1,7 +1,7 @@
# ignore default image save location and model symbolic link
outputs/
models/ldm/stable-diffusion-v1/model.ckpt
ldm/invoke/restoration/codeformer/weights
**/restoration/codeformer/weights
# ignore user models config
configs/models.user.yaml

13
LICENSE
View File

@ -1,17 +1,6 @@
MIT License
Copyright (c) 2022 Lincoln Stein and InvokeAI Organization
This software is derived from a fork of the source code available from
https://github.com/pesser/stable-diffusion and
https://github.com/CompViz/stable-diffusion. They carry the following
copyrights:
Copyright (c) 2022 Machine Vision and Learning Group, LMU Munich
Copyright (c) 2022 Robin Rombach and Patrick Esser and contributors
Please see individual source code files for copyright and authorship
attributions.
Copyright (c) 2022 InvokeAI Team
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal

View File

@ -68,11 +68,11 @@ requests. Be sure to use the provided templates. They will help aid diagnose iss
This fork is supported across multiple platforms. You can find individual installation instructions
below.
- #### [Linux](docs/installation/INSTALL_LINUX.md)
- #### [Linux](https://invoke-ai.github.io/InvokeAI/installation/INSTALL_LINUX/)
- #### [Windows](docs/installation/INSTALL_WINDOWS.md)
- #### [Windows](https://invoke-ai.github.io/InvokeAI/installation/INSTALL_WINDOWS/)
- #### [Macintosh](docs/installation/INSTALL_MAC.md)
- #### [Macintosh](https://invoke-ai.github.io/InvokeAI/installation/INSTALL_MAC/)
### Hardware Requirements
@ -103,34 +103,33 @@ errors like 'expected type Float but found Half' or 'not implemented for Half'
you can try starting `invoke.py` with the `--precision=float32` flag:
```bash
(ldm) ~/stable-diffusion$ python scripts/invoke.py --precision=float32
(invokeai) ~/InvokeAI$ python scripts/invoke.py --precision=float32
```
### Features
#### Major Features
- [Web Server](docs/features/WEB.md)
- [Interactive Command Line Interface](docs/features/CLI.md)
- [Image To Image](docs/features/IMG2IMG.md)
- [Inpainting Support](docs/features/INPAINTING.md)
- [Outpainting Support](docs/features/OUTPAINTING.md)
- [Upscaling, face-restoration and outpainting](docs/features/POSTPROCESS.md)
- [Seamless Tiling](docs/features/OTHER.md#seamless-tiling)
- [Google Colab](docs/features/OTHER.md#google-colab)
- [Reading Prompts From File](docs/features/PROMPTS.md#reading-prompts-from-a-file)
- [Shortcut: Reusing Seeds](docs/features/OTHER.md#shortcuts-reusing-seeds)
- [Prompt Blending](docs/features/PROMPTS.md#prompt-blending)
- [Thresholding and Perlin Noise Initialization Options](/docs/features/OTHER.md#thresholding-and-perlin-noise-initialization-options)
- [Negative/Unconditioned Prompts](docs/features/PROMPTS.md#negative-and-unconditioned-prompts)
- [Variations](docs/features/VARIATIONS.md)
- [Personalizing Text-to-Image Generation](docs/features/TEXTUAL_INVERSION.md)
- [Simplified API for text to image generation](docs/features/OTHER.md#simplified-api)
- [Web Server](https://invoke-ai.github.io/InvokeAI/features/WEB/)
- [Interactive Command Line Interface](https://invoke-ai.github.io/InvokeAI/features/CLI/)
- [Image To Image](https://invoke-ai.github.io/InvokeAI/features/IMG2IMG/)
- [Inpainting Support](https://invoke-ai.github.io/InvokeAI/features/INPAINTING/)
- [Outpainting Support](https://invoke-ai.github.io/InvokeAI/features/OUTPAINTING/)
- [Upscaling, face-restoration and outpainting](https://invoke-ai.github.io/InvokeAI/features/POSTPROCESS/)
- [Reading Prompts From File](https://invoke-ai.github.io/InvokeAI/features/PROMPTS/#reading-prompts-from-a-file)
- [Prompt Blending](https://invoke-ai.github.io/InvokeAI/features/PROMPTS/#prompt-blending)
- [Thresholding and Perlin Noise Initialization Options](https://invoke-ai.github.io/InvokeAI/features/OTHER/#thresholding-and-perlin-noise-initialization-options)
- [Negative/Unconditioned Prompts](https://invoke-ai.github.io/InvokeAI/features/PROMPTS/#negative-and-unconditioned-prompts)
- [Variations](https://invoke-ai.github.io/InvokeAI/features/VARIATIONS/)
- [Personalizing Text-to-Image Generation](https://invoke-ai.github.io/InvokeAI/features/TEXTUAL_INVERSION/)
- [Simplified API for text to image generation](https://invoke-ai.github.io/InvokeAI/features/OTHER/#simplified-api)
#### Other Features
- [Creating Transparent Regions for Inpainting](docs/features/INPAINTING.md#creating-transparent-regions-for-inpainting)
- [Preload Models](docs/features/OTHER.md#preload-models)
- [Google Colab](https://invoke-ai.github.io/InvokeAI/features/OTHER/#google-colab)
- [Seamless Tiling](https://invoke-ai.github.io/InvokeAI/features/OTHER/#seamless-tiling)
- [Shortcut: Reusing Seeds](https://invoke-ai.github.io/InvokeAI/features/OTHER/#shortcuts-reusing-seeds)
- [Preload Models](https://invoke-ai.github.io/InvokeAI/features/OTHER/#preload-models)
### Latest Changes
@ -144,33 +143,33 @@ you can try starting `invoke.py` with the `--precision=float32` flag:
- `dream.py` script renamed `invoke.py`. A `dream.py` script wrapper remains
for backward compatibility.
- Completely new WebGUI - launch with `python3 scripts/invoke.py --web`
- Support for <a href="https://github.com/invoke-ai/InvokeAI/blob/main/docs/features/INPAINTING.md">inpainting</a> and <a href="https://github.com/invoke-ai/InvokeAI/blob/main/docs/features/OUTPAINTING.md">outpainting</a>
- Support for <a href="https://invoke-ai.github.io/InvokeAI/features/INPAINTING/">inpainting</a> and <a href="https://invoke-ai.github.io/InvokeAI/features/OUTPAINTING/">outpainting</a>
- img2img runs on all k* samplers
- Support for <a href="https://github.com/invoke-ai/InvokeAI/blob/main/docs/features/PROMPTS.md#negative-and-unconditioned-prompts">negative prompts</a>
- Support for <a href="https://invoke-ai.github.io/InvokeAI/features/PROMPTS/#negative-and-unconditioned-prompts">negative prompts</a>
- Support for CodeFormer face reconstruction
- Support for Textual Inversion on Macintoshes
- Support in both WebGUI and CLI for <a href="https://github.com/invoke-ai/InvokeAI/blob/main/docs/features/POSTPROCESS.md">post-processing of previously-generated images</a>
- Support in both WebGUI and CLI for <a href="https://invoke-ai.github.io/InvokeAI/features/POSTPROCESS/">post-processing of previously-generated images</a>
using facial reconstruction, ESRGAN upscaling, outcropping (similar to DALL-E infinite canvas),
and "embiggen" upscaling. See the `!fix` command.
- New `--hires` option on `invoke>` line allows <a href="https://github.com/invoke-ai/InvokeAI/blob/main/docs/features/CLI.md#this-is-an-example-of-txt2img">larger images to be created without duplicating elements</a>, at the cost of some performance.
- New `--hires` option on `invoke>` line allows <a href="https://invoke-ai.github.io/InvokeAI/features/CLI/#txt2img">larger images to be created without duplicating elements</a>, at the cost of some performance.
- New `--perlin` and `--threshold` options allow you to add and control variation
during image generation (see <a href="https://github.com/invoke-ai/InvokeAI/blob/main/docs/features/OTHER.md#thresholding-and-perlin-noise-initialization-options">Thresholding and Perlin Noise Initialization</a>
- Extensive metadata now written into PNG files, allowing reliable regeneration of images
and tweaking of previous settings.
- Command-line completion in `invoke.py` now works on Windows, Linux and Mac platforms.
- Improved <a href="https://github.com/invoke-ai/InvokeAI/blob/main/docs/features/CLI.md">command-line completion behavior</a>.
- Improved <a href="https://invoke-ai.github.io/InvokeAI/features/CLI/">command-line completion behavior</a>.
New commands added:
* List command-line history with `!history`
* Search command-line history with `!search`
* Clear history with `!clear`
- List command-line history with `!history`
- Search command-line history with `!search`
- Clear history with `!clear`
- Deprecated `--full_precision` / `-F`. Simply omit it and `invoke.py` will auto
configure. To switch away from auto use the new flag like `--precision=float32`.
For older changelogs, please visit the **[CHANGELOG](docs/features/CHANGELOG.md)**.
For older changelogs, please visit the **[CHANGELOG](https://invoke-ai.github.io/InvokeAI/CHANGELOG#v114-11-september-2022)**.
### Troubleshooting
Please check out our **[Q&A](docs/help/TROUBLESHOOT.md)** to get solutions for common installation
Please check out our **[Q&A](https://invoke-ai.github.io/InvokeAI/help/TROUBLESHOOT/#faq)** to get solutions for common installation
problems and other issues.
# Contributing
@ -188,7 +187,7 @@ changes.
### Contributors
This fork is a combined effort of various people from across the world.
[Check out the list of all these amazing people](docs/other/CONTRIBUTORS.md). We thank them for
[Check out the list of all these amazing people](https://invoke-ai.github.io/InvokeAI/other/CONTRIBUTORS/). We thank them for
their time, hard work and effort.
### Support
@ -202,4 +201,4 @@ Original portions of the software are Copyright (c) 2020
### Further Reading
Please see the original README for more information on this software and underlying algorithm,
located in the file [README-CompViz.md](docs/other/README-CompViz.md).
located in the file [README-CompViz.md](https://invoke-ai.github.io/InvokeAI/other/README-CompViz/).

View File

@ -15,25 +15,25 @@ title: Changelog
- `dream.py` script renamed `invoke.py`. A `dream.py` script wrapper remains
for backward compatibility.
- Completely new WebGUI - launch with `python3 scripts/invoke.py --web`
- Support for <a href="https://github.com/invoke-ai/InvokeAI/blob/main/docs/features/INPAINTING.md">inpainting</a> and <a href="https://github.com/invoke-ai/InvokeAI/blob/main/docs/features/OUTPAINTING.md">outpainting</a>
- Support for [inpainting](features/INPAINTING.md) and [outpainting](features/OUTPAINTING.md)
- img2img runs on all k* samplers
- Support for <a href="https://github.com/invoke-ai/InvokeAI/blob/main/docs/features/PROMPTS.md#negative-and-unconditioned-prompts">negative prompts</a>
- Support for [negative prompts](features/PROMPTS.md#negative-and-unconditioned-prompts)
- Support for CodeFormer face reconstruction
- Support for Textual Inversion on Macintoshes
- Support in both WebGUI and CLI for <a href="https://github.com/invoke-ai/InvokeAI/blob/main/docs/features/POSTPROCESS.md">post-processing of previously-generated images</a>
- Support in both WebGUI and CLI for [post-processing of previously-generated images](features/POSTPROCESS.md)
using facial reconstruction, ESRGAN upscaling, outcropping (similar to DALL-E infinite canvas),
and "embiggen" upscaling. See the `!fix` command.
- New `--hires` option on `invoke>` line allows <a href="https://github.com/invoke-ai/InvokeAI/blob/main/docs/features/CLI.m#this-is-an-example-of-txt2img">larger images to be created without duplicating elements</a>, at the cost of some performance.
- New `--hires` option on `invoke>` line allows [larger images to be created without duplicating elements](features/CLI.md#this-is-an-example-of-txt2img), at the cost of some performance.
- New `--perlin` and `--threshold` options allow you to add and control variation
during image generation (see <a href="https://github.com/invoke-ai/InvokeAI/blob/main/docs/features/OTHER.md#thresholding-and-perlin-noise-initialization-options">Thresholding and Perlin Noise Initialization</a>
during image generation (see [Thresholding and Perlin Noise Initialization](features/OTHER.md#thresholding-and-perlin-noise-initialization-options))
- Extensive metadata now written into PNG files, allowing reliable regeneration of images
and tweaking of previous settings.
- Command-line completion in `invoke.py` now works on Windows, Linux and Mac platforms.
- Improved <a href="https://github.com/invoke-ai/InvokeAI/blob/main/docs/features/CLI.m">command-line completion behavior</a>.
- Improved [command-line completion behavior](features/CLI.md)
New commands added:
* List command-line history with `!history`
* Search command-line history with `!search`
* Clear history with `!clear`
- List command-line history with `!history`
- Search command-line history with `!search`
- Clear history with `!clear`
- Deprecated `--full_precision` / `-F`. Simply omit it and `invoke.py` will auto
configure. To switch away from auto use the new flag like `--precision=float32`.

Binary file not shown.

After

Width:  |  Height:  |  Size: 635 KiB

View File

@ -1,143 +0,0 @@
---
title: Changelog
---
# :octicons-log-16: Changelog
## v1.13
- Supports a Google Colab notebook for a standalone server running on Google
hardware [Arturo Mendivil](https://github.com/artmen1516)
- WebUI supports GFPGAN/ESRGAN facial reconstruction and upscaling
[Kevin Gibbons](https://github.com/bakkot)
- WebUI supports incremental display of in-progress images during generation
[Kevin Gibbons](https://github.com/bakkot)
- Output directory can be specified on the invoke> command line.
- The grid was displaying duplicated images when not enough images to fill the
final row [Muhammad Usama](https://github.com/SMUsamaShah)
- Can specify --grid on invoke.py command line as the default.
- Miscellaneous internal bug and stability fixes.
---
## v1.12 <small>(28 August 2022)</small>
- Improved file handling, including ability to read prompts from standard input.
(kudos to [Yunsaki](https://github.com/yunsaki)
- The web server is now integrated with the invoke.py script. Invoke by adding
--web to the invoke.py command arguments.
- Face restoration and upscaling via GFPGAN and Real-ESGAN are now automatically
enabled if the GFPGAN directory is located as a sibling to Stable Diffusion.
VRAM requirements are modestly reduced. Thanks to both
[Blessedcoolant](https://github.com/blessedcoolant) and
[Oceanswave](https://github.com/oceanswave) for their work on this.
- You can now swap samplers on the invoke> command line.
[Blessedcoolant](https://github.com/blessedcoolant)
---
## v1.11 <small>(26 August 2022)</small>
- NEW FEATURE: Support upscaling and face enhancement using the GFPGAN module.
(kudos to [Oceanswave](https://github.com/Oceanswave))
- You now can specify a seed of -1 to use the previous image's seed, -2 to use
the seed for the image generated before that, etc. Seed memory only extends
back to the previous command, but will work on all images generated with the
-n# switch.
- Variant generation support temporarily disabled pending more general solution.
- Created a feature branch named **yunsaki-morphing-invoke** which adds
experimental support for iteratively modifying the prompt and its parameters.
Please
see[ Pull Request #86](https://github.com/lstein/stable-diffusion/pull/86) for
a synopsis of how this works. Note that when this feature is eventually added
to the main branch, it will may be modified significantly.
---
## v1.10 <small>(25 August 2022)</small>
- A barebones but fully functional interactive web server for online generation
of txt2img and img2img.
---
## v1.09 <small>(24 August 2022)</small>
- A new -v option allows you to generate multiple variants of an initial image
in img2img mode. (kudos to [Oceanswave](https://github.com/Oceanswave).
- [See this discussion in the PR for examples and details on use](https://github.com/lstein/stable-diffusion/pull/71#issuecomment-1226700810))
- Added ability to personalize text to image generation (kudos to
[Oceanswave](https://github.com/Oceanswave) and
[nicolai256](https://github.com/nicolai256))
- Enabled all of the samplers from k_diffusion
---
## v1.08 <small>(24 August 2022)</small>
- Escape single quotes on the invoke> command before trying to parse. This avoids
parse errors.
- Removed instruction to get Python3.8 as first step in Windows install.
Anaconda3 does it for you.
- Added bounds checks for numeric arguments that could cause crashes.
- Cleaned up the copyright and license agreement files.
---
## v1.07 <small>(23 August 2022)</small>
- Image filenames will now never fill gaps in the sequence, but will be assigned
the next higher name in the chosen directory. This ensures that the alphabetic
and chronological sort orders are the same.
---
## v1.06 <small>(23 August 2022)</small>
- Added weighted prompt support contributed by
[xraxra](https://github.com/xraxra)
- Example of using weighted prompts to tweak a demonic figure contributed by
[bmaltais](https://github.com/bmaltais)
---
## v1.05 <small>(22 August 2022 - after the drop)</small>
- Filenames now use the following formats: 000010.95183149.png -- Two files
produced by the same command (e.g. -n2), 000010.26742632.png -- distinguished
by a different seed.
000011.455191342.01.png -- Two files produced by the same command using
000011.455191342.02.png -- a batch size>1 (e.g. -b2). They have the same seed.
000011.4160627868.grid#1-4.png -- a grid of four images (-g); the whole grid
can be regenerated with the indicated key
- It should no longer be possible for one image to overwrite another
- You can use the "cd" and "pwd" commands at the invoke> prompt to set and
retrieve the path of the output directory.
## v1.04 <small>(22 August 2022 - after the drop)</small>
- Updated README to reflect installation of the released weights.
- Suppressed very noisy and inconsequential warning when loading the frozen CLIP
tokenizer.
## v1.03 <small>(22 August 2022)</small>
- The original txt2img and img2img scripts from the CompViz repository have been
moved into a subfolder named "orig_scripts", to reduce confusion.
## v1.02 <small>(21 August 2022)</small>
- A copy of the prompt and all of its switches and options is now stored in the
corresponding image in a tEXt metadata field named "Dream". You can read the
prompt using scripts/images2prompt.py, or an image editor that allows you to
explore the full metadata. **Please run "conda env update -f environment.yaml"
to load the k_lms dependencies!!**
## v1.01 <small>(21 August 2022)</small>
- added k_lms sampling. **Please run "conda env update -f environment.yaml" to
load the k_lms dependencies!!**
- use half precision arithmetic by default, resulting in faster execution and
lower memory requirements Pass argument --full_precision to invoke.py to get
slower but more accurate image generation

View File

@ -101,9 +101,7 @@ overridden on a per-prompt basis (see [List of prompt arguments](#list-of-prompt
| `--free_gpu_mem` | | `False` | Free GPU memory after sampling, to allow image decoding and saving in low VRAM conditions |
| `--precision` | | `auto` | Set model precision, default is selected by device. Options: auto, float32, float16, autocast |
!!! warning deprecated
These arguments are deprecated but still work:
!!! warning "These arguments are deprecated but still work"
<div align="center" markdown>
@ -132,7 +130,7 @@ from text ([txt2img](#txt2img)), to embellish an existing image or sketch
### txt2img
!!! example
!!! example ""
```bash
invoke> waterfall and rainbow -W640 -H480
@ -200,7 +198,7 @@ accepts additional options:
### inpainting
!!! example
!!! example ""
```bash
invoke> waterfall and rainbow -I./vacation-photo.png -M./vacation-mask.png -W640 -H480 --fit

View File

@ -17,15 +17,15 @@ tree on a hill with a river, nature photograph, national geographic -I./test-pic
This will take the original image shown here:
<div align="center" markdown>
<figure markdown>
<img src="https://user-images.githubusercontent.com/50542132/193946000-c42a96d8-5a74-4f8a-b4c3-5213e6cadcce.png" width=350>
</div>
</figure>
and generate a new image based on it as shown here:
<div align="center" markdown>
<figure markdown>
<img src="https://user-images.githubusercontent.com/111189/194135515-53d4c060-e994-4016-8121-7c685e281ac9.png" width=350>
</div>
</figure>
The `--init_img` (`-I`) option gives the path to the seed picture. `--strength` (`-f`) controls how much
the original will be modified, ranging from `0.0` (keep the original intact), to `1.0` (ignore the
@ -41,11 +41,10 @@ interesting variants.
Note that the prompt makes a big difference. For example, this slight variation on the prompt produces
a very different image:
`photograph of a tree on a hill with a river`
<div align="center" markdown>
<figure markdown>
<img src="https://user-images.githubusercontent.com/111189/194135220-16b62181-b60c-4248-8989-4834a8fd7fbd.png" width=350>
</div>
<caption markdown>photograph of a tree on a hill with a river</caption>
</figure>
!!! tip
@ -79,9 +78,9 @@ gaussian noise and progressively refines it over the requested number of steps,
invoke> "fire" -s10 -W384 -H384 -S1592514025
```
<div align="center" markdown>
<figure markdown>
![latent steps](../assets/img2img/000019.steps.png)
</div>
</figure>
Put simply: starting from a frame of fuzz/static, SD finds details in each frame that it thinks look like "fire" and brings them a little bit more into focus, gradually scrubbing out the fuzz until a clear image remains.
@ -91,21 +90,21 @@ Put simply: starting from a frame of fuzz/static, SD finds details in each frame
I want SD to draw a fire based on this hand-drawn image:
<div align="center" markdown>
<figure markdown>
![drawing of a fireplace](../assets/img2img/fire-drawing.png)
</div>
</figure>
Let's only do 10 steps, to make it easier to see what's happening. If strength is `0.7`, this is what the internal steps the algorithm has to take will look like:
<div align="center" markdown>
<figure markdown>
![gravity32](../assets/img2img/000032.steps.gravity.png)
</div>
</figure>
With strength `0.4`, the steps look more like this:
<div align="center" markdown>
<figure markdown>
![gravity30](../assets/img2img/000030.steps.gravity.png)
</div>
</figure>
Notice how much more fuzzy the starting image is for strength `0.7` compared to `0.4`, and notice also how much longer the sequence is with `0.7`:
@ -137,9 +136,9 @@ Here's strength `0.4` (note step count `50`, which is `20 ÷ 0.4` to make sure S
invoke> "fire" -s50 -W384 -H384 -S1592514025 -I /tmp/fire-drawing.png -f 0.4
```
<div align="center" markdown>
<figure markdown>
![000035.1592514025](../assets/img2img/000035.1592514025.png)
</div>
</figure>
and here is strength `0.7` (note step count `30`, which is roughly `20 ÷ 0.7` to make sure SD does `20` steps from my image):
@ -147,29 +146,38 @@ and here is strength `0.7` (note step count `30`, which is roughly `20 ÷ 0.7` t
invoke> "fire" -s30 -W384 -H384 -S1592514025 -I /tmp/fire-drawing.png -f 0.7
```
<div align="center" markdown>
<figure markdown>
![000046.1592514025](../assets/img2img/000046.1592514025.png)
</div>
</figure>
In both cases the image is nice and clean and "finished", but because at strength `0.7` Stable Diffusion has been give so much more freedom to improve on my badly-drawn flames, they've come out looking much better. You can really see the difference when looking at the latent steps. There's more noise on the first image with strength `0.7`:
<figure markdown>
![gravity46](../assets/img2img/000046.steps.gravity.png)
</figure>
than there is for strength `0.4`:
<figure markdown>
![gravity35](../assets/img2img/000035.steps.gravity.png)
</figure>
and that extra noise gives the algorithm more choices when it is evaluating how to denoise any particular pixel in the image.
Unfortunately, it seems that `img2img` is very sensitive to the step count. Here's strength `0.7` with a step count of `29` (SD did 19 steps from my image):
<div align="center" markdown>
<figure markdown>
![gravity45](../assets/img2img/000045.1592514025.png)
</div>
</figure>
By comparing the latents we can sort of see that something got interpreted differently enough on the third or fourth step to lead to a rather different interpretation of the flames.
<figure markdown>
![gravity46](../assets/img2img/000046.steps.gravity.png)
</figure>
<figure markdown>
![gravity45](../assets/img2img/000045.steps.gravity.png)
</figure>
This is the result of a difference in the de-noising "schedule" - basically the noise has to be cleaned by a certain degree each step or the model won't "converge" on the image properly (see [stable diffusion blog](https://huggingface.co/blog/stable_diffusion) for more about that). A different step count means a different schedule, which means things get interpreted slightly differently at every step.

View File

@ -257,28 +257,40 @@ surrounding unmasked regions as well.
1. Open image in Photoshop
<div align="center" markdown>![step1](../assets/step1.png)</div>
<figure markdown>
![step1](../assets/step1.png)
</figure>
2. Use any of the selection tools (Marquee, Lasso, or Wand) to select the area you desire to inpaint.
<div align="center" markdown>![step2](../assets/step2.png)</div>
<figure markdown>
![step2](../assets/step2.png)
</figure>
3. Because we'll be applying a mask over the area we want to preserve, you should now select the inverse by using the ++shift+ctrl+i++ shortcut, or right clicking and using the "Select Inverse" option.
4. You'll now create a mask by selecting the image layer, and Masking the selection. Make sure that you don't delete any of the underlying image, or your inpainting results will be dramatically impacted.
<div align="center" markdown>![step4](../assets/step4.png)</div>
<figure markdown>
![step4](../assets/step4.png)
</figure>
5. Make sure to hide any background layers that are present. You should see the mask applied to your image layer, and the image on your canvas should display the checkered background.
<div align="center" markdown>![step5](../assets/step5.png)</div>
<figure markdown>
![step5](../assets/step5.png)
</figure>
6. Save the image as a transparent PNG by using `File`-->`Save a Copy` from the menu bar, or by using the keyboard shortcut ++alt+ctrl+s++
<div align="center" markdown>![step6](../assets/step6.png)</div>
<figure markdown>
![step6](../assets/step6.png)
</figure>
7. After following the inpainting instructions above (either through the CLI or the Web UI), marvel at your newfound ability to selectively invoke. Lookin' good!
<div align="center" markdown>![step7](../assets/step7.png)</div>
<figure markdown>
![step7](../assets/step7.png)
</figure>
8. In the export dialogue, Make sure the "Save colour values from transparent pixels" checkbox is selected.

View File

@ -64,31 +64,32 @@ model](INPAINTING.md#using-the-runwayml-inpainting-model).
Consider this image:
<div align="center" markdown>
<figure markdown>
![curly_woman](../assets/outpainting/curly.png)
</div>
</figure>
Pretty nice, but it's annoying that the top of her head is cut
off. She's also a bit off center. Let's fix that!
```bash
invoke> !fix images/curly.png --outcrop top 64 right 64
invoke> !fix images/curly.png --outcrop top 128 right 64 bottom 64
```
This is saying to apply the `outcrop` extension by extending the top
of the image by 64 pixels, and the right of the image by the same
amount. You can use any combination of top|left|right|bottom, and
of the image by 128 pixels, and the right and bottom of the image by
64 pixels. You can use any combination of top|left|right|bottom, and
specify any number of pixels to extend. You can also abbreviate
`--outcrop` to `-c`.
The result looks like this:
<div align="center" markdown>
![curly_woman_outcrop](../assets/outpainting/curly-outcrop.png)
</div>
<figure markdown>
![curly_woman_outcrop](../assets/outpainting/curly-outcrop-2.png)
</figure>
The new image is actually slightly larger than the original (576x576,
because 64 pixels were added to the top and right sides.)
The new image is larger than the original (576x704)
because 64 pixels were added to the top and right sides. You will
need enough VRAM to process an image of this size.
A number of caveats:
@ -103,3 +104,53 @@ you'll get a slightly different result. You can run it repeatedly
until you get an image you like. Unfortunately `!fix` does not
currently respect the `-n` (`--iterations`) argument.
3. Your results will be _much_ better if you use the `inpaint-1.5`
model released by runwayML and installed by default by
`scripts/preload_models.py`. This model was trained specifically to
harmoniously fill in image gaps. The standard model will work as well,
but you may notice color discontinuities at the border.
4. When using the `inpaint-1.5` model, you may notice subtle changes
to the area within the original image. This is because the model
performs an encoding/decoding on the image as a whole. This does not
occur with the standard model.
## Outpaint
The `outpaint` extension does the same thing, but with subtle
differences. Starting with the same image, here is how we would add an
additional 64 pixels to the top of the image:
```bash
invoke> !fix images/curly.png --out_direction top 64
```
(you can abbreviate `--out_direction` as `-D`.
The result is shown here:
<figure markdown>
![curly_woman_outpaint](../assets/outpainting/curly-outpaint.png)
</figure>
Although the effect is similar, there are significant differences from
outcropping:
- You can only specify one direction to extend at a time.
- The image is **not** resized. Instead, the image is shifted by the specified
number of pixels. If you look carefully, you'll see that less of the lady's
torso is visible in the image.
- Because the image dimensions remain the same, there's no rounding
to multiples of 64.
- Attempting to outpaint larger areas will frequently give rise to ugly
ghosting effects.
- For best results, try increasing the step number.
- If you don't specify a pixel value in `-D`, it will default to half
of the whole image, which is likely not what you want.
!!! tip
Neither `outpaint` nor `outcrop` are perfect, but we continue to tune
and improve them. If one doesn't work, try the other. You may also
wish to experiment with other `img2img` arguments, such as `-C`, `-f`
and `-s`.

View File

@ -47,33 +47,33 @@ original prompt:
`#!bash "A fantastical translucent pony made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve" -s 20 -W 512 -H 768 -C 7.5 -A k_euler_a -S 1654590180`
<div align="center" markdown>
<figure markdown>
![step1](../assets/negative_prompt_walkthru/step1.png)
</div>
</figure>
That image has a woman, so if we want the horse without a rider, we can influence the image not to have a woman by putting [woman] in the prompt, like this:
`#!bash "A fantastical translucent poney made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve [woman]" -s 20 -W 512 -H 768 -C 7.5 -A k_euler_a -S 1654590180`
<div align="center" markdown>
<figure markdown>
![step2](../assets/negative_prompt_walkthru/step2.png)
</div>
</figure>
That's nice - but say we also don't want the image to be quite so blue. We can add "blue" to the list of negative prompts, so it's now [woman blue]:
`#!bash "A fantastical translucent poney made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve [woman blue]" -s 20 -W 512 -H 768 -C 7.5 -A k_euler_a -S 1654590180`
<div align="center" markdown>
<figure markdown>
![step3](../assets/negative_prompt_walkthru/step3.png)
</div>
</figure>
Getting close - but there's no sense in having a saddle when our horse doesn't have a rider, so we'll add one more negative prompt: [woman blue saddle].
`#!bash "A fantastical translucent poney made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve [woman blue saddle]" -s 20 -W 512 -H 768 -C 7.5 -A k_euler_a -S 1654590180`
<div align="center" markdown>
<figure markdown>
![step4](../assets/negative_prompt_walkthru/step4.png)
</div>
</figure>
!!! notes "Notes about this feature:"
@ -215,56 +215,56 @@ different results each time you run them.
---
<div align="center" markdown>
<figure markdown>
### "blue sphere, red cube, hybrid"
</div>
</figure>
This example doesn't use melding at all and represents the default way
of mixing concepts.
<div align="center" markdown>
<figure markdown>
![blue-sphere-red-cube-hyprid](../assets/prompt-blending/blue-sphere-red-cube-hybrid.png)
</div>
</figure>
It's interesting to see how the AI expressed the concept of "cube" as
the four quadrants of the enclosing frame. If you look closely, there
is depth there, so the enclosing frame is actually a cube.
<div align="center" markdown>
<figure markdown>
### "blue sphere:0.25 red cube:0.75 hybrid"
![blue-sphere-25-red-cube-75](../assets/prompt-blending/blue-sphere-0.25-red-cube-0.75-hybrid.png)
</div>
</figure>
Now that's interesting. We get neither a blue sphere nor a red cube,
but a red sphere embedded in a brick wall, which represents a melding
of concepts within the AI's "latent space" of semantic
representations. Where is Ludwig Wittgenstein when you need him?
<div align="center" markdown>
<figure markdown>
### "blue sphere:0.75 red cube:0.25 hybrid"
![blue-sphere-75-red-cube-25](../assets/prompt-blending/blue-sphere-0.75-red-cube-0.25-hybrid.png)
</div>
</figure>
Definitely more blue-spherey. The cube is gone entirely, but it's
really cool abstract art.
<div align="center" markdown>
<figure markdown>
### "blue sphere:0.5 red cube:0.5 hybrid"
![blue-sphere-5-red-cube-5-hybrid](../assets/prompt-blending/blue-sphere-0.5-red-cube-0.5-hybrid.png)
</div>
</figure>
Whoa...! I see blue and red, but no spheres or cubes. Is the word
"hybrid" summoning up the concept of some sort of scifi creature?
Let's find out.
<div align="center" markdown>
<figure markdown>
### "blue sphere:0.5 red cube:0.5"
![blue-sphere-5-red-cube-5](../assets/prompt-blending/blue-sphere-0.5-red-cube-0.5.png)
</div>
</figure>
Indeed, removing the word "hybrid" produces an image that is more like
what we'd expect.

View File

@ -86,74 +86,57 @@ You wil need one of the following:
- At least 12 GB of free disk space for the machine learning model, Python, and all its dependencies.
!!! note
!!! info
If you are have a Nvidia 10xx series card (e.g. the 1080ti), please run the invoke script in
full-precision mode as shown below.
Similarly, specify full-precision mode on Apple M1 hardware.
To run in full-precision mode, start `invoke.py` with the `--full_precision` flag:
Precision is auto configured based on the device. If however you encounter errors like
`expected type Float but found Half` or `not implemented for Half` you can try starting
`invoke.py` with the `--precision=float32` flag:
```bash
(invokeai) ~/InvokeAI$ python scripts/invoke.py --full_precision
```
## :octicons-log-16: Latest Changes
### v2.0.1 <small>(13 October 2022)</small>
- fix noisy images at high step count when using k* samplers
- dream.py script now calls invoke.py module directly rather than
via a new python process (which could break the environment)
### v2.0.0 <small>(9 October 2022)</small>
- `dream.py` script renamed `invoke.py`. A `dream.py` script wrapper remains
for backward compatibility.
- Completely new WebGUI - launch with `python3 scripts/invoke.py --web`
- Support for <a href="https://github.com/invoke-ai/InvokeAI/blob/main/docs/features/INPAINTING.md">inpainting</a> and <a href="https://github.com/invoke-ai/InvokeAI/blob/main/docs/features/OUTPAINTING.md">outpainting</a>
- Support for <a href="https://invoke-ai.github.io/InvokeAI/features/INPAINTING/">inpainting</a> and <a href="https://invoke-ai.github.io/InvokeAI/features/OUTPAINTING/">outpainting</a>
- img2img runs on all k* samplers
- Support for <a href="https://github.com/invoke-ai/InvokeAI/blob/main/docs/features/PROMPTS.md#negative-and-unconditioned-prompts">negative prompts</a>
- Support for <a href="https://invoke-ai.github.io/InvokeAI/features/PROMPTS/#negative-and-unconditioned-prompts">negative prompts</a>
- Support for CodeFormer face reconstruction
- Support for Textual Inversion on Macintoshes
- Support in both WebGUI and CLI for <a href="https://github.com/invoke-ai/InvokeAI/blob/main/docs/features/POSTPROCESS.md">post-processing of previously-generated images</a>
- Support in both WebGUI and CLI for <a href="https://invoke-ai.github.io/InvokeAI/features/POSTPROCESS/">post-processing of previously-generated images</a>
using facial reconstruction, ESRGAN upscaling, outcropping (similar to DALL-E infinite canvas),
and "embiggen" upscaling. See the `!fix` command.
- New `--hires` option on `invoke>` line allows <a href="https://github.com/invoke-ai/InvokeAI/blob/main/docs/features/CLI.m#this-is-an-example-of-txt2img">larger images to be created without duplicating elements</a>, at the cost of some performance.
- New `--hires` option on `invoke>` line allows <a href="https://invoke-ai.github.io/InvokeAI/features/CLI/#txt2img">larger images to be created without duplicating elements</a>, at the cost of some performance.
- New `--perlin` and `--threshold` options allow you to add and control variation
during image generation (see <a href="https://github.com/invoke-ai/InvokeAI/blob/main/docs/features/OTHER.md#thresholding-and-perlin-noise-initialization-options">Thresholding and Perlin Noise Initialization</a>
- Extensive metadata now written into PNG files, allowing reliable regeneration of images
and tweaking of previous settings.
- Command-line completion in `invoke.py` now works on Windows, Linux and Mac platforms.
- Improved <a href="https://github.com/invoke-ai/InvokeAI/blob/main/docs/features/CLI.m">command-line completion behavior</a>.
- Improved <a href="https://invoke-ai.github.io/InvokeAI/features/CLI/">command-line completion behavior</a>.
New commands added:
* List command-line history with `!history`
* Search command-line history with `!search`
* Clear history with `!clear`
- List command-line history with `!history`
- Search command-line history with `!search`
- Clear history with `!clear`
- Deprecated `--full_precision` / `-F`. Simply omit it and `invoke.py` will auto
configure. To switch away from auto use the new flag like `--precision=float32`.
### v1.14 <small>(11 September 2022)</small>
- Memory optimizations for small-RAM cards. 512x512 now possible on 4 GB GPUs.
- Full support for Apple hardware with M1 or M2 chips.
- Add "seamless mode" for circular tiling of image. Generates beautiful effects.
([prixt](https://github.com/prixt)).
- Inpainting support.
- Improved web server GUI.
- Lots of code and documentation cleanups.
### v1.13 <small>(3 September 2022</small>
- Support image variations (see [VARIATIONS](features/VARIATIONS.md)
([Kevin Gibbons](https://github.com/bakkot) and many contributors and reviewers)
- Supports a Google Colab notebook for a standalone server running on Google hardware
[Arturo Mendivil](https://github.com/artmen1516)
- WebUI supports GFPGAN/ESRGAN facial reconstruction and upscaling
[Kevin Gibbons](https://github.com/bakkot)
- WebUI supports incremental display of in-progress images during generation
[Kevin Gibbons](https://github.com/bakkot)
- A new configuration file scheme that allows new models (including upcoming stable-diffusion-v1.5)
to be added without altering the code. ([David Wager](https://github.com/maddavid12))
- Can specify --grid on invoke.py command line as the default.
- Miscellaneous internal bug and stability fixes.
- Works on M1 Apple hardware.
- Multiple bug fixes.
For older changelogs, please visit the **[CHANGELOG](features/CHANGELOG.md)**.
For older changelogs, please visit the **[CHANGELOG](CHANGELOG.md#v114-11-september-2022)**.
## :material-target: Troubleshooting

View File

@ -72,7 +72,7 @@ in the wiki
7. Load the big stable diffusion weights files and a couple of smaller machine-learning models:
```bash
(invokeai) ~/InvokeAI$ python3 scripts/preload_models.py
(invokeai) ~/InvokeAI$ python scripts\preload_models.py
```
!!! note

View File

@ -22,7 +22,6 @@ dependencies:
- diffusers=0.6.0
- einops=0.4.1
- grpcio=1.46.4
- getpass_asterisk
- humanfriendly=10.0
- imageio=2.21.2
- imageio-ffmpeg=0.4.7
@ -52,6 +51,7 @@ dependencies:
- transformers=4.23.1
- torch-fidelity=0.3.0
- pip:
- getpass_asterisk
- dependency_injector==4.40.0
- realesrgan==0.2.5.0
- test-tube==0.7.5

View File

@ -1,6 +1,7 @@
name: invokeai
channels:
- pytorch
- conda-forge
- defaults
dependencies:
- python>=3.9
@ -9,7 +10,7 @@ dependencies:
- torchvision=0.13.1
- torchaudio=0.12.1
- pytorch=1.12.1
- cudatoolkit=11.3
- cudatoolkit=11.6
- pip:
- albumentations==0.4.3
- opencv-python==4.5.5.64
@ -39,6 +40,6 @@ dependencies:
- -e git+https://github.com/CompVis/taming-transformers.git@master#egg=taming-transformers
- -e git+https://github.com/Birch-san/k-diffusion.git@mps#egg=k_diffusion
- -e git+https://github.com/invoke-ai/Real-ESRGAN.git#egg=realesrgan
- -e git+https://github.com/TencentARC/GFPGAN.git#egg=gfpgan
- -e git+https://github.com/invoke-ai/GFPGAN.git#egg=gfpgan
- -e git+https://github.com/invoke-ai/clipseg.git@models-rename#egg=clipseg
- -e .

View File

@ -122,7 +122,7 @@ gr = Generate(
# these are deprecated - use conf and model instead
weights = path to model weights ('models/ldm/stable-diffusion-v1/model.ckpt')
config = path to model configuraiton ('configs/stable-diffusion/v1-inference.yaml')
config = path to model configuration ('configs/stable-diffusion/v1-inference.yaml')
)
"""

1
tests/pr_prompt.txt Normal file
View File

@ -0,0 +1 @@
banana sushi -Ak_lms -S42 -s10