Bring main back into a consistent state with other branches

- Due to misuse of rebase command, main was transiently
  in an inconsistent state.

- This repairs the damage, and adds a few post-release
  patches that ensure stable conda installs on Mac and Windows.
This commit is contained in:
Lincoln Stein
2022-11-03 15:44:06 -04:00
370 changed files with 32709 additions and 9854 deletions

View File

@ -45,7 +45,7 @@ Here's a prompt that depicts what it does.
original prompt:
`#!bash "A fantastical translucent poney made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve" -s 20 -W 512 -H 768 -C 7.5 -A k_euler_a -S 1654590180`
`#!bash "A fantastical translucent pony made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve" -s 20 -W 512 -H 768 -C 7.5 -A k_euler_a -S 1654590180`
<figure markdown>
![step1](../assets/negative_prompt_walkthru/step1.png)
@ -84,6 +84,109 @@ Getting close - but there's no sense in having a saddle when our horse doesn't h
---
## **Prompt Syntax Features**
The InvokeAI prompting language has the following features:
### Attention weighting
Append a word or phrase with `-` or `+`, or a weight between `0` and `2` (`1`=default), to decrease or increase "attention" (= a mix of per-token CFG weighting multiplier and, for `-`, a weighted blend with the prompt without the term).
The following syntax is recognised:
* single words without parentheses: `a tall thin man picking apricots+`
* single or multiple words with parentheses: `a tall thin man picking (apricots)+` `a tall thin man picking (apricots)-` `a tall thin man (picking apricots)+` `a tall thin man (picking apricots)-`
* more effect with more symbols `a tall thin man (picking apricots)++`
* nesting `a tall thin man (picking apricots+)++` (`apricots` effectively gets `+++`)
* all of the above with explicit numbers `a tall thin man picking (apricots)1.1` `a tall thin man (picking (apricots)1.3)1.1`. (`+` is equivalent to 1.1, `++` is pow(1.1,2), `+++` is pow(1.1,3), etc; `-` means 0.9, `--` means pow(0.9,2), etc.)
* attention also applies to `[unconditioning]` so `a tall thin man picking apricots [(ladder)0.01]` will *very gently* nudge SD away from trying to draw the man on a ladder
You can use this to increase or decrease the amount of something. Starting from this prompt of `a man picking apricots from a tree`, let's see what happens if we increase and decrease how much attention we want Stable Diffusion to pay to the word `apricots`:
![an AI generated image of a man picking apricots from a tree](../assets/prompt_syntax/apricots-0.png)
Using `-` to reduce apricot-ness:
| `a man picking apricots- from a tree` | `a man picking apricots-- from a tree` | `a man picking apricots--- from a tree` |
| -- | -- | -- |
| ![an AI generated image of a man picking apricots from a tree, with smaller apricots](../assets/prompt_syntax/apricots--1.png) | ![an AI generated image of a man picking apricots from a tree, with even smaller and fewer apricots](../assets/prompt_syntax/apricots--2.png) | ![an AI generated image of a man picking apricots from a tree, with very few very small apricots](../assets/prompt_syntax/apricots--3.png) |
Using `+` to increase apricot-ness:
| `a man picking apricots+ from a tree` | `a man picking apricots++ from a tree` | `a man picking apricots+++ from a tree` | `a man picking apricots++++ from a tree` | `a man picking apricots+++++ from a tree` |
| -- | -- | -- | -- | -- |
| ![an AI generated image of a man picking apricots from a tree, with larger, more vibrant apricots](../assets/prompt_syntax/apricots-1.png) | ![an AI generated image of a man picking apricots from a tree with even larger, even more vibrant apricots](../assets/prompt_syntax/apricots-2.png) | ![an AI generated image of a man picking apricots from a tree, but the man has been replaced by a pile of apricots](../assets/prompt_syntax/apricots-3.png) | ![an AI generated image of a man picking apricots from a tree, but the man has been replaced by a mound of giant melting-looking apricots](../assets/prompt_syntax/apricots-4.png) | ![an AI generated image of a man picking apricots from a tree, but the man and the leaves and parts of the ground have all been replaced by giant melting-looking apricots](../assets/prompt_syntax/apricots-5.png) |
You can also change the balance between different parts of a prompt. For example, below is a `mountain man`:
![an AI generated image of a mountain man](../assets/prompt_syntax/mountain-man.png)
And here he is with more mountain:
| `mountain+ man` | `mountain++ man` | `mountain+++ man` |
| -- | -- | -- |
| ![](../assets/prompt_syntax/mountain1-man.png) | ![](../assets/prompt_syntax/mountain2-man.png) | ![](../assets/prompt_syntax/mountain3-man.png) |
Or, alternatively, with more man:
| `mountain man+` | `mountain man++` | `mountain man+++` | `mountain man++++` |
| -- | -- | -- | -- |
| ![](../assets/prompt_syntax/mountain-man1.png) | ![](../assets/prompt_syntax/mountain-man2.png) | ![](../assets/prompt_syntax/mountain-man3.png) | ![](../assets/prompt_syntax/mountain-man4.png) |
### Blending between prompts
* `("a tall thin man picking apricots", "a tall thin man picking pears").blend(1,1)`
* The existing prompt blending using `:<weight>` will continue to be supported - `("a tall thin man picking apricots", "a tall thin man picking pears").blend(1,1)` is equivalent to `a tall thin man picking apricots:1 a tall thin man picking pears:1` in the old syntax.
* Attention weights can be nested inside blends.
* Non-normalized blends are supported by passing `no_normalize` as an additional argument to the blend weights, eg `("a tall thin man picking apricots", "a tall thin man picking pears").blend(1,-1,no_normalize)`. very fun to explore local maxima in the feature space, but also easy to produce garbage output.
See the section below on "Prompt Blending" for more information about how this works.
### Cross-Attention Control ('prompt2prompt')
Sometimes an image you generate is almost right, and you just want to
change one detail without affecting the rest. You could use a photo editor and inpainting
to overpaint the area, but that's a pain. Here's where `prompt2prompt`
comes in handy.
Generate an image with a given prompt, record the seed of the image,
and then use the `prompt2prompt` syntax to substitute words in the
original prompt for words in a new prompt. This works for `img2img` as well.
* `a ("fluffy cat").swap("smiling dog") eating a hotdog`.
* quotes optional: `a (fluffy cat).swap(smiling dog) eating a hotdog`.
* for single word substitutions parentheses are also optional: `a cat.swap(dog) eating a hotdog`.
* Supports options `s_start`, `s_end`, `t_start`, `t_end` (each 0-1) loosely corresponding to bloc97's `prompt_edit_spatial_start/_end` and `prompt_edit_tokens_start/_end` but with the math swapped to make it easier to intuitively understand.
* Example usage:`a (cat).swap(dog, s_end=0.3) eating a hotdog` - the `s_end` argument means that the "spatial" (self-attention) edit will stop having any effect after 30% (=0.3) of the steps have been done, leaving Stable Diffusion with 70% of the steps where it is free to decide for itself how to reshape the cat-form into a dog form.
* The numbers represent a percentage through the step sequence where the edits should happen. 0 means the start (noisy starting image), 1 is the end (final image).
* For img2img, the step sequence does not start at 0 but instead at (1-strength) - so if strength is 0.7, s_start and s_end must both be greater than 0.3 (1-0.7) to have any effect.
* Convenience option `shape_freedom` (0-1) to specify how much "freedom" Stable Diffusion should have to change the shape of the subject being swapped.
* `a (cat).swap(dog, shape_freedom=0.5) eating a hotdog`.
The `prompt2prompt` code is based off [bloc97's
colab](https://github.com/bloc97/CrossAttentionControl).
Note that `prompt2prompt` is not currently working with the runwayML
inpainting model, and may never work due to the way this model is set
up. If you attempt to use `prompt2prompt` you will get the original
image back. However, since this model is so good at inpainting, a
good substitute is to use the `clipseg` text masking option:
```
invoke> a fluffy cat eating a hotdot
Outputs:
[1010] outputs/000025.2182095108.png: a fluffy cat eating a hotdog
invoke> a smiling dog eating a hotdog -I 000025.2182095108.png -tm cat
```
### Escaping parantheses () and speech marks ""
If the model you are using has parentheses () or speech marks "" as
part of its syntax, you will need to "escape" these using a backslash,
so that`(my_keyword)` becomes `\(my_keyword\)`. Otherwise, the prompt
parser will attempt to interpret the parentheses as part of the prompt
syntax and it will get confused.
## **Prompt Blending**
You may blend together different sections of the prompt to explore the