From d12ae3bab0719c1e9f5aa925768407012538a97c Mon Sep 17 00:00:00 2001 From: Damian at mba Date: Mon, 24 Oct 2022 14:58:38 +0200 Subject: [PATCH] documentation for new prompt syntax --- docs/features/PROMPTS.md | 42 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 42 insertions(+) diff --git a/docs/features/PROMPTS.md b/docs/features/PROMPTS.md index b5ef26858b..8fdb97b7b8 100644 --- a/docs/features/PROMPTS.md +++ b/docs/features/PROMPTS.md @@ -84,6 +84,48 @@ Getting close - but there's no sense in having a saddle when our horse doesn't h --- +## **Prompt Syntax Features** + +The InvokeAI prompting language has the following features: + +### Attention weighting +Append a word or phrase with `-` or `+`, or a weight between `0` and `2` (`1`=default), to decrease or increase "attention" (= a mix of per-token CFG weighting multiplier and, for `-`, a weighted blend with the prompt without the term). + +The following will be recognised: + * single words without parentheses: `a tall thin man picking apricots+` + * single or multiple words with parentheses: `a tall thin man picking (apricots)+` `a tall thin man picking (apricots)-` `a tall thin man (picking apricots)+` `a tall thin man (picking apricots)-` + * more effect with more symbols `a tall thin man (picking apricots)++` + * nesting `a tall thin man (picking apricots+)++` (`apricots` effectively gets `+++`) + * all of the above with explicit numbers `a tall thin man picking (apricots)1.1` `a tall thin man (picking (apricots)1.3)1.1`. (`+` is equivalent to 1.1, `++` is pow(1.1,2), `+++` is pow(1.1,3), etc; `-` means 0.9, `--` means pow(0.9,2), etc.) + * attention also applies to `[unconditioning]` so `a tall thin man picking apricots [(ladder)0.01]` will *very gently* nudge SD away from trying to draw the man on a ladder + +### Blending between prompts + +* `("a tall thin man picking apricots", "a tall thin man picking pears").blend(1,1)` +* The existing prompt blending using `:` will continue to be supported - `("a tall thin man picking apricots", "a tall thin man picking pears").blend(1,1)` is equivalent to `a tall thin man picking apricots:1 a tall thin man picking pears:1` in the old syntax. +* Attention weights can be nested inside blends. +* Non-normalized blends are supported by passing `no_normalize` as an additional argument to the blend weights, eg `("a tall thin man picking apricots", "a tall thin man picking pears").blend(1,-1,no_normalize)`. very fun to explore local maxima in the feature space, but also easy to produce garbage output. + +See the section below on "Prompt Blending" for more information about how this works. + +### Cross-Attention Control ('prompt2prompt') + +Denoise with a given prompt and then re-use the attention→pixel maps to substitute words in the original prompt for words in a new prompt. Based off [bloc97's colab](https://github.com/bloc97/CrossAttentionControl). + +* `a ("fluffy cat").swap("smiling dog") eating a hotdog`. + * quotes optional: `a (fluffy cat).swap(smiling dog) eating a hotdog`. + * for single word substitutions parentheses are also optional: `a cat.swap(dog) eating a hotdog`. +* Supports options `s_start`, `s_end`, `t_start`, `t_end` (each 0-1) loosely corresponding to bloc97's `prompt_edit_spatial_start/_end` and `prompt_edit_tokens_start/_end` but with the math swapped to make it easier to intuitively understand. + * Example usage:`a (cat).swap(dog, s_end=0.3) eating a hotdog` - the `s_end` argument means that the "spatial" (self-attention) edit will stop having any effect after 30% (=0.3) of the steps have been done, leaving Stable Diffusion with 70% of the steps where it is free to decide for itself how to reshape the cat-form into a dog form. + * The numbers represent a percentage through the step sequence where the edits should happen. 0 means the start (noisy starting image), 1 is the end (final image). + * For img2img, the step sequence does not start at 0 but instead at (1-strength) - so if strength is 0.7, s_start and s_end must both be greater than 0.3 (1-0.7) to have any effect. +* Convenience option `shape_freedom` (0-1) to specify how much "freedom" Stable Diffusion should have to change the shape of the subject being swapped. + * `a (cat).swap(dog, shape_freedom=0.5) eating a hotdog`. + +### Escaping parantheses () and speech marks "" + +If the model you are using has parentheses () or speech marks "" as part of its syntax, you will need to "escape" these using a backslash, so that`(my_keyword)` becomes `\(my_keyword\)`. Otherwise, the prompt parser will attempt to interpret the parentheses as part of the prompt syntax and it will get confused. + ## **Prompt Blending** You may blend together different sections of the prompt to explore the