test prompt:
"a cat sitting on a car {a dog sitting on a car}" -W 384 -H 256 -s 10 -S 12346 -A k_euler
note that substition of dog for cat is currently hard-coded (ksampler.py
line 43-44)
On the command line, the new option is --text_mask or -tm.
Example:
```
invoke> a baseball -I /path/to/still_life.png -tm orange
```
This will find the orange fruit in the still life painting and replace
it with an image of a baseball.
- In CLI: the argument is --png_compression <0..9> (-z<0..9>)
- In API, pass `compress_level` to PngWriter.save_image_and_prompt_to_png()
Compression ranges from 0 (no compression) to 9 (maximum compression).
Default value is 6 (as specified by Pillow package).
This addresses an issue first raised in #652.
- --inpaint_replace 0.X will cause inpainting to ignore what is under
the masked region with a strength ranging from 0 (don't ignore at all)
to 1.0 (ignore completely)
- sync with upstream development
- update docs
- add a `--inpaint_replace` option that fills masked regions with
latent noise. This allows radical changes to inpainted regions
at the cost of losing context.
- fix up readline, arg processing and metadata writing to accommodate
this change
- fixed bug in storage and retrieval of variations, discovered incidentally
during testing
- update documentation
- Error checks for invalid model
- Add !del_model command to invoke.py
- Add del_model() method to model_cache
- Autocompleter kept in sync with model addition/subtraction.
At step counts greater than ~75, the ksamplers start producing noisy
images when using the Karras noise schedule. This PR reverts to using
the model's own noise schedule, which eliminates the problem at the
cost of slowing convergence at lower step counts.
This PR also introduces a new CLI `--save_intermediates <n>' argument,
which will save every nth intermediate image into a subdirectory
named `intermediates/<image_prefix>'.
Addresses issue #1083.
At step counts greater than ~75, the ksamplers start producing noisy
images when using the Karras noise schedule. This PR reverts to using
the model's own noise schedule, which eliminates the problem at the
cost of slowing convergence at lower step counts.
This PR also introduces a new CLI `--save_intermediates <n>' argument,
which will save every nth intermediate image into a subdirectory
named `intermediates/<image_prefix>'.
Addresses issue #1083.
- !import_model <path/to/model/weights> will import a new model,
prompt the user for its name and description, write it to the
models.yaml file, and load it.
- !edit_model <model_name> will bring up a previously-defined model
and prompt the user to edit its descriptive fields.
Example of !import_model
<pre>
invoke> <b>!import_model models/ldm/stable-diffusion-v1/model-epoch08-float16.ckpt</b>
>> Model import in process. Please enter the values needed to configure this model:
Name for this model: <b>waifu-diffusion</b>
Description of this model: <b>Waifu Diffusion v1.3</b>
Configuration file for this model: <b>configs/stable-diffusion/v1-inference.yaml</b>
Default image width: <b>512</b>
Default image height: <b>512</b>
>> New configuration:
waifu-diffusion:
config: configs/stable-diffusion/v1-inference.yaml
description: Waifu Diffusion v1.3
height: 512
weights: models/ldm/stable-diffusion-v1/model-epoch08-float16.ckpt
width: 512
OK to import [n]? <b>y</b>
>> Caching model stable-diffusion-1.4 in system RAM
>> Loading waifu-diffusion from models/ldm/stable-diffusion-v1/model-epoch08-float16.ckpt
| LatentDiffusion: Running in eps-prediction mode
| DiffusionWrapper has 859.52 M params.
| Making attention of type 'vanilla' with 512 in_channels
| Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
| Making attention of type 'vanilla' with 512 in_channels
| Using faster float16 precision
</pre>
Example of !edit_model
<pre>
invoke> <b>!edit_model waifu-diffusion</b>
>> Editing model waifu-diffusion from configuration file ./configs/models.yaml
description: <b>Waifu diffusion v1.4beta</b>
weights: models/ldm/stable-diffusion-v1/<b>model-epoch10-float16.ckpt</b>
config: configs/stable-diffusion/v1-inference.yaml
width: 512
height: 512
>> New configuration:
waifu-diffusion:
config: configs/stable-diffusion/v1-inference.yaml
description: Waifu diffusion v1.4beta
weights: models/ldm/stable-diffusion-v1/model-epoch10-float16.ckpt
height: 512
width: 512
OK to import [n]? y
>> Caching model stable-diffusion-1.4 in system RAM
>> Loading waifu-diffusion from models/ldm/stable-diffusion-v1/model-epoch10-float16.ckpt
...
</pre>
This commit "reverts" the new API changes by extracting the old
functionality into new files.
The work is based on the commit `803a51d5adca7e6e28491fc414fd3937bee7cb79`
PngWriter regained PromptFormatter as old server used that.
`server_legacy.py` is the old server that `dream.py` used.
Finally `legacy_api.py` is what `dream.py` used to be at the mentioned
commit.
One manually run test has been added in order to be able to test
compatibility with the old API, currently just testing that the API
endpoint works the same way + the image hash is the same as it used to
be before.
- This PR enables two new commands in the invoke.py script
!models -- list the available models and their cache status
!switch <model> -- switch to the indicated model
Example:
invoke> !models
laion400m not loaded Latent Diffusion LAION400M model
stable-diffusion-1.4 active Stable Diffusion inference model version 1.4
waifu-1.3 cached Waifu anime model version 1.3
invoke> !switch waifu-1.3
>> Caching model stable-diffusion-1.4 in system RAM
>> Retrieving model waifu-1.3 from system RAM cache
The name and descriptions of the models are taken from
`config/models.yaml`. A future enhancement to `model_cache.py` will be
to enable new model stanzas to be added to the file
programmatically. This will be useful for the WebGUI.
More details:
- Use fast switching algorithm described in PR #948
- Models are selected using their configuration stanza name
given in models.yaml.
- To avoid filling up CPU RAM with cached models, this PR
implements an LRU cache that monitors available CPU RAM.
- The caching code allows the minimum value of available RAM
to be adjusted, but invoke.py does not currently have a
command-line argument that allows you to set it. The
minimum free RAM is arbitrarily set to 2 GB.
- Add optional description field to configs/models.yaml
Unrelated fixes:
- Added ">>" to CompViz model loading messages in order to make user experience
more consistent.
- When generating an image greater than defaults, will only warn about possible
VRAM filling the first time.
- Fixed bug that was causing help message to be printed twice. This involved
moving the import line for the web backend into the section where it is
called.
Coauthored by: @ArDiouscuros
- the prompt argument comes before the optional arguments
- usage statement shows 'invoke>' rather than 'invoke.py'
- use pydoc pager to help display long help message
- the prompt argument comes before the optional arguments
- usage statement shows 'invoke>' rather than 'invoke.py'
- use pydoc pager to help display long help message
- rename dream.py to invoke.py
- create a compatibility script named dream.py that execs() invoke.py
- redo documentation
- change help message in args
- this does **not** rename the libraries, which are still ldm.dream.util, etc
This reverts commit 5f42d08945.
This fix was intended to solve issue #939, in which ESRGAN generates
dark images when upscaling 4X on certain GTX cards. However, the fix
apparently causes conflicts with some versions of the ESRGAN library,
and this fix will have to wait until after release of 2.0.
- txt2img2img back to using DDIM as img2img sampler; results produced
by some k* samplers are just not reliable enough for good user
experience
- img2img progress message clarifies why img2img steps taken != steps requested
- warn of potential problems when user tries to run img2img on a small init image
- Added support for pyreadline3 so that Window users can benefit.
- Added the !search command to search the history for a matching string:
~~~
!search puppies
[20] puppies at the food bowl -Ak_lms
[54] house overrun by hungry puppies -C20 -s100
~~~
- Added the !clear command to clear the in-memory and on-disk
command history.
- embiggen needs to use ddim sampler due to low step count
- --hires_fix option needs to be written to log and command string
- fix call signature of _init_image_mask()
- img2img confirmed working with all samplers
- inpainting working on ddim & plms. Changes to k-diffusion
module seem to be needed for inpainting support.
- switched k-diffuser noise schedule to original karras schedule,
which reduces the step number needed for good results
-if readline.set_auto_history() is not implemented, as in pyreadline3, will fall
back gracefully to automatic history saving. The only issue with this is that
-!history commands will be recorded in the history.
-!fetch on missing file no longer crashes script
-!history is now one of the autocomplete commands
-.dream_history now stored in output directory rather than ~user directory.
An important limitation of the last feature is that the history is
loaded and saved to the .dream_history file in the --outdir directory
specified at script launch time. It is not swapped around when the
--outdir is changed during the session.
Add message about interpolation size
Fix crash if sampler not set to DDIM, change parameter name to hires_fix
Hi res mode fix duplicates with img2img scaling
-if readline.set_auto_history() is not implemented, as in pyreadline3, will fall
back gracefully to automatic history saving. The only issue with this is that
-!history commands will be recorded in the history.
-!fetch on missing file no longer crashes script
-!history is now one of the autocomplete commands
-.dream_history now stored in output directory rather than ~user directory.
An important limitation of the last feature is that the history is
loaded and saved to the .dream_history file in the --outdir directory
specified at script launch time. It is not swapped around when the
--outdir is changed during the session.
- When --save_orig *not* provided during image generation with
upscaling/face fixing, an extra image file was being created. This
PR fixes the problem.
- Also generalizes the tab autocomplete for image paths such that
autocomplete searches the output directory for all path-modifying
options except for --outdir.
- normalized how filenames are written out when postprocessing invoked
- various fixes of bugs encountered during testing
- updated documentation
- updated help text
- Enhance tab completion functionality
- Each of the switches that read a filepath (e.g. --init_img) will trigger file path completion. The
-S switch will display a list of recently-used seeds.
- Added new !fetch command to retrieve the metadata from a previously-generated image and populate the
readline linebuffer with the appropriate editable command to regenerate.
- Added new !history command to display previous commands and reload them for modification.
- The !fetch and !fix commands both autocomplete *and* search automatically through the current
outdir for files.
- The completer maintains a list of recently used seeds and will try to autocomplete them.
- normalized how filenames are written out when postprocessing invoked
- various fixes of bugs encountered during testing
- updated documentation
- updated help text
- Enhance tab completion functionality
- Each of the switches that read a filepath (e.g. --init_img) will trigger file path completion. The
-S switch will display a list of recently-used seeds.
- Added new !fetch command to retrieve the metadata from a previously-generated image and populate the
readline linebuffer with the appropriate editable command to regenerate.
- Added new !history command to display previous commands and reload them for modification.
- The !fetch and !fix commands both autocomplete *and* search automatically through the current
outdir for files.
- The completer maintains a list of recently used seeds and will try to autocomplete them.
- args.py will now attempt to return a metadata-containing Args
object using the following methods:
1. By looking for the 'sd-metadata' tag in the PNG info
2. By looking from the 'Dream' tag
3. As a last resort, fetch the seed from the filename and assume
defaults for all other options.
Build the base generator in same place and way as other generators to reduce the chance of missed arguments in the future.
Fixes crash with display in-progress images, though note the feature still doesn't work for other reasons.
1. Add ldm/dream/restoration/__init__.py file that was inadvertently not
committed earlier.
2. Add '.' to sys.path to address weird mac problem reported in #723
- Adapted from PR #489, author Dominic Letz [https://github.com/dominicletz]
- Too many upstream changes to merge, so frankensteined it in.
- Added support for !fix syntax
- Added documentation
- The seed printed needs to be the one generated prior to the
initial noising operation. To do this, I added a new "first_seed"
argument to the image callback in dream.py.
- Closes#641
- modify strength of embiggen to reduce tiling ghosts
- normalize naming of postprocessed files (could improve more to avoid
name collisions)
- move restoration modules under ldm.dream
- supports gfpgan, esrgan, codeformer and embiggen
- To use:
dream> !fix ./outputs/img-samples/000056.292144555.png -ft gfpgan -U2 -G0.8
dream> !fix ./outputs/img-samples/000056.292144555.png -ft codeformer -G 0.8
dream> !fix ./outputs/img-samples/000056.29214455.png -U4
dream> !fix ./outputs/img-samples/000056.292144555.png -embiggen 1.5
The first example invokes gfpgan to fix faces and esrgan to upscale.
The second example invokes codeformer to fix faces, no upscaling
The third example uses esrgan to upscale 4X
The four example runs embiggen to enlarge 1.5X
- This is very preliminary work. There are some anomalies to note:
1. The syntax is non-obvious. I would prefer something like:
!fix esrgan,gfpgan
!fix esrgan
!fix embiggen,codeformer
However, this will require refactoring the gfpgan and embiggen
code.
2. Images generated using gfpgan, esrgan or codeformer all are named
"xxxxxx.xxxxxx.postprocessed.png" and the original is saved.
However, the prefix is a new one that is not related to the
original.
3. Images generated using embiggen are named "xxxxx.xxxxxxx.png",
and once again the prefix is new. I'm not sure whether the
prefix should be aligned with the original file's prefix or not.
Probably not, but opinions welcome.
Allowed values are 'auto', 'float32', 'autocast', 'float16'. If not specified or 'auto' a working precision is automatically selected based on the torch device.
Context: #526
Deprecated --full_precision / -F
Tested on both cuda and cpu by calling scripts/dream.py without arguments and checked the auto configuration worked. With --precision=auto/float32/autocast/float16 it performs as expected, either working or failing with a reasonable error. Also checked Img2Img.
- modify strength of embiggen to reduce tiling ghosts
- normalize naming of postprocessed files (could improve more to avoid
name collisions)
- move restoration modules under ldm.dream
- supports gfpgan, esrgan, codeformer and embiggen
- To use:
dream> !fix ./outputs/img-samples/000056.292144555.png -ft gfpgan -U2 -G0.8
dream> !fix ./outputs/img-samples/000056.292144555.png -ft codeformer -G 0.8
dream> !fix ./outputs/img-samples/000056.29214455.png -U4
dream> !fix ./outputs/img-samples/000056.292144555.png -embiggen 1.5
The first example invokes gfpgan to fix faces and esrgan to upscale.
The second example invokes codeformer to fix faces, no upscaling
The third example uses esrgan to upscale 4X
The four example runs embiggen to enlarge 1.5X
- This is very preliminary work. There are some anomalies to note:
1. The syntax is non-obvious. I would prefer something like:
!fix esrgan,gfpgan
!fix esrgan
!fix embiggen,codeformer
However, this will require refactoring the gfpgan and embiggen
code.
2. Images generated using gfpgan, esrgan or codeformer all are named
"xxxxxx.xxxxxx.postprocessed.png" and the original is saved.
However, the prefix is a new one that is not related to the
original.
3. Images generated using embiggen are named "xxxxx.xxxxxxx.png",
and once again the prefix is new. I'm not sure whether the
prefix should be aligned with the original file's prefix or not.
Probably not, but opinions welcome.
* Support color correction for img2img and inpainting, avoiding the shift to magenta seen when running images through img2img repeatedly.
* Fix docs for color correction
* add --init_color to prompt reconstruction
* For best results, the --init_color option should point to the *very first* image used in the sequence of img2img operations. Otherwise color correction will skew towards cyan.
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
Fixes:
File "stable-diffusion/ldm/modules/diffusionmodules/model.py", line 37, in nonlinearity
return x*torch.sigmoid(x)
RuntimeError: CUDA out of memory. Tried to allocate 1.56 GiB [..]
Now up to 1536x1280 is possible on 8GB VRAM.
Also remove unused SiLU class.
Apply ~6% speedup by moving * self.scale to earlier on a smaller tensor.
When we have enough VRAM don't make a useless zeros tensor.
Switch between cuda/mps/cpu based on q.device.type to allow cleaner per architecture future optimizations.
For cuda and cpu keep VRAM usage and faster slicing consistent.
For cpu use smaller slices. Tested ~20% faster on i7, 9.8 to 7.7 s/it.
Fix = typo to self.mem_total >= 8 in einsum_op_mps_v2 as per #582 discussion.
- fixes no closing quote in pretty-printed dream_prompt string
- removes unecessary -f switch when txt2img used
In addition, this commit does an experimental commenting-out of the
random.seed() call in the variation-generating part of ldm.dream.generator.base.
This fixes the problem of two calls that use the same seed and -v0.1
generating different images (#641). However, it does not fix the issue
of two images generated using the same seed and -VXXXXXX being
different.
* Implements rudimentary api
* Fixes blocking in API
* Adds UI to monorepo > src/frontend/
* Updates frontend/README
* Reverts conda env name to `ldm`
* Fixes environment yamls
* CORS config for testing
* Fixes LogViewer position
* API WID
* Adds actions to image viewer
* Increases vite chunkSizeWarningLimit to 1500
* Implements init image
* Implements state persistence in localStorage
* Improve progress data handling
* Final build
* Fixes mimetypes error on windows
* Adds error logging
* Fixes bugged img2img strength component
* Adds sourcemaps to dev build
* Fixes missing key
* Changes connection status indicator to text
* Adds ability to serve other hosts than localhost
* Adding Flask API server
* Removes source maps from config
* Fixes prop transfer
* Add missing packages and add CORS support
* Adding API doc
* Remove defaults from openapi doc
* Adds basic error handling for server config query
* Mostly working socket.io implementation.
* Fixes bug preventing mask upload
* Fixes bug with sampler name not written to metadata
* UI Overhaul, numerous fixes
Co-authored-by: Kyle Schouviller <kyle0654@hotmail.com>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* Feature complete for #266 with exception of several small deviations:
1. initial image and model weight hashes use full sha256 hash rather than first 8 digits
2. Initialization parameters for post-processing steps not provided
3. Uses top-level "images" tags for both a single image and a grid of images. This change was suggested in a comment.
* Added scripts/sd_metadata.py to retrieve and print metadata from PNG files
* New ldm.dream.args.Args class is a namespace like object which holds all defaults and can be modified during exection to hold current settings.
* Modified dream.py and server.py to accommodate Args class.
This change makes it so any API clients can show the same error as what
happens in the terminal where you run the API. Useful for various WebUIs
to display more helpful error messages to users.
Co-authored-by: CapableWeb <capableweb@domain.com>