* Testing change to LatentsToText to allow setting different cfg_scale values per diffusion step.
* Adding first attempt at float param easing node, using Penner easing functions.
* Core implementation of ControlNet and MultiControlNet.
* Added support for ControlNet and MultiControlNet to legacy non-nodal Txt2Img in backend/generator. Although backend/generator will likely disappear by v3.x, right now they are very useful for testing core ControlNet and MultiControlNet functionality while node codebase is rapidly evolving.
* Added example of using ControlNet with legacy Txt2Img generator
* Resolving rebase conflict
* Added first controlnet preprocessor node for canny edge detection.
* Initial port of controlnet node support from generator-based TextToImageInvocation node to latent-based TextToLatentsInvocation node
* Switching to ControlField for output from controlnet nodes.
* Resolving conflicts in rebase to origin/main
* Refactored ControlNet nodes so they subclass from PreprocessedControlInvocation, and only need to override run_processor(image) (instead of reimplementing invoke())
* changes to base class for controlnet nodes
* Added HED, LineArt, and OpenPose ControlNet nodes
* Added an additional "raw_processed_image" output port to controlnets, mainly so could route ImageField to a ShowImage node
* Added more preprocessor nodes for:
MidasDepth
ZoeDepth
MLSD
NormalBae
Pidi
LineartAnime
ContentShuffle
Removed pil_output options, ControlNet preprocessors should always output as PIL. Removed diagnostics and other general cleanup.
* Prep for splitting pre-processor and controlnet nodes
* Refactored controlnet nodes: split out controlnet stuff into separate node, stripped controlnet stuff form image processing/analysis nodes.
* Added resizing of controlnet image based on noise latent. Fixes a tensor mismatch issue.
* More rebase repair.
* Added support for using multiple control nets. Unfortunately this breaks direct usage of Control node output port ==> TextToLatent control input port -- passing through a Collect node is now required. Working on fixing this...
* Fixed use of ControlNet control_weight parameter
* Fixed lint-ish formatting error
* Core implementation of ControlNet and MultiControlNet.
* Added first controlnet preprocessor node for canny edge detection.
* Initial port of controlnet node support from generator-based TextToImageInvocation node to latent-based TextToLatentsInvocation node
* Switching to ControlField for output from controlnet nodes.
* Refactored controlnet node to output ControlField that bundles control info.
* changes to base class for controlnet nodes
* Added more preprocessor nodes for:
MidasDepth
ZoeDepth
MLSD
NormalBae
Pidi
LineartAnime
ContentShuffle
Removed pil_output options, ControlNet preprocessors should always output as PIL. Removed diagnostics and other general cleanup.
* Prep for splitting pre-processor and controlnet nodes
* Refactored controlnet nodes: split out controlnet stuff into separate node, stripped controlnet stuff form image processing/analysis nodes.
* Added resizing of controlnet image based on noise latent. Fixes a tensor mismatch issue.
* Cleaning up TextToLatent arg testing
* Cleaning up mistakes after rebase.
* Removed last bits of dtype and and device hardwiring from controlnet section
* Refactored ControNet support to consolidate multiple parameters into data struct. Also redid how multiple controlnets are handled.
* Added support for specifying which step iteration to start using
each ControlNet, and which step to end using each controlnet (specified as fraction of total steps)
* Cleaning up prior to submitting ControlNet PR. Mostly turning off diagnostic printing. Also fixed error when there is no controlnet input.
* Added dependency on controlnet-aux v0.0.3
* Commented out ZoeDetector. Will re-instate once there's a controlnet-aux release that supports it.
* Switched CotrolNet node modelname input from free text to default list of popular ControlNet model names.
* Fix to work with current stable release of controlnet_aux (v0.0.3). Turned of pre-processor params that were added post v0.0.3. Also change defaults for shuffle.
* Refactored most of controlnet code into its own method to declutter TextToLatents.invoke(), and make upcoming integration with LatentsToLatents easier.
* Cleaning up after ControlNet refactor in TextToLatentsInvocation
* Extended node-based ControlNet support to LatentsToLatentsInvocation.
* chore(ui): regen api client
* fix(ui): add value to conditioning field
* fix(ui): add control field type
* fix(ui): fix node ui type hints
* fix(nodes): controlnet input accepts list or single controlnet
* Moved to controlnet_aux v0.0.4, reinstated Zoe controlnet preprocessor. Also in pyproject.toml had to specify downgrade of timm to 0.6.13 _after_ controlnet-aux installs timm >= 0.9.2, because timm >0.6.13 breaks Zoe preprocessor.
* Core implementation of ControlNet and MultiControlNet.
* Added first controlnet preprocessor node for canny edge detection.
* Switching to ControlField for output from controlnet nodes.
* Resolving conflicts in rebase to origin/main
* Refactored ControlNet nodes so they subclass from PreprocessedControlInvocation, and only need to override run_processor(image) (instead of reimplementing invoke())
* changes to base class for controlnet nodes
* Added HED, LineArt, and OpenPose ControlNet nodes
* Added more preprocessor nodes for:
MidasDepth
ZoeDepth
MLSD
NormalBae
Pidi
LineartAnime
ContentShuffle
Removed pil_output options, ControlNet preprocessors should always output as PIL. Removed diagnostics and other general cleanup.
* Prep for splitting pre-processor and controlnet nodes
* Refactored controlnet nodes: split out controlnet stuff into separate node, stripped controlnet stuff form image processing/analysis nodes.
* Added resizing of controlnet image based on noise latent. Fixes a tensor mismatch issue.
* Added support for using multiple control nets. Unfortunately this breaks direct usage of Control node output port ==> TextToLatent control input port -- passing through a Collect node is now required. Working on fixing this...
* Fixed use of ControlNet control_weight parameter
* Core implementation of ControlNet and MultiControlNet.
* Added first controlnet preprocessor node for canny edge detection.
* Initial port of controlnet node support from generator-based TextToImageInvocation node to latent-based TextToLatentsInvocation node
* Switching to ControlField for output from controlnet nodes.
* Refactored controlnet node to output ControlField that bundles control info.
* changes to base class for controlnet nodes
* Added more preprocessor nodes for:
MidasDepth
ZoeDepth
MLSD
NormalBae
Pidi
LineartAnime
ContentShuffle
Removed pil_output options, ControlNet preprocessors should always output as PIL. Removed diagnostics and other general cleanup.
* Prep for splitting pre-processor and controlnet nodes
* Refactored controlnet nodes: split out controlnet stuff into separate node, stripped controlnet stuff form image processing/analysis nodes.
* Added resizing of controlnet image based on noise latent. Fixes a tensor mismatch issue.
* Cleaning up TextToLatent arg testing
* Cleaning up mistakes after rebase.
* Removed last bits of dtype and and device hardwiring from controlnet section
* Refactored ControNet support to consolidate multiple parameters into data struct. Also redid how multiple controlnets are handled.
* Added support for specifying which step iteration to start using
each ControlNet, and which step to end using each controlnet (specified as fraction of total steps)
* Cleaning up prior to submitting ControlNet PR. Mostly turning off diagnostic printing. Also fixed error when there is no controlnet input.
* Commented out ZoeDetector. Will re-instate once there's a controlnet-aux release that supports it.
* Switched CotrolNet node modelname input from free text to default list of popular ControlNet model names.
* Fix to work with current stable release of controlnet_aux (v0.0.3). Turned of pre-processor params that were added post v0.0.3. Also change defaults for shuffle.
* Refactored most of controlnet code into its own method to declutter TextToLatents.invoke(), and make upcoming integration with LatentsToLatents easier.
* Cleaning up after ControlNet refactor in TextToLatentsInvocation
* Extended node-based ControlNet support to LatentsToLatentsInvocation.
* chore(ui): regen api client
* fix(ui): fix node ui type hints
* fix(nodes): controlnet input accepts list or single controlnet
* Added Mediapipe image processor for use as ControlNet preprocessor.
Also hacked in ability to specify HF subfolder when loading ControlNet models from string.
* Fixed bug where MediapipFaceProcessorInvocation was ignoring max_faces and min_confidence params.
* Added nodes for float params: ParamFloatInvocation and FloatCollectionOutput. Also added FloatOutput.
* Added mediapipe install requirement. Should be able to remove once controlnet_aux package adds mediapipe to its requirements.
* Added float to FIELD_TYPE_MAP ins constants.ts
* Progress toward improvement in fieldTemplateBuilder.ts getFieldType()
* Fixed controlnet preprocessors and controlnet handling in TextToLatents to work with revised Image services.
* Cleaning up from merge, re-adding cfg_scale to FIELD_TYPE_MAP
* Making sure cfg_scale of type list[float] can be used in image metadata, to support param easing for cfg_scale
* Fixed math for per-step param easing.
* Added option to show plot of param value at each step
* Just cleaning up after adding param easing plot option, removing vestigial code.
* Modified control_weight ControlNet param to be polistmorphic --
can now be either a single float weight applied for all steps, or a list of floats of size total_steps, that specifies weight for each step.
* Added more informative error message when _validat_edge() throws an error.
* Just improving parm easing bar chart title to include easing type.
* Added requirement for easing-functions package
* Taking out some diagnostic prints.
* Added option to use both easing function and mirror of easing function together.
* Fixed recently introduced problem (when pulled in main), triggered by num_steps in StepParamEasingInvocation not having a default value -- just added default.
---------
Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
1. Contents of autoscan directory field are restored after doing an installation.
2. Activate dialogue to choose V2 parameterization when importing from a directory.
3. Remove autoscan directory from init file when its checkbox is unselected.
4. Add widget cycling behavior to install models form.
- Also fixed up order in which logger is created in invokeai-web
so that handlers are installed after command-line options are
parsed (and not before!)
1. Model installer works correctly under Windows 11 Terminal
2. Fixed crash when configure script hands control off to installer
3. Kill install subprocess on keyboard interrupt
4. Command-line functionality for --yes configuration and model installation
restored.
5. New command-line features:
- install/delete lists of diffusers, LoRAS, controlnets and textual inversions
using repo ids, paths or URLs.
Help:
```
usage: invokeai-model-install [-h] [--diffusers [DIFFUSERS ...]] [--loras [LORAS ...]] [--controlnets [CONTROLNETS ...]] [--textual-inversions [TEXTUAL_INVERSIONS ...]] [--delete] [--full-precision | --no-full-precision]
[--yes] [--default_only] [--list-models {diffusers,loras,controlnets,tis}] [--config_file CONFIG_FILE] [--root_dir ROOT]
InvokeAI model downloader
options:
-h, --help show this help message and exit
--diffusers [DIFFUSERS ...]
List of URLs or repo_ids of diffusers to install/delete
--loras [LORAS ...] List of URLs or repo_ids of LoRA/LyCORIS models to install/delete
--controlnets [CONTROLNETS ...]
List of URLs or repo_ids of controlnet models to install/delete
--textual-inversions [TEXTUAL_INVERSIONS ...]
List of URLs or repo_ids of textual inversion embeddings to install/delete
--delete Delete models listed on command line rather than installing them
--full-precision, --no-full-precision
use 32-bit weights instead of faster 16-bit weights (default: False)
--yes, -y answer "yes" to all prompts
--default_only only install the default model
--list-models {diffusers,loras,controlnets,tis}
list installed models
--config_file CONFIG_FILE, -c CONFIG_FILE
path to configuration file to create
--root_dir ROOT path to root of install directory
```
- The invokeai.db database file has now been moved into
`INVOKEAIROOT/databases`. Using plural here for possible
future with more than one database file.
- Removed a few dangling debug messages that appeared during
testing.
- Rebuilt frontend to test web.
1. Separated the "starter models" and "more models" sections. This
gives us room to list all installed diffuserse models, not just
those that are on the starter list.
2. Support mouse-based paste into the textboxes with either middle
or right mouse buttons.
3. Support terminal-style cursor movement:
^A to move to beginning of line
^E to move to end of line
^K kill text to right and put in killring
^Y yank text back
4. Internal code cleanup.
Problem was that controlnet support involved adding **kwargs to method calls down in denoising loop, and AddsMaskLatents didn't accept **kwarg arg. So just changed to accept and pass on **kwargs.
The problem was the same seed was getting used for the seam painting pass, causing the fried look.
Same issue as if you do img2img on a txt2img with the same seed/prompt.
Thanks to @hipsterusername for teaming up to debug this. We got pretty deep into the weeds.
This commit makes InvokeAI 3.0 to be installable via PyPi.org and the
installer script.
Main changes.
1. Move static web pages into `invokeai/frontend/web` and modify the
API to look for them there. This allows pip to copy the files into the
distribution directory so that user no longer has to be in repo root
to launch.
2. Update invoke.sh and invoke.bat to launch the new web application
properly. This also changes the wording for launching the CLI from
"generate images" to "explore the InvokeAI node system," since I would
not recommend using the CLI to generate images routinely.
3. Fix a bug in the checkpoint converter script that was identified
during testing.
4. Better error reporting when checkpoint converter fails.
5. Rebuild front end.
- Make environment variable settings case InSenSiTive:
INVOKEAI_MAX_LOADED_MODELS and InvokeAI_Max_Loaded_Models
environment variables will both set `max_loaded_models`
- Updated realesrgan to use new config system.
- Updated textual_inversion_training to use new config system.
- Discovered a race condition when InvokeAIAppConfig is created
at module load time, which makes it impossible to customize
or replace the help message produced with --help on the command
line. To fix this, moved all instances of get_invokeai_config()
from module load time to object initialization time. Makes code
cleaner, too.
- Added `--from_file` argument to `invokeai-node-cli` and changed
github action to match. CI tests will hopefully work now.
- invokeai-configure updated to work with new config system
- migrate invokeai.init to invokeai.yaml during configure
- replace legacy invokeai with invokeai-node-cli
- add ability to run an invocation directly from invokeai-node-cli command line
- update CI tests to work with new invokeai syntax
1. If an external VAE is specified in config file, then
get_model(submodel=vae) will return the external VAE, not the one
burnt into the parent diffusers pipeline.
2. The mechanism in (1) is generalized such that you can now have
"unet:", "text_encoder:" and similar stanzas in the config file.
Valid formats of these subsections:
unet:
repo_id: foo/bar
unet:
path: /path/to/local/folder
unet:
repo_id: foo/bar
subfolder: unet
In the near future, these will also be used to attach external
parts to the pipeline, generalizing VAE behavior.
3. Accommodate callers (i.e. the WebUI) that are passing the
model key ("diffusers/stable-diffusion-1.5") to get_model()
instead of the tuple of model_name and model_type.
4. Fixed bug in VAE model attaching code.
5. Rebuilt web front end.
This commit adds invokeai.backend.util.logging, which provides support
for formatted console and logfile messages that follow the status
reporting conventions of earlier InvokeAI versions.
Examples:
### A critical error (logging.CRITICAL)
*** A non-fatal error (logging.ERROR)
** A warning (logging.WARNING)
>> Informational message (logging.INFO)
| Debugging message (logging.DEBUG)
This style logs everything through a single logging object and is
identical to using Python's `logging` module. The commonly-used
module-level logging functions are implemented as simple pass-thrus
to logging:
import invokeai.backend.util.logging as ialog
ialog.debug('this is a debugging message')
ialog.info('this is a informational message')
ialog.log(level=logging.CRITICAL, 'get out of dodge')
ialog.disable(level=logging.INFO)
ialog.basicConfig(filename='/var/log/invokeai.log')
Internally, the invokeai logging module creates a new default logger
named "invokeai" so that its logging does not interfere with other
module's use of the vanilla logging module. So `logging.error("foo")`
will go through the regular logging path and not add the additional
message decorations.
For more control, the logging module's object-oriented logging style
is also supported. The API is identical to the vanilla logging
usage. In fact, the only thing that has changed is that the
getLogger() method adds a custom formatter to the log messages.
import logging
from invokeai.backend.util.logging import InvokeAILogger
logger = InvokeAILogger.getLogger(__name__)
fh = logging.FileHandler('/var/invokeai.log')
logger.addHandler(fh)
logger.critical('this will be logged to both the console and the log file')
This commit adds invokeai.backend.util.logging, which provides support
for formatted console and logfile messages that follow the status
reporting conventions of earlier InvokeAI versions.
Examples:
### A critical error (logging.CRITICAL)
*** A non-fatal error (logging.ERROR)
** A warning (logging.WARNING)
>> Informational message (logging.INFO)
| Debugging message (logging.DEBUG)
- New method is ModelManager.get_sub_model(model_name:str,model_part:SDModelComponent)
To use:
```
from invokeai.backend import ModelManager, SDModelComponent as sdmc
manager = ModelManager('/path/to/models.yaml')
vae = manager.get_sub_model('stable-diffusion-1.5', sdmc.vae)
```
This commit fixes bugs related to the on-the-fly conversion and loading of
legacy checkpoint models built on SD-2.0 base.
- When legacy checkpoints built on SD-2.0 models were converted
on-the-fly using --ckpt_convert, generation would crash with a
precision incompatibility error.
A long-standing issue with importing legacy checkpoints (both ckpt and
safetensors) is that the user has to identify the correct config file,
either by providing its path or by selecting which type of model the
checkpoint is (e.g. "v1 inpainting"). In addition, some users wish to
provide custom VAEs for use with the model. Currently this is done in
the WebUI by importing the model, editing it, and then typing in the
path to the VAE.
To improve the user experience, the model manager's
`heuristic_import()` method has been enhanced as follows:
1. When initially called, the caller can pass a config file path, in
which case it will be used.
2. If no config file provided, the method looks for a .yaml file in the
same directory as the model which bears the same basename. e.g.
```
my-new-model.safetensors
my-new-model.yaml
```
The yaml file is then used as the configuration file for
importation and conversion.
3. If no such file is found, then the method opens up the checkpoint
and probes it to determine whether it is V1, V1-inpaint or V2.
If it is a V1 format, then the appropriate v1-inference.yaml config
file is used. Unfortunately there are two V2 variants that cannot be
distinguished by introspection.
4. If the probe algorithm is unable to determine the model type, then its
last-ditch effort is to execute an optional callback function that can
be provided by the caller. This callback, named `config_file_callback`
receives the path to the legacy checkpoint and returns the path to the
config file to use. The CLI uses to put up a multiple choice prompt to
the user. The WebUI **could** use this to prompt the user to choose
from a radio-button selection.
5. If the config file cannot be determined, then the import is abandoned.
The user can attach a custom VAE to the imported and converted model
by copying the desired VAE into the same directory as the file to be
imported, and giving it the same basename. E.g.:
```
my-new-model.safetensors
my-new-model.vae.pt
```
For this to work, the VAE must end with ".vae.pt", ".vae.ckpt", or
".vae.safetensors". The indicated VAE will be converted into diffusers
format and stored with the converted models file, so the ".pt" file
can be deleted after conversion.
No facility is currently provided to swap a diffusers VAE at import
time, but this can be done after the fact using the WebUI and CLI's
model editing functions.
- This PR adds support for embedding files that contain a single key
"emb_params". The only example I know of this format is the
"EasyNegative" embedding on HuggingFace, but there are certainly
others.
- This PR also adds support for loading embedding files that have been
saved in safetensors format.
- It also cleans up the code so that the logic of probing for and
selecting the right format parser is clear.
- resolve conflicts with generate.py invocation
- remove unused symbols that pyflakes complains about
- add **untested** code for passing intermediate latent image to the
step callback in the format expected.
This PR fixes#2951 and restores the step_callback argument in the
refactored generate() method. Note that this issue states that
"something is still wrong because steps and step are zero." However,
I think this is confusion over the call signature of the callback, which
since the diffusers merge has been `callback(state:PipelineIntermediateState)`
This is the test script that I used to determine that `step` is being passed
correctly:
```
from pathlib import Path
from invokeai.backend import ModelManager, PipelineIntermediateState
from invokeai.backend.globals import global_config_dir
from invokeai.backend.generator import Txt2Img
def my_callback(state:PipelineIntermediateState, total_steps:int):
print(f'callback(step={state.step}/{total_steps})')
def main():
manager = ModelManager(Path(global_config_dir()) / "models.yaml")
model = manager.get_model('stable-diffusion-1.5')
print ('=== TXT2IMG TEST ===')
steps=30
output = next(Txt2Img(model).generate(prompt='banana sushi',
iterations=None,
steps=steps,
step_callback=lambda x: my_callback(x,steps)
)
)
print(f'image={output.image}, seed={output.seed}, steps={output.params.steps}')
if __name__=='__main__':
main()
```
This PR fixes#2951 and restores the step_callback argument in the
refactored generate() method. Note that this issue states that
"something is still wrong because steps and step are zero." However,
I think this is confusion over the call signature of the callback, which
since the diffusers merge has been `callback(state:PipelineIntermediateState)`
This is the test script that I used to determine that `step` is being passed
correctly:
```
from pathlib import Path
from invokeai.backend import ModelManager, PipelineIntermediateState
from invokeai.backend.globals import global_config_dir
from invokeai.backend.generator import Txt2Img
def my_callback(state:PipelineIntermediateState, total_steps:int):
print(f'callback(step={state.step}/{total_steps})')
def main():
manager = ModelManager(Path(global_config_dir()) / "models.yaml")
model = manager.get_model('stable-diffusion-1.5')
print ('=== TXT2IMG TEST ===')
steps=30
output = next(Txt2Img(model).generate(prompt='banana sushi',
iterations=None,
steps=steps,
step_callback=lambda x: my_callback(x,steps)
)
)
print(f'image={output.image}, seed={output.seed}, steps={output.params.steps}')
if __name__=='__main__':
main()
```
- This PR turns on pickle scanning before a legacy checkpoint file
is loaded from disk within the checkpoint_to_diffusers module.
- Also miscellaneous diagnostic message cleanup.
- When a legacy checkpoint model is loaded via --convert_ckpt and its
models.yaml stanza refers to a custom VAE path (using the 'vae:'
key), the custom VAE will be converted and used within the diffusers
model. Otherwise the VAE contained within the legacy model will be
used.
- Note that the heuristic_import() method, which imports arbitrary
legacy files on disk and URLs, will continue to default to the
the standard stabilityai/sd-vae-ft-mse VAE. This can be fixed after
the fact by editing the models.yaml stanza using the Web or CLI
UIs.
- Fixes issue #2917
- The value of png_compression was always 6, despite the value provided to the
--png_compression argument. This fixes the bug.
- It also fixes an inconsistency between the maximum range of png_compression
and the help text.
- Closes#2945
Prior to this commit, all models would be loaded with the extremely unsafe `torch.load` method, except those with the exact extension `.safetensors`. Even a change in casing (eg. `saFetensors`, `Safetensors`, etc) would cause the file to be loaded with torch.load instead of the much safer `safetensors.toch.load_file`.
If a malicious actor renamed an infected `.ckpt` to something like `.SafeTensors` or `.SAFETENSORS` an unsuspecting user would think they are loading a safe .safetensor, but would in fact be parsing an unsafe pickle file, and executing an attacker's payload. This commit fixes this vulnerability by reversing the loading-method decision logic to only use the unsafe `torch.load` when the file extension is exactly `.ckpt`.