Go to file
Gregg Helt c647056287
Feat/easy param (#3504)
* Testing change to LatentsToText to allow setting different cfg_scale values per diffusion step.

* Adding first attempt at float param easing node, using Penner easing functions.

* Core implementation of ControlNet and MultiControlNet.

* Added support for ControlNet and MultiControlNet to legacy non-nodal Txt2Img in backend/generator. Although backend/generator will likely disappear by v3.x, right now they are very useful for testing core ControlNet and MultiControlNet functionality while node codebase is rapidly evolving.

* Added example of using ControlNet with legacy Txt2Img generator

* Resolving rebase conflict

* Added first controlnet preprocessor node for canny edge detection.

* Initial port of controlnet node support from generator-based TextToImageInvocation node to latent-based TextToLatentsInvocation node

* Switching to ControlField for output from controlnet nodes.

* Resolving conflicts in rebase to origin/main

* Refactored ControlNet nodes so they subclass from PreprocessedControlInvocation, and only need to override run_processor(image) (instead of reimplementing invoke())

* changes to base class for controlnet nodes

* Added HED, LineArt, and OpenPose ControlNet nodes

* Added an additional "raw_processed_image" output port to controlnets, mainly so could route ImageField to a ShowImage node

* Added more preprocessor nodes for:
      MidasDepth
      ZoeDepth
      MLSD
      NormalBae
      Pidi
      LineartAnime
      ContentShuffle
Removed pil_output options, ControlNet preprocessors should always output as PIL. Removed diagnostics and other general cleanup.

* Prep for splitting pre-processor and controlnet nodes

* Refactored controlnet nodes: split out controlnet stuff into separate node, stripped controlnet stuff form image processing/analysis nodes.

* Added resizing of controlnet image based on noise latent. Fixes a tensor mismatch issue.

* More rebase repair.

* Added support for using multiple control nets. Unfortunately this breaks direct usage of Control node output port  ==> TextToLatent control input port -- passing through a Collect node is now required. Working on fixing this...

* Fixed use of ControlNet control_weight parameter

* Fixed lint-ish formatting error

* Core implementation of ControlNet and MultiControlNet.

* Added first controlnet preprocessor node for canny edge detection.

* Initial port of controlnet node support from generator-based TextToImageInvocation node to latent-based TextToLatentsInvocation node

* Switching to ControlField for output from controlnet nodes.

* Refactored controlnet node to output ControlField that bundles control info.

* changes to base class for controlnet nodes

* Added more preprocessor nodes for:
      MidasDepth
      ZoeDepth
      MLSD
      NormalBae
      Pidi
      LineartAnime
      ContentShuffle
Removed pil_output options, ControlNet preprocessors should always output as PIL. Removed diagnostics and other general cleanup.

* Prep for splitting pre-processor and controlnet nodes

* Refactored controlnet nodes: split out controlnet stuff into separate node, stripped controlnet stuff form image processing/analysis nodes.

* Added resizing of controlnet image based on noise latent. Fixes a tensor mismatch issue.

* Cleaning up TextToLatent arg testing

* Cleaning up mistakes after rebase.

* Removed last bits of dtype and and device hardwiring from controlnet section

* Refactored ControNet support to consolidate multiple parameters into data struct. Also redid how multiple controlnets are handled.

* Added support for specifying which step iteration to start using
each ControlNet, and which step to end using each controlnet (specified as fraction of total steps)

* Cleaning up prior to submitting ControlNet PR. Mostly turning off diagnostic printing. Also fixed error when there is no controlnet input.

* Added dependency on controlnet-aux v0.0.3

* Commented out ZoeDetector. Will re-instate once there's a controlnet-aux release that supports it.

* Switched CotrolNet node modelname input from free text to default list of popular ControlNet model names.

* Fix to work with current stable release of controlnet_aux (v0.0.3). Turned of pre-processor params that were added post v0.0.3. Also change defaults for shuffle.

* Refactored most of controlnet code into its own method to declutter TextToLatents.invoke(), and make upcoming integration with LatentsToLatents easier.

* Cleaning up after ControlNet refactor in TextToLatentsInvocation

* Extended node-based ControlNet support to LatentsToLatentsInvocation.

* chore(ui): regen api client

* fix(ui): add value to conditioning field

* fix(ui): add control field type

* fix(ui): fix node ui type hints

* fix(nodes): controlnet input accepts list or single controlnet

* Moved to controlnet_aux v0.0.4, reinstated Zoe controlnet preprocessor. Also in pyproject.toml  had to specify downgrade of timm to 0.6.13 _after_ controlnet-aux installs timm >= 0.9.2, because timm >0.6.13 breaks Zoe preprocessor.

* Core implementation of ControlNet and MultiControlNet.

* Added first controlnet preprocessor node for canny edge detection.

* Switching to ControlField for output from controlnet nodes.

* Resolving conflicts in rebase to origin/main

* Refactored ControlNet nodes so they subclass from PreprocessedControlInvocation, and only need to override run_processor(image) (instead of reimplementing invoke())

* changes to base class for controlnet nodes

* Added HED, LineArt, and OpenPose ControlNet nodes

* Added more preprocessor nodes for:
      MidasDepth
      ZoeDepth
      MLSD
      NormalBae
      Pidi
      LineartAnime
      ContentShuffle
Removed pil_output options, ControlNet preprocessors should always output as PIL. Removed diagnostics and other general cleanup.

* Prep for splitting pre-processor and controlnet nodes

* Refactored controlnet nodes: split out controlnet stuff into separate node, stripped controlnet stuff form image processing/analysis nodes.

* Added resizing of controlnet image based on noise latent. Fixes a tensor mismatch issue.

* Added support for using multiple control nets. Unfortunately this breaks direct usage of Control node output port  ==> TextToLatent control input port -- passing through a Collect node is now required. Working on fixing this...

* Fixed use of ControlNet control_weight parameter

* Core implementation of ControlNet and MultiControlNet.

* Added first controlnet preprocessor node for canny edge detection.

* Initial port of controlnet node support from generator-based TextToImageInvocation node to latent-based TextToLatentsInvocation node

* Switching to ControlField for output from controlnet nodes.

* Refactored controlnet node to output ControlField that bundles control info.

* changes to base class for controlnet nodes

* Added more preprocessor nodes for:
      MidasDepth
      ZoeDepth
      MLSD
      NormalBae
      Pidi
      LineartAnime
      ContentShuffle
Removed pil_output options, ControlNet preprocessors should always output as PIL. Removed diagnostics and other general cleanup.

* Prep for splitting pre-processor and controlnet nodes

* Refactored controlnet nodes: split out controlnet stuff into separate node, stripped controlnet stuff form image processing/analysis nodes.

* Added resizing of controlnet image based on noise latent. Fixes a tensor mismatch issue.

* Cleaning up TextToLatent arg testing

* Cleaning up mistakes after rebase.

* Removed last bits of dtype and and device hardwiring from controlnet section

* Refactored ControNet support to consolidate multiple parameters into data struct. Also redid how multiple controlnets are handled.

* Added support for specifying which step iteration to start using
each ControlNet, and which step to end using each controlnet (specified as fraction of total steps)

* Cleaning up prior to submitting ControlNet PR. Mostly turning off diagnostic printing. Also fixed error when there is no controlnet input.

* Commented out ZoeDetector. Will re-instate once there's a controlnet-aux release that supports it.

* Switched CotrolNet node modelname input from free text to default list of popular ControlNet model names.

* Fix to work with current stable release of controlnet_aux (v0.0.3). Turned of pre-processor params that were added post v0.0.3. Also change defaults for shuffle.

* Refactored most of controlnet code into its own method to declutter TextToLatents.invoke(), and make upcoming integration with LatentsToLatents easier.

* Cleaning up after ControlNet refactor in TextToLatentsInvocation

* Extended node-based ControlNet support to LatentsToLatentsInvocation.

* chore(ui): regen api client

* fix(ui): fix node ui type hints

* fix(nodes): controlnet input accepts list or single controlnet

* Added Mediapipe image processor for use as ControlNet preprocessor.
Also hacked in ability to specify HF subfolder when loading ControlNet models from string.

* Fixed bug where MediapipFaceProcessorInvocation was ignoring max_faces and min_confidence params.

* Added nodes for float params: ParamFloatInvocation and FloatCollectionOutput. Also added FloatOutput.

* Added mediapipe install requirement. Should be able to remove once controlnet_aux package adds mediapipe to its requirements.

* Added float to FIELD_TYPE_MAP ins constants.ts

* Progress toward improvement in fieldTemplateBuilder.ts  getFieldType()

* Fixed controlnet preprocessors and controlnet handling in TextToLatents to work with revised Image services.

* Cleaning up from merge, re-adding cfg_scale to FIELD_TYPE_MAP

* Making sure cfg_scale of type list[float] can be used in image metadata, to support param easing for cfg_scale

* Fixed math for per-step param easing.

* Added option to show plot of param value at each step

* Just cleaning up after adding param easing plot option, removing vestigial code.

* Modified control_weight ControlNet param to be polistmorphic --
can now be either a single float weight applied for all steps, or a list of floats of size total_steps, that specifies weight for each step.

* Added more informative error message when _validat_edge() throws an error.

* Just improving parm easing bar chart title to include easing type.

* Added requirement for easing-functions package

* Taking out some diagnostic prints.

* Added option to use both easing function and mirror of easing function together.

* Fixed recently introduced problem (when pulled in main), triggered by num_steps in StepParamEasingInvocation not having a default value -- just added default.

---------

Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
2023-06-11 16:27:44 +10:00
.dev_scripts Replace --full_precision with --precision that works even if not specified 2022-09-20 17:08:00 -04:00
.github Update CODEOWNERS 2023-05-26 08:59:10 -04:00
coverage combine pytest.ini with pyproject.toml 2023-03-05 17:00:08 +00:00
docker fix Dockerfile 2023-03-04 23:51:07 +01:00
docs docs(nodes): update INVOCATIONS.md 2023-06-07 18:44:43 +10:00
installer the "restore" env variable in .bat launcher confuses pydantic 2023-06-04 22:53:46 -04:00
invokeai Feat/easy param (#3504) 2023-06-11 16:27:44 +10:00
notebooks Merge dev into main for 2.2.0 (#1642) 2022-11-30 16:12:23 -05:00
scripts merge with main 2023-06-04 13:59:31 -04:00
tests feat(nodes): add tests for depth-first execution 2023-06-09 14:53:45 +10:00
.dockerignore fix Dockerfile 2023-03-04 23:51:07 +01:00
.editorconfig Merge dev into main for 2.2.0 (#1642) 2022-11-30 16:12:23 -05:00
.git-blame-ignore-revs add .git-blame-ignore-revs file to maintain provenance 2023-03-03 16:23:48 -05:00
.gitattributes Global replace [ \t]+$, add "GB" (#1751) 2022-12-19 16:36:39 +00:00
.gitignore partial port of invokeai-configure 2023-05-16 01:50:01 -04:00
.gitmodules remove src directory, which is gumming up conda installs; addresses issue #77 2022-08-25 10:43:05 -04:00
.prettierrc.yaml change printWidth for markdown files to 80 2022-09-17 02:23:00 +02:00
CODE_OF_CONDUCT.md Merge dev into main for 2.2.0 (#1642) 2022-11-30 16:12:23 -05:00
InvokeAI_Statement_of_Values.md Add @ebr to Contributors (#2095) 2022-12-21 14:33:08 -05:00
LICENSE adding license using GitHub template 2022-10-17 12:09:24 -04:00
LICENSE-ModelWeights.txt added assertion checks for out-of-bound arguments; added various copyright and license agreement files 2022-08-24 09:22:27 -04:00
mkdocs.yml (docs) add redirects for moved pages (#2063) 2022-12-18 08:04:58 +00:00
pyproject.toml Feat/easy param (#3504) 2023-06-11 16:27:44 +10:00
README.md docs: add note on README about migration 2023-04-27 11:05:32 +10:00
shell.nix nix: add shell.nix file 2022-10-25 07:08:31 -04:00
Stable_Diffusion_v1_Model_Card.md Global replace [ \t]+$, add "GB" (#1751) 2022-12-19 16:36:39 +00:00

project logo

InvokeAI: A Stable Diffusion Toolkit

discord badge

latest release badge github stars badge github forks badge

CI checks on main badge latest commit to main badge

github open issues badge github open prs badge translation status badge

Note: The UI is not fully functional on main. If you need a stable UI based on main, use the pre-nodes tag while we migrate to a new backend.

InvokeAI is a leading creative engine built to empower professionals and enthusiasts alike. Generate and create stunning visual media using the latest AI-driven technologies. InvokeAI offers an industry leading Web Interface, interactive Command Line Interface, and also serves as the foundation for multiple commercial products.

Quick links: [How to Install] [Discord Server] [Documentation and Tutorials] [Code and Downloads] [Bug Reports] [Discussion, Ideas & Q&A]

Note: InvokeAI is rapidly evolving. Please use the Issues tab to report bugs and make feature requests. Be sure to use the provided templates. They will help us diagnose issues faster.

canvas preview

Table of Contents

  1. Quick Start
  2. Installation
  3. Hardware Requirements
  4. Features
  5. Latest Changes
  6. Troubleshooting
  7. Contributing
  8. Contributors
  9. Support
  10. Further Reading

Getting Started with InvokeAI

For full installation and upgrade instructions, please see: InvokeAI Installation Overview

Automatic Installer (suggested for 1st time users)

  1. Go to the bottom of the Latest Release Page

  2. Download the .zip file for your OS (Windows/macOS/Linux).

  3. Unzip the file.

  4. If you are on Windows, double-click on the install.bat script. On macOS, open a Terminal window, drag the file install.sh from Finder into the Terminal, and press return. On Linux, run install.sh.

  5. You'll be asked to confirm the location of the folder in which to install InvokeAI and its image generation model files. Pick a location with at least 15 GB of free memory. More if you plan on installing lots of models.

  6. Wait while the installer does its thing. After installing the software, the installer will launch a script that lets you configure InvokeAI and select a set of starting image generation models.

  7. Find the folder that InvokeAI was installed into (it is not the same as the unpacked zip file directory!) The default location of this folder (if you didn't change it in step 5) is ~/invokeai on Linux/Mac systems, and C:\Users\YourName\invokeai on Windows. This directory will contain launcher scripts named invoke.sh and invoke.bat.

  8. On Windows systems, double-click on the invoke.bat file. On macOS, open a Terminal window, drag invoke.sh from the folder into the Terminal, and press return. On Linux, run invoke.sh

  9. Press 2 to open the "browser-based UI", press enter/return, wait a minute or two for Stable Diffusion to start up, then open your browser and go to http://localhost:9090.

  10. Type banana sushi in the box on the top left and click Invoke

Command-Line Installation (for users familiar with Terminals)

You must have Python 3.9 or 3.10 installed on your machine. Earlier or later versions are not supported.

  1. Open a command-line window on your machine. The PowerShell is recommended for Windows.

  2. Create a directory to install InvokeAI into. You'll need at least 15 GB of free space:

    mkdir invokeai
    
  3. Create a virtual environment named .venv inside this directory and activate it:

    cd invokeai
    python -m venv .venv --prompt InvokeAI
    
  4. Activate the virtual environment (do it every time you run InvokeAI)

    For Linux/Mac users:

    source .venv/bin/activate
    

    For Windows users:

    .venv\Scripts\activate
    
  5. Install the InvokeAI module and its dependencies. Choose the command suited for your platform & GPU.

    For Windows/Linux with an NVIDIA GPU:

    pip install "InvokeAI[xformers]" --use-pep517 --extra-index-url https://download.pytorch.org/whl/cu117
    

    For Linux with an AMD GPU:

    pip install InvokeAI --use-pep517 --extra-index-url https://download.pytorch.org/whl/rocm5.4.2
    

    For non-GPU systems:

    pip install InvokeAI --use-pep517 --extra-index-url https://download.pytorch.org/whl/cpu
    

    For Macintoshes, either Intel or M1/M2:

    pip install InvokeAI --use-pep517
    
  6. Configure InvokeAI and install a starting set of image generation models (you only need to do this once):

    invokeai-configure
    
  7. Launch the web server (do it every time you run InvokeAI):

    invokeai --web
    
  8. Point your browser to http://localhost:9090 to bring up the web interface.

  9. Type banana sushi in the box on the top left and click Invoke.

Be sure to activate the virtual environment each time before re-launching InvokeAI, using source .venv/bin/activate or .venv\Scripts\activate.

Detailed Installation Instructions

This fork is supported across Linux, Windows and Macintosh. Linux users can use either an Nvidia-based card (with CUDA support) or an AMD card (using the ROCm driver). For full installation and upgrade instructions, please see: InvokeAI Installation Overview

Hardware Requirements

InvokeAI is supported across Linux, Windows and macOS. Linux users can use either an Nvidia-based card (with CUDA support) or an AMD card (using the ROCm driver).

System

You will need one of the following:

  • An NVIDIA-based graphics card with 4 GB or more VRAM memory.
  • An Apple computer with an M1 chip.
  • An AMD-based graphics card with 4GB or more VRAM memory. (Linux only)

We do not recommend the GTX 1650 or 1660 series video cards. They are unable to run in half-precision mode and do not have sufficient VRAM to render 512x512 images.

Memory

  • At least 12 GB Main Memory RAM.

Disk

  • At least 12 GB of free disk space for the machine learning model, Python, and all its dependencies.

Features

Feature documentation can be reviewed by navigating to the InvokeAI Documentation page

Web Server & UI

InvokeAI offers a locally hosted Web Server & React Frontend, with an industry leading user experience. The Web-based UI allows for simple and intuitive workflows, and is responsive for use on mobile devices and tablets accessing the web server.

Unified Canvas

The Unified Canvas is a fully integrated canvas implementation with support for all core generation capabilities, in/outpainting, brush tools, and more. This creative tool unlocks the capability for artists to create with AI as a creative collaborator, and can be used to augment AI-generated imagery, sketches, photography, renders, and more.

Advanced Prompt Syntax

InvokeAI's advanced prompt syntax allows for token weighting, cross-attention control, and prompt blending, allowing for fine-tuned tweaking of your invocations and exploration of the latent space.

Command Line Interface

For users utilizing a terminal-based environment, or who want to take advantage of CLI features, InvokeAI offers an extensive and actively supported command-line interface that provides the full suite of generation functionality available in the tool.

Other features

  • Support for both ckpt and diffusers models
  • SD 2.0, 2.1 support
  • Noise Control & Tresholding
  • Popular Sampler Support
  • Upscaling & Face Restoration Tools
  • Embedding Manager & Support
  • Model Manager & Support

Coming Soon

  • Node-Based Architecture & UI
  • And more...

Latest Changes

For our latest changes, view our Release Notes and the CHANGELOG.

Troubleshooting

Please check out our Q&A to get solutions for common installation problems and other issues.

Contributing

Anyone who wishes to contribute to this project, whether documentation, features, bug fixes, code cleanup, testing, or code reviews, is very much encouraged to do so.

To join, just raise your hand on the InvokeAI Discord server (#dev-chat) or the GitHub discussion board.

If you'd like to help with translation, please see our translation guide.

If you are unfamiliar with how to contribute to GitHub projects, here is a Getting Started Guide. A full set of contribution guidelines, along with templates, are in progress. You can make your pull request against the "main" branch.

We hope you enjoy using our software as much as we enjoy creating it, and we hope that some of those of you who are reading this will elect to become part of our community.

Welcome to InvokeAI!

Contributors

This fork is a combined effort of various people from across the world. Check out the list of all these amazing people. We thank them for their time, hard work and effort.

Thanks to Weblate for generously providing translation services to this project.

Support

For support, please use this repository's GitHub Issues tracking service, or join the Discord.

Original portions of the software are Copyright (c) 2023 by respective contributors.