Go to file
Ryan Dick 87261bdbc9
FLUX memory management improvements (#6791)
## Summary

This PR contains several improvements to memory management for FLUX
workflows.

It is now possible to achieve better FLUX model caching performance, but
this still requires users to manually configure their `ram`/`vram`
settings. E.g. a `vram` setting of 16.0 should allow for all quantized
FLUX models to be kept in memory on the GPU.

Changes:
- Check the size of a model on disk and free the requisite space in the
model cache before loading it. (This behaviour existed previously, but
was removed in https://github.com/invoke-ai/InvokeAI/pull/6072/files.
The removal did not seem to be intentional).
- Removed the hack to free 24GB of space in the cache before loading the
FLUX model.
- Split the T5 embedding and CLIP embedding steps into separate
functions so that the two models don't both have to be held in RAM at
the same time.
- Fix a bug in `InvokeLinear8bitLt` that was causing some tensors to be
left on the GPU when the model was offloaded to the CPU. (This class is
getting very messy due to the non-standard state_dict handling in
`bnb.nn.Linear8bitLt`. )
- Tidy up some dtype handling in FluxTextToImageInvocation to avoid
situations where we hold references to two copies of the same tensor
unnecessarily.
- (minor) Misc cleanup of ModelCache: improve docs and remove unused
vars.

Future:
We should revisit our default ram/vram configs. The current defaults are
very conservative, and users could see major performance improvements
from tuning these values.

## QA Instructions

I tested the FLUX workflow with the following configurations and
verified that the cache hit rates and memory usage matched the expected
behaviour:
- `ram = 16` and `vram = 16`
- `ram = 16` and `vram = 1`
- `ram = 1` and `vram = 1`

Note that the changes in this PR are not isolated to FLUX. Since we now
check the size of models on disk, we may see slight changes in model
cache offload patterns for other models as well.

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
2024-08-29 15:17:45 -04:00
.dev_scripts Apply black 2023-07-27 10:54:01 -04:00
.github Update macos test vm to macOS-14 2024-08-26 20:17:50 -04:00
coverage combine pytest.ini with pyproject.toml 2023-03-05 17:00:08 +00:00
docker fix(docs): follow-up docker readme fixes 2024-08-22 11:19:07 -04:00
docs Warn on invalid model configs in the DB rather than crashing. 2024-07-11 21:05:55 -04:00
installer Fix invoke.sh not detecting symlinks 2024-08-16 10:40:59 +10:00
invokeai Tidy variable management and dtype handling in FluxTextToImageInvocation. 2024-08-29 19:08:18 +00:00
scripts fix(app): openapi schema generation 2024-05-30 12:03:03 +10:00
tests Update HF download logic to work for black-forest-labs/FLUX.1-schnell. 2024-08-26 20:17:50 -04:00
.dockerignore Update dockerignore, set venv to 3.10, pass cache to yarn vite buidl 2023-07-12 16:51:15 -04:00
.editorconfig Merge dev into main for 2.2.0 (#1642) 2022-11-30 16:12:23 -05:00
.git-blame-ignore-revs (meta) hide the 'black' formatting commit from git blame 2023-07-27 11:29:22 -04:00
.gitattributes Enforce Unix line endings in container (#4990) 2023-10-30 12:34:30 -04:00
.gitignore feat: no frontend build in repo 2023-12-11 12:30:13 +11:00
.gitmodules remove src directory, which is gumming up conda installs; addresses issue #77 2022-08-25 10:43:05 -04:00
.pre-commit-config.yaml Adding isort GHA and pre-commit hooks 2023-09-12 13:01:58 -04:00
.prettierrc.yaml feat: automated releases via github action 2024-02-29 21:57:20 -05:00
InvokeAI_Statement_of_Values.md Add @ebr to Contributors (#2095) 2022-12-21 14:33:08 -05:00
LICENSE Update LICENSE 2023-07-05 23:46:27 -04:00
LICENSE-SD1+SD2.txt updated LICENSE files and added information about watermarking 2023-07-26 17:27:33 -04:00
LICENSE-SDXL.txt updated LICENSE files and added information about watermarking 2023-07-26 17:27:33 -04:00
Makefile fix(app): openapi schema generation 2024-05-30 12:03:03 +10:00
README.md docs: overhaul Docker documentation, add to main README 2024-07-09 09:47:29 -04:00
Stable_Diffusion_v1_Model_Card.md Global replace [ \t]+$, add "GB" (#1751) 2022-12-19 16:36:39 +00:00
flake.lock Add Nix Flake for development, which uses Python virtualenv. 2023-07-31 19:14:30 +10:00
flake.nix fix: flake: add opencv with CUDA, new patchmatch dependency. 2023-08-01 23:56:41 +10:00
mkdocs.yml docs: merge INSTALL_TROUBLESHOOTING into FAQ 2024-03-27 18:59:55 +05:30
pyproject.toml build: remove broken scripts 2024-08-27 22:01:45 +10:00

README.md

project hero

Invoke - Professional Creative AI Tools for Visual Media

To learn more about Invoke, or implement our Business solutions, visit invoke.com

discord badge latest release badge github stars badge github forks badge CI checks on main badge latest commit to main badge github open issues badge github open prs badge translation status badge

Invoke is a leading creative engine built to empower professionals and enthusiasts alike. Generate and create stunning visual media using the latest AI-driven technologies. Invoke offers an industry leading web-based UI, and serves as the foundation for multiple commercial products.

Invoke is available in two editions:

Community Edition Professional Edition
For users looking for a locally installed, self-hosted and self-managed service For users or teams looking for a cloud-hosted, fully managed service
- Free to use under a commercially-friendly license - Monthly subscription fee with three different plan levels
- Download and install on compatible hardware - Offers additional benefits, including multi-user support, improved model training, and more
- Includes all core studio features: generate, refine, iterate on images, and build workflows - Hosted in the cloud for easy, secure model access and scalability
Quick Start -> Installation and Updates More Information -> www.invoke.com/pricing

Highlighted Features - Canvas and Workflows

Documentation

Quick Links
Installation and Updates - Documentation and Tutorials - Bug Reports - Contributing

Quick Start

  1. Download and unzip the installer from the bottom of the latest release.

  2. Run the installer script.

    • Windows: Double-click on the install.bat script.
    • macOS: Open a Terminal window, drag the file install.sh from Finder into the Terminal, and press enter.
    • Linux: Run install.sh.
  3. When prompted, enter a location for the install and select your GPU type.

  4. Once the install finishes, find the directory you selected during install. The default location is C:\Users\Username\invokeai for Windows or ~/invokeai for Linux/macOS.

  5. Run the launcher script (invoke.bat for Windows, invoke.sh for macOS and Linux) the same way you ran the installer script in step 2.

  6. Select option 1 to start the application. Once it starts up, open your browser and go to http://localhost:9090.

  7. Open the model manager tab to install a starter model and then you'll be ready to generate.

More detail, including hardware requirements and manual install instructions, are available in the installation documentation.

Docker Container

We publish official container images in Github Container Registry: https://github.com/invoke-ai/InvokeAI/pkgs/container/invokeai. Both CUDA and ROCm images are available. Check the above link for relevant tags.

[!IMPORTANT] Ensure that Docker is set up to use the GPU. Refer to NVIDIA or AMD documentation.

Generate!

Run the container, modifying the command as necessary:

docker run --runtime=nvidia --gpus=all --publish 9090:9090 ghcr.io/invoke-ai/invokeai

Then open http://localhost:9090 and install some models using the Model Manager tab to begin generating.

For ROCm, add --device /dev/kfd --device /dev/dri to the docker run command.

Persist your data

You will likely want to persist your workspace outside of the container. Use the --volume /home/myuser/invokeai:/invokeai flag to mount some local directory (using its absolute path) to the /invokeai path inside the container. Your generated images and models will reside there. You can use this directory with other InvokeAI installations, or switch between runtime directories as needed.

DIY

Build your own image and customize the environment to match your needs using our docker-compose stack. See README.md in the docker directory.

Troubleshooting, FAQ and Support

Please review our FAQ for solutions to common installation problems and other issues.

For more help, please join our Discord.

Features

Full details on features can be found in our documentation.

Web Server & UI

Invoke runs a locally hosted web server & React UI with an industry-leading user experience.

Unified Canvas

The Unified Canvas is a fully integrated canvas implementation with support for all core generation capabilities, in/out-painting, brush tools, and more. This creative tool unlocks the capability for artists to create with AI as a creative collaborator, and can be used to augment AI-generated imagery, sketches, photography, renders, and more.

Workflows & Nodes

Invoke offers a fully featured workflow management solution, enabling users to combine the power of node-based workflows with the easy of a UI. This allows for customizable generation pipelines to be developed and shared by users looking to create specific workflows to support their production use-cases.

Invoke features an organized gallery system for easily storing, accessing, and remixing your content in the Invoke workspace. Images can be dragged/dropped onto any Image-base UI element in the application, and rich metadata within the Image allows for easy recall of key prompts or settings used in your workflow.

Other features

  • Support for both ckpt and diffusers models
  • SD1.5, SD2.0, and SDXL support
  • Upscaling Tools
  • Embedding Manager & Support
  • Model Manager & Support
  • Workflow creation & management
  • Node-Based Architecture

Contributing

Anyone who wishes to contribute to this project - whether documentation, features, bug fixes, code cleanup, testing, or code reviews - is very much encouraged to do so.

Get started with contributing by reading our contribution documentation, joining the #dev-chat or the GitHub discussion board.

We hope you enjoy using Invoke as much as we enjoy creating it, and we hope you will elect to become part of our community.

Thanks

Invoke is a combined effort of passionate and talented people from across the world. We thank them for their time, hard work and effort.

Original portions of the software are Copyright © 2024 by respective contributors.