mirror of https://github.com/invoke-ai/InvokeAI synced 2025-07-25 21:05:37 +00:00

Go to file

Ryan Dick 87261bdbc9 FLUX memory management improvements (#6791 )

## Summary

This PR contains several improvements to memory management for FLUX
workflows.

It is now possible to achieve better FLUX model caching performance, but
this still requires users to manually configure their `ram`/`vram`
settings. E.g. a `vram` setting of 16.0 should allow for all quantized
FLUX models to be kept in memory on the GPU.

Changes:
- Check the size of a model on disk and free the requisite space in the
model cache before loading it. (This behaviour existed previously, but
was removed in https://github.com/invoke-ai/InvokeAI/pull/6072/files.
The removal did not seem to be intentional).
- Removed the hack to free 24GB of space in the cache before loading the
FLUX model.
- Split the T5 embedding and CLIP embedding steps into separate
functions so that the two models don't both have to be held in RAM at
the same time.
- Fix a bug in `InvokeLinear8bitLt` that was causing some tensors to be
left on the GPU when the model was offloaded to the CPU. (This class is
getting very messy due to the non-standard state_dict handling in
`bnb.nn.Linear8bitLt`. )
- Tidy up some dtype handling in FluxTextToImageInvocation to avoid
situations where we hold references to two copies of the same tensor
unnecessarily.
- (minor) Misc cleanup of ModelCache: improve docs and remove unused
vars.

Future:
We should revisit our default ram/vram configs. The current defaults are
very conservative, and users could see major performance improvements
from tuning these values.

## QA Instructions

I tested the FLUX workflow with the following configurations and
verified that the cache hit rates and memory usage matched the expected
behaviour:
- `ram = 16` and `vram = 16`
- `ram = 16` and `vram = 1`
- `ram = 1` and `vram = 1`

Note that the changes in this PR are not isolated to FLUX. Since we now
check the size of models on disk, we may see slight changes in model
cache offload patterns for other models as well.

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_

2024-08-29 15:17:45 -04:00

.dev_scripts

Apply black

2023-07-27 10:54:01 -04:00

.github

Update macos test vm to macOS-14

2024-08-26 20:17:50 -04:00

coverage

combine pytest.ini with pyproject.toml

2023-03-05 17:00:08 +00:00

docker

fix(docs): follow-up docker readme fixes

2024-08-22 11:19:07 -04:00

docs

Warn on invalid model configs in the DB rather than crashing.

2024-07-11 21:05:55 -04:00

installer

Fix invoke.sh not detecting symlinks

2024-08-16 10:40:59 +10:00

invokeai

Tidy variable management and dtype handling in FluxTextToImageInvocation.

2024-08-29 19:08:18 +00:00

scripts

fix(app): openapi schema generation

2024-05-30 12:03:03 +10:00

tests

Update HF download logic to work for black-forest-labs/FLUX.1-schnell.

2024-08-26 20:17:50 -04:00

.dockerignore

Update dockerignore, set venv to 3.10, pass cache to yarn vite buidl

2023-07-12 16:51:15 -04:00

.editorconfig

Merge dev into main for 2.2.0 (#1642 )

2022-11-30 16:12:23 -05:00

.git-blame-ignore-revs

(meta) hide the 'black' formatting commit from git blame

2023-07-27 11:29:22 -04:00

.gitattributes

Enforce Unix line endings in container (#4990 )

2023-10-30 12:34:30 -04:00

.gitignore

feat: no frontend build in repo

2023-12-11 12:30:13 +11:00

.gitmodules

remove src directory, which is gumming up conda installs; addresses issue #77

2022-08-25 10:43:05 -04:00

.pre-commit-config.yaml

Adding isort GHA and pre-commit hooks

2023-09-12 13:01:58 -04:00

.prettierrc.yaml

feat: automated releases via github action

2024-02-29 21:57:20 -05:00

flake.lock

Add Nix Flake for development, which uses Python virtualenv.

2023-07-31 19:14:30 +10:00

flake.nix

fix: flake: add opencv with CUDA, new patchmatch dependency.

2023-08-01 23:56:41 +10:00

InvokeAI_Statement_of_Values.md

Add @ebr to Contributors (#2095 )

2022-12-21 14:33:08 -05:00

LICENSE

Update LICENSE

2023-07-05 23:46:27 -04:00

LICENSE-SD1+SD2.txt

updated LICENSE files and added information about watermarking

2023-07-26 17:27:33 -04:00

LICENSE-SDXL.txt

updated LICENSE files and added information about watermarking

2023-07-26 17:27:33 -04:00

Makefile

fix(app): openapi schema generation

2024-05-30 12:03:03 +10:00

mkdocs.yml

docs: merge INSTALL_TROUBLESHOOTING into FAQ

2024-03-27 18:59:55 +05:30

pyproject.toml

build: remove broken scripts

2024-08-27 22:01:45 +10:00

README.md

docs: overhaul Docker documentation, add to main README

2024-07-09 09:47:29 -04:00

Stable_Diffusion_v1_Model_Card.md

Global replace [ \t]+$, add "GB" (#1751 )

2022-12-19 16:36:39 +00:00

README.md

Invoke - Professional Creative AI Tools for Visual Media

To learn more about Invoke, or implement our Business solutions, visit invoke.com

Invoke is a leading creative engine built to empower professionals and enthusiasts alike. Generate and create stunning visual media using the latest AI-driven technologies. Invoke offers an industry leading web-based UI, and serves as the foundation for multiple commercial products.

Invoke is available in two editions:

Community Edition	Professional Edition
For users looking for a locally installed, self-hosted and self-managed service	For users or teams looking for a cloud-hosted, fully managed service
- Free to use under a commercially-friendly license	- Monthly subscription fee with three different plan levels
- Download and install on compatible hardware	- Offers additional benefits, including multi-user support, improved model training, and more
- Includes all core studio features: generate, refine, iterate on images, and build workflows	- Hosted in the cloud for easy, secure model access and scalability
Quick Start -> Installation and Updates	More Information -> www.invoke.com/pricing

Documentation

Quick Links
Installation and Updates - Documentation and Tutorials - Bug Reports - Contributing

Quick Start

Download and unzip the installer from the bottom of the latest release.
Run the installer script.
- Windows: Double-click on the install.bat script.
- macOS: Open a Terminal window, drag the file install.sh from Finder into the Terminal, and press enter.
- Linux: Run install.sh.
When prompted, enter a location for the install and select your GPU type.
Once the install finishes, find the directory you selected during install. The default location is C:\Users\Username\invokeai for Windows or ~/invokeai for Linux/macOS.
Run the launcher script (invoke.bat for Windows, invoke.sh for macOS and Linux) the same way you ran the installer script in step 2.
Select option 1 to start the application. Once it starts up, open your browser and go to http://localhost:9090.
Open the model manager tab to install a starter model and then you'll be ready to generate.

More detail, including hardware requirements and manual install instructions, are available in the installation documentation.

Docker Container

We publish official container images in Github Container Registry: https://github.com/invoke-ai/InvokeAI/pkgs/container/invokeai. Both CUDA and ROCm images are available. Check the above link for relevant tags.

Important

Ensure that Docker is set up to use the GPU. Refer to NVIDIA or AMD documentation.

Generate!

Run the container, modifying the command as necessary:

docker run --runtime=nvidia --gpus=all --publish 9090:9090 ghcr.io/invoke-ai/invokeai

Then open http://localhost:9090 and install some models using the Model Manager tab to begin generating.

For ROCm, add --device /dev/kfd --device /dev/dri to the docker run command.

Persist your data

You will likely want to persist your workspace outside of the container. Use the --volume /home/myuser/invokeai:/invokeai flag to mount some local directory (using its absolute path) to the /invokeai path inside the container. Your generated images and models will reside there. You can use this directory with other InvokeAI installations, or switch between runtime directories as needed.

DIY

Build your own image and customize the environment to match your needs using our docker-compose stack. See README.md in the docker directory.

Troubleshooting, FAQ and Support

Please review our FAQ for solutions to common installation problems and other issues.

For more help, please join our Discord.

Features

Full details on features can be found in our documentation.

Web Server & UI

Invoke runs a locally hosted web server & React UI with an industry-leading user experience.

Unified Canvas

The Unified Canvas is a fully integrated canvas implementation with support for all core generation capabilities, in/out-painting, brush tools, and more. This creative tool unlocks the capability for artists to create with AI as a creative collaborator, and can be used to augment AI-generated imagery, sketches, photography, renders, and more.

Workflows & Nodes

Invoke offers a fully featured workflow management solution, enabling users to combine the power of node-based workflows with the easy of a UI. This allows for customizable generation pipelines to be developed and shared by users looking to create specific workflows to support their production use-cases.

Board & Gallery Management

Invoke features an organized gallery system for easily storing, accessing, and remixing your content in the Invoke workspace. Images can be dragged/dropped onto any Image-base UI element in the application, and rich metadata within the Image allows for easy recall of key prompts or settings used in your workflow.

Other features

Support for both ckpt and diffusers models
SD1.5, SD2.0, and SDXL support
Upscaling Tools
Embedding Manager & Support
Model Manager & Support
Workflow creation & management
Node-Based Architecture

Contributing

Anyone who wishes to contribute to this project - whether documentation, features, bug fixes, code cleanup, testing, or code reviews - is very much encouraged to do so.

Get started with contributing by reading our contribution documentation, joining the #dev-chat or the GitHub discussion board.

We hope you enjoy using Invoke as much as we enjoy creating it, and we hope you will elect to become part of our community.

Thanks

Invoke is a combined effort of passionate and talented people from across the world. We thank them for their time, hard work and effort.