Merge branch 'main' into feat/onnx
9
.github/workflows/close-inactive-issues.yml
vendored
@ -1,11 +1,11 @@
|
||||
name: Close inactive issues
|
||||
on:
|
||||
schedule:
|
||||
- cron: "00 6 * * *"
|
||||
- cron: "00 4 * * *"
|
||||
|
||||
env:
|
||||
DAYS_BEFORE_ISSUE_STALE: 14
|
||||
DAYS_BEFORE_ISSUE_CLOSE: 28
|
||||
DAYS_BEFORE_ISSUE_STALE: 30
|
||||
DAYS_BEFORE_ISSUE_CLOSE: 14
|
||||
|
||||
jobs:
|
||||
close-issues:
|
||||
@ -14,7 +14,7 @@ jobs:
|
||||
issues: write
|
||||
pull-requests: write
|
||||
steps:
|
||||
- uses: actions/stale@v5
|
||||
- uses: actions/stale@v8
|
||||
with:
|
||||
days-before-issue-stale: ${{ env.DAYS_BEFORE_ISSUE_STALE }}
|
||||
days-before-issue-close: ${{ env.DAYS_BEFORE_ISSUE_CLOSE }}
|
||||
@ -23,5 +23,6 @@ jobs:
|
||||
close-issue-message: "Due to inactivity, this issue was automatically closed. If you are still experiencing the issue, please recreate the issue."
|
||||
days-before-pr-stale: -1
|
||||
days-before-pr-close: -1
|
||||
exempt-issue-labels: "Active Issue"
|
||||
repo-token: ${{ secrets.GITHUB_TOKEN }}
|
||||
operations-per-run: 500
|
||||
|
2
.github/workflows/mkdocs-material.yml
vendored
@ -2,7 +2,7 @@ name: mkdocs-material
|
||||
on:
|
||||
push:
|
||||
branches:
|
||||
- 'refs/heads/v2.3'
|
||||
- 'refs/heads/main'
|
||||
|
||||
permissions:
|
||||
contents: write
|
||||
|
59
README.md
@ -36,15 +36,6 @@
|
||||
|
||||
</div>
|
||||
|
||||
_**Note: This is an alpha release. Bugs are expected and not all
|
||||
features are fully implemented. Please use the GitHub [Issues
|
||||
pages](https://github.com/invoke-ai/InvokeAI/issues?q=is%3Aissue+is%3Aopen)
|
||||
to report unexpected problems. Also note that InvokeAI root directory
|
||||
which contains models, outputs and configuration files, has changed
|
||||
between the 2.x and 3.x release. If you wish to use your v2.3 root
|
||||
directory with v3.0, please follow the directions in [Migrating a 2.3
|
||||
root directory to 3.0](#migrating-to-3).**_
|
||||
|
||||
InvokeAI is a leading creative engine built to empower professionals
|
||||
and enthusiasts alike. Generate and create stunning visual media using
|
||||
the latest AI-driven technologies. InvokeAI offers an industry leading
|
||||
@ -264,19 +255,24 @@ old models directory (which contains the models selected at install
|
||||
time) will be renamed `models.orig` and can be deleted once you have
|
||||
confirmed that the migration was successful.
|
||||
|
||||
If you wish, you can pass the 2.3 root directory to both `--from` and
|
||||
`--to` in order to update in place. Warning: this directory will no
|
||||
longer be usable with InvokeAI 2.3.
|
||||
|
||||
#### Migrating in place
|
||||
|
||||
For the adventurous, you may do an in-place upgrade from 2.3 to 3.0
|
||||
without touching the command line. The recipe is as follows>
|
||||
without touching the command line. ***This recipe does not work on
|
||||
Windows platforms due to a bug in the Windows version of the 2.3
|
||||
upgrade script.** See the next section for a Windows recipe.
|
||||
|
||||
##### For Mac and Linux Users:
|
||||
|
||||
1. Launch the InvokeAI launcher script in your current v2.3 root directory.
|
||||
|
||||
2. Select option [9] "Update InvokeAI" to bring up the updater dialog.
|
||||
|
||||
3a. During the alpha release phase, select option [3] and manually
|
||||
enter the tag name `v3.0.0+a2`.
|
||||
|
||||
3b. Once 3.0 is released, select option [1] to upgrade to the latest release.
|
||||
3. Select option [1] to upgrade to the latest release.
|
||||
|
||||
4. Once the upgrade is finished you will be returned to the launcher
|
||||
menu. Select option [7] "Re-run the configure script to fix a broken
|
||||
@ -295,14 +291,33 @@ worked, you can safely remove these files. Alternatively you can
|
||||
restore a working v2.3 directory by removing the new files and
|
||||
restoring the ".orig" files' original names.
|
||||
|
||||
##### For Windows Users:
|
||||
|
||||
Windows Users can upgrade with the
|
||||
|
||||
1. Enter the 2.3 root directory you wish to upgrade
|
||||
2. Launch `invoke.sh` or `invoke.bat`
|
||||
3. Select the "Developer's console" option [8]
|
||||
4. Type the following commands
|
||||
|
||||
```
|
||||
pip install "invokeai @ https://github.com/invoke-ai/InvokeAI/archive/refs/tags/v3.0.0" --use-pep517 --upgrade
|
||||
invokeai-configure --root .
|
||||
```
|
||||
(Replace `v3.0.0` with the current release number if this document is out of date).
|
||||
|
||||
The first command will install and upgrade new software to run
|
||||
InvokeAI. The second will prepare the 2.3 directory for use with 3.0.
|
||||
You may now launch the WebUI in the usual way, by selecting option [1]
|
||||
from the launcher script
|
||||
|
||||
#### Migration Caveats
|
||||
|
||||
The migration script will migrate your invokeai settings and models,
|
||||
including textual inversion models, LoRAs and merges that you may have
|
||||
installed previously. However it does **not** migrate the generated
|
||||
images stored in your 2.3-format outputs directory. The released
|
||||
version of 3.0 is expected to have an interface for importing an
|
||||
entire directory of image files as a batch.
|
||||
images stored in your 2.3-format outputs directory. You will need to
|
||||
manually import selected images into the 3.0 gallery via drag-and-drop.
|
||||
|
||||
## Hardware Requirements
|
||||
|
||||
@ -314,9 +329,12 @@ AMD card (using the ROCm driver).
|
||||
|
||||
You will need one of the following:
|
||||
|
||||
- An NVIDIA-based graphics card with 4 GB or more VRAM memory.
|
||||
- An NVIDIA-based graphics card with 4 GB or more VRAM memory. 6-8 GB
|
||||
of VRAM is highly recommended for rendering using the Stable
|
||||
Diffusion XL models
|
||||
- An Apple computer with an M1 chip.
|
||||
- An AMD-based graphics card with 4GB or more VRAM memory. (Linux only)
|
||||
- An AMD-based graphics card with 4GB or more VRAM memory (Linux
|
||||
only), 6-8 GB for XL rendering.
|
||||
|
||||
We do not recommend the GTX 1650 or 1660 series video cards. They are
|
||||
unable to run in half-precision mode and do not have sufficient VRAM
|
||||
@ -349,13 +367,12 @@ Invoke AI provides an organized gallery system for easily storing, accessing, an
|
||||
### Other features
|
||||
|
||||
- *Support for both ckpt and diffusers models*
|
||||
- *SD 2.0, 2.1 support*
|
||||
- *SD 2.0, 2.1, XL support*
|
||||
- *Upscaling Tools*
|
||||
- *Embedding Manager & Support*
|
||||
- *Model Manager & Support*
|
||||
- *Node-Based Architecture*
|
||||
- *Node-Based Plug-&-Play UI (Beta)*
|
||||
- *SDXL Support* (Coming soon)
|
||||
|
||||
### Latest Changes
|
||||
|
||||
|
BIN
docs/assets/control-panel-2.png
Normal file
After Width: | Height: | Size: 415 KiB |
BIN
docs/assets/installing-models/model-installer-controlnet.png
Normal file
After Width: | Height: | Size: 57 KiB |
BIN
docs/assets/invoke-control-panel-1.png
Normal file
After Width: | Height: | Size: 37 KiB |
Before Width: | Height: | Size: 101 KiB After Width: | Height: | Size: 22 KiB |
Before Width: | Height: | Size: 29 KiB After Width: | Height: | Size: 16 KiB |
Before Width: | Height: | Size: 148 KiB After Width: | Height: | Size: 76 KiB |
Before Width: | Height: | Size: 637 KiB After Width: | Height: | Size: 729 KiB |
BIN
docs/assets/lora-example-0.png
Normal file
After Width: | Height: | Size: 530 KiB |
BIN
docs/assets/lora-example-1.png
Normal file
After Width: | Height: | Size: 24 KiB |
BIN
docs/assets/lora-example-2.png
Normal file
After Width: | Height: | Size: 8.5 KiB |
BIN
docs/assets/lora-example-3.png
Normal file
After Width: | Height: | Size: 409 KiB |
BIN
docs/assets/nodes/groupsallscale.png
Normal file
After Width: | Height: | Size: 490 KiB |
BIN
docs/assets/nodes/groupsconditioning.png
Normal file
After Width: | Height: | Size: 335 KiB |
BIN
docs/assets/nodes/groupscontrol.png
Normal file
After Width: | Height: | Size: 217 KiB |
BIN
docs/assets/nodes/groupsimgvae.png
Normal file
After Width: | Height: | Size: 244 KiB |
BIN
docs/assets/nodes/groupsiterate.png
Normal file
After Width: | Height: | Size: 948 KiB |
BIN
docs/assets/nodes/groupslora.png
Normal file
After Width: | Height: | Size: 292 KiB |
BIN
docs/assets/nodes/groupsmultigenseeding.png
Normal file
After Width: | Height: | Size: 420 KiB |
BIN
docs/assets/nodes/groupsnoise.png
Normal file
After Width: | Height: | Size: 179 KiB |
BIN
docs/assets/nodes/groupsrandseed.png
Normal file
After Width: | Height: | Size: 216 KiB |
BIN
docs/assets/nodes/nodescontrol.png
Normal file
After Width: | Height: | Size: 439 KiB |
BIN
docs/assets/nodes/nodesi2i.png
Normal file
After Width: | Height: | Size: 563 KiB |
BIN
docs/assets/nodes/nodest2i.png
Normal file
After Width: | Height: | Size: 353 KiB |
1
docs/assets/sdxl-graphs/sdxl-base-example1.json
Normal file
1
docs/assets/sdxl-graphs/sdxl-base-refine-example1.json
Normal file
BIN
docs/assets/send-to-icon.png
Normal file
After Width: | Height: | Size: 41 KiB |
BIN
docs/assets/upscaling.png
Normal file
After Width: | Height: | Size: 637 KiB |
@ -1,42 +1,38 @@
|
||||
# How to Contribute
|
||||
|
||||
## Welcome to Invoke AI
|
||||
|
||||
We're thrilled to have you here and we're excited for you to contribute.
|
||||
|
||||
Invoke AI originated as a project built by the community, and that vision carries forward today as we aim to build the best pro-grade tools available. We work together to incorporate the latest in AI/ML research, making these tools available in over 20 languages to artists and creatives around the world as part of our fully permissive OSS project designed for individual users to self-host and use.
|
||||
|
||||
Here are some guidelines to help you get started:
|
||||
|
||||
### Technical Prerequisites
|
||||
## Contributing to Invoke AI
|
||||
Anyone who wishes to contribute to InvokeAI, whether features, bug fixes, code cleanup, testing, code reviews, documentation or translation is very much encouraged to do so.
|
||||
|
||||
Front-end: You'll need a working knowledge of React and TypeScript.
|
||||
To join, just raise your hand on the InvokeAI Discord server (#dev-chat) or the GitHub discussion board.
|
||||
|
||||
Back-end: Depending on the scope of your contribution, you may need to know SQLite, FastAPI, Python, and Socketio. Also, a good majority of the backend logic involved in processing images is built in a modular way using a concept called "Nodes", which are isolated functions that carry out individual, discrete operations. This design allows for easy contributions of novel pipelines and capabilities.
|
||||
### Areas of contribution:
|
||||
|
||||
### How to Submit Contributions
|
||||
#### Development
|
||||
If you’d like to help with development, please see our [development guide](contribution_guides/development.md). If you’re unfamiliar with contributing to open source projects, there is a tutorial contained within the development guide.
|
||||
|
||||
To start contributing, please follow these steps:
|
||||
#### Documentation
|
||||
If you’d like to help with documentation, please see our [documentation guide](contribution_guides/documenation.md).
|
||||
|
||||
1. Familiarize yourself with our roadmap and open projects to see where your skills and interests align. These documents can serve as a source of inspiration.
|
||||
2. Open a Pull Request (PR) with a clear description of the feature you're adding or the problem you're solving. Make sure your contribution aligns with the project's vision.
|
||||
3. Adhere to general best practices. This includes assuming interoperability with other nodes, keeping the scope of your functions as small as possible, and organizing your code according to our architecture documents.
|
||||
#### Translation
|
||||
If you'd like to help with translation, please see our [translation guide](docs/contributing/.contribution_guides/translation.md).
|
||||
|
||||
### Types of Contributions We're Looking For
|
||||
#### Tutorials
|
||||
Please reach out to @imic or @hipsterusername on [Discord](https://discord.gg/ZmtBAhwWhy) to help create tutorials for InvokeAI.
|
||||
|
||||
We welcome all contributions that improve the project. Right now, we're especially looking for:
|
||||
We hope you enjoy using our software as much as we enjoy creating it, and we hope that some of those of you who are reading this will elect to become part of our contributor community.
|
||||
|
||||
1. Quality of life (QOL) enhancements on the front-end.
|
||||
2. New backend capabilities added through nodes.
|
||||
3. Incorporating additional optimizations from the broader open-source software community.
|
||||
|
||||
### Communication and Decision-making Process
|
||||
### Contributors
|
||||
|
||||
Project maintainers and code owners review PRs to ensure they align with the project's goals. They may provide design or architectural guidance, suggestions on user experience, or provide more significant feedback on the contribution itself. Expect to receive feedback on your submissions, and don't hesitate to ask questions or propose changes.
|
||||
This project is a combined effort of dedicated people from across the world. [Check out the list of all these amazing people](https://invoke-ai.github.io/InvokeAI/other/CONTRIBUTORS/). We thank them for their time, hard work and effort.
|
||||
|
||||
For more robust discussions, or if you're planning to add capabilities not currently listed on our roadmap, please reach out to us on our Discord server. That way, we can ensure your proposed contribution aligns with the project's direction before you start writing code.
|
||||
### Code of Conduct
|
||||
|
||||
### Code of Conduct and Contribution Expectations
|
||||
|
||||
We want everyone in our community to have a positive experience. To facilitate this, we've established a code of conduct and a statement of values that we expect all contributors to adhere to. Please take a moment to review these documents—they're essential to maintaining a respectful and inclusive environment.
|
||||
The InvokeAI community is a welcoming place, and we want your help in maintaining that. Please review our [Code of Conduct](https://github.com/invoke-ai/InvokeAI/blob/main/CODE_OF_CONDUCT.md) to learn more - it's essential to maintaining a respectful and inclusive environment.
|
||||
|
||||
By making a contribution to this project, you certify that:
|
||||
|
||||
@ -49,6 +45,12 @@ This disclaimer is not a license and does not grant any rights or permissions. Y
|
||||
|
||||
This disclaimer is provided "as is" without warranty of any kind, whether expressed or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose, or non-infringement. In no event shall the authors or copyright holders be liable for any claim, damages, or other liability, whether in an action of contract, tort, or otherwise, arising from, out of, or in connection with the contribution or the use or other dealings in the contribution.
|
||||
|
||||
### Support
|
||||
|
||||
For support, please use this repository's [GitHub Issues](https://github.com/invoke-ai/InvokeAI/issues), or join the [Discord](https://discord.gg/ZmtBAhwWhy).
|
||||
|
||||
Original portions of the software are Copyright (c) 2023 by respective contributors.
|
||||
|
||||
---
|
||||
|
||||
Remember, your contributions help make this project great. We're excited to see what you'll bring to our community!
|
||||
|
91
docs/contributing/contribution_guides/development.md
Normal file
@ -0,0 +1,91 @@
|
||||
# Development
|
||||
|
||||
## **What do I need to know to help?**
|
||||
|
||||
If you are looking to help to with a code contribution, InvokeAI uses several different technologies under the hood: Python (Pydantic, FastAPI, diffusers) and Typescript (React, Redux Toolkit, ChakraUI, Mantine, Konva). Familiarity with StableDiffusion and image generation concepts is helpful, but not essential.
|
||||
|
||||
For more information, please review our area specific documentation:
|
||||
|
||||
* #### [InvokeAI Architecure](../ARCHITECTURE.md)
|
||||
* #### [Frontend Documentation](development_guides/contributingToFrontend.md)
|
||||
* #### [Node Documentation](../INVOCATIONS.md)
|
||||
* #### [Local Development](../LOCAL_DEVELOPMENT.md)
|
||||
|
||||
If you don't feel ready to make a code contribution yet, no problem! You can also help out in other ways, such as [documentation](documentation.md) or [translation](translation.md).
|
||||
|
||||
There are two paths to making a development contribution:
|
||||
|
||||
1. Choosing an open issue to address. Open issues can be found in the [Issues](https://github.com/invoke-ai/InvokeAI/issues?q=is%3Aissue+is%3Aopen) section of the InvokeAI repository. These are tagged by the issue type (bug, enhancement, etc.) along with the “good first issues” tag denoting if they are suitable for first time contributors.
|
||||
1. Additional items can be found on our roadmap <******************************link to roadmap>******************************. The roadmap is organized in terms of priority, and contains features of varying size and complexity. If there is an inflight item you’d like to help with, reach out to the contributor assigned to the item to see how you can help.
|
||||
2. Opening a new issue or feature to add. **Please make sure you have searched through existing issues before creating new ones.**
|
||||
|
||||
*Regardless of what you choose, please post in the [#dev-chat](https://discord.com/channels/1020123559063990373/1049495067846524939) channel of the Discord before you start development in order to confirm that the issue or feature is aligned with the current direction of the project. We value our contributors time and effort and want to ensure that no one’s time is being misspent.*
|
||||
|
||||
## Best Practices:
|
||||
* Keep your pull requests small. Smaller pull requests are more likely to be accepted and merged
|
||||
* Comments! Commenting your code helps reviwers easily understand your contribution
|
||||
* Use Python and Typescript’s typing systems, and consider using an editor with [LSP](https://microsoft.github.io/language-server-protocol/) support to streamline development
|
||||
* Make all communications public. This ensure knowledge is shared with the whole community
|
||||
|
||||
## **How do I make a contribution?**
|
||||
|
||||
Never made an open source contribution before? Wondering how contributions work in our project? Here's a quick rundown!
|
||||
|
||||
Before starting these steps, ensure you have your local environment [configured for development](../LOCAL_DEVELOPMENT.md).
|
||||
|
||||
1. Find a [good first issue](https://github.com/invoke-ai/InvokeAI/contribute) that you are interested in addressing or a feature that you would like to add. Then, reach out to our team in the [#dev-chat](https://discord.com/channels/1020123559063990373/1049495067846524939) channel of the Discord to ensure you are setup for success.
|
||||
2. Fork the [InvokeAI](https://github.com/invoke-ai/InvokeAI) repository to your GitHub profile. This means that you will have a copy of the repository under **your-GitHub-username/InvokeAI**.
|
||||
3. Clone the repository to your local machine using:
|
||||
|
||||
```bash
|
||||
git clone https://github.com/your-GitHub-username/InvokeAI.git
|
||||
```
|
||||
|
||||
If you're unfamiliar with using Git through the commandline, [GitHub Desktop](https://desktop.github.com) is a easy-to-use alternative with a UI. You can do all the same steps listed here, but through the interface.
|
||||
|
||||
4. Create a new branch for your fix using:
|
||||
|
||||
```bash
|
||||
git checkout -b branch-name-here
|
||||
```
|
||||
|
||||
5. Make the appropriate changes for the issue you are trying to address or the feature that you want to add.
|
||||
6. Add the file contents of the changed files to the "snapshot" git uses to manage the state of the project, also known as the index:
|
||||
|
||||
```bash
|
||||
git add insert-paths-of-changed-files-here
|
||||
```
|
||||
|
||||
7. Store the contents of the index with a descriptive message.
|
||||
|
||||
```bash
|
||||
git commit -m "Insert a short message of the changes made here"
|
||||
```
|
||||
|
||||
8. Push the changes to the remote repository using
|
||||
|
||||
```markdown
|
||||
git push origin branch-name-here
|
||||
```
|
||||
|
||||
9. Submit a pull request to the **main** branch of the InvokeAI repository.
|
||||
10. Title the pull request with a short description of the changes made and the issue or bug number associated with your change. For example, you can title an issue like so "Added more log outputting to resolve #1234".
|
||||
11. In the description of the pull request, explain the changes that you made, any issues you think exist with the pull request you made, and any questions you have for the maintainer. It's OK if your pull request is not perfect (no pull request is), the reviewer will be able to help you fix any problems and improve it!
|
||||
12. Wait for the pull request to be reviewed by other collaborators.
|
||||
13. Make changes to the pull request if the reviewer(s) recommend them.
|
||||
14. Celebrate your success after your pull request is merged!
|
||||
|
||||
If you’d like to learn more about contributing to Open Source projects, here is a [Getting Started Guide](https://opensource.com/article/19/7/create-pull-request-github).
|
||||
|
||||
## **Where can I go for help?**
|
||||
|
||||
If you need help, you can ask questions in the [#dev-chat](https://discord.com/channels/1020123559063990373/1049495067846524939) channel of the Discord.
|
||||
|
||||
For frontend related work, **@pyschedelicious** is the best person to reach out to.
|
||||
|
||||
For backend related work, please reach out to **@blessedcoolant**, **@lstein**, **@StAlKeR7779** or **@pyschedelicious**.
|
||||
|
||||
## **What does the Code of Conduct mean for me?**
|
||||
|
||||
Our [Code of Conduct](CODE_OF_CONDUCT.md) means that you are responsible for treating everyone on the project with respect and courtesy regardless of their identity. If you are the victim of any inappropriate behavior or comments as described in our Code of Conduct, we are here for you and will do the best to ensure that the abuser is reprimanded appropriately, per our code.
|
||||
|
@ -0,0 +1,75 @@
|
||||
# Contributing to the Frontend
|
||||
|
||||
# InvokeAI Web UI
|
||||
|
||||
- [InvokeAI Web UI](https://github.com/invoke-ai/InvokeAI/tree/main/invokeai/frontend/web/docs#invokeai-web-ui)
|
||||
- [Stack](https://github.com/invoke-ai/InvokeAI/tree/main/invokeai/frontend/web/docs#stack)
|
||||
- [Contributing](https://github.com/invoke-ai/InvokeAI/tree/main/invokeai/frontend/web/docs#contributing)
|
||||
- [Dev Environment](https://github.com/invoke-ai/InvokeAI/tree/main/invokeai/frontend/web/docs#dev-environment)
|
||||
- [Production builds](https://github.com/invoke-ai/InvokeAI/tree/main/invokeai/frontend/web/docs#production-builds)
|
||||
|
||||
The UI is a fairly straightforward Typescript React app, with the Unified Canvas being more complex.
|
||||
|
||||
Code is located in `invokeai/frontend/web/` for review.
|
||||
|
||||
## Stack
|
||||
|
||||
State management is Redux via [Redux Toolkit](https://github.com/reduxjs/redux-toolkit). We lean heavily on RTK:
|
||||
|
||||
- `createAsyncThunk` for HTTP requests
|
||||
- `createEntityAdapter` for fetching images and models
|
||||
- `createListenerMiddleware` for workflows
|
||||
|
||||
The API client and associated types are generated from the OpenAPI schema. See API_CLIENT.md.
|
||||
|
||||
Communication with server is a mix of HTTP and [socket.io](https://github.com/socketio/socket.io-client) (with a simple socket.io redux middleware to help).
|
||||
|
||||
[Chakra-UI](https://github.com/chakra-ui/chakra-ui) & [Mantine](https://github.com/mantinedev/mantine) for components and styling.
|
||||
|
||||
[Konva](https://github.com/konvajs/react-konva) for the canvas, but we are pushing the limits of what is feasible with it (and HTML canvas in general). We plan to rebuild it with [PixiJS](https://github.com/pixijs/pixijs) to take advantage of WebGL's improved raster handling.
|
||||
|
||||
[Vite](https://vitejs.dev/) for bundling.
|
||||
|
||||
Localisation is via [i18next](https://github.com/i18next/react-i18next), but translation happens on our [Weblate](https://hosted.weblate.org/engage/invokeai/) project. Only the English source strings should be changed on this repo.
|
||||
|
||||
## Contributing
|
||||
|
||||
Thanks for your interest in contributing to the InvokeAI Web UI!
|
||||
|
||||
We encourage you to ping @psychedelicious and @blessedcoolant on [Discord](https://discord.gg/ZmtBAhwWhy) if you want to contribute, just to touch base and ensure your work doesn't conflict with anything else going on. The project is very active.
|
||||
|
||||
### Dev Environment
|
||||
|
||||
**Setup**
|
||||
|
||||
1. Install [node](https://nodejs.org/en/download/). You can confirm node is installed with:
|
||||
```bash
|
||||
node --version
|
||||
```
|
||||
2. Install [yarn classic](https://classic.yarnpkg.com/lang/en/) and confirm it is installed by running this:
|
||||
```bash
|
||||
npm install --global yarn
|
||||
yarn --version
|
||||
```
|
||||
|
||||
From `invokeai/frontend/web/` run `yarn install` to get everything set up.
|
||||
|
||||
Start everything in dev mode:
|
||||
1. Ensure your virtual environment is running
|
||||
2. Start the dev server: `yarn dev`
|
||||
3. Start the InvokeAI Nodes backend: `python scripts/invokeai-web.py # run from the repo root`
|
||||
4. Point your browser to the dev server address e.g. [http://localhost:5173/](http://localhost:5173/)
|
||||
|
||||
### VSCode Remote Dev
|
||||
|
||||
We've noticed an intermittent issue with the VSCode Remote Dev port forwarding. If you use this feature of VSCode, you may intermittently click the Invoke button and then get nothing until the request times out. Suggest disabling the IDE's port forwarding feature and doing it manually via SSH:
|
||||
|
||||
`ssh -L 9090:localhost:9090 -L 5173:localhost:5173 user@host`
|
||||
|
||||
### Production builds
|
||||
|
||||
For a number of technical and logistical reasons, we need to commit UI build artefacts to the repo.
|
||||
|
||||
If you submit a PR, there is a good chance we will ask you to include a separate commit with a build of the app.
|
||||
|
||||
To build for production, run `yarn build`.
|
13
docs/contributing/contribution_guides/documentation.md
Normal file
@ -0,0 +1,13 @@
|
||||
# Documentation
|
||||
|
||||
Documentation is an important part of any open source project. It provides a clear and concise way to communicate how the software works, how to use it, and how to troubleshoot issues. Without proper documentation, it can be difficult for users to understand the purpose and functionality of the project.
|
||||
|
||||
## Contributing
|
||||
|
||||
All documentation is maintained in the InvokeAI GitHub repository. If you come across documentation that is out of date or incorrect, please submit a pull request with the necessary changes.
|
||||
|
||||
When updating or creating documentation, please keep in mind InvokeAI is a tool for everyone, not just those who have familiarity with generative art.
|
||||
|
||||
## Help & Questions
|
||||
|
||||
Please ping @imic1 or @hipsterusername in the [Discord](https://discord.com/channels/1020123559063990373/1049495067846524939) if you have any questions.
|
19
docs/contributing/contribution_guides/translation.md
Normal file
@ -0,0 +1,19 @@
|
||||
# Translation
|
||||
|
||||
InvokeAI uses [Weblate](https://weblate.org/) for translation. Weblate is a FOSS project providing a scalable translation service. Weblate automates the tedious parts of managing translation of a growing project, and the service is generously provided at no cost to FOSS projects like InvokeAI.
|
||||
|
||||
## Contributing
|
||||
|
||||
If you'd like to contribute by adding or updating a translation, please visit our [Weblate project](https://hosted.weblate.org/engage/invokeai/). You'll need to sign in with your GitHub account (a number of other accounts are supported, including Google).
|
||||
|
||||
Once signed in, select a language and then the Web UI component. From here you can Browse and Translate strings from English to your chosen language. Zen mode offers a simpler translation experience.
|
||||
|
||||
Your changes will be attributed to you in the automated PR process; you don't need to do anything else.
|
||||
|
||||
## Help & Questions
|
||||
|
||||
Please check Weblate's [documentation](https://docs.weblate.org/en/latest/index.html) or ping @Harvestor on [Discord](https://discord.com/channels/1020123559063990373/1049495067846524939) if you have any questions.
|
||||
|
||||
## Thanks
|
||||
|
||||
Thanks to the InvokeAI community for their efforts to translate the project!
|
11
docs/contributing/contribution_guides/tutorials.md
Normal file
@ -0,0 +1,11 @@
|
||||
# Tutorials
|
||||
|
||||
Tutorials help new & existing users expand their abilty to use InvokeAI to the full extent of our features and services.
|
||||
|
||||
Currently, we have a set of tutorials available on our [YouTube channel](https://www.youtube.com/@invokeai), but as InvokeAI continues to evolve with new updates, we want to ensure that we are giving our users the resources they need to succeed.
|
||||
|
||||
Tutorials can be in the form of videos or article walkthroughs on a subject of your choice. We recommend focusing tutorials on the key image generation methods, or on a specific component within one of the image generation methods.
|
||||
|
||||
## Contributing
|
||||
|
||||
Please reach out to @imic or @hipsterusername on [Discord](https://discord.gg/ZmtBAhwWhy) to help create tutorials for InvokeAI.
|
@ -1,8 +1,8 @@
|
||||
---
|
||||
title: Concepts
|
||||
title: Textual Inversion Embeddings and LoRAs
|
||||
---
|
||||
|
||||
# :material-library-shelves: The Hugging Face Concepts Library and Importing Textual Inversion files
|
||||
# :material-library-shelves: Textual Inversions and LoRAs
|
||||
|
||||
With the advances in research, many new capabilities are available to customize the knowledge and understanding of novel concepts not originally contained in the base model.
|
||||
|
||||
@ -64,21 +64,25 @@ select the embedding you'd like to use. This UI has type-ahead support, so you c
|
||||
|
||||
## Using LoRAs
|
||||
|
||||
LoRA files are models that customize the output of Stable Diffusion image generation.
|
||||
Larger than embeddings, but much smaller than full models, they augment SD with improved
|
||||
understanding of subjects and artistic styles.
|
||||
LoRA files are models that customize the output of Stable Diffusion
|
||||
image generation. Larger than embeddings, but much smaller than full
|
||||
models, they augment SD with improved understanding of subjects and
|
||||
artistic styles.
|
||||
|
||||
Unlike TI files, LoRAs do not introduce novel vocabulary into the model's known tokens. Instead,
|
||||
LoRAs augment the model's weights that are applied to generate imagery. LoRAs may be supplied
|
||||
with a "trigger" word that they have been explicitly trained on, or may simply apply their
|
||||
effect without being triggered.
|
||||
Unlike TI files, LoRAs do not introduce novel vocabulary into the
|
||||
model's known tokens. Instead, LoRAs augment the model's weights that
|
||||
are applied to generate imagery. LoRAs may be supplied with a
|
||||
"trigger" word that they have been explicitly trained on, or may
|
||||
simply apply their effect without being triggered.
|
||||
|
||||
LoRAs are typically stored in .safetensors files, which are the most secure way to store and transmit
|
||||
these types of weights. You may install any number of `.safetensors` LoRA files simply by copying them into
|
||||
the `lora` directory of the corresponding InvokeAI models directory (usually `invokeai`
|
||||
in your home directory). For example, you can simply move a Stable Diffusion 1.5 LoRA file to
|
||||
the `sd-1/lora` folder.
|
||||
LoRAs are typically stored in .safetensors files, which are the most
|
||||
secure way to store and transmit these types of weights. You may
|
||||
install any number of `.safetensors` LoRA files simply by copying them
|
||||
into the `autoimport/lora` directory of the corresponding InvokeAI models
|
||||
directory (usually `invokeai` in your home directory).
|
||||
|
||||
To use these when generating, open the LoRA menu item in the options panel, select the LoRAs you want to apply
|
||||
and ensure that they have the appropriate weight recommended by the model provider. Typically, most LoRAs perform best at a weight of .75-1.
|
||||
To use these when generating, open the LoRA menu item in the options
|
||||
panel, select the LoRAs you want to apply and ensure that they have
|
||||
the appropriate weight recommended by the model provider. Typically,
|
||||
most LoRAs perform best at a weight of .75-1.
|
||||
|
||||
|
@ -136,19 +136,16 @@ command-line options by giving the `--help` argument:
|
||||
|
||||
```
|
||||
(.venv) > invokeai-web --help
|
||||
usage: InvokeAI [-h] [--host HOST] [--port PORT] [--allow_origins [ALLOW_ORIGINS ...]] [--allow_credentials | --no-allow_credentials]
|
||||
[--allow_methods [ALLOW_METHODS ...]] [--allow_headers [ALLOW_HEADERS ...]] [--esrgan | --no-esrgan]
|
||||
[--internet_available | --no-internet_available] [--log_tokenization | --no-log_tokenization]
|
||||
[--nsfw_checker | --no-nsfw_checker] [--patchmatch | --no-patchmatch] [--restore | --no-restore]
|
||||
[--always_use_cpu | --no-always_use_cpu] [--free_gpu_mem | --no-free_gpu_mem] [--max_cache_size MAX_CACHE_SIZE]
|
||||
[--max_vram_cache_size MAX_VRAM_CACHE_SIZE] [--precision {auto,float16,float32,autocast}]
|
||||
[--sequential_guidance | --no-sequential_guidance] [--xformers_enabled | --no-xformers_enabled]
|
||||
[--tiled_decode | --no-tiled_decode] [--root ROOT] [--autoimport_dir AUTOIMPORT_DIR] [--lora_dir LORA_DIR]
|
||||
[--embedding_dir EMBEDDING_DIR] [--controlnet_dir CONTROLNET_DIR] [--conf_path CONF_PATH] [--models_dir MODELS_DIR]
|
||||
[--legacy_conf_dir LEGACY_CONF_DIR] [--db_dir DB_DIR] [--outdir OUTDIR] [--from_file FROM_FILE]
|
||||
[--use_memory_db | --no-use_memory_db] [--model MODEL] [--log_handlers [LOG_HANDLERS ...]]
|
||||
[--log_format {plain,color,syslog,legacy}] [--log_level {debug,info,warning,error,critical}]
|
||||
...
|
||||
usage: InvokeAI [-h] [--host HOST] [--port PORT] [--allow_origins [ALLOW_ORIGINS ...]] [--allow_credentials | --no-allow_credentials] [--allow_methods [ALLOW_METHODS ...]]
|
||||
[--allow_headers [ALLOW_HEADERS ...]] [--esrgan | --no-esrgan] [--internet_available | --no-internet_available] [--log_tokenization | --no-log_tokenization]
|
||||
[--nsfw_checker | --no-nsfw_checker] [--invisible_watermark | --no-invisible_watermark] [--patchmatch | --no-patchmatch] [--restore | --no-restore]
|
||||
[--always_use_cpu | --no-always_use_cpu] [--free_gpu_mem | --no-free_gpu_mem] [--max_loaded_models MAX_LOADED_MODELS] [--max_cache_size MAX_CACHE_SIZE]
|
||||
[--max_vram_cache_size MAX_VRAM_CACHE_SIZE] [--gpu_mem_reserved GPU_MEM_RESERVED] [--precision {auto,float16,float32,autocast}]
|
||||
[--sequential_guidance | --no-sequential_guidance] [--xformers_enabled | --no-xformers_enabled] [--tiled_decode | --no-tiled_decode] [--root ROOT]
|
||||
[--autoimport_dir AUTOIMPORT_DIR] [--lora_dir LORA_DIR] [--embedding_dir EMBEDDING_DIR] [--controlnet_dir CONTROLNET_DIR] [--conf_path CONF_PATH]
|
||||
[--models_dir MODELS_DIR] [--legacy_conf_dir LEGACY_CONF_DIR] [--db_dir DB_DIR] [--outdir OUTDIR] [--from_file FROM_FILE]
|
||||
[--use_memory_db | --no-use_memory_db] [--model MODEL] [--log_handlers [LOG_HANDLERS ...]] [--log_format {plain,color,syslog,legacy}]
|
||||
[--log_level {debug,info,warning,error,critical}] [--version | --no-version]
|
||||
```
|
||||
|
||||
## The Configuration Settings
|
||||
@ -179,6 +176,7 @@ These configuration settings allow you to enable and disable various InvokeAI fe
|
||||
| `internet_available` | `true` | When a resource is not available locally, try to fetch it via the internet |
|
||||
| `log_tokenization` | `false` | Before each text2image generation, print a color-coded representation of the prompt to the console; this can help understand why a prompt is not working as expected |
|
||||
| `nsfw_checker` | `true` | Activate the NSFW checker to blur out risque images |
|
||||
| `invisible_watermark` | `true` | Write an invisible watermark 'InvokeAI' into generated images for use by AI image detectors |
|
||||
| `patchmatch` | `true` | Activate the "patchmatch" algorithm for improved inpainting |
|
||||
| `restore` | `true` | Activate the facial restoration features (DEPRECATED; restoration features will be removed in 3.0.0) |
|
||||
|
||||
|
@ -8,20 +8,64 @@ title: ControlNet
|
||||
|
||||
ControlNet
|
||||
|
||||
ControlNet is a powerful set of features developed by the open-source community (notably, Stanford researcher [**@ilyasviel**](https://github.com/lllyasviel)) that allows you to apply a secondary neural network model to your image generation process in Invoke.
|
||||
ControlNet is a powerful set of features developed by the open-source
|
||||
community (notably, Stanford researcher
|
||||
[**@ilyasviel**](https://github.com/lllyasviel)) that allows you to
|
||||
apply a secondary neural network model to your image generation
|
||||
process in Invoke.
|
||||
|
||||
With ControlNet, you can get more control over the output of your image generation, providing you with a way to direct the network towards generating images that better fit your desired style or outcome.
|
||||
With ControlNet, you can get more control over the output of your
|
||||
image generation, providing you with a way to direct the network
|
||||
towards generating images that better fit your desired style or
|
||||
outcome.
|
||||
|
||||
|
||||
### How it works
|
||||
|
||||
ControlNet works by analyzing an input image, pre-processing that image to identify relevant information that can be interpreted by each specific ControlNet model, and then inserting that control information into the generation process. This can be used to adjust the style, composition, or other aspects of the image to better achieve a specific result.
|
||||
ControlNet works by analyzing an input image, pre-processing that
|
||||
image to identify relevant information that can be interpreted by each
|
||||
specific ControlNet model, and then inserting that control information
|
||||
into the generation process. This can be used to adjust the style,
|
||||
composition, or other aspects of the image to better achieve a
|
||||
specific result.
|
||||
|
||||
|
||||
### Models
|
||||
|
||||
As part of the model installation, ControlNet models can be selected including a variety of pre-trained models that have been added to achieve different effects or styles in your generated images. Further ControlNet models may require additional code functionality to also be incorporated into Invoke's Invocations folder. You should expect to follow any installation instructions for ControlNet models loaded outside the default models provided by Invoke. The default models include:
|
||||
InvokeAI provides access to a series of ControlNet models that provide
|
||||
different effects or styles in your generated images. Currently
|
||||
InvokeAI only supports "diffuser" style ControlNet models. These are
|
||||
folders that contain the files `config.json` and/or
|
||||
`diffusion_pytorch_model.safetensors` and
|
||||
`diffusion_pytorch_model.fp16.safetensors`. The name of the folder is
|
||||
the name of the model.
|
||||
|
||||
***InvokeAI does not currently support checkpoint-format
|
||||
ControlNets. These come in the form of a single file with the
|
||||
extension `.safetensors`.***
|
||||
|
||||
Diffuser-style ControlNet models are available at HuggingFace
|
||||
(http://huggingface.co) and accessed via their repo IDs (identifiers
|
||||
in the format "author/modelname"). The easiest way to install them is
|
||||
to use the InvokeAI model installer application. Use the
|
||||
`invoke.sh`/`invoke.bat` launcher to select item [5] and then navigate
|
||||
to the CONTROLNETS section. Select the models you wish to install and
|
||||
press "APPLY CHANGES". You may also enter additional HuggingFace
|
||||
repo_ids in the "Additional models" textbox:
|
||||
|
||||
![Model Installer -
|
||||
Controlnetl](../assets/installing-models/model-installer-controlnet.png){:width="640px"}
|
||||
|
||||
Command-line users can launch the model installer using the command
|
||||
`invokeai-model-install`.
|
||||
|
||||
_Be aware that some ControlNet models require additional code
|
||||
functionality in order to work properly, so just installing a
|
||||
third-party ControlNet model may not have the desired effect._ Please
|
||||
read and follow the documentation for installing a third party model
|
||||
not currently included among InvokeAI's default list.
|
||||
|
||||
The models currently supported include:
|
||||
|
||||
**Canny**:
|
||||
|
||||
|
@ -61,11 +61,13 @@ A noise scheduler (eg. DPM++ 2M Karras) schedules the subtraction of noise from
|
||||
| ImageInverseLerp | Inverse linear interpolation of all pixels of an image |
|
||||
| ImageLerp | Linear interpolation of all pixels of an image |
|
||||
| ImageMultiply | Multiplies two images together using `PIL.ImageChops.Multiply()` |
|
||||
| ImageNSFWBlurInvocation | Detects and blurs images that may contain sexually explicit content |
|
||||
| ImagePaste | Pastes an image into another image |
|
||||
| ImageProcessor | Base class for invocations that reprocess images for ControlNet |
|
||||
| ImageResize | Resizes an image to specific dimensions |
|
||||
| ImageScale | Scales an image by a factor |
|
||||
| ImageToLatents | Scales latents by a given factor |
|
||||
| ImageWatermarkInvocation | Adds an invisible watermark to images |
|
||||
| InfillColor | Infills transparent areas of an image with a solid color |
|
||||
| InfillPatchMatch | Infills transparent areas of an image using the PatchMatch algorithm |
|
||||
| InfillTile | Infills transparent areas of an image with tiles of the image |
|
||||
@ -116,49 +118,49 @@ There are several node grouping concepts that can be examined with a narrow focu
|
||||
|
||||
As described, an initial noise tensor is necessary for the latent diffusion process. As a result, all non-image *ToLatents nodes require a noise node input.
|
||||
|
||||
<img width="654" alt="groupsnoise" src="https://github.com/ymgenesis/InvokeAI/assets/25252829/2e8d297e-ad55-4d27-bc93-c119dad2a2c5">
|
||||
![groupsnoise](../assets/nodes/groupsnoise.png)
|
||||
|
||||
### Conditioning
|
||||
|
||||
As described, conditioning is necessary for the latent diffusion process, whether empty or not. As a result, all non-image *ToLatents nodes require positive and negative conditioning inputs. Conditioning is reliant on a CLIP tokenizer provided by the Model Loader node.
|
||||
|
||||
<img width="1024" alt="groupsconditioning" src="https://github.com/ymgenesis/InvokeAI/assets/25252829/f8f7ad8a-8d9c-418e-b5ad-1437b774b27e">
|
||||
![groupsconditioning](../assets/nodes/groupsconditioning.png)
|
||||
|
||||
### Image Space & VAE
|
||||
|
||||
The ImageToLatents node doesn't require a noise node input, but requires a VAE input to convert the image from image space into latent space. In reverse, the LatentsToImage node requires a VAE input to convert from latent space back into image space.
|
||||
|
||||
<img width="637" alt="groupsimgvae" src="https://github.com/ymgenesis/InvokeAI/assets/25252829/dd99969c-e0a8-4f78-9b17-3ffe179cef9a">
|
||||
![groupsimgvae](../assets/nodes/groupsimgvae.png)
|
||||
|
||||
### Defined & Random Seeds
|
||||
|
||||
It is common to want to use both the same seed (for continuity) and random seeds (for variance). To define a seed, simply enter it into the 'Seed' field on a noise node. Conversely, the RandomInt node generates a random integer between 'Low' and 'High', and can be used as input to the 'Seed' edge point on a noise node to randomize your seed.
|
||||
|
||||
<img width="922" alt="groupsrandseed" src="https://github.com/ymgenesis/InvokeAI/assets/25252829/af55bc20-60f6-438e-aba5-3ec871443710">
|
||||
![groupsrandseed](../assets/nodes/groupsrandseed.png)
|
||||
|
||||
### Control
|
||||
|
||||
Control means to guide the diffusion process to adhere to a defined input or structure. Control can be provided as input to non-image *ToLatents nodes from ControlNet nodes. ControlNet nodes usually require an image processor which converts an input image for use with ControlNet.
|
||||
|
||||
<img width="805" alt="groupscontrol" src="https://github.com/ymgenesis/InvokeAI/assets/25252829/cc9c5de7-23a7-46c8-bbad-1f3609d999a6">
|
||||
![groupscontrol](../assets/nodes/groupscontrol.png)
|
||||
|
||||
### LoRA
|
||||
|
||||
The Lora Loader node lets you load a LoRA (say that ten times fast) and pass it as output to both the Prompt (Compel) and non-image *ToLatents nodes. A model's CLIP tokenizer is passed through the LoRA into Prompt (Compel), where it affects conditioning. A model's U-Net is also passed through the LoRA into a non-image *ToLatents node, where it affects noise prediction.
|
||||
|
||||
<img width="993" alt="groupslora" src="https://github.com/ymgenesis/InvokeAI/assets/25252829/630962b0-d914-4505-b3ea-ccae9b0269da">
|
||||
![groupslora](../assets/nodes/groupslora.png)
|
||||
|
||||
### Scaling
|
||||
|
||||
Use the ImageScale, ScaleLatents, and Upscale nodes to upscale images and/or latent images. The chosen method differs across contexts. However, be aware that latents are already noisy and compressed at their original resolution; scaling an image could produce more detailed results.
|
||||
|
||||
<img width="644" alt="groupsallscale" src="https://github.com/ymgenesis/InvokeAI/assets/25252829/99314f05-dd9f-4b6d-b378-31de55346a13">
|
||||
![groupsallscale](../assets/nodes/groupsallscale.png)
|
||||
|
||||
### Iteration + Multiple Images as Input
|
||||
|
||||
Iteration is a common concept in any processing, and means to repeat a process with given input. In nodes, you're able to use the Iterate node to iterate through collections usually gathered by the Collect node. The Iterate node has many potential uses, from processing a collection of images one after another, to varying seeds across multiple image generations and more. This screenshot demonstrates how to collect several images and pass them out one at a time.
|
||||
|
||||
<img width="788" alt="groupsiterate" src="https://github.com/ymgenesis/InvokeAI/assets/25252829/4af5ca27-82c9-4018-8c5b-024d3ee0a121">
|
||||
![groupsiterate](../assets/nodes/groupsiterate.png)
|
||||
|
||||
### Multiple Image Generation + Random Seeds
|
||||
|
||||
@ -166,7 +168,7 @@ Multiple image generation in the node editor is done using the RandomRange node.
|
||||
|
||||
To control seeds across generations takes some care. The first row in the screenshot will generate multiple images with different seeds, but using the same RandomRange parameters across invocations will result in the same group of random seeds being used across the images, producing repeatable results. In the second row, adding the RandomInt node as input to RandomRange's 'Seed' edge point will ensure that seeds are varied across all images across invocations, producing varied results.
|
||||
|
||||
<img width="1027" alt="groupsmultigenseeding" src="https://github.com/ymgenesis/InvokeAI/assets/25252829/518d1b2b-fed1-416b-a052-ab06552521b3">
|
||||
![groupsmultigenseeding](../assets/nodes/groupsmultigenseeding.png)
|
||||
|
||||
## Examples
|
||||
|
||||
@ -174,7 +176,7 @@ With our knowledge of node grouping and the diffusion process, let’s break dow
|
||||
|
||||
### Basic text-to-image Node Graph
|
||||
|
||||
<img width="875" alt="nodest2i" src="https://github.com/ymgenesis/InvokeAI/assets/25252829/17c67720-c376-4db8-94f0-5e00381a61ee">
|
||||
![nodest2i](../assets/nodes/nodest2i.png)
|
||||
|
||||
- Model Loader: A necessity to generating images (as we’ve read above). We choose our model from the dropdown. It outputs a U-Net, CLIP tokenizer, and VAE.
|
||||
- Prompt (Compel): Another necessity. Two prompt nodes are created. One will output positive conditioning (what you want, ‘dog’), one will output negative (what you don’t want, ‘cat’). They both input the CLIP tokenizer that the Model Loader node outputs.
|
||||
@ -184,7 +186,7 @@ With our knowledge of node grouping and the diffusion process, let’s break dow
|
||||
|
||||
### Basic image-to-image Node Graph
|
||||
|
||||
<img width="998" alt="nodesi2i" src="https://github.com/ymgenesis/InvokeAI/assets/25252829/3f2c95d5-cee7-4415-9b79-b46ee60a92fe">
|
||||
![nodesi2i](../assets/nodes/nodesi2i.png)
|
||||
|
||||
- Model Loader: Choose a model from the dropdown.
|
||||
- Prompt (Compel): Two prompt nodes. One positive (dog), one negative (dog). Same CLIP inputs from the Model Loader node as before.
|
||||
@ -195,7 +197,7 @@ With our knowledge of node grouping and the diffusion process, let’s break dow
|
||||
|
||||
### Basic ControlNet Node Graph
|
||||
|
||||
<img width="703" alt="nodescontrol" src="https://github.com/ymgenesis/InvokeAI/assets/25252829/b02ded86-ceb4-44a2-9910-e19ad184d471">
|
||||
![nodescontrol](../assets/nodes/nodescontrol.png)
|
||||
|
||||
- Model Loader
|
||||
- Prompt (Compel)
|
||||
|
@ -16,21 +16,24 @@ Output Example:
|
||||
|
||||
---
|
||||
|
||||
## **Seamless Tiling**
|
||||
## **Invisible Watermark**
|
||||
|
||||
The seamless tiling mode causes generated images to seamlessly tile
|
||||
with itself creating repetitive wallpaper-like patterns. To use it,
|
||||
activate the Seamless Tiling option in the Web GUI and then select
|
||||
whether to tile on the X (horizontal) and/or Y (vertical) axes. Tiling
|
||||
will then be active for the next set of generations.
|
||||
In keeping with the principles for responsible AI generation, and to
|
||||
help AI researchers avoid synthetic images contaminating their
|
||||
training sets, InvokeAI adds an invisible watermark to each of the
|
||||
final images it generates. The watermark consists of the text
|
||||
"InvokeAI" and can be viewed using the
|
||||
[invisible-watermarks](https://github.com/ShieldMnt/invisible-watermark)
|
||||
tool.
|
||||
|
||||
A nice prompt to test seamless tiling with is:
|
||||
Watermarking is controlled using the `invisible-watermark` setting in
|
||||
`invokeai.yaml`. To turn it off, add the following line under the `Features`
|
||||
category.
|
||||
|
||||
```
|
||||
pond garden with lotus by claude monet"
|
||||
invisible_watermark: false
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## **Weighted Prompts**
|
||||
|
||||
@ -39,34 +42,10 @@ priority to them, by adding `:<percent>` to the end of the section you wish to u
|
||||
example consider this prompt:
|
||||
|
||||
```bash
|
||||
tabby cat:0.25 white duck:0.75 hybrid
|
||||
(tabby cat):0.25 (white duck):0.75 hybrid
|
||||
```
|
||||
|
||||
This will tell the sampler to invest 25% of its effort on the tabby cat aspect of the image and 75%
|
||||
on the white duck aspect (surprisingly, this example actually works). The prompt weights can use any
|
||||
combination of integers and floating point numbers, and they do not need to add up to 1.
|
||||
|
||||
## **Thresholding and Perlin Noise Initialization Options**
|
||||
|
||||
Under the Noise section of the Web UI, you will find two options named
|
||||
Perlin Noise and Noise Threshold. [Perlin
|
||||
noise](https://en.wikipedia.org/wiki/Perlin_noise) is a type of
|
||||
structured noise used to simulate terrain and other natural
|
||||
textures. The slider controls the percentage of perlin noise that will
|
||||
be mixed into the image at the beginning of generation. Adding a little
|
||||
perlin noise to a generation will alter the image substantially.
|
||||
|
||||
The noise threshold limits the range of the latent values during
|
||||
sampling and helps combat the oversharpening seem with higher CFG
|
||||
scale values.
|
||||
|
||||
For better intuition into what these options do in practice:
|
||||
|
||||
![here is a graphic demonstrating them both](../assets/truncation_comparison.jpg)
|
||||
|
||||
In generating this graphic, perlin noise at initialization was
|
||||
programmatically varied going across on the diagram by values 0.0,
|
||||
0.1, 0.2, 0.4, 0.5, 0.6, 0.8, 0.9, 1.0; and the threshold was varied
|
||||
going down from 0, 1, 2, 3, 4, 5, 10, 20, 100. The other options are
|
||||
fixed using the prompt "a portrait of a beautiful young lady" a CFG of
|
||||
20, 100 steps, and a seed of 1950357039.
|
||||
|
@ -4,15 +4,19 @@ title: InvokeAI Web Server
|
||||
|
||||
# :material-web: InvokeAI Web Server
|
||||
|
||||
As of version 2.0.0, this distribution comes with a full-featured web server
|
||||
(see screenshot).
|
||||
## Quick guided walkthrough of the WebUI's features
|
||||
|
||||
To use it, launch the `invoke.sh`/`invoke.bat` script and select
|
||||
option (2). Alternatively, with the InvokeAI environment active, run
|
||||
the `invokeai` script by adding the `--web` option:
|
||||
While most of the WebUI's features are intuitive, here is a guided walkthrough
|
||||
through its various components.
|
||||
|
||||
### Launching the WebUI
|
||||
|
||||
To run the InvokeAI web server, start the `invoke.sh`/`invoke.bat`
|
||||
script and select option (1). Alternatively, with the InvokeAI
|
||||
environment active, run `invokeai-web`:
|
||||
|
||||
```bash
|
||||
invokeai --web
|
||||
invokeai-web
|
||||
```
|
||||
|
||||
You can then connect to the server by pointing your web browser at
|
||||
@ -28,33 +32,32 @@ invoke.sh --host 0.0.0.0
|
||||
or
|
||||
|
||||
```bash
|
||||
invokeai --web --host 0.0.0.0
|
||||
invokeai-web --host 0.0.0.0
|
||||
```
|
||||
|
||||
## Quick guided walkthrough of the WebUI's features
|
||||
|
||||
While most of the WebUI's features are intuitive, here is a guided walkthrough
|
||||
through its various components.
|
||||
### The InvokeAI Web Interface
|
||||
|
||||
![Invoke Web Server - Major Components](../assets/invoke-web-server-1.png){:width="640px"}
|
||||
|
||||
The screenshot above shows the Text to Image tab of the WebUI. There are three
|
||||
main sections:
|
||||
|
||||
1. A **control panel** on the left, which contains various settings for text to
|
||||
image generation. The most important part is the text field (currently
|
||||
showing `strawberry sushi`) for entering the text prompt, and the camera icon
|
||||
directly underneath that will render the image. We'll call this the _Invoke_
|
||||
button from now on.
|
||||
1. A **control panel** on the left, which contains various settings
|
||||
for text to image generation. The most important part is the text
|
||||
field (currently showing `fantasy painting, horned demon`) for
|
||||
entering the positive text prompt, another text field right below it for an
|
||||
optional negative text prompt (concepts to exclude), and a _Invoke_ button
|
||||
to begin the image rendering process.
|
||||
|
||||
2. The **current image** section in the middle, which shows a large format
|
||||
version of the image you are currently working on. A series of buttons at the
|
||||
top ("image to image", "Use All", "Use Seed", etc) lets you modify the image
|
||||
in various ways.
|
||||
2. The **current image** section in the middle, which shows a large
|
||||
format version of the image you are currently working on. A series
|
||||
of buttons at the top lets you modify and manipulate the image in
|
||||
various ways.
|
||||
|
||||
3. A \*_gallery_ section on the left that contains a history of the images you
|
||||
3. A **gallery** section on the left that contains a history of the images you
|
||||
have generated. These images are read and written to the directory specified
|
||||
at launch time in `--outdir`.
|
||||
in the `INVOKEAIROOT/invokeai.yaml` initialization file, usually a directory
|
||||
named `outputs` in `INVOKEAIROOT`.
|
||||
|
||||
In addition to these three elements, there are a series of icons for changing
|
||||
global settings, reporting bugs, and changing the theme on the upper right.
|
||||
@ -76,15 +79,11 @@ From top to bottom, these are:
|
||||
with outpainting,and modify interior portions of the image with
|
||||
inpainting, erase portions of a starting image and have the AI fill in
|
||||
the erased region from a text prompt.
|
||||
4. Node Editor - this panel allows you to create
|
||||
4. Node Editor - (experimental) this panel allows you to create
|
||||
pipelines of common operations and combine them into workflows.
|
||||
5. Model Manager - this panel allows you to import and configure new
|
||||
models using URLs, local paths, or HuggingFace diffusers repo_ids.
|
||||
|
||||
The inpainting, outpainting and postprocessing tabs are currently in
|
||||
development. However, limited versions of their features can already be accessed
|
||||
through the Text to Image and Image to Image tabs.
|
||||
|
||||
## Walkthrough
|
||||
|
||||
The following walkthrough will exercise most (but not all) of the WebUI's
|
||||
@ -92,43 +91,54 @@ feature set.
|
||||
|
||||
### Text to Image
|
||||
|
||||
1. Launch the WebUI using `python scripts/invoke.py --web` and connect to it
|
||||
with your browser by accessing `http://localhost:9090`. If the browser and
|
||||
server are running on different machines on your LAN, add the option
|
||||
`--host 0.0.0.0` to the launch command line and connect to the machine
|
||||
hosting the web server using its IP address or domain name.
|
||||
1. Launch the WebUI using launcher option [1] and connect to it with
|
||||
your browser by accessing `http://localhost:9090`. If the browser
|
||||
and server are running on different machines on your LAN, add the
|
||||
option `--host 0.0.0.0` to the `invoke.sh` launch command line and connect to
|
||||
the machine hosting the web server using its IP address or domain
|
||||
name.
|
||||
|
||||
2. If all goes well, the WebUI should come up and you'll see a green
|
||||
`connected` message on the upper right.
|
||||
2. If all goes well, the WebUI should come up and you'll see a green dot
|
||||
meaning `connected` on the upper right.
|
||||
|
||||
![Invoke Web Server - Control Panel](../assets/invoke-control-panel-1.png){ align=right width=300px }
|
||||
|
||||
#### Basics
|
||||
|
||||
1. Generate an image by typing _strawberry sushi_ into the large prompt field
|
||||
on the upper left and then clicking on the Invoke button (the one with the
|
||||
Camera icon). After a short wait, you'll see a large image of sushi in the
|
||||
1. Generate an image by typing _bluebird_ into the large prompt field
|
||||
on the upper left and then clicking on the Invoke button or pressing
|
||||
the return button.
|
||||
After a short wait, you'll see a large image of a bluebird in the
|
||||
image panel, and a new thumbnail in the gallery on the right.
|
||||
|
||||
If you need more room on the screen, you can turn the gallery off by
|
||||
clicking on the **x** to the right of "Your Invocations". You can turn it
|
||||
back on later by clicking the image icon that appears in the gallery's
|
||||
place.
|
||||
If you need more room on the screen, you can turn the gallery off
|
||||
by typing the **g** hotkey. You can turn it back on later by clicking the
|
||||
image icon that appears in the gallery's place. The list of hotkeys can
|
||||
be found by clicking on the keyboard icon above the image gallery.
|
||||
|
||||
The images are written into the directory indicated by the `--outdir` option
|
||||
provided at script launch time. By default, this is `outputs/img-samples`
|
||||
under the InvokeAI directory.
|
||||
|
||||
2. Generate a bunch of strawberry sushi images by increasing the number of
|
||||
requested images by adjusting the Images counter just below the Camera
|
||||
2. Generate a bunch of bluebird images by increasing the number of
|
||||
requested images by adjusting the Images counter just below the Invoke
|
||||
button. As each is generated, it will be added to the gallery. You can
|
||||
switch the active image by clicking on the gallery thumbnails.
|
||||
|
||||
If you'd like to watch the image generation progress, click the hourglass
|
||||
icon above the main image area. As generation progresses, you'll see
|
||||
increasingly detailed versions of the ultimate image.
|
||||
|
||||
3. Try playing with different settings, including image width and height, the
|
||||
Sampler, the Steps and the CFG scale.
|
||||
3. Try playing with different settings, including changing the main
|
||||
model, the image width and height, the Scheduler, the Steps and
|
||||
the CFG scale.
|
||||
|
||||
The _Model_ changes the main model. Thousands of custom models are
|
||||
now available, which generate a variety of image styles and
|
||||
subjects. While InvokeAI comes with a few starter models, it is
|
||||
easy to import new models into the application. See [Installing
|
||||
Models](../installation/050_INSTALLING_MODELS.md) for more details.
|
||||
|
||||
Image _Width_ and _Height_ do what you'd expect. However, be aware that
|
||||
larger images consume more VRAM memory and take longer to generate.
|
||||
|
||||
The _Sampler_ controls how the AI selects the image to display. Some
|
||||
The _Scheduler_ controls how the AI selects the image to display. Some
|
||||
samplers are more "creative" than others and will produce a wider range of
|
||||
variations (see next section). Some samplers run faster than others.
|
||||
|
||||
@ -142,17 +152,27 @@ feature set.
|
||||
to the input prompt. You can go as high or low as you like, but generally
|
||||
values greater than 20 won't improve things much, and values lower than 5
|
||||
will produce unexpected images. There are complex interactions between
|
||||
_Steps_, _CFG Scale_ and the _Sampler_, so experiment to find out what works
|
||||
_Steps_, _CFG Scale_ and the _Scheduler_, so experiment to find out what works
|
||||
for you.
|
||||
|
||||
The _Seed_ controls the series of values returned by InvokeAI's
|
||||
random number generator. Each unique seed value will generate a different
|
||||
image. To regenerate a previous image, simply use the original image's
|
||||
seed value. A slider to the right of the _Seed_ field will change the
|
||||
seed each time an image is generated.
|
||||
|
||||
4. To regenerate a previously-generated image, select the image you want and
|
||||
click _Use All_. This loads the text prompt and other original settings into
|
||||
the control panel. If you then press _Invoke_ it will regenerate the image
|
||||
exactly. You can also selectively modify the prompt or other settings to
|
||||
tweak the image.
|
||||
![Invoke Web Server - Control Panel 2](../assets/control-panel-2.png){ align=right width=400px }
|
||||
|
||||
Alternatively, you may click on _Use Seed_ to load just the image's seed,
|
||||
and leave other settings unchanged.
|
||||
4. To regenerate a previously-generated image, select the image you
|
||||
want and click the asterisk ("*") button at the top of the
|
||||
image. This loads the text prompt and other original settings into
|
||||
the control panel. If you then press _Invoke_ it will regenerate
|
||||
the image exactly. You can also selectively modify the prompt or
|
||||
other settings to tweak the image.
|
||||
|
||||
Alternatively, you may click on the "sprouting plant icon" to load
|
||||
just the image's seed, and leave other settings unchanged or the
|
||||
quote icon to load just the positive and negative prompts.
|
||||
|
||||
5. To regenerate a Stable Diffusion image that was generated by another SD
|
||||
package, you need to know its text prompt and its _Seed_. Copy-paste the
|
||||
@ -161,62 +181,22 @@ feature set.
|
||||
you Invoke, you will get something similar to the original image. It will
|
||||
not be exact unless you also set the correct values for the original
|
||||
sampler, CFG, steps and dimensions, but it will (usually) be close.
|
||||
|
||||
6. To save an image, right click on it to bring up a menu that will
|
||||
let you download the image, save it to a named image gallery, and
|
||||
copy it to the clipboard, among other things.
|
||||
|
||||
#### Variations on a theme
|
||||
#### Upscaling
|
||||
|
||||
1. Let's try generating some variations. Select your favorite sushi image from
|
||||
the gallery to load it. Then select "Use All" from the list of buttons
|
||||
above. This will load up all the settings used to generate this image,
|
||||
including its unique seed.
|
||||
![Invoke Web Server - Upscaling](../assets/upscaling.png){ align=right width=400px }
|
||||
|
||||
Go down to the Variations section of the Control Panel and set the button to
|
||||
On. Set Variation Amount to 0.2 to generate a modest number of variations on
|
||||
the image, and also set the Image counter to `4`. Press the `invoke` button.
|
||||
This will generate a series of related images. To obtain smaller variations,
|
||||
just lower the Variation Amount. You may also experiment with changing the
|
||||
Sampler. Some samplers generate more variability than others. _k_euler_a_ is
|
||||
particularly creative, while _ddim_ is pretty conservative.
|
||||
|
||||
2. For even more variations, experiment with increasing the setting for
|
||||
_Perlin_. This adds a bit of noise to the image generation process. Note
|
||||
that values of Perlin noise greater than 0.15 produce poor images for
|
||||
several of the samplers.
|
||||
|
||||
#### Facial reconstruction and upscaling
|
||||
|
||||
Stable Diffusion frequently produces mangled faces, particularly when there are
|
||||
multiple figures in the same scene. Stable Diffusion has particular issues with
|
||||
generating reallistic eyes. InvokeAI provides the ability to reconstruct faces
|
||||
using either the GFPGAN or CodeFormer libraries. For more information see
|
||||
[POSTPROCESS](POSTPROCESS.md).
|
||||
|
||||
1. Invoke a prompt that generates a mangled face. A prompt that often gives
|
||||
this is "portrait of a lawyer, 3/4 shot" (this is not intended as a slur
|
||||
against lawyers!) Once you have an image that needs some touching up, load
|
||||
it into the Image panel, and press the button with the face icon
|
||||
(highlighted in the first screenshot below). A dialog box will appear. Leave
|
||||
_Strength_ at 0.8 and press \*Restore Faces". If all goes well, the eyes and
|
||||
other aspects of the face will be improved (see the second screenshot)
|
||||
|
||||
![Invoke Web Server - Original Image](../assets/invoke-web-server-3.png)
|
||||
|
||||
![Invoke Web Server - Retouched Image](../assets/invoke-web-server-4.png)
|
||||
|
||||
The facial reconstruction _Strength_ field adjusts how aggressively the face
|
||||
library will try to alter the face. It can be as high as 1.0, but be aware
|
||||
that this often softens the face airbrush style, losing some details. The
|
||||
default 0.8 is usually sufficient.
|
||||
|
||||
2. "Upscaling" is the process of increasing the size of an image while
|
||||
retaining the sharpness. InvokeAI uses an external library called "ESRGAN"
|
||||
to do this. To invoke upscaling, simply select an image and press the _HD_
|
||||
button above it. You can select between 2X and 4X upscaling, and adjust the
|
||||
upscaling strength, which has much the same meaning as in facial
|
||||
reconstruction. Try running this on one of your previously-generated images.
|
||||
|
||||
3. Finally, you can run facial reconstruction and/or upscaling automatically
|
||||
after each Invocation. Go to the Advanced Options section of the Control
|
||||
Panel and turn on _Restore Face_ and/or _Upscale_.
|
||||
"Upscaling" is the process of increasing the size of an image while
|
||||
retaining the sharpness. InvokeAI uses an external library called
|
||||
"ESRGAN" to do this. To invoke upscaling, simply select an image
|
||||
and press the "expanding arrows" button above it. You can select
|
||||
between 2X and 4X upscaling, and adjust the upscaling strength,
|
||||
which has much the same meaning as in facial reconstruction. Try
|
||||
running this on one of your previously-generated images.
|
||||
|
||||
### Image to Image
|
||||
|
||||
@ -224,24 +204,14 @@ InvokeAI lets you take an existing image and use it as the basis for a new
|
||||
creation. You can use any sort of image, including a photograph, a scanned
|
||||
sketch, or a digital drawing, as long as it is in PNG or JPEG format.
|
||||
|
||||
For this tutorial, we'll use files named
|
||||
[Lincoln-and-Parrot-512.png](../assets/Lincoln-and-Parrot-512.png), and
|
||||
[Lincoln-and-Parrot-512-transparent.png](../assets/Lincoln-and-Parrot-512-transparent.png).
|
||||
Download these images to your local machine now to continue with the
|
||||
walkthrough.
|
||||
For this tutorial, we'll use the file named
|
||||
[Lincoln-and-Parrot-512.png](../assets/Lincoln-and-Parrot-512.png).
|
||||
|
||||
1. Click on the _Image to Image_ tab icon, which is the second icon from the
|
||||
top on the left-hand side of the screen:
|
||||
1. Click on the _Image to Image_ tab icon, which is the second icon
|
||||
from the top on the left-hand side of the screen. This will bring
|
||||
you to a screen similar to the one shown here:
|
||||
|
||||
<figure markdown>
|
||||
![Invoke Web Server - Image to Image Icon](../assets/invoke-web-server-5.png)
|
||||
</figure>
|
||||
|
||||
This will bring you to a screen similar to the one shown here:
|
||||
|
||||
<figure markdown>
|
||||
![Invoke Web Server - Image to Image Tab](../assets/invoke-web-server-6.png){:width="640px"}
|
||||
</figure>
|
||||
![Invoke Web Server - Image to Image Tab](../assets/invoke-web-server-6.png){ width="640px" }
|
||||
|
||||
2. Drag-and-drop the Lincoln-and-Parrot image into the Image panel, or click
|
||||
the blank area to get an upload dialog. The image will load into an area
|
||||
@ -255,120 +225,99 @@ walkthrough.
|
||||
![Invoke Web Server - Image to Image example](../assets/invoke-web-server-7.png){:width="640px"}
|
||||
|
||||
4. Experiment with the different settings. The most influential one in Image to
|
||||
Image is _Image to Image Strength_ located about midway down the control
|
||||
Image is _Denoising Strength_ located about midway down the control
|
||||
panel. By default it is set to 0.75, but can range from 0.0 to 0.99. The
|
||||
higher the value, the more of the original image the AI will replace. A
|
||||
value of 0 will leave the initial image completely unchanged, while 0.99
|
||||
will replace it completely. However, the Sampler and CFG Scale also
|
||||
will replace it completely. However, the _Scheduler_ and _CFG Scale_ also
|
||||
influence the final result. You can also generate variations in the same way
|
||||
as described in Text to Image.
|
||||
|
||||
5. What if we only want to change certain part(s) of the image and leave the
|
||||
rest intact? This is called Inpainting, and a future version of the InvokeAI
|
||||
web server will provide an interactive painting canvas on which you can
|
||||
directly draw the areas you wish to Inpaint into. For now, you can achieve
|
||||
this effect by using an external photoeditor tool to make one or more
|
||||
regions of the image transparent as described in [INPAINTING.md] and
|
||||
uploading that.
|
||||
|
||||
The file
|
||||
[Lincoln-and-Parrot-512-transparent.png](../assets/Lincoln-and-Parrot-512-transparent.png)
|
||||
is a version of the earlier image in which the area around the parrot has
|
||||
been replaced with transparency. Click on the "x" in the upper right of the
|
||||
Initial Image and upload the transparent version. Using the same prompt "old
|
||||
sea captain with raven on shoulder" try Invoking an image. This time, only
|
||||
the parrot will be replaced, leaving the rest of the original image intact:
|
||||
|
||||
<figure markdown>
|
||||
![Invoke Web Server - Inpainting](../assets/invoke-web-server-8.png){:width="640px"}
|
||||
</figure>
|
||||
5. What if we only want to change certain part(s) of the image and
|
||||
leave the rest intact? This is called Inpainting, and you can do
|
||||
it in the [Unified Canvas](UNIFIED_CANVAS.md). The Unified Canvas
|
||||
also allows you to extend borders of the image and fill in the
|
||||
blank areas, a process called outpainting.
|
||||
|
||||
6. Would you like to modify a previously-generated image using the Image to
|
||||
Image facility? Easy! While in the Image to Image panel, hover over any of
|
||||
the gallery images to see a little menu of icons pop up. Click the picture
|
||||
icon to instantly send the selected image to Image to Image as the initial
|
||||
image.
|
||||
Image facility? Easy! While in the Image to Image panel, drag and drop any
|
||||
image in the gallery into the Initial Image area, and it will be ready for
|
||||
use. You can do the same thing with the main image display. Click on the
|
||||
_Send to_ icon to get a menu of
|
||||
commands and choose "Send to Image to Image".
|
||||
|
||||
![Send To Icon](../assets/send-to-icon.png)
|
||||
|
||||
You can do the same from the Text to Image tab by clicking on the picture icon
|
||||
above the central image panel. The screenshot below shows where the "use as
|
||||
initial image" icons are located.
|
||||
### Textual Inversion, LoRA and ControlNet
|
||||
|
||||
![Invoke Web Server - Use as Image Links](../assets/invoke-web-server-9.png){:width="640px"}
|
||||
InvokeAI supports several different types of model files that
|
||||
extending the capabilities of the main model by adding artistic
|
||||
styles, special effects, or subjects. By mixing and matching textual
|
||||
inversion, LoRA and ControlNet models, you can achieve many
|
||||
interesting and beautiful effects.
|
||||
|
||||
### Unified Canvas
|
||||
We will give an example using a LoRA model named "Ink Scenery". This
|
||||
LoRA, which can be downloaded from Civitai (civitai.com), is
|
||||
specialized to paint landscapes that look like they were made with
|
||||
dripping india ink. To install this LoRA, we first download it and
|
||||
put it into the `autoimport/lora` folder located inside the
|
||||
`invokeai` root directory. After restarting the web server, the
|
||||
LoRA will now become available for use.
|
||||
|
||||
See the [Unified Canvas Guide](UNIFIED_CANVAS.md)
|
||||
To see this LoRA at work, we'll first generate an image without it
|
||||
using the standard `stable-diffusion-v1-5` model. Choose this
|
||||
model and enter the prompt "mountains, ink". Here is a typical
|
||||
generated image, a mountain range rendered in ink and watercolor
|
||||
wash:
|
||||
|
||||
## Reference
|
||||
![Ink Scenery without LoRA](../assets/lora-example-0.png){ width=512px }
|
||||
|
||||
### Additional Options
|
||||
Now let's install and activate the Ink Scenery LoRA. Go to
|
||||
https://civitai.com/models/78605/ink-scenery-or and download the LoRA
|
||||
model file to `invokeai/autoimport/lora` and restart the web
|
||||
server. (Alternatively, you can use [InvokeAI's Web Model
|
||||
Manager](../installation/050_INSTALLING_MODELS.md) to download and
|
||||
install the LoRA directly by typing its URL into the _Import
|
||||
Models_->_Location_ field).
|
||||
|
||||
| parameter <img width=160 align="right"> | effect |
|
||||
| --------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------ |
|
||||
| `--web_develop` | Starts the web server in development mode. |
|
||||
| `--web_verbose` | Enables verbose logging |
|
||||
| `--cors [CORS ...]` | Additional allowed origins, comma-separated |
|
||||
| `--host HOST` | Web server: Host or IP to listen on. Set to 0.0.0.0 to accept traffic from other devices on your network. |
|
||||
| `--port PORT` | Web server: Port to listen on |
|
||||
| `--certfile CERTFILE` | Web server: Path to certificate file to use for SSL. Use together with --keyfile |
|
||||
| `--keyfile KEYFILE` | Web server: Path to private key file to use for SSL. Use together with --certfile' |
|
||||
| `--gui` | Start InvokeAI GUI - This is the "desktop mode" version of the web app. It uses Flask to create a desktop app experience of the webserver. |
|
||||
Scroll down the control panel until you get to the LoRA accordion
|
||||
section, and open it:
|
||||
|
||||
### Web Specific Features
|
||||
![LoRA Section](../assets/lora-example-1.png){ width=512px }
|
||||
|
||||
The web experience offers an incredibly easy-to-use experience for interacting
|
||||
with the InvokeAI toolkit. For detailed guidance on individual features, see the
|
||||
Feature-specific help documents available in this directory. Note that the
|
||||
latest functionality available in the CLI may not always be available in the Web
|
||||
interface.
|
||||
Click the popup menu and select "Ink scenery". (If it isn't there, then
|
||||
the model wasn't installed to the right place, or perhaps you forgot
|
||||
to restart the web server.) The LoRA section will change to look like this:
|
||||
|
||||
#### Dark Mode & Light Mode
|
||||
![LoRA Section Loaded](../assets/lora-example-2.png){ width=512px }
|
||||
|
||||
The InvokeAI interface is available in a nano-carbon black & purple Dark Mode,
|
||||
and a "burn your eyes out Nosferatu" Light Mode. These can be toggled by
|
||||
clicking the Sun/Moon icons at the top right of the interface.
|
||||
Note that there is now a slider control for _Ink scenery_. The slider
|
||||
controls how much influence the LoRA model will have on the generated
|
||||
image.
|
||||
|
||||
![InvokeAI Web Server - Dark Mode](../assets/invoke_web_dark.png)
|
||||
Run the "mountains, ink" prompt again and observe the change in style:
|
||||
|
||||
![InvokeAI Web Server - Light Mode](../assets/invoke_web_light.png)
|
||||
![Ink Scenery](../assets/lora-example-3.png){ width=512px }
|
||||
|
||||
#### Invocation Toolbar
|
||||
Try adjusting the weight slider for larger and smaller weights and
|
||||
generate the image after each adjustment. The higher the weight, the
|
||||
more influence the LoRA will have.
|
||||
|
||||
The left side of the InvokeAI interface is available for customizing the prompt
|
||||
and the settings used for invoking your new image. Typing your prompt into the
|
||||
open text field and clicking the Invoke button will produce the image based on
|
||||
the settings configured in the toolbar.
|
||||
To remove the LoRA completely, just click on its trash can icon.
|
||||
|
||||
See below for additional documentation related to each feature:
|
||||
Multiple LoRAs can be added simultaneously and combined with textual
|
||||
inversions and ControlNet models. Please see [Textual Inversions and
|
||||
LoRAs](CONCEPTS.md) and [Using ControlNet](CONTROLNET.md) for details.
|
||||
|
||||
- [Variations](./VARIATIONS.md)
|
||||
- [Upscaling](./POSTPROCESS.md#upscaling)
|
||||
- [Image to Image](./IMG2IMG.md)
|
||||
- [Other](./OTHER.md)
|
||||
## Summary
|
||||
|
||||
#### Invocation Gallery
|
||||
|
||||
The currently selected --outdir (or the default outputs folder) will display all
|
||||
previously generated files on load. As new invocations are generated, these will
|
||||
be dynamically added to the gallery, and can be previewed by selecting them.
|
||||
Each image also has a simple set of actions (e.g., Delete, Use Seed, Use All
|
||||
Parameters, etc.) that can be accessed by hovering over the image.
|
||||
|
||||
#### Image Workspace
|
||||
|
||||
When an image from the Invocation Gallery is selected, or is generated, the
|
||||
image will be displayed within the center of the interface. A quickbar of common
|
||||
image interactions are displayed along the top of the image, including:
|
||||
|
||||
- Use image in the `Image to Image` workflow
|
||||
- Initialize Face Restoration on the selected file
|
||||
- Initialize Upscaling on the selected file
|
||||
- View File metadata and details
|
||||
- Delete the file
|
||||
This walkthrough just skims the surface of the many things InvokeAI
|
||||
can do. Please see [Features](index.md) for more detailed reference
|
||||
guides.
|
||||
|
||||
## Acknowledgements
|
||||
|
||||
A huge shout-out to the core team working to make this vision a reality,
|
||||
A huge shout-out to the core team working to make the Web GUI a reality,
|
||||
including [psychedelicious](https://github.com/psychedelicious),
|
||||
[Kyle0654](https://github.com/Kyle0654) and
|
||||
[blessedcoolant](https://github.com/blessedcoolant).
|
||||
|
@ -17,8 +17,12 @@ a single convenient digital artist-optimized user interface.
|
||||
### * [Prompt Engineering](PROMPTS.md)
|
||||
Get the images you want with the InvokeAI prompt engineering language.
|
||||
|
||||
## * The [Concepts Library](CONCEPTS.md)
|
||||
Add custom subjects and styles using HuggingFace's repository of embeddings.
|
||||
### * The [LoRA, LyCORIS and Textual Inversion Models](CONCEPTS.md)
|
||||
Add custom subjects and styles using a variety of fine-tuned models.
|
||||
|
||||
### * [ControlNet](CONTROLNET.md)
|
||||
Learn how to install and use ControlNet models for fine control over
|
||||
image output.
|
||||
|
||||
### * [Image-to-Image Guide](IMG2IMG.md)
|
||||
Use a seed image to build new creations in the CLI.
|
||||
@ -29,26 +33,28 @@ are the ticket.
|
||||
|
||||
## Model Management
|
||||
|
||||
## * [Model Installation](../installation/050_INSTALLING_MODELS.md)
|
||||
### * [Model Installation](../installation/050_INSTALLING_MODELS.md)
|
||||
Learn how to import third-party models and switch among them. This
|
||||
guide also covers optimizing models to load quickly.
|
||||
|
||||
## * [Merging Models](MODEL_MERGING.md)
|
||||
### * [Merging Models](MODEL_MERGING.md)
|
||||
Teach an old model new tricks. Merge 2-3 models together to create a
|
||||
new model that combines characteristics of the originals.
|
||||
|
||||
## * [Textual Inversion](TRAINING.md)
|
||||
### * [Textual Inversion](TRAINING.md)
|
||||
Personalize models by adding your own style or subjects.
|
||||
|
||||
# Other Features
|
||||
## Other Features
|
||||
|
||||
## * [The NSFW Checker](NSFW.md)
|
||||
### * [The NSFW Checker](NSFW.md)
|
||||
Prevent InvokeAI from displaying unwanted racy images.
|
||||
|
||||
## * [Controlling Logging](LOGGING.md)
|
||||
### * [Controlling Logging](LOGGING.md)
|
||||
Control how InvokeAI logs status messages.
|
||||
|
||||
## * [Miscellaneous](OTHER.md)
|
||||
<!-- OUT OF DATE
|
||||
### * [Miscellaneous](OTHER.md)
|
||||
Run InvokeAI on Google Colab, generate images with repeating patterns,
|
||||
batch process a file of prompts, increase the "creativity" of image
|
||||
generation by adding initial noise, and more!
|
||||
-->
|
||||
|
@ -24,7 +24,7 @@ title: Home
|
||||
|
||||
[![CI checks on main badge]][ci checks on main link]
|
||||
[![CI checks on dev badge]][ci checks on dev link]
|
||||
[![latest commit to dev badge]][latest commit to dev link]
|
||||
<!-- [![latest commit to dev badge]][latest commit to dev link] -->
|
||||
|
||||
[![github open issues badge]][github open issues link]
|
||||
[![github open prs badge]][github open prs link]
|
||||
@ -54,10 +54,10 @@ title: Home
|
||||
[github stars badge]:
|
||||
https://flat.badgen.net/github/stars/invoke-ai/InvokeAI?icon=github
|
||||
[github stars link]: https://github.com/invoke-ai/InvokeAI/stargazers
|
||||
[latest commit to dev badge]:
|
||||
<!-- [latest commit to dev badge]:
|
||||
https://flat.badgen.net/github/last-commit/invoke-ai/InvokeAI/development?icon=github&color=yellow&label=last%20dev%20commit&cache=900
|
||||
[latest commit to dev link]:
|
||||
https://github.com/invoke-ai/InvokeAI/commits/development
|
||||
https://github.com/invoke-ai/InvokeAI/commits/main -->
|
||||
[latest release badge]:
|
||||
https://flat.badgen.net/github/release/invoke-ai/InvokeAI/development?icon=github
|
||||
[latest release link]: https://github.com/invoke-ai/InvokeAI/releases
|
||||
@ -82,6 +82,25 @@ Q&A</a>]
|
||||
|
||||
This fork is rapidly evolving. Please use the [Issues tab](https://github.com/invoke-ai/InvokeAI/issues) to report bugs and make feature requests. Be sure to use the provided templates. They will help aid diagnose issues faster.
|
||||
|
||||
## :octicons-package-dependencies-24: Installation
|
||||
|
||||
This fork is supported across Linux, Windows and Macintosh. Linux users can use
|
||||
either an Nvidia-based card (with CUDA support) or an AMD card (using the ROCm
|
||||
driver).
|
||||
|
||||
### [Installation Getting Started Guide](installation)
|
||||
#### **[Automated Installer](installation/010_INSTALL_AUTOMATED.md)**
|
||||
✅ This is the recommended installation method for first-time users.
|
||||
#### [Manual Installation](installation/020_INSTALL_MANUAL.md)
|
||||
This method is recommended for experienced users and developers
|
||||
#### [Docker Installation](installation/040_INSTALL_DOCKER.md)
|
||||
This method is recommended for those familiar with running Docker containers
|
||||
### Other Installation Guides
|
||||
- [PyPatchMatch](installation/060_INSTALL_PATCHMATCH.md)
|
||||
- [XFormers](installation/070_INSTALL_XFORMERS.md)
|
||||
- [CUDA and ROCm Drivers](installation/030_INSTALL_CUDA_AND_ROCM.md)
|
||||
- [Installing New Models](installation/050_INSTALLING_MODELS.md)
|
||||
|
||||
## :fontawesome-solid-computer: Hardware Requirements
|
||||
|
||||
### :octicons-cpu-24: System
|
||||
@ -107,24 +126,6 @@ images in full-precision mode:
|
||||
- At least 18 GB of free disk space for the machine learning model, Python, and
|
||||
all its dependencies.
|
||||
|
||||
## :octicons-package-dependencies-24: Installation
|
||||
|
||||
This fork is supported across Linux, Windows and Macintosh. Linux users can use
|
||||
either an Nvidia-based card (with CUDA support) or an AMD card (using the ROCm
|
||||
driver).
|
||||
|
||||
### [Installation Getting Started Guide](installation)
|
||||
#### [Automated Installer](installation/010_INSTALL_AUTOMATED.md)
|
||||
This method is recommended for 1st time users
|
||||
#### [Manual Installation](installation/020_INSTALL_MANUAL.md)
|
||||
This method is recommended for experienced users and developers
|
||||
#### [Docker Installation](installation/040_INSTALL_DOCKER.md)
|
||||
This method is recommended for those familiar with running Docker containers
|
||||
### Other Installation Guides
|
||||
- [PyPatchMatch](installation/060_INSTALL_PATCHMATCH.md)
|
||||
- [XFormers](installation/070_INSTALL_XFORMERS.md)
|
||||
- [CUDA and ROCm Drivers](installation/030_INSTALL_CUDA_AND_ROCM.md)
|
||||
- [Installing New Models](installation/050_INSTALLING_MODELS.md)
|
||||
|
||||
## :octicons-gift-24: InvokeAI Features
|
||||
|
||||
@ -145,6 +146,7 @@ This method is recommended for those familiar with running Docker containers
|
||||
### Model Management
|
||||
- [Installing](installation/050_INSTALLING_MODELS.md)
|
||||
- [Model Merging](features/MODEL_MERGING.md)
|
||||
- [ControlNet Models](features/CONTROLNET.md)
|
||||
- [Style/Subject Concepts and Embeddings](features/CONCEPTS.md)
|
||||
- [Not Safe for Work (NSFW) Checker](features/NSFW.md)
|
||||
<!-- seperator -->
|
||||
@ -221,14 +223,10 @@ get solutions for common installation problems and other issues.
|
||||
|
||||
Anyone who wishes to contribute to this project, whether documentation,
|
||||
features, bug fixes, code cleanup, testing, or code reviews, is very much
|
||||
encouraged to do so. If you are unfamiliar with how to contribute to GitHub
|
||||
projects, here is a
|
||||
[Getting Started Guide](https://opensource.com/article/19/7/create-pull-request-github).
|
||||
encouraged to do so.
|
||||
|
||||
A full set of contribution guidelines, along with templates, are in progress,
|
||||
but for now the most important thing is to **make your pull request against the
|
||||
"development" branch**, and not against "main". This will help keep public
|
||||
breakage to a minimum and will allow you to propose more radical changes.
|
||||
[Please take a look at our Contribution documentation to learn more about contributing to InvokeAI.
|
||||
](contributing/CONTRIBUTING.md)
|
||||
|
||||
## :octicons-person-24: Contributors
|
||||
|
||||
|
@ -124,9 +124,9 @@ experimental versions later.
|
||||
[latest release](https://github.com/invoke-ai/InvokeAI/releases/latest),
|
||||
and look for a file named:
|
||||
|
||||
- InvokeAI-installer-v2.X.X.zip
|
||||
- InvokeAI-installer-v3.X.X.zip
|
||||
|
||||
where "2.X.X" is the latest released version. The file is located
|
||||
where "3.X.X" is the latest released version. The file is located
|
||||
at the very bottom of the release page, under **Assets**.
|
||||
|
||||
4. **Unpack the installer**: Unpack the zip file into a convenient directory. This will create a new
|
||||
|
@ -15,7 +15,7 @@ See the [troubleshooting
|
||||
section](010_INSTALL_AUTOMATED.md#troubleshooting) of the automated
|
||||
install guide for frequently-encountered installation issues.
|
||||
|
||||
## Main Application
|
||||
## Installation options
|
||||
|
||||
1. [Automated Installer](010_INSTALL_AUTOMATED.md)
|
||||
|
||||
@ -24,6 +24,9 @@ install guide for frequently-encountered installation issues.
|
||||
"developer console" which will help us debug problems with you and
|
||||
give you to access experimental features.
|
||||
|
||||
|
||||
✅ This is the recommended option for first time users.
|
||||
|
||||
2. [Manual Installation](020_INSTALL_MANUAL.md)
|
||||
|
||||
In this method you will manually run the commands needed to install
|
||||
|
52
docs/nodes/communityNodes.md
Normal file
@ -0,0 +1,52 @@
|
||||
# Community Nodes
|
||||
|
||||
These are nodes that have been developed by the community, for the community. If you're not sure what a node is, you can learn more about nodes [here](overview.md).
|
||||
|
||||
If you'd like to submit a node for the community, please refer to the [node creation overview](./overview.md#contributing-nodes).
|
||||
|
||||
To download a node, simply download the `.py` node file from the link and add it to the `invokeai/app/invocations/` folder in your Invoke AI install location. Along with the node, an example node graph should be provided to help you get started with the node.
|
||||
|
||||
To use a community node graph, download the the `.json` node graph file and load it into Invoke AI via the **Load Nodes** button on the Node Editor.
|
||||
|
||||
## Disclaimer
|
||||
|
||||
The nodes linked below have been developed and contributed by members of the Invoke AI community. While we strive to ensure the quality and safety of these contributions, we do not guarantee the reliability or security of the nodes. If you have issues or concerns with any of the nodes below, please raise it on GitHub or in the Discord.
|
||||
|
||||
## List of Nodes
|
||||
|
||||
### Face Mask
|
||||
|
||||
**Description:** This node autodetects a face in the image using MediaPipe and masks it by making it transparent. Via outpainting you can swap faces with other faces, or invert the mask and swap things around the face with other things. Additionally, you can supply X and Y offset values to scale/change the shape of the mask for finer control. The node also outputs an all-white mask in the same dimensions as the input image. This is needed by the inpaint node (and unified canvas) for outpainting.
|
||||
|
||||
**Node Link:** https://github.com/ymgenesis/InvokeAI/blob/facemaskmediapipe/invokeai/app/invocations/facemask.py
|
||||
|
||||
**Example Node Graph:** https://www.mediafire.com/file/gohn5sb1bfp8use/21-July_2023-FaceMask.json/file
|
||||
|
||||
**Output Examples**
|
||||
|
||||
![2e3168cb-af6a-475d-bfac-c7b7fd67b4c2](https://github.com/ymgenesis/InvokeAI/assets/25252829/a5ad7d44-2ada-4b3c-a56e-a21f8244a1ac)
|
||||
![2_annotated](https://github.com/ymgenesis/InvokeAI/assets/25252829/53416c8a-a23b-4d76-bb6d-3cfd776e0096)
|
||||
![2fe2150c-fd08-4e26-8c36-f0610bf441bb](https://github.com/ymgenesis/InvokeAI/assets/25252829/b0f7ecfe-f093-4147-a904-b9f131b41dc9)
|
||||
![831b6b98-4f0f-4360-93c8-69a9c1338cbe](https://github.com/ymgenesis/InvokeAI/assets/25252829/fc7b0622-e361-4155-8a76-082894d084f0)
|
||||
|
||||
--------------------------------
|
||||
### Super Cool Node Template
|
||||
|
||||
**Description:** This node allows you to do super cool things with InvokeAI.
|
||||
|
||||
**Node Link:** https://github.com/invoke-ai/InvokeAI/fake_node.py
|
||||
|
||||
**Example Node Graph:** https://github.com/invoke-ai/InvokeAI/fake_node_graph.json
|
||||
|
||||
**Output Examples**
|
||||
|
||||
![Invoke AI](https://invoke-ai.github.io/InvokeAI/assets/invoke_ai_banner.png)
|
||||
|
||||
### Ideal Size
|
||||
|
||||
**Description:** This node calculates an ideal image size for a first pass of a multi-pass upscaling. The aim is to avoid duplication that results from choosing a size larger than the model is capable of.
|
||||
|
||||
**Node Link:** https://github.com/JPPhoto/ideal-size-node
|
||||
|
||||
## Help
|
||||
If you run into any issues with a node, please post in the [InvokeAI Discord](https://discord.gg/ZmtBAhwWhy).
|
42
docs/nodes/overview.md
Normal file
@ -0,0 +1,42 @@
|
||||
# Nodes
|
||||
|
||||
## What are Nodes?
|
||||
An Node is simply a single operation that takes in some inputs and gives
|
||||
out some outputs. We can then chain multiple nodes together to create more
|
||||
complex functionality. All InvokeAI features are added through nodes.
|
||||
|
||||
This means nodes can be used to easily extend the image generation capabilities of InvokeAI, and allow you build workflows to suit your needs.
|
||||
|
||||
You can read more about nodes and the node editor [here](../features/NODES.md).
|
||||
|
||||
|
||||
## Downloading Nodes
|
||||
To download a new node, visit our list of [Community Nodes](communityNodes.md). These are nodes that have been created by the community, for the community.
|
||||
|
||||
|
||||
## Contributing Nodes
|
||||
|
||||
To learn about creating a new node, please visit our [Node creation documenation](../contributing/INVOCATIONS.md).
|
||||
|
||||
Once you’ve created a node and confirmed that it behaves as expected locally, follow these steps:
|
||||
* Make sure the node is contained in a new Python (.py) file
|
||||
* Submit a pull request with a link to your node in GitHub against the `nodes` branch to add the node to the [Community Nodes](Community Nodes) list
|
||||
* Make sure you are following the template below and have provided all relevant details about the node and what it does.
|
||||
* A maintainer will review the pull request and node. If the node is aligned with the direction of the project, you might be asked for permission to include it in the core project.
|
||||
|
||||
### Community Node Template
|
||||
|
||||
```markdown
|
||||
--------------------------------
|
||||
### Super Cool Node Template
|
||||
|
||||
**Description:** This node allows you to do super cool things with InvokeAI.
|
||||
|
||||
**Node Link:** https://github.com/invoke-ai/InvokeAI/fake_node.py
|
||||
|
||||
**Example Node Graph:** https://github.com/invoke-ai/InvokeAI/fake_node_graph.json
|
||||
|
||||
**Output Examples**
|
||||
|
||||
![InvokeAI](https://invoke-ai.github.io/InvokeAI/assets/invoke_ai_banner.png)
|
||||
```
|
@ -17,67 +17,267 @@ We thank them for all of their time and hard work.
|
||||
|
||||
* @lstein (Lincoln Stein) - Co-maintainer
|
||||
* @blessedcoolant - Co-maintainer
|
||||
* @hipsterusername (Kent Keirsey) - Product Manager
|
||||
* @psychedelicious - Web Team Leader
|
||||
* @hipsterusername (Kent Keirsey) - Co-maintainer, CEO, Positive Vibes
|
||||
* @psychedelicious (Spencer Mabrito) - Web Team Leader
|
||||
* @Kyle0654 (Kyle Schouviller) - Node Architect and General Backend Wizard
|
||||
* @damian0815 - Attention Systems and Gameplay Engineer
|
||||
* @mauwii (Matthias Wild) - Continuous integration and product maintenance engineer
|
||||
* @Netsvetaev (Artur Netsvetaev) - UI/UX Developer
|
||||
* @tildebyte - General gadfly and resident (self-appointed) know-it-all
|
||||
* @keturn - Lead for Diffusers port
|
||||
* @damian0815 - Attention Systems and Compel Maintainer
|
||||
* @ebr (Eugene Brodsky) - Cloud/DevOps/Sofware engineer; your friendly neighbourhood cluster-autoscaler
|
||||
* @jpphoto (Jonathan Pollack) - Inference and rendering engine optimization
|
||||
* @genomancer (Gregg Helt) - Model training and merging
|
||||
* @genomancer (Gregg Helt) - Controlnet support
|
||||
* @StAlKeR7779 (Sergey Borisov) - Torch stack, ONNX, model management, optimization
|
||||
* @cheerio (Mary Rogers) - Lead Engineer & Web App Development
|
||||
* @brandon (Brandon Rising) - Platform, Infrastructure, Backend Systems
|
||||
* @ryanjdick (Ryan Dick) - Machine Learning & Training
|
||||
* @millu (Millun Atluri) - Community Manager, Documentation, Node-wrangler
|
||||
* @chainchompa (Jennifer Player) - Web Development & Chain-Chomping
|
||||
* @keturn (Kevin Turner) - Diffusers
|
||||
* @gogurt enjoyer - Discord moderator and end user support
|
||||
* @whosawhatsis - Discord moderator and end user support
|
||||
* @dwinrger - Discord moderator and end user support
|
||||
* @526christian - Discord moderator and end user support
|
||||
|
||||
## **Contributions by**
|
||||
## **Full List of Contributors by Commit Name**
|
||||
|
||||
- [Sean McLellan](https://github.com/Oceanswave)
|
||||
- [Kevin Gibbons](https://github.com/bakkot)
|
||||
- [Tesseract Cat](https://github.com/TesseractCat)
|
||||
- [blessedcoolant](https://github.com/blessedcoolant)
|
||||
- [David Ford](https://github.com/david-ford)
|
||||
- [yunsaki](https://github.com/yunsaki)
|
||||
- [James Reynolds](https://github.com/magnusviri)
|
||||
- [David Wager](https://github.com/maddavid123)
|
||||
- [Jason Toffaletti](https://github.com/toffaletti)
|
||||
- [tildebyte](https://github.com/tildebyte)
|
||||
- [Cragin Godley](https://github.com/cgodley)
|
||||
- [BlueAmulet](https://github.com/BlueAmulet)
|
||||
- [Benjamin Warner](https://github.com/warner-benjamin)
|
||||
- [Cora Johnson-Roberson](https://github.com/corajr)
|
||||
- [veprogames](https://github.com/veprogames)
|
||||
- [JigenD](https://github.com/JigenD)
|
||||
- [Niek van der Maas](https://github.com/Niek)
|
||||
- [Henry van Megen](https://github.com/hvanmegen)
|
||||
- [Håvard Gulldahl](https://github.com/havardgulldahl)
|
||||
- [greentext2](https://github.com/greentext2)
|
||||
- [Simon Vans-Colina](https://github.com/simonvc)
|
||||
- [Gabriel Rotbart](https://github.com/gabrielrotbart)
|
||||
- [Eric Khun](https://github.com/erickhun)
|
||||
- [Brent Ozar](https://github.com/BrentOzar)
|
||||
- [nderscore](https://github.com/nderscore)
|
||||
- [Mikhail Tishin](https://github.com/tishin)
|
||||
- [Tom Elovi Spruce](https://github.com/ilovecomputers)
|
||||
- [spezialspezial](https://github.com/spezialspezial)
|
||||
- [Yosuke Shinya](https://github.com/shinya7y)
|
||||
- [Andy Pilate](https://github.com/Cubox)
|
||||
- [Muhammad Usama](https://github.com/SMUsamaShah)
|
||||
- [Arturo Mendivil](https://github.com/artmen1516)
|
||||
- [Paul Sajna](https://github.com/sajattack)
|
||||
- [Samuel Husso](https://github.com/shusso)
|
||||
- [nicolai256](https://github.com/nicolai256)
|
||||
- [Mihai](https://github.com/mh-dm)
|
||||
- [Any Winter](https://github.com/any-winter-4079)
|
||||
- [Doggettx](https://github.com/doggettx)
|
||||
- [Matthias Wild](https://github.com/mauwii)
|
||||
- [Kyle Schouviller](https://github.com/kyle0654)
|
||||
- [rabidcopy](https://github.com/rabidcopy)
|
||||
- [Dominic Letz](https://github.com/dominicletz)
|
||||
- [Dmitry T.](https://github.com/ArDiouscuros)
|
||||
- [Kent Keirsey](https://github.com/hipsterusername)
|
||||
- [psychedelicious](https://github.com/psychedelicious)
|
||||
- [damian0815](https://github.com/damian0815)
|
||||
- [Eugene Brodsky](https://github.com/ebr)
|
||||
- AbdBarho
|
||||
- ablattmann
|
||||
- AdamOStark
|
||||
- Adam Rice
|
||||
- Airton Silva
|
||||
- Alexander Eichhorn
|
||||
- Alexandre D. Roberge
|
||||
- Andreas Rozek
|
||||
- Andre LaBranche
|
||||
- Andy Bearman
|
||||
- Andy Luhrs
|
||||
- Andy Pilate
|
||||
- Any-Winter-4079
|
||||
- apolinario
|
||||
- ArDiouscuros
|
||||
- Armando C. Santisbon
|
||||
- Arthur Holstvoogd
|
||||
- artmen1516
|
||||
- Artur
|
||||
- Arturo Mendivil
|
||||
- Ben Alkov
|
||||
- Benjamin Warner
|
||||
- Bernard Maltais
|
||||
- blessedcoolant
|
||||
- blhook
|
||||
- BlueAmulet
|
||||
- Bouncyknighter
|
||||
- Brandon Rising
|
||||
- Brent Ozar
|
||||
- Brian Racer
|
||||
- bsilvereagle
|
||||
- c67e708d
|
||||
- CapableWeb
|
||||
- Carson Katri
|
||||
- Chloe
|
||||
- Chris Dawson
|
||||
- Chris Hayes
|
||||
- Chris Jones
|
||||
- chromaticist
|
||||
- Claus F. Strasburger
|
||||
- cmdr2
|
||||
- cody
|
||||
- Conor Reid
|
||||
- Cora Johnson-Roberson
|
||||
- coreco
|
||||
- cosmii02
|
||||
- cpacker
|
||||
- Cragin Godley
|
||||
- creachec
|
||||
- Damian Stewart
|
||||
- Daniel Manzke
|
||||
- Danny Beer
|
||||
- Dan Sully
|
||||
- David Burnett
|
||||
- David Ford
|
||||
- David Regla
|
||||
- David Wager
|
||||
- Daya Adianto
|
||||
- db3000
|
||||
- Denis Olshin
|
||||
- Dennis
|
||||
- Dominic Letz
|
||||
- DrGunnarMallon
|
||||
- Edward Johan
|
||||
- elliotsayes
|
||||
- Elrik
|
||||
- ElrikUnderlake
|
||||
- Eric Khun
|
||||
- Eric Wolf
|
||||
- Eugene Brodsky
|
||||
- ExperimentalCyborg
|
||||
- Fabian Bahl
|
||||
- Fabio 'MrWHO' Torchetti
|
||||
- fattire
|
||||
- Felipe Nogueira
|
||||
- Félix Sanz
|
||||
- figgefigge
|
||||
- Gabriel Mackievicz Telles
|
||||
- gabrielrotbart
|
||||
- gallegonovato
|
||||
- Gérald LONLAS
|
||||
- GitHub Actions Bot
|
||||
- gogurtenjoyer
|
||||
- greentext2
|
||||
- Gregg Helt
|
||||
- H4rk
|
||||
- Håvard Gulldahl
|
||||
- henry
|
||||
- Henry van Megen
|
||||
- hipsterusername
|
||||
- hj
|
||||
- Hosted Weblate
|
||||
- Iman Karim
|
||||
- ismail ihsan bülbül
|
||||
- Ivan Efimov
|
||||
- jakehl
|
||||
- Jakub Kolčář
|
||||
- JamDon2
|
||||
- James Reynolds
|
||||
- Jan Skurovec
|
||||
- Jari Vetoniemi
|
||||
- Jason Toffaletti
|
||||
- Jaulustus
|
||||
- Jeff Mahoney
|
||||
- jeremy
|
||||
- Jeremy Clark
|
||||
- JigenD
|
||||
- Jim Hays
|
||||
- Johan Roxendal
|
||||
- Johnathon Selstad
|
||||
- Jonathan
|
||||
- Joseph Dries III
|
||||
- JPPhoto
|
||||
- jspraul
|
||||
- Justin Wong
|
||||
- Juuso V
|
||||
- Kaspar Emanuel
|
||||
- Katsuyuki-Karasawa
|
||||
- Kent Keirsey
|
||||
- Kevin Coakley
|
||||
- Kevin Gibbons
|
||||
- Kevin Schaul
|
||||
- Kevin Turner
|
||||
- krummrey
|
||||
- Kyle Lacy
|
||||
- Kyle Schouviller
|
||||
- Lawrence Norton
|
||||
- LemonDouble
|
||||
- Leo Pasanen
|
||||
- Lincoln Stein
|
||||
- LoganPederson
|
||||
- Lynne Whitehorn
|
||||
- majick
|
||||
- Marco Labarile
|
||||
- Martin Kristiansen
|
||||
- Mary Hipp Rogers
|
||||
- mastercaster9000
|
||||
- Matthias Wild
|
||||
- michaelk71
|
||||
- mickr777
|
||||
- Mihai
|
||||
- Mihail Dumitrescu
|
||||
- Mikhail Tishin
|
||||
- Millun Atluri
|
||||
- Minjune Song
|
||||
- mitien
|
||||
- mofuzz
|
||||
- Muhammad Usama
|
||||
- Name
|
||||
- _nderscore
|
||||
- Netzer R
|
||||
- Nicholas Koh
|
||||
- Nicholas Körfer
|
||||
- nicolai256
|
||||
- Niek van der Maas
|
||||
- noodlebox
|
||||
- Nuno Coração
|
||||
- ofirkris
|
||||
- Olivier Louvignes
|
||||
- owenvincent
|
||||
- Patrick Esser
|
||||
- Patrick Tien
|
||||
- Patrick von Platen
|
||||
- Paul Sajna
|
||||
- pejotr
|
||||
- Peter Baylies
|
||||
- Peter Lin
|
||||
- plucked
|
||||
- prixt
|
||||
- psychedelicious
|
||||
- Rainer Bernhardt
|
||||
- Riccardo Giovanetti
|
||||
- Rich Jones
|
||||
- rmagur1203
|
||||
- Rob Baines
|
||||
- Robert Bolender
|
||||
- Robin Rombach
|
||||
- Rohan Barar
|
||||
- rpagliuca
|
||||
- rromb
|
||||
- Rupesh Sreeraman
|
||||
- Ryan Cao
|
||||
- Saifeddine
|
||||
- Saifeddine ALOUI
|
||||
- SammCheese
|
||||
- Sammy
|
||||
- sammyf
|
||||
- Samuel Husso
|
||||
- Scott Lahteine
|
||||
- Sean McLellan
|
||||
- Sebastian Aigner
|
||||
- Sergey Borisov
|
||||
- Sergey Krashevich
|
||||
- Shapor Naghibzadeh
|
||||
- Shawn Zhong
|
||||
- Simon Vans-Colina
|
||||
- skunkworxdark
|
||||
- slashtechno
|
||||
- spezialspezial
|
||||
- ssantos
|
||||
- StAlKeR7779
|
||||
- Stephan Koglin-Fischer
|
||||
- SteveCaruso
|
||||
- Steve Martinelli
|
||||
- Steven Frank
|
||||
- System X - Files
|
||||
- Taylor Kems
|
||||
- techicode
|
||||
- techybrain-dev
|
||||
- tesseractcat
|
||||
- thealanle
|
||||
- Thomas
|
||||
- tildebyte
|
||||
- Tim Cabbage
|
||||
- Tom
|
||||
- Tom Elovi Spruce
|
||||
- Tom Gouville
|
||||
- tomosuto
|
||||
- Travco
|
||||
- Travis Palmer
|
||||
- tyler
|
||||
- unknown
|
||||
- user1
|
||||
- Vedant Madane
|
||||
- veprogames
|
||||
- wa.code
|
||||
- wfng92
|
||||
- whosawhatsis
|
||||
- Will
|
||||
- William Becher
|
||||
- William Chong
|
||||
- xra
|
||||
- Yeung Yiu Hung
|
||||
- ymgenesis
|
||||
- Yorzaren
|
||||
- Yosuke Shinya
|
||||
- yun saki
|
||||
- Zadagu
|
||||
- zeptofine
|
||||
- 冯不游
|
||||
- 唐澤 克幸
|
||||
|
||||
## **Original CompVis Authors**
|
||||
|
||||
|
@ -58,7 +58,8 @@ class ApiDependencies:
|
||||
|
||||
@staticmethod
|
||||
def initialize(config: InvokeAIAppConfig, event_handler_id: int, logger: Logger = logger):
|
||||
logger.debug(f"InvokeAI version {__version__}")
|
||||
logger.info(f"InvokeAI version {__version__}")
|
||||
logger.info(f"Root directory = {str(config.root_path)}")
|
||||
logger.debug(f"Internet connectivity is {config.internet_available}")
|
||||
|
||||
events = FastAPIEventService(event_handler_id)
|
||||
|
@ -1,9 +1,32 @@
|
||||
import typing
|
||||
from enum import Enum
|
||||
from fastapi import Body
|
||||
from fastapi.routing import APIRouter
|
||||
from pathlib import Path
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
from invokeai.backend.image_util.patchmatch import PatchMatch
|
||||
from invokeai.backend.image_util.safety_checker import SafetyChecker
|
||||
from invokeai.backend.image_util.invisible_watermark import InvisibleWatermark
|
||||
from invokeai.app.invocations.upscale import ESRGAN_MODELS
|
||||
|
||||
from invokeai.version import __version__
|
||||
|
||||
from ..dependencies import ApiDependencies
|
||||
from invokeai.backend.util.logging import logging
|
||||
|
||||
class LogLevel(int, Enum):
|
||||
NotSet = logging.NOTSET
|
||||
Debug = logging.DEBUG
|
||||
Info = logging.INFO
|
||||
Warning = logging.WARNING
|
||||
Error = logging.ERROR
|
||||
Critical = logging.CRITICAL
|
||||
|
||||
class Upscaler(BaseModel):
|
||||
upscaling_method: str = Field(description="Name of upscaling method")
|
||||
upscaling_models: list[str] = Field(description="List of upscaling models for this method")
|
||||
|
||||
app_router = APIRouter(prefix="/v1/app", tags=["app"])
|
||||
|
||||
|
||||
@ -17,6 +40,9 @@ class AppConfig(BaseModel):
|
||||
"""App Config Response"""
|
||||
|
||||
infill_methods: list[str] = Field(description="List of available infill methods")
|
||||
upscaling_methods: list[Upscaler] = Field(description="List of upscaling methods")
|
||||
nsfw_methods: list[str] = Field(description="List of NSFW checking methods")
|
||||
watermarking_methods: list[str] = Field(description="List of invisible watermark methods")
|
||||
|
||||
|
||||
@app_router.get(
|
||||
@ -33,4 +59,51 @@ async def get_config() -> AppConfig:
|
||||
infill_methods = ['tile']
|
||||
if PatchMatch.patchmatch_available():
|
||||
infill_methods.append('patchmatch')
|
||||
return AppConfig(infill_methods=infill_methods)
|
||||
|
||||
|
||||
upscaling_models = []
|
||||
for model in typing.get_args(ESRGAN_MODELS):
|
||||
upscaling_models.append(str(Path(model).stem))
|
||||
upscaler = Upscaler(
|
||||
upscaling_method = 'esrgan',
|
||||
upscaling_models = upscaling_models
|
||||
)
|
||||
|
||||
nsfw_methods = []
|
||||
if SafetyChecker.safety_checker_available():
|
||||
nsfw_methods.append('nsfw_checker')
|
||||
|
||||
watermarking_methods = []
|
||||
if InvisibleWatermark.invisible_watermark_available():
|
||||
watermarking_methods.append('invisible_watermark')
|
||||
|
||||
return AppConfig(
|
||||
infill_methods=infill_methods,
|
||||
upscaling_methods=[upscaler],
|
||||
nsfw_methods=nsfw_methods,
|
||||
watermarking_methods=watermarking_methods,
|
||||
)
|
||||
|
||||
@app_router.get(
|
||||
"/logging",
|
||||
operation_id="get_log_level",
|
||||
responses={200: {"description" : "The operation was successful"}},
|
||||
response_model = LogLevel,
|
||||
)
|
||||
async def get_log_level(
|
||||
) -> LogLevel:
|
||||
"""Returns the log level"""
|
||||
return LogLevel(ApiDependencies.invoker.services.logger.level)
|
||||
|
||||
@app_router.post(
|
||||
"/logging",
|
||||
operation_id="set_log_level",
|
||||
responses={200: {"description" : "The operation was successful"}},
|
||||
response_model = LogLevel,
|
||||
)
|
||||
async def set_log_level(
|
||||
level: LogLevel = Body(description="New log verbosity level"),
|
||||
) -> LogLevel:
|
||||
"""Sets the log verbosity level"""
|
||||
ApiDependencies.invoker.services.logger.setLevel(level)
|
||||
return LogLevel(ApiDependencies.invoker.services.logger.level)
|
||||
|
@ -24,11 +24,14 @@ async def create_board_image(
|
||||
):
|
||||
"""Creates a board_image"""
|
||||
try:
|
||||
result = ApiDependencies.invoker.services.board_images.add_image_to_board(board_id=board_id, image_name=image_name)
|
||||
result = ApiDependencies.invoker.services.board_images.add_image_to_board(
|
||||
board_id=board_id, image_name=image_name
|
||||
)
|
||||
return result
|
||||
except Exception as e:
|
||||
raise HTTPException(status_code=500, detail="Failed to add to board")
|
||||
|
||||
|
||||
|
||||
@board_images_router.delete(
|
||||
"/",
|
||||
operation_id="remove_board_image",
|
||||
@ -43,27 +46,10 @@ async def remove_board_image(
|
||||
):
|
||||
"""Deletes a board_image"""
|
||||
try:
|
||||
result = ApiDependencies.invoker.services.board_images.remove_image_from_board(board_id=board_id, image_name=image_name)
|
||||
result = ApiDependencies.invoker.services.board_images.remove_image_from_board(
|
||||
board_id=board_id, image_name=image_name
|
||||
)
|
||||
return result
|
||||
except Exception as e:
|
||||
raise HTTPException(status_code=500, detail="Failed to update board")
|
||||
|
||||
|
||||
|
||||
@board_images_router.get(
|
||||
"/{board_id}",
|
||||
operation_id="list_board_images",
|
||||
response_model=OffsetPaginatedResults[ImageDTO],
|
||||
)
|
||||
async def list_board_images(
|
||||
board_id: str = Path(description="The id of the board"),
|
||||
offset: int = Query(default=0, description="The page offset"),
|
||||
limit: int = Query(default=10, description="The number of boards per page"),
|
||||
) -> OffsetPaginatedResults[ImageDTO]:
|
||||
"""Gets a list of images for a board"""
|
||||
|
||||
results = ApiDependencies.invoker.services.board_images.get_images_for_board(
|
||||
board_id,
|
||||
)
|
||||
return results
|
||||
|
||||
|
@ -1,16 +1,28 @@
|
||||
from typing import Optional, Union
|
||||
|
||||
from fastapi import Body, HTTPException, Path, Query
|
||||
from fastapi.routing import APIRouter
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
from invokeai.app.services.board_record_storage import BoardChanges
|
||||
from invokeai.app.services.image_record_storage import OffsetPaginatedResults
|
||||
from invokeai.app.services.models.board_record import BoardDTO
|
||||
|
||||
|
||||
from ..dependencies import ApiDependencies
|
||||
|
||||
boards_router = APIRouter(prefix="/v1/boards", tags=["boards"])
|
||||
|
||||
|
||||
class DeleteBoardResult(BaseModel):
|
||||
board_id: str = Field(description="The id of the board that was deleted.")
|
||||
deleted_board_images: list[str] = Field(
|
||||
description="The image names of the board-images relationships that were deleted."
|
||||
)
|
||||
deleted_images: list[str] = Field(
|
||||
description="The names of the images that were deleted."
|
||||
)
|
||||
|
||||
|
||||
@boards_router.post(
|
||||
"/",
|
||||
operation_id="create_board",
|
||||
@ -69,25 +81,42 @@ async def update_board(
|
||||
raise HTTPException(status_code=500, detail="Failed to update board")
|
||||
|
||||
|
||||
@boards_router.delete("/{board_id}", operation_id="delete_board")
|
||||
@boards_router.delete(
|
||||
"/{board_id}", operation_id="delete_board", response_model=DeleteBoardResult
|
||||
)
|
||||
async def delete_board(
|
||||
board_id: str = Path(description="The id of board to delete"),
|
||||
include_images: Optional[bool] = Query(
|
||||
description="Permanently delete all images on the board", default=False
|
||||
),
|
||||
) -> None:
|
||||
) -> DeleteBoardResult:
|
||||
"""Deletes a board"""
|
||||
try:
|
||||
if include_images is True:
|
||||
deleted_images = ApiDependencies.invoker.services.board_images.get_all_board_image_names_for_board(
|
||||
board_id=board_id
|
||||
)
|
||||
ApiDependencies.invoker.services.images.delete_images_on_board(
|
||||
board_id=board_id
|
||||
)
|
||||
ApiDependencies.invoker.services.boards.delete(board_id=board_id)
|
||||
return DeleteBoardResult(
|
||||
board_id=board_id,
|
||||
deleted_board_images=[],
|
||||
deleted_images=deleted_images,
|
||||
)
|
||||
else:
|
||||
deleted_board_images = ApiDependencies.invoker.services.board_images.get_all_board_image_names_for_board(
|
||||
board_id=board_id
|
||||
)
|
||||
ApiDependencies.invoker.services.boards.delete(board_id=board_id)
|
||||
return DeleteBoardResult(
|
||||
board_id=board_id,
|
||||
deleted_board_images=deleted_board_images,
|
||||
deleted_images=[],
|
||||
)
|
||||
except Exception as e:
|
||||
# TODO: Does this need any exception handling at all?
|
||||
pass
|
||||
raise HTTPException(status_code=500, detail="Failed to delete board")
|
||||
|
||||
|
||||
@boards_router.get(
|
||||
@ -115,3 +144,19 @@ async def list_boards(
|
||||
status_code=400,
|
||||
detail="Invalid request: Must provide either 'all' or both 'offset' and 'limit'",
|
||||
)
|
||||
|
||||
|
||||
@boards_router.get(
|
||||
"/{board_id}/image_names",
|
||||
operation_id="list_all_board_image_names",
|
||||
response_model=list[str],
|
||||
)
|
||||
async def list_all_board_image_names(
|
||||
board_id: str = Path(description="The id of the board"),
|
||||
) -> list[str]:
|
||||
"""Gets a list of images for a board"""
|
||||
|
||||
image_names = ApiDependencies.invoker.services.board_images.get_all_board_image_names_for_board(
|
||||
board_id,
|
||||
)
|
||||
return image_names
|
||||
|
@ -1,8 +1,7 @@
|
||||
import io
|
||||
from typing import Optional
|
||||
|
||||
from fastapi import (Body, HTTPException, Path, Query, Request, Response,
|
||||
UploadFile)
|
||||
from fastapi import Body, HTTPException, Path, Query, Request, Response, UploadFile
|
||||
from fastapi.responses import FileResponse
|
||||
from fastapi.routing import APIRouter
|
||||
from PIL import Image
|
||||
@ -11,9 +10,11 @@ from invokeai.app.invocations.metadata import ImageMetadata
|
||||
from invokeai.app.models.image import ImageCategory, ResourceOrigin
|
||||
from invokeai.app.services.image_record_storage import OffsetPaginatedResults
|
||||
from invokeai.app.services.item_storage import PaginatedResults
|
||||
from invokeai.app.services.models.image_record import (ImageDTO,
|
||||
ImageRecordChanges,
|
||||
ImageUrlsDTO)
|
||||
from invokeai.app.services.models.image_record import (
|
||||
ImageDTO,
|
||||
ImageRecordChanges,
|
||||
ImageUrlsDTO,
|
||||
)
|
||||
|
||||
from ..dependencies import ApiDependencies
|
||||
|
||||
@ -39,9 +40,15 @@ async def upload_image(
|
||||
response: Response,
|
||||
image_category: ImageCategory = Query(description="The category of the image"),
|
||||
is_intermediate: bool = Query(description="Whether this is an intermediate image"),
|
||||
board_id: Optional[str] = Query(
|
||||
default=None, description="The board to add this image to, if any"
|
||||
),
|
||||
session_id: Optional[str] = Query(
|
||||
default=None, description="The session ID associated with this upload, if any"
|
||||
),
|
||||
crop_visible: Optional[bool] = Query(
|
||||
default=False, description="Whether to crop the image"
|
||||
),
|
||||
) -> ImageDTO:
|
||||
"""Uploads an image"""
|
||||
if not file.content_type.startswith("image"):
|
||||
@ -51,6 +58,9 @@ async def upload_image(
|
||||
|
||||
try:
|
||||
pil_image = Image.open(io.BytesIO(contents))
|
||||
if crop_visible:
|
||||
bbox = pil_image.getbbox()
|
||||
pil_image = pil_image.crop(bbox)
|
||||
except:
|
||||
# Error opening the image
|
||||
raise HTTPException(status_code=415, detail="Failed to read image")
|
||||
@ -61,6 +71,7 @@ async def upload_image(
|
||||
image_origin=ResourceOrigin.EXTERNAL,
|
||||
image_category=image_category,
|
||||
session_id=session_id,
|
||||
board_id=board_id,
|
||||
is_intermediate=is_intermediate,
|
||||
)
|
||||
|
||||
@ -85,6 +96,18 @@ async def delete_image(
|
||||
pass
|
||||
|
||||
|
||||
@images_router.post("/clear-intermediates", operation_id="clear_intermediates")
|
||||
async def clear_intermediates() -> int:
|
||||
"""Clears all intermediates"""
|
||||
|
||||
try:
|
||||
count_deleted = ApiDependencies.invoker.services.images.delete_intermediates()
|
||||
return count_deleted
|
||||
except Exception as e:
|
||||
raise HTTPException(status_code=500, detail="Failed to clear intermediates")
|
||||
pass
|
||||
|
||||
|
||||
@images_router.patch(
|
||||
"/{image_name}",
|
||||
operation_id="update_image",
|
||||
@ -119,6 +142,7 @@ async def get_image_dto(
|
||||
except Exception as e:
|
||||
raise HTTPException(status_code=404)
|
||||
|
||||
|
||||
@images_router.get(
|
||||
"/{image_name}/metadata",
|
||||
operation_id="get_image_metadata",
|
||||
@ -234,16 +258,17 @@ async def get_image_urls(
|
||||
)
|
||||
async def list_image_dtos(
|
||||
image_origin: Optional[ResourceOrigin] = Query(
|
||||
default=None, description="The origin of images to list"
|
||||
default=None, description="The origin of images to list."
|
||||
),
|
||||
categories: Optional[list[ImageCategory]] = Query(
|
||||
default=None, description="The categories of image to include"
|
||||
default=None, description="The categories of image to include."
|
||||
),
|
||||
is_intermediate: Optional[bool] = Query(
|
||||
default=None, description="Whether to list intermediate images"
|
||||
default=None, description="Whether to list intermediate images."
|
||||
),
|
||||
board_id: Optional[str] = Query(
|
||||
default=None, description="The board id to filter by"
|
||||
default=None,
|
||||
description="The board id to filter by. Use 'none' to find images without a board.",
|
||||
),
|
||||
offset: int = Query(default=0, description="The page offset"),
|
||||
limit: int = Query(default=10, description="The number of images per page"),
|
||||
|
@ -298,7 +298,7 @@ async def search_for_models(
|
||||
)->List[pathlib.Path]:
|
||||
if not search_path.is_dir():
|
||||
raise HTTPException(status_code=404, detail=f"The search path '{search_path}' does not exist or is not directory")
|
||||
return ApiDependencies.invoker.services.model_manager.search_for_models([search_path])
|
||||
return ApiDependencies.invoker.services.model_manager.search_for_models(search_path)
|
||||
|
||||
@models_router.get(
|
||||
"/ckpt_confs",
|
||||
@ -315,20 +315,21 @@ async def list_ckpt_configs(
|
||||
return ApiDependencies.invoker.services.model_manager.list_checkpoint_configs()
|
||||
|
||||
|
||||
@models_router.get(
|
||||
@models_router.post(
|
||||
"/sync",
|
||||
operation_id="sync_to_config",
|
||||
responses={
|
||||
201: { "description": "synchronization successful" },
|
||||
},
|
||||
status_code = 201,
|
||||
response_model = None
|
||||
response_model = bool
|
||||
)
|
||||
async def sync_to_config(
|
||||
)->None:
|
||||
)->bool:
|
||||
"""Call after making changes to models.yaml, autoimport directories or models directory to synchronize
|
||||
in-memory data structures with disk data structures."""
|
||||
return ApiDependencies.invoker.services.model_manager.sync_to_config()
|
||||
ApiDependencies.invoker.services.model_manager.sync_to_config()
|
||||
return True
|
||||
|
||||
@models_router.put(
|
||||
"/merge/{base_model}",
|
||||
@ -373,50 +374,3 @@ async def merge_models(
|
||||
except ValueError as e:
|
||||
raise HTTPException(status_code=400, detail=str(e))
|
||||
return response
|
||||
|
||||
# The rename operation is now supported by update_model and no longer needs to be
|
||||
# a standalone route.
|
||||
# @models_router.post(
|
||||
# "/rename/{base_model}/{model_type}/{model_name}",
|
||||
# operation_id="rename_model",
|
||||
# responses= {
|
||||
# 201: {"description" : "The model was renamed successfully"},
|
||||
# 404: {"description" : "The model could not be found"},
|
||||
# 409: {"description" : "There is already a model corresponding to the new name"},
|
||||
# },
|
||||
# status_code=201,
|
||||
# response_model=ImportModelResponse
|
||||
# )
|
||||
# async def rename_model(
|
||||
# base_model: BaseModelType = Path(description="Base model"),
|
||||
# model_type: ModelType = Path(description="The type of model"),
|
||||
# model_name: str = Path(description="current model name"),
|
||||
# new_name: Optional[str] = Query(description="new model name", default=None),
|
||||
# new_base: Optional[BaseModelType] = Query(description="new model base", default=None),
|
||||
# ) -> ImportModelResponse:
|
||||
# """ Rename a model"""
|
||||
|
||||
# logger = ApiDependencies.invoker.services.logger
|
||||
|
||||
# try:
|
||||
# result = ApiDependencies.invoker.services.model_manager.rename_model(
|
||||
# base_model = base_model,
|
||||
# model_type = model_type,
|
||||
# model_name = model_name,
|
||||
# new_name = new_name,
|
||||
# new_base = new_base,
|
||||
# )
|
||||
# logger.debug(result)
|
||||
# logger.info(f'Successfully renamed {model_name}=>{new_name}')
|
||||
# model_raw = ApiDependencies.invoker.services.model_manager.list_model(
|
||||
# model_name=new_name or model_name,
|
||||
# base_model=new_base or base_model,
|
||||
# model_type=model_type
|
||||
# )
|
||||
# return parse_obj_as(ImportModelResponse, model_raw)
|
||||
# except ModelNotFoundException as e:
|
||||
# logger.error(str(e))
|
||||
# raise HTTPException(status_code=404, detail=str(e))
|
||||
# except ValueError as e:
|
||||
# logger.error(str(e))
|
||||
# raise HTTPException(status_code=409, detail=str(e))
|
||||
|
@ -4,6 +4,7 @@ import sys
|
||||
from inspect import signature
|
||||
|
||||
import uvicorn
|
||||
import socket
|
||||
|
||||
from fastapi import FastAPI
|
||||
from fastapi.middleware.cors import CORSMiddleware
|
||||
@ -193,9 +194,25 @@ app.mount("/",
|
||||
)
|
||||
|
||||
def invoke_api():
|
||||
def find_port(port: int):
|
||||
"""Find a port not in use starting at given port"""
|
||||
# Taken from https://waylonwalker.com/python-find-available-port/, thanks Waylon!
|
||||
# https://github.com/WaylonWalker
|
||||
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
|
||||
if s.connect_ex(("localhost", port)) == 0:
|
||||
return find_port(port=port + 1)
|
||||
else:
|
||||
return port
|
||||
|
||||
from invokeai.backend.install.check_root import check_invokeai_root
|
||||
check_invokeai_root(app_config) # note, may exit with an exception if root not set up
|
||||
|
||||
port = find_port(app_config.port)
|
||||
if port != app_config.port:
|
||||
logger.warn(f"Port {app_config.port} in use, using port {port}")
|
||||
# Start our own event loop for eventing usage
|
||||
loop = asyncio.new_event_loop()
|
||||
config = uvicorn.Config(app=app, host=app_config.host, port=app_config.port, loop=loop)
|
||||
config = uvicorn.Config(app=app, host=app_config.host, port=port, loop=loop)
|
||||
# Use access_log to turn off logging
|
||||
server = uvicorn.Server(config)
|
||||
loop.run_until_complete(server.serve())
|
||||
|
Before Width: | Height: | Size: 33 KiB After Width: | Height: | Size: 33 KiB |
@ -103,7 +103,7 @@ class CompelInvocation(BaseInvocation):
|
||||
def _lora_loader():
|
||||
for lora in self.clip.loras:
|
||||
lora_info = context.services.model_manager.get_model(
|
||||
**lora.dict(exclude={"weight"}))
|
||||
**lora.dict(exclude={"weight"}), context=context)
|
||||
yield (lora_info.context.model, lora.weight)
|
||||
del lora_info
|
||||
return
|
||||
@ -179,16 +179,16 @@ class CompelInvocation(BaseInvocation):
|
||||
class SDXLPromptInvocationBase:
|
||||
def run_clip_raw(self, context, clip_field, prompt, get_pooled):
|
||||
tokenizer_info = context.services.model_manager.get_model(
|
||||
**clip_field.tokenizer.dict(),
|
||||
**clip_field.tokenizer.dict(), context=context,
|
||||
)
|
||||
text_encoder_info = context.services.model_manager.get_model(
|
||||
**clip_field.text_encoder.dict(),
|
||||
**clip_field.text_encoder.dict(), context=context,
|
||||
)
|
||||
|
||||
def _lora_loader():
|
||||
for lora in clip_field.loras:
|
||||
lora_info = context.services.model_manager.get_model(
|
||||
**lora.dict(exclude={"weight"}))
|
||||
**lora.dict(exclude={"weight"}), context=context)
|
||||
yield (lora_info.context.model, lora.weight)
|
||||
del lora_info
|
||||
return
|
||||
@ -204,6 +204,7 @@ class SDXLPromptInvocationBase:
|
||||
model_name=name,
|
||||
base_model=clip_field.text_encoder.base_model,
|
||||
model_type=ModelType.TextualInversion,
|
||||
context=context,
|
||||
).context.model
|
||||
)
|
||||
except ModelNotFoundException:
|
||||
@ -248,16 +249,16 @@ class SDXLPromptInvocationBase:
|
||||
|
||||
def run_clip_compel(self, context, clip_field, prompt, get_pooled):
|
||||
tokenizer_info = context.services.model_manager.get_model(
|
||||
**clip_field.tokenizer.dict(),
|
||||
**clip_field.tokenizer.dict(), context=context,
|
||||
)
|
||||
text_encoder_info = context.services.model_manager.get_model(
|
||||
**clip_field.text_encoder.dict(),
|
||||
**clip_field.text_encoder.dict(), context=context,
|
||||
)
|
||||
|
||||
def _lora_loader():
|
||||
for lora in clip_field.loras:
|
||||
lora_info = context.services.model_manager.get_model(
|
||||
**lora.dict(exclude={"weight"}))
|
||||
**lora.dict(exclude={"weight"}), context=context)
|
||||
yield (lora_info.context.model, lora.weight)
|
||||
del lora_info
|
||||
return
|
||||
@ -273,6 +274,7 @@ class SDXLPromptInvocationBase:
|
||||
model_name=name,
|
||||
base_model=clip_field.text_encoder.base_model,
|
||||
model_type=ModelType.TextualInversion,
|
||||
context=context,
|
||||
).context.model
|
||||
)
|
||||
except ModelNotFoundException:
|
||||
|
@ -20,7 +20,7 @@ from ...backend.model_management import BaseModelType, ModelType
|
||||
from ..models.image import ImageCategory, ImageField, ResourceOrigin
|
||||
from .baseinvocation import (BaseInvocation, BaseInvocationOutput,
|
||||
InvocationConfig, InvocationContext)
|
||||
from .image import ImageOutput, PILInvocationConfig
|
||||
from ..models.image import ImageOutput, PILInvocationConfig
|
||||
|
||||
CONTROLNET_DEFAULT_MODELS = [
|
||||
###########################################
|
||||
@ -85,8 +85,8 @@ CONTROLNET_DEFAULT_MODELS = [
|
||||
CONTROLNET_NAME_VALUES = Literal[tuple(CONTROLNET_DEFAULT_MODELS)]
|
||||
CONTROLNET_MODE_VALUES = Literal[tuple(
|
||||
["balanced", "more_prompt", "more_control", "unbalanced"])]
|
||||
# crop and fill options not ready yet
|
||||
# CONTROLNET_RESIZE_VALUES = Literal[tuple(["just_resize", "crop_resize", "fill_resize"])]
|
||||
CONTROLNET_RESIZE_VALUES = Literal[tuple(
|
||||
["just_resize", "crop_resize", "fill_resize", "just_resize_simple",])]
|
||||
|
||||
|
||||
class ControlNetModelField(BaseModel):
|
||||
@ -111,7 +111,8 @@ class ControlField(BaseModel):
|
||||
description="When the ControlNet is last applied (% of total steps)")
|
||||
control_mode: CONTROLNET_MODE_VALUES = Field(
|
||||
default="balanced", description="The control mode to use")
|
||||
# resize_mode: CONTROLNET_RESIZE_VALUES = Field(default="just_resize", description="The resize mode to use")
|
||||
resize_mode: CONTROLNET_RESIZE_VALUES = Field(
|
||||
default="just_resize", description="The resize mode to use")
|
||||
|
||||
@validator("control_weight")
|
||||
def validate_control_weight(cls, v):
|
||||
@ -161,6 +162,7 @@ class ControlNetInvocation(BaseInvocation):
|
||||
end_step_percent: float = Field(default=1, ge=0, le=1,
|
||||
description="When the ControlNet is last applied (% of total steps)")
|
||||
control_mode: CONTROLNET_MODE_VALUES = Field(default="balanced", description="The control mode used")
|
||||
resize_mode: CONTROLNET_RESIZE_VALUES = Field(default="just_resize", description="The resize mode used")
|
||||
# fmt: on
|
||||
|
||||
class Config(InvocationConfig):
|
||||
@ -187,6 +189,7 @@ class ControlNetInvocation(BaseInvocation):
|
||||
begin_step_percent=self.begin_step_percent,
|
||||
end_step_percent=self.end_step_percent,
|
||||
control_mode=self.control_mode,
|
||||
resize_mode=self.resize_mode,
|
||||
),
|
||||
)
|
||||
|
||||
|
@ -4,61 +4,21 @@ from typing import Literal, Optional
|
||||
|
||||
import numpy
|
||||
from PIL import Image, ImageFilter, ImageOps, ImageChops
|
||||
from pydantic import BaseModel, Field
|
||||
from pydantic import Field
|
||||
from pathlib import Path
|
||||
from typing import Union
|
||||
|
||||
from ..models.image import ImageCategory, ImageField, ResourceOrigin
|
||||
from invokeai.app.invocations.metadata import CoreMetadata
|
||||
from ..models.image import (
|
||||
ImageCategory, ImageField, ResourceOrigin,
|
||||
PILInvocationConfig, ImageOutput, MaskOutput,
|
||||
)
|
||||
from .baseinvocation import (
|
||||
BaseInvocation,
|
||||
BaseInvocationOutput,
|
||||
InvocationContext,
|
||||
InvocationConfig,
|
||||
)
|
||||
|
||||
|
||||
class PILInvocationConfig(BaseModel):
|
||||
"""Helper class to provide all PIL invocations with additional config"""
|
||||
|
||||
class Config(InvocationConfig):
|
||||
schema_extra = {
|
||||
"ui": {
|
||||
"tags": ["PIL", "image"],
|
||||
},
|
||||
}
|
||||
|
||||
|
||||
class ImageOutput(BaseInvocationOutput):
|
||||
"""Base class for invocations that output an image"""
|
||||
|
||||
# fmt: off
|
||||
type: Literal["image_output"] = "image_output"
|
||||
image: ImageField = Field(default=None, description="The output image")
|
||||
width: int = Field(description="The width of the image in pixels")
|
||||
height: int = Field(description="The height of the image in pixels")
|
||||
# fmt: on
|
||||
|
||||
class Config:
|
||||
schema_extra = {"required": ["type", "image", "width", "height"]}
|
||||
|
||||
|
||||
class MaskOutput(BaseInvocationOutput):
|
||||
"""Base class for invocations that output a mask"""
|
||||
|
||||
# fmt: off
|
||||
type: Literal["mask"] = "mask"
|
||||
mask: ImageField = Field(default=None, description="The output mask")
|
||||
width: int = Field(description="The width of the mask in pixels")
|
||||
height: int = Field(description="The height of the mask in pixels")
|
||||
# fmt: on
|
||||
|
||||
class Config:
|
||||
schema_extra = {
|
||||
"required": [
|
||||
"type",
|
||||
"mask",
|
||||
]
|
||||
}
|
||||
|
||||
from invokeai.backend.image_util.safety_checker import SafetyChecker
|
||||
from invokeai.backend.image_util.invisible_watermark import InvisibleWatermark
|
||||
|
||||
class LoadImageInvocation(BaseInvocation):
|
||||
"""Load an image and provide it as output."""
|
||||
@ -397,7 +357,6 @@ class ImageConvertInvocation(BaseInvocation, PILInvocationConfig):
|
||||
height=image_dto.height,
|
||||
)
|
||||
|
||||
|
||||
class ImageBlurInvocation(BaseInvocation, PILInvocationConfig):
|
||||
"""Blurs an image"""
|
||||
|
||||
@ -602,7 +561,6 @@ class ImageLerpInvocation(BaseInvocation, PILInvocationConfig):
|
||||
height=image_dto.height,
|
||||
)
|
||||
|
||||
|
||||
class ImageInverseLerpInvocation(BaseInvocation, PILInvocationConfig):
|
||||
"""Inverse linear interpolation of all pixels of an image"""
|
||||
|
||||
@ -650,3 +608,97 @@ class ImageInverseLerpInvocation(BaseInvocation, PILInvocationConfig):
|
||||
width=image_dto.width,
|
||||
height=image_dto.height,
|
||||
)
|
||||
|
||||
class ImageNSFWBlurInvocation(BaseInvocation, PILInvocationConfig):
|
||||
"""Add blur to NSFW-flagged images"""
|
||||
|
||||
# fmt: off
|
||||
type: Literal["img_nsfw"] = "img_nsfw"
|
||||
|
||||
# Inputs
|
||||
image: Optional[ImageField] = Field(default=None, description="The image to check")
|
||||
metadata: Optional[CoreMetadata] = Field(default=None, description="Optional core metadata to be written to the image")
|
||||
# fmt: on
|
||||
|
||||
class Config(InvocationConfig):
|
||||
schema_extra = {
|
||||
"ui": {
|
||||
"title": "Blur NSFW Images",
|
||||
"tags": ["image", "nsfw", "checker"]
|
||||
},
|
||||
}
|
||||
|
||||
def invoke(self, context: InvocationContext) -> ImageOutput:
|
||||
image = context.services.images.get_pil_image(self.image.image_name)
|
||||
|
||||
logger = context.services.logger
|
||||
logger.debug("Running NSFW checker")
|
||||
if SafetyChecker.has_nsfw_concept(image):
|
||||
logger.info("A potentially NSFW image has been detected. Image will be blurred.")
|
||||
blurry_image = image.filter(filter=ImageFilter.GaussianBlur(radius=32))
|
||||
caution = self._get_caution_img()
|
||||
blurry_image.paste(caution,(0,0),caution)
|
||||
image = blurry_image
|
||||
|
||||
image_dto = context.services.images.create(
|
||||
image=image,
|
||||
image_origin=ResourceOrigin.INTERNAL,
|
||||
image_category=ImageCategory.GENERAL,
|
||||
node_id=self.id,
|
||||
session_id=context.graph_execution_state_id,
|
||||
is_intermediate=self.is_intermediate,
|
||||
metadata=self.metadata.dict() if self.metadata else None,
|
||||
)
|
||||
|
||||
return ImageOutput(
|
||||
image=ImageField(image_name=image_dto.image_name),
|
||||
width=image_dto.width,
|
||||
height=image_dto.height,
|
||||
)
|
||||
|
||||
def _get_caution_img(self)->Image:
|
||||
import invokeai.app.assets.images as image_assets
|
||||
caution = Image.open(Path(image_assets.__path__[0]) / 'caution.png')
|
||||
return caution.resize((caution.width // 2, caution.height //2))
|
||||
|
||||
class ImageWatermarkInvocation(BaseInvocation, PILInvocationConfig):
|
||||
""" Add an invisible watermark to an image """
|
||||
|
||||
# fmt: off
|
||||
type: Literal["img_watermark"] = "img_watermark"
|
||||
|
||||
# Inputs
|
||||
image: Optional[ImageField] = Field(default=None, description="The image to check")
|
||||
text: str = Field(default='InvokeAI', description="Watermark text")
|
||||
metadata: Optional[CoreMetadata] = Field(default=None, description="Optional core metadata to be written to the image")
|
||||
# fmt: on
|
||||
|
||||
class Config(InvocationConfig):
|
||||
schema_extra = {
|
||||
"ui": {
|
||||
"title": "Add Invisible Watermark",
|
||||
"tags": ["image", "watermark", "invisible"]
|
||||
},
|
||||
}
|
||||
|
||||
def invoke(self, context: InvocationContext) -> ImageOutput:
|
||||
image = context.services.images.get_pil_image(self.image.image_name)
|
||||
new_image = InvisibleWatermark.add_watermark(image, self.text)
|
||||
image_dto = context.services.images.create(
|
||||
image=new_image,
|
||||
image_origin=ResourceOrigin.INTERNAL,
|
||||
image_category=ImageCategory.GENERAL,
|
||||
node_id=self.id,
|
||||
session_id=context.graph_execution_state_id,
|
||||
is_intermediate=self.is_intermediate,
|
||||
metadata=self.metadata.dict() if self.metadata else None,
|
||||
)
|
||||
|
||||
return ImageOutput(
|
||||
image=ImageField(image_name=image_dto.image_name),
|
||||
width=image_dto.width,
|
||||
height=image_dto.height,
|
||||
)
|
||||
|
||||
|
||||
|
||||
|
@ -22,8 +22,8 @@ from ...backend.stable_diffusion.diffusers_pipeline import (
|
||||
from ...backend.stable_diffusion.diffusion.shared_invokeai_diffusion import \
|
||||
PostprocessingSettings
|
||||
from ...backend.stable_diffusion.schedulers import SCHEDULER_MAP
|
||||
from ...backend.util.devices import choose_torch_device, torch_dtype
|
||||
from ...backend.model_management import ModelPatcher
|
||||
from ...backend.util.devices import choose_torch_device, torch_dtype, choose_precision
|
||||
from ..models.image import ImageCategory, ImageField, ResourceOrigin
|
||||
from .baseinvocation import (BaseInvocation, BaseInvocationOutput,
|
||||
InvocationConfig, InvocationContext)
|
||||
@ -31,6 +31,7 @@ from .compel import ConditioningField
|
||||
from .controlnet_image_processors import ControlField
|
||||
from .image import ImageOutput
|
||||
from .model import ModelInfo, UNetField, VaeField
|
||||
from invokeai.app.util.controlnet_utils import prepare_control_image
|
||||
|
||||
from diffusers.models.attention_processor import (
|
||||
AttnProcessor2_0,
|
||||
@ -40,6 +41,9 @@ from diffusers.models.attention_processor import (
|
||||
)
|
||||
|
||||
|
||||
DEFAULT_PRECISION = choose_precision(choose_torch_device())
|
||||
|
||||
|
||||
class LatentsField(BaseModel):
|
||||
"""A latents field used for passing latents between invocations"""
|
||||
|
||||
@ -286,7 +290,7 @@ class TextToLatentsInvocation(BaseInvocation):
|
||||
# and add in batch_size, num_images_per_prompt?
|
||||
# and do real check for classifier_free_guidance?
|
||||
# prepare_control_image should return torch.Tensor of shape(batch_size, 3, height, width)
|
||||
control_image = model.prepare_control_image(
|
||||
control_image = prepare_control_image(
|
||||
image=input_image,
|
||||
do_classifier_free_guidance=do_classifier_free_guidance,
|
||||
width=control_width_resize,
|
||||
@ -296,13 +300,18 @@ class TextToLatentsInvocation(BaseInvocation):
|
||||
device=control_model.device,
|
||||
dtype=control_model.dtype,
|
||||
control_mode=control_info.control_mode,
|
||||
resize_mode=control_info.resize_mode,
|
||||
)
|
||||
control_item = ControlNetData(
|
||||
model=control_model, image_tensor=control_image,
|
||||
model=control_model,
|
||||
image_tensor=control_image,
|
||||
weight=control_info.control_weight,
|
||||
begin_step_percent=control_info.begin_step_percent,
|
||||
end_step_percent=control_info.end_step_percent,
|
||||
control_mode=control_info.control_mode,
|
||||
# any resizing needed should currently be happening in prepare_control_image(),
|
||||
# but adding resize_mode to ControlNetData in case needed in the future
|
||||
resize_mode=control_info.resize_mode,
|
||||
)
|
||||
control_data.append(control_item)
|
||||
# MultiControlNetModel has been refactored out, just need list[ControlNetData]
|
||||
@ -493,8 +502,8 @@ class LatentsToImageInvocation(BaseInvocation):
|
||||
vae: VaeField = Field(default=None, description="Vae submodel")
|
||||
tiled: bool = Field(
|
||||
default=False,
|
||||
description="Decode latents by overlaping tiles(less memory consumption)")
|
||||
fp32: bool = Field(False, description="Decode in full precision")
|
||||
description="Decode latents by overlapping tiles(less memory consumption)")
|
||||
fp32: bool = Field(DEFAULT_PRECISION=='float32', description="Decode in full precision")
|
||||
metadata: Optional[CoreMetadata] = Field(default=None, description="Optional core metadata to be written to the image")
|
||||
|
||||
# Schema customisation
|
||||
@ -599,7 +608,7 @@ class ResizeLatentsInvocation(BaseInvocation):
|
||||
antialias: bool = Field(
|
||||
default=False,
|
||||
description="Whether or not to antialias (applied in bilinear and bicubic modes only)")
|
||||
|
||||
|
||||
class Config(InvocationConfig):
|
||||
schema_extra = {
|
||||
"ui": {
|
||||
@ -645,7 +654,7 @@ class ScaleLatentsInvocation(BaseInvocation):
|
||||
antialias: bool = Field(
|
||||
default=False,
|
||||
description="Whether or not to antialias (applied in bilinear and bicubic modes only)")
|
||||
|
||||
|
||||
class Config(InvocationConfig):
|
||||
schema_extra = {
|
||||
"ui": {
|
||||
@ -688,7 +697,7 @@ class ImageToLatentsInvocation(BaseInvocation):
|
||||
tiled: bool = Field(
|
||||
default=False,
|
||||
description="Encode latents by overlaping tiles(less memory consumption)")
|
||||
fp32: bool = Field(False, description="Decode in full precision")
|
||||
fp32: bool = Field(DEFAULT_PRECISION=='float32', description="Decode in full precision")
|
||||
|
||||
|
||||
# Schema customisation
|
||||
@ -756,7 +765,7 @@ class ImageToLatentsInvocation(BaseInvocation):
|
||||
dtype=vae.dtype
|
||||
) # FIXME: uses torch.randn. make reproducible!
|
||||
|
||||
latents = 0.18215 * latents
|
||||
latents = vae.config.scaling_factor * latents
|
||||
latents = latents.to(dtype=orig_dtype)
|
||||
|
||||
name = f"{context.graph_execution_state_id}__{self.id}"
|
||||
|
@ -2,16 +2,18 @@ from typing import Literal, Optional, Union
|
||||
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
from invokeai.app.invocations.baseinvocation import (BaseInvocation,
|
||||
BaseInvocationOutput, InvocationConfig,
|
||||
InvocationContext)
|
||||
from invokeai.app.invocations.baseinvocation import (
|
||||
BaseInvocation,
|
||||
BaseInvocationOutput,
|
||||
InvocationConfig,
|
||||
InvocationContext,
|
||||
)
|
||||
from invokeai.app.invocations.controlnet_image_processors import ControlField
|
||||
from invokeai.app.invocations.model import (LoRAModelField, MainModelField,
|
||||
VAEModelField)
|
||||
|
||||
from invokeai.app.invocations.model import LoRAModelField, MainModelField, VAEModelField
|
||||
|
||||
class LoRAMetadataField(BaseModel):
|
||||
"""LoRA metadata for an image generated in InvokeAI."""
|
||||
|
||||
lora: LoRAModelField = Field(description="The LoRA model")
|
||||
weight: float = Field(description="The weight of the LoRA model")
|
||||
|
||||
@ -19,7 +21,9 @@ class LoRAMetadataField(BaseModel):
|
||||
class CoreMetadata(BaseModel):
|
||||
"""Core generation metadata for an image generated in InvokeAI."""
|
||||
|
||||
generation_mode: str = Field(description="The generation mode that output this image",)
|
||||
generation_mode: str = Field(
|
||||
description="The generation mode that output this image",
|
||||
)
|
||||
positive_prompt: str = Field(description="The positive prompt parameter")
|
||||
negative_prompt: str = Field(description="The negative prompt parameter")
|
||||
width: int = Field(description="The width parameter")
|
||||
@ -29,10 +33,20 @@ class CoreMetadata(BaseModel):
|
||||
cfg_scale: float = Field(description="The classifier-free guidance scale parameter")
|
||||
steps: int = Field(description="The number of steps used for inference")
|
||||
scheduler: str = Field(description="The scheduler used for inference")
|
||||
clip_skip: int = Field(description="The number of skipped CLIP layers",)
|
||||
clip_skip: int = Field(
|
||||
description="The number of skipped CLIP layers",
|
||||
)
|
||||
model: MainModelField = Field(description="The main model used for inference")
|
||||
controlnets: list[ControlField]= Field(description="The ControlNets used for inference")
|
||||
controlnets: list[ControlField] = Field(
|
||||
description="The ControlNets used for inference"
|
||||
)
|
||||
loras: list[LoRAMetadataField] = Field(description="The LoRAs used for inference")
|
||||
vae: Union[VAEModelField, None] = Field(
|
||||
default=None,
|
||||
description="The VAE used for decoding, if the main model's default was not used",
|
||||
)
|
||||
|
||||
# Latents-to-Latents
|
||||
strength: Union[float, None] = Field(
|
||||
default=None,
|
||||
description="The strength used for latents-to-latents",
|
||||
@ -40,9 +54,34 @@ class CoreMetadata(BaseModel):
|
||||
init_image: Union[str, None] = Field(
|
||||
default=None, description="The name of the initial image"
|
||||
)
|
||||
vae: Union[VAEModelField, None] = Field(
|
||||
|
||||
# SDXL
|
||||
positive_style_prompt: Union[str, None] = Field(
|
||||
default=None, description="The positive style prompt parameter"
|
||||
)
|
||||
negative_style_prompt: Union[str, None] = Field(
|
||||
default=None, description="The negative style prompt parameter"
|
||||
)
|
||||
|
||||
# SDXL Refiner
|
||||
refiner_model: Union[MainModelField, None] = Field(
|
||||
default=None, description="The SDXL Refiner model used"
|
||||
)
|
||||
refiner_cfg_scale: Union[float, None] = Field(
|
||||
default=None,
|
||||
description="The VAE used for decoding, if the main model's default was not used",
|
||||
description="The classifier-free guidance scale parameter used for the refiner",
|
||||
)
|
||||
refiner_steps: Union[int, None] = Field(
|
||||
default=None, description="The number of steps used for the refiner"
|
||||
)
|
||||
refiner_scheduler: Union[str, None] = Field(
|
||||
default=None, description="The scheduler used for the refiner"
|
||||
)
|
||||
refiner_aesthetic_store: Union[float, None] = Field(
|
||||
default=None, description="The aesthetic score used for the refiner"
|
||||
)
|
||||
refiner_start: Union[float, None] = Field(
|
||||
default=None, description="The start value used for refiner denoising"
|
||||
)
|
||||
|
||||
|
||||
@ -71,7 +110,9 @@ class MetadataAccumulatorInvocation(BaseInvocation):
|
||||
|
||||
type: Literal["metadata_accumulator"] = "metadata_accumulator"
|
||||
|
||||
generation_mode: str = Field(description="The generation mode that output this image",)
|
||||
generation_mode: str = Field(
|
||||
description="The generation mode that output this image",
|
||||
)
|
||||
positive_prompt: str = Field(description="The positive prompt parameter")
|
||||
negative_prompt: str = Field(description="The negative prompt parameter")
|
||||
width: int = Field(description="The width parameter")
|
||||
@ -81,9 +122,13 @@ class MetadataAccumulatorInvocation(BaseInvocation):
|
||||
cfg_scale: float = Field(description="The classifier-free guidance scale parameter")
|
||||
steps: int = Field(description="The number of steps used for inference")
|
||||
scheduler: str = Field(description="The scheduler used for inference")
|
||||
clip_skip: int = Field(description="The number of skipped CLIP layers",)
|
||||
clip_skip: int = Field(
|
||||
description="The number of skipped CLIP layers",
|
||||
)
|
||||
model: MainModelField = Field(description="The main model used for inference")
|
||||
controlnets: list[ControlField]= Field(description="The ControlNets used for inference")
|
||||
controlnets: list[ControlField] = Field(
|
||||
description="The ControlNets used for inference"
|
||||
)
|
||||
loras: list[LoRAMetadataField] = Field(description="The LoRAs used for inference")
|
||||
strength: Union[float, None] = Field(
|
||||
default=None,
|
||||
@ -97,36 +142,44 @@ class MetadataAccumulatorInvocation(BaseInvocation):
|
||||
description="The VAE used for decoding, if the main model's default was not used",
|
||||
)
|
||||
|
||||
# SDXL
|
||||
positive_style_prompt: Union[str, None] = Field(
|
||||
default=None, description="The positive style prompt parameter"
|
||||
)
|
||||
negative_style_prompt: Union[str, None] = Field(
|
||||
default=None, description="The negative style prompt parameter"
|
||||
)
|
||||
|
||||
# SDXL Refiner
|
||||
refiner_model: Union[MainModelField, None] = Field(
|
||||
default=None, description="The SDXL Refiner model used"
|
||||
)
|
||||
refiner_cfg_scale: Union[float, None] = Field(
|
||||
default=None,
|
||||
description="The classifier-free guidance scale parameter used for the refiner",
|
||||
)
|
||||
refiner_steps: Union[int, None] = Field(
|
||||
default=None, description="The number of steps used for the refiner"
|
||||
)
|
||||
refiner_scheduler: Union[str, None] = Field(
|
||||
default=None, description="The scheduler used for the refiner"
|
||||
)
|
||||
refiner_aesthetic_store: Union[float, None] = Field(
|
||||
default=None, description="The aesthetic score used for the refiner"
|
||||
)
|
||||
refiner_start: Union[float, None] = Field(
|
||||
default=None, description="The start value used for refiner denoising"
|
||||
)
|
||||
|
||||
class Config(InvocationConfig):
|
||||
schema_extra = {
|
||||
"ui": {
|
||||
"title": "Metadata Accumulator",
|
||||
"tags": ["image", "metadata", "generation"]
|
||||
"tags": ["image", "metadata", "generation"],
|
||||
},
|
||||
}
|
||||
|
||||
|
||||
def invoke(self, context: InvocationContext) -> MetadataAccumulatorOutput:
|
||||
"""Collects and outputs a CoreMetadata object"""
|
||||
|
||||
return MetadataAccumulatorOutput(
|
||||
metadata=CoreMetadata(
|
||||
generation_mode=self.generation_mode,
|
||||
positive_prompt=self.positive_prompt,
|
||||
negative_prompt=self.negative_prompt,
|
||||
width=self.width,
|
||||
height=self.height,
|
||||
seed=self.seed,
|
||||
rand_device=self.rand_device,
|
||||
cfg_scale=self.cfg_scale,
|
||||
steps=self.steps,
|
||||
scheduler=self.scheduler,
|
||||
model=self.model,
|
||||
strength=self.strength,
|
||||
init_image=self.init_image,
|
||||
vae=self.vae,
|
||||
controlnets=self.controlnets,
|
||||
loras=self.loras,
|
||||
clip_skip=self.clip_skip,
|
||||
)
|
||||
)
|
||||
return MetadataAccumulatorOutput(metadata=CoreMetadata(**self.dict()))
|
||||
|
@ -119,8 +119,8 @@ class NoiseInvocation(BaseInvocation):
|
||||
|
||||
@validator("seed", pre=True)
|
||||
def modulo_seed(cls, v):
|
||||
"""Returns the seed modulo SEED_MAX to ensure it is within the valid range."""
|
||||
return v % SEED_MAX
|
||||
"""Returns the seed modulo (SEED_MAX + 1) to ensure it is within the valid range."""
|
||||
return v % (SEED_MAX + 1)
|
||||
|
||||
def invoke(self, context: InvocationContext) -> NoiseOutput:
|
||||
noise = get_noise(
|
||||
|
@ -6,6 +6,7 @@ from typing import List, Literal, Optional, Union
|
||||
from pydantic import Field, validator
|
||||
|
||||
from ...backend.model_management import ModelType, SubModelType
|
||||
from invokeai.app.util.step_callback import stable_diffusion_xl_step_callback
|
||||
from .baseinvocation import (BaseInvocation, BaseInvocationOutput,
|
||||
InvocationConfig, InvocationContext)
|
||||
|
||||
@ -137,7 +138,7 @@ class SDXLRefinerModelLoaderInvocation(BaseInvocation):
|
||||
"ui": {
|
||||
"title": "SDXL Refiner Model Loader",
|
||||
"tags": ["model", "loader", "sdxl_refiner"],
|
||||
"type_hints": {"model": "model"},
|
||||
"type_hints": {"model": "refiner_model"},
|
||||
},
|
||||
}
|
||||
|
||||
@ -243,10 +244,31 @@ class SDXLTextToLatentsInvocation(BaseInvocation):
|
||||
},
|
||||
}
|
||||
|
||||
def dispatch_progress(
|
||||
self,
|
||||
context: InvocationContext,
|
||||
source_node_id: str,
|
||||
sample,
|
||||
step,
|
||||
total_steps,
|
||||
) -> None:
|
||||
stable_diffusion_xl_step_callback(
|
||||
context=context,
|
||||
node=self.dict(),
|
||||
source_node_id=source_node_id,
|
||||
sample=sample,
|
||||
step=step,
|
||||
total_steps=total_steps,
|
||||
)
|
||||
|
||||
# based on
|
||||
# https://github.com/huggingface/diffusers/blob/3ebbaf7c96801271f9e6c21400033b6aa5ffcf29/src/diffusers/pipelines/stable_diffusion/pipeline_onnx_stable_diffusion.py#L375
|
||||
@torch.no_grad()
|
||||
def invoke(self, context: InvocationContext) -> LatentsOutput:
|
||||
graph_execution_state = context.services.graph_execution_manager.get(
|
||||
context.graph_execution_state_id
|
||||
)
|
||||
source_node_id = graph_execution_state.prepared_source_mapping[self.id]
|
||||
latents = context.services.latents.get(self.noise.latents_name)
|
||||
|
||||
positive_cond_data = context.services.latents.get(self.positive_conditioning.conditioning_name)
|
||||
@ -273,7 +295,7 @@ class SDXLTextToLatentsInvocation(BaseInvocation):
|
||||
|
||||
|
||||
unet_info = context.services.model_manager.get_model(
|
||||
**self.unet.unet.dict()
|
||||
**self.unet.unet.dict(), context=context
|
||||
)
|
||||
do_classifier_free_guidance = True
|
||||
cross_attention_kwargs = None
|
||||
@ -341,6 +363,7 @@ class SDXLTextToLatentsInvocation(BaseInvocation):
|
||||
# call the callback, if provided
|
||||
if i == len(timesteps) - 1 or ((i + 1) > num_warmup_steps and (i + 1) % scheduler.order == 0):
|
||||
progress_bar.update()
|
||||
self.dispatch_progress(context, source_node_id, latents, i, num_inference_steps)
|
||||
#if callback is not None and i % callback_steps == 0:
|
||||
# callback(i, t, latents)
|
||||
else:
|
||||
@ -409,6 +432,7 @@ class SDXLTextToLatentsInvocation(BaseInvocation):
|
||||
# call the callback, if provided
|
||||
if i == len(timesteps) - 1 or ((i + 1) > num_warmup_steps and (i + 1) % scheduler.order == 0):
|
||||
progress_bar.update()
|
||||
self.dispatch_progress(context, source_node_id, latents, i, num_inference_steps)
|
||||
#if callback is not None and i % callback_steps == 0:
|
||||
# callback(i, t, latents)
|
||||
|
||||
@ -439,8 +463,8 @@ class SDXLLatentsToLatentsInvocation(BaseInvocation):
|
||||
unet: UNetField = Field(default=None, description="UNet submodel")
|
||||
latents: Optional[LatentsField] = Field(description="Initial latents")
|
||||
|
||||
denoising_start: float = Field(default=0.0, ge=0, lt=1, description="")
|
||||
denoising_end: float = Field(default=1.0, gt=0, le=1, description="")
|
||||
denoising_start: float = Field(default=0.0, ge=0, le=1, description="")
|
||||
denoising_end: float = Field(default=1.0, ge=0, le=1, description="")
|
||||
|
||||
#control: Union[ControlField, list[ControlField]] = Field(default=None, description="The control to use")
|
||||
#seamless: bool = Field(default=False, description="Whether or not to generate an image that can tile without seams", )
|
||||
@ -473,10 +497,31 @@ class SDXLLatentsToLatentsInvocation(BaseInvocation):
|
||||
},
|
||||
}
|
||||
|
||||
def dispatch_progress(
|
||||
self,
|
||||
context: InvocationContext,
|
||||
source_node_id: str,
|
||||
sample,
|
||||
step,
|
||||
total_steps,
|
||||
) -> None:
|
||||
stable_diffusion_xl_step_callback(
|
||||
context=context,
|
||||
node=self.dict(),
|
||||
source_node_id=source_node_id,
|
||||
sample=sample,
|
||||
step=step,
|
||||
total_steps=total_steps,
|
||||
)
|
||||
|
||||
# based on
|
||||
# https://github.com/huggingface/diffusers/blob/3ebbaf7c96801271f9e6c21400033b6aa5ffcf29/src/diffusers/pipelines/stable_diffusion/pipeline_onnx_stable_diffusion.py#L375
|
||||
@torch.no_grad()
|
||||
def invoke(self, context: InvocationContext) -> LatentsOutput:
|
||||
graph_execution_state = context.services.graph_execution_manager.get(
|
||||
context.graph_execution_state_id
|
||||
)
|
||||
source_node_id = graph_execution_state.prepared_source_mapping[self.id]
|
||||
latents = context.services.latents.get(self.latents.latents_name)
|
||||
|
||||
positive_cond_data = context.services.latents.get(self.positive_conditioning.conditioning_name)
|
||||
@ -504,13 +549,13 @@ class SDXLLatentsToLatentsInvocation(BaseInvocation):
|
||||
num_inference_steps = num_inference_steps - t_start
|
||||
|
||||
# apply noise(if provided)
|
||||
if self.noise is not None:
|
||||
if self.noise is not None and timesteps.shape[0] > 0:
|
||||
noise = context.services.latents.get(self.noise.latents_name)
|
||||
latents = scheduler.add_noise(latents, noise, timesteps[:1])
|
||||
del noise
|
||||
|
||||
unet_info = context.services.model_manager.get_model(
|
||||
**self.unet.unet.dict()
|
||||
**self.unet.unet.dict(), context=context,
|
||||
)
|
||||
do_classifier_free_guidance = True
|
||||
cross_attention_kwargs = None
|
||||
@ -579,6 +624,7 @@ class SDXLLatentsToLatentsInvocation(BaseInvocation):
|
||||
# call the callback, if provided
|
||||
if i == len(timesteps) - 1 or ((i + 1) > num_warmup_steps and (i + 1) % scheduler.order == 0):
|
||||
progress_bar.update()
|
||||
self.dispatch_progress(context, source_node_id, latents, i, num_inference_steps)
|
||||
#if callback is not None and i % callback_steps == 0:
|
||||
# callback(i, t, latents)
|
||||
else:
|
||||
@ -647,6 +693,7 @@ class SDXLLatentsToLatentsInvocation(BaseInvocation):
|
||||
# call the callback, if provided
|
||||
if i == len(timesteps) - 1 or ((i + 1) > num_warmup_steps and (i + 1) % scheduler.order == 0):
|
||||
progress_bar.update()
|
||||
self.dispatch_progress(context, source_node_id, latents, i, num_inference_steps)
|
||||
#if callback is not None and i % callback_steps == 0:
|
||||
# callback(i, t, latents)
|
||||
|
||||
|
@ -1,9 +1,80 @@
|
||||
from enum import Enum
|
||||
from typing import Optional, Tuple
|
||||
from typing import Optional, Tuple, Literal
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
from invokeai.app.util.metaenum import MetaEnum
|
||||
from ..invocations.baseinvocation import (
|
||||
BaseInvocationOutput,
|
||||
InvocationConfig,
|
||||
)
|
||||
|
||||
class ImageField(BaseModel):
|
||||
"""An image field used for passing image objects between invocations"""
|
||||
|
||||
image_name: Optional[str] = Field(default=None, description="The name of the image")
|
||||
|
||||
class Config:
|
||||
schema_extra = {"required": ["image_name"]}
|
||||
|
||||
|
||||
class ColorField(BaseModel):
|
||||
r: int = Field(ge=0, le=255, description="The red component")
|
||||
g: int = Field(ge=0, le=255, description="The green component")
|
||||
b: int = Field(ge=0, le=255, description="The blue component")
|
||||
a: int = Field(ge=0, le=255, description="The alpha component")
|
||||
|
||||
def tuple(self) -> Tuple[int, int, int, int]:
|
||||
return (self.r, self.g, self.b, self.a)
|
||||
|
||||
|
||||
class ProgressImage(BaseModel):
|
||||
"""The progress image sent intermittently during processing"""
|
||||
|
||||
width: int = Field(description="The effective width of the image in pixels")
|
||||
height: int = Field(description="The effective height of the image in pixels")
|
||||
dataURL: str = Field(description="The image data as a b64 data URL")
|
||||
|
||||
class PILInvocationConfig(BaseModel):
|
||||
"""Helper class to provide all PIL invocations with additional config"""
|
||||
|
||||
class Config(InvocationConfig):
|
||||
schema_extra = {
|
||||
"ui": {
|
||||
"tags": ["PIL", "image"],
|
||||
},
|
||||
}
|
||||
|
||||
class ImageOutput(BaseInvocationOutput):
|
||||
"""Base class for invocations that output an image"""
|
||||
|
||||
# fmt: off
|
||||
type: Literal["image_output"] = "image_output"
|
||||
image: ImageField = Field(default=None, description="The output image")
|
||||
width: int = Field(description="The width of the image in pixels")
|
||||
height: int = Field(description="The height of the image in pixels")
|
||||
# fmt: on
|
||||
|
||||
class Config:
|
||||
schema_extra = {"required": ["type", "image", "width", "height"]}
|
||||
|
||||
|
||||
class MaskOutput(BaseInvocationOutput):
|
||||
"""Base class for invocations that output a mask"""
|
||||
|
||||
# fmt: off
|
||||
type: Literal["mask"] = "mask"
|
||||
mask: ImageField = Field(default=None, description="The output mask")
|
||||
width: int = Field(description="The width of the mask in pixels")
|
||||
height: int = Field(description="The height of the mask in pixels")
|
||||
# fmt: on
|
||||
|
||||
class Config:
|
||||
schema_extra = {
|
||||
"required": [
|
||||
"type",
|
||||
"mask",
|
||||
]
|
||||
}
|
||||
|
||||
class ResourceOrigin(str, Enum, metaclass=MetaEnum):
|
||||
"""The origin of a resource (eg image).
|
||||
@ -63,28 +134,3 @@ class InvalidImageCategoryException(ValueError):
|
||||
super().__init__(message)
|
||||
|
||||
|
||||
class ImageField(BaseModel):
|
||||
"""An image field used for passing image objects between invocations"""
|
||||
|
||||
image_name: Optional[str] = Field(default=None, description="The name of the image")
|
||||
|
||||
class Config:
|
||||
schema_extra = {"required": ["image_name"]}
|
||||
|
||||
|
||||
class ColorField(BaseModel):
|
||||
r: int = Field(ge=0, le=255, description="The red component")
|
||||
g: int = Field(ge=0, le=255, description="The green component")
|
||||
b: int = Field(ge=0, le=255, description="The blue component")
|
||||
a: int = Field(ge=0, le=255, description="The alpha component")
|
||||
|
||||
def tuple(self) -> Tuple[int, int, int, int]:
|
||||
return (self.r, self.g, self.b, self.a)
|
||||
|
||||
|
||||
class ProgressImage(BaseModel):
|
||||
"""The progress image sent intermittently during processing"""
|
||||
|
||||
width: int = Field(description="The effective width of the image in pixels")
|
||||
height: int = Field(description="The effective height of the image in pixels")
|
||||
dataURL: str = Field(description="The image data as a b64 data URL")
|
||||
|
@ -32,11 +32,11 @@ class BoardImageRecordStorageBase(ABC):
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def get_images_for_board(
|
||||
def get_all_board_image_names_for_board(
|
||||
self,
|
||||
board_id: str,
|
||||
) -> OffsetPaginatedResults[ImageRecord]:
|
||||
"""Gets images for a board."""
|
||||
) -> list[str]:
|
||||
"""Gets all board images for a board, as a list of the image names."""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
@ -211,6 +211,26 @@ class SqliteBoardImageRecordStorage(BoardImageRecordStorageBase):
|
||||
items=images, offset=offset, limit=limit, total=count
|
||||
)
|
||||
|
||||
def get_all_board_image_names_for_board(self, board_id: str) -> list[str]:
|
||||
try:
|
||||
self._lock.acquire()
|
||||
self._cursor.execute(
|
||||
"""--sql
|
||||
SELECT image_name
|
||||
FROM board_images
|
||||
WHERE board_id = ?;
|
||||
""",
|
||||
(board_id,),
|
||||
)
|
||||
result = cast(list[sqlite3.Row], self._cursor.fetchall())
|
||||
image_names = list(map(lambda r: r[0], result))
|
||||
return image_names
|
||||
except sqlite3.Error as e:
|
||||
self._conn.rollback()
|
||||
raise e
|
||||
finally:
|
||||
self._lock.release()
|
||||
|
||||
def get_board_for_image(
|
||||
self,
|
||||
image_name: str,
|
||||
|
@ -38,11 +38,11 @@ class BoardImagesServiceABC(ABC):
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def get_images_for_board(
|
||||
def get_all_board_image_names_for_board(
|
||||
self,
|
||||
board_id: str,
|
||||
) -> OffsetPaginatedResults[ImageDTO]:
|
||||
"""Gets images for a board."""
|
||||
) -> list[str]:
|
||||
"""Gets all board images for a board, as a list of the image names."""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
@ -98,30 +98,13 @@ class BoardImagesService(BoardImagesServiceABC):
|
||||
) -> None:
|
||||
self._services.board_image_records.remove_image_from_board(board_id, image_name)
|
||||
|
||||
def get_images_for_board(
|
||||
def get_all_board_image_names_for_board(
|
||||
self,
|
||||
board_id: str,
|
||||
) -> OffsetPaginatedResults[ImageDTO]:
|
||||
image_records = self._services.board_image_records.get_images_for_board(
|
||||
) -> list[str]:
|
||||
return self._services.board_image_records.get_all_board_image_names_for_board(
|
||||
board_id
|
||||
)
|
||||
image_dtos = list(
|
||||
map(
|
||||
lambda r: image_record_to_dto(
|
||||
r,
|
||||
self._services.urls.get_image_url(r.image_name),
|
||||
self._services.urls.get_image_url(r.image_name, True),
|
||||
board_id,
|
||||
),
|
||||
image_records.items,
|
||||
)
|
||||
)
|
||||
return OffsetPaginatedResults[ImageDTO](
|
||||
items=image_dtos,
|
||||
offset=image_records.offset,
|
||||
limit=image_records.limit,
|
||||
total=image_records.total,
|
||||
)
|
||||
|
||||
def get_board_for_image(
|
||||
self,
|
||||
@ -136,7 +119,7 @@ def board_record_to_dto(
|
||||
) -> BoardDTO:
|
||||
"""Converts a board record to a board DTO."""
|
||||
return BoardDTO(
|
||||
**board_record.dict(exclude={'cover_image_name'}),
|
||||
**board_record.dict(exclude={"cover_image_name"}),
|
||||
cover_image_name=cover_image_name,
|
||||
image_count=image_count,
|
||||
)
|
||||
|
@ -28,7 +28,6 @@ InvokeAI:
|
||||
always_use_cpu: false
|
||||
free_gpu_mem: false
|
||||
Features:
|
||||
nsfw_checker: true
|
||||
restore: true
|
||||
esrgan: true
|
||||
patchmatch: true
|
||||
@ -92,18 +91,18 @@ Typical usage at the top level file:
|
||||
|
||||
from invokeai.app.services.config import InvokeAIAppConfig
|
||||
|
||||
# get global configuration and print its nsfw_checker value
|
||||
# get global configuration and print its cache size
|
||||
conf = InvokeAIAppConfig.get_config()
|
||||
conf.parse_args()
|
||||
print(conf.nsfw_checker)
|
||||
print(conf.max_cache_size)
|
||||
|
||||
Typical usage in a backend module:
|
||||
|
||||
from invokeai.app.services.config import InvokeAIAppConfig
|
||||
|
||||
# get global configuration and print its nsfw_checker value
|
||||
# get global configuration and print its cache size value
|
||||
conf = InvokeAIAppConfig.get_config()
|
||||
print(conf.nsfw_checker)
|
||||
print(conf.max_cache_size)
|
||||
|
||||
|
||||
Computed properties:
|
||||
@ -277,7 +276,7 @@ class InvokeAISettings(BaseSettings):
|
||||
@classmethod
|
||||
def _excluded_from_yaml(self)->List[str]:
|
||||
# combination of deprecated parameters and internal ones that shouldn't be exposed as invokeai.yaml options
|
||||
return ['type','initconf', 'gpu_mem_reserved', 'max_loaded_models', 'version', 'from_file', 'model', 'restore']
|
||||
return ['type','initconf', 'gpu_mem_reserved', 'max_loaded_models', 'version', 'from_file', 'model', 'restore', 'root', 'nsfw_checker']
|
||||
|
||||
class Config:
|
||||
env_file_encoding = 'utf-8'
|
||||
@ -364,7 +363,6 @@ setting environment variables INVOKEAI_<setting>.
|
||||
esrgan : bool = Field(default=True, description="Enable/disable upscaling code", category='Features')
|
||||
internet_available : bool = Field(default=True, description="If true, attempt to download models on the fly; otherwise only use local models", category='Features')
|
||||
log_tokenization : bool = Field(default=False, description="Enable logging of parsed prompt tokens.", category='Features')
|
||||
nsfw_checker : bool = Field(default=True, description="Enable/disable the NSFW checker", category='Features')
|
||||
patchmatch : bool = Field(default=True, description="Enable/disable patchmatch inpaint code", category='Features')
|
||||
restore : bool = Field(default=True, description="Enable/disable face restoration code (DEPRECATED)", category='DEPRECATED')
|
||||
|
||||
@ -374,16 +372,17 @@ setting environment variables INVOKEAI_<setting>.
|
||||
max_cache_size : float = Field(default=6.0, gt=0, description="Maximum memory amount used by model cache for rapid switching", category='Memory/Performance')
|
||||
max_vram_cache_size : float = Field(default=2.75, ge=0, description="Amount of VRAM reserved for model storage", category='Memory/Performance')
|
||||
gpu_mem_reserved : float = Field(default=2.75, ge=0, description="DEPRECATED: use max_vram_cache_size. Amount of VRAM reserved for model storage", category='DEPRECATED')
|
||||
precision : Literal[tuple(['auto','float16','float32','autocast'])] = Field(default='float16',description='Floating point precision', category='Memory/Performance')
|
||||
nsfw_checker : bool = Field(default=True, description="DEPRECATED: use Web settings to enable/disable", category='DEPRECATED')
|
||||
precision : Literal[tuple(['auto','float16','float32','autocast'])] = Field(default='auto',description='Floating point precision', category='Memory/Performance')
|
||||
sequential_guidance : bool = Field(default=False, description="Whether to calculate guidance in serial instead of in parallel, lowering memory requirements", category='Memory/Performance')
|
||||
xformers_enabled : bool = Field(default=True, description="Enable/disable memory-efficient attention", category='Memory/Performance')
|
||||
tiled_decode : bool = Field(default=False, description="Whether to enable tiled VAE decode (reduces memory consumption with some performance penalty)", category='Memory/Performance')
|
||||
|
||||
root : Path = Field(default=_find_root(), description='InvokeAI runtime root directory', category='Paths')
|
||||
autoimport_dir : Path = Field(default='autoimport/main', description='Path to a directory of models files to be imported on startup.', category='Paths')
|
||||
lora_dir : Path = Field(default='autoimport/lora', description='Path to a directory of LoRA/LyCORIS models to be imported on startup.', category='Paths')
|
||||
embedding_dir : Path = Field(default='autoimport/embedding', description='Path to a directory of Textual Inversion embeddings to be imported on startup.', category='Paths')
|
||||
controlnet_dir : Path = Field(default='autoimport/controlnet', description='Path to a directory of ControlNet embeddings to be imported on startup.', category='Paths')
|
||||
autoimport_dir : Path = Field(default='autoimport', description='Path to a directory of models files to be imported on startup.', category='Paths')
|
||||
lora_dir : Path = Field(default=None, description='Path to a directory of LoRA/LyCORIS models to be imported on startup.', category='Paths')
|
||||
embedding_dir : Path = Field(default=None, description='Path to a directory of Textual Inversion embeddings to be imported on startup.', category='Paths')
|
||||
controlnet_dir : Path = Field(default=None, description='Path to a directory of ControlNet embeddings to be imported on startup.', category='Paths')
|
||||
conf_path : Path = Field(default='configs/models.yaml', description='Path to models definition file', category='Paths')
|
||||
models_dir : Path = Field(default='models', description='Path to the models directory', category='Paths')
|
||||
legacy_conf_dir : Path = Field(default='configs/stable-diffusion', description='Path to directory of legacy checkpoint config files', category='Paths')
|
||||
@ -397,7 +396,7 @@ setting environment variables INVOKEAI_<setting>.
|
||||
log_handlers : List[str] = Field(default=["console"], description='Log handler. Valid options are "console", "file=<path>", "syslog=path|address:host:port", "http=<url>"', category="Logging")
|
||||
# note - would be better to read the log_format values from logging.py, but this creates circular dependencies issues
|
||||
log_format : Literal[tuple(['plain','color','syslog','legacy'])] = Field(default="color", description='Log format. Use "plain" for text-only, "color" for colorized output, "legacy" for 2.3-style logging and "syslog" for syslog-style', category="Logging")
|
||||
log_level : Literal[tuple(["debug","info","warning","error","critical"])] = Field(default="debug", description="Emit logging messages at this level or higher", category="Logging")
|
||||
log_level : Literal[tuple(["debug","info","warning","error","critical"])] = Field(default="info", description="Emit logging messages at this level or higher", category="Logging")
|
||||
|
||||
version : bool = Field(default=False, description="Show InvokeAI version and exit", category="Other")
|
||||
#fmt: on
|
||||
@ -446,7 +445,7 @@ setting environment variables INVOKEAI_<setting>.
|
||||
Path to the runtime root directory
|
||||
'''
|
||||
if self.root:
|
||||
return Path(self.root).expanduser()
|
||||
return Path(self.root).expanduser().absolute()
|
||||
else:
|
||||
return self.find_root()
|
||||
|
||||
@ -525,6 +524,16 @@ setting environment variables INVOKEAI_<setting>.
|
||||
"""Return true if patchmatch true"""
|
||||
return self.patchmatch
|
||||
|
||||
@property
|
||||
def nsfw_checker(self)->bool:
|
||||
""" NSFW node is always active and disabled from Web UIe"""
|
||||
return True
|
||||
|
||||
@property
|
||||
def invisible_watermark(self)->bool:
|
||||
""" invisible watermark node is always active and disabled from Web UIe"""
|
||||
return True
|
||||
|
||||
@staticmethod
|
||||
def find_root()->Path:
|
||||
'''
|
||||
|
@ -1,4 +1,5 @@
|
||||
from ..invocations.latent import LatentsToImageInvocation, TextToLatentsInvocation
|
||||
from ..invocations.image import ImageNSFWBlurInvocation
|
||||
from ..invocations.noise import NoiseInvocation
|
||||
from ..invocations.compel import CompelInvocation
|
||||
from ..invocations.params import ParamIntInvocation
|
||||
@ -24,6 +25,7 @@ def create_text_to_image() -> LibraryGraph:
|
||||
'5': CompelInvocation(id='5'),
|
||||
'6': TextToLatentsInvocation(id='6'),
|
||||
'7': LatentsToImageInvocation(id='7'),
|
||||
'8': ImageNSFWBlurInvocation(id='8'),
|
||||
},
|
||||
edges=[
|
||||
Edge(source=EdgeConnection(node_id='width', field='a'), destination=EdgeConnection(node_id='3', field='width')),
|
||||
@ -33,6 +35,7 @@ def create_text_to_image() -> LibraryGraph:
|
||||
Edge(source=EdgeConnection(node_id='6', field='latents'), destination=EdgeConnection(node_id='7', field='latents')),
|
||||
Edge(source=EdgeConnection(node_id='4', field='conditioning'), destination=EdgeConnection(node_id='6', field='positive_conditioning')),
|
||||
Edge(source=EdgeConnection(node_id='5', field='conditioning'), destination=EdgeConnection(node_id='6', field='negative_conditioning')),
|
||||
Edge(source=EdgeConnection(node_id='7', field='image'), destination=EdgeConnection(node_id='8', field='image')),
|
||||
]
|
||||
),
|
||||
exposed_inputs=[
|
||||
@ -43,7 +46,7 @@ def create_text_to_image() -> LibraryGraph:
|
||||
ExposedNodeInput(node_path='seed', field='a', alias='seed'),
|
||||
],
|
||||
exposed_outputs=[
|
||||
ExposedNodeOutput(node_path='7', field='image', alias='image')
|
||||
ExposedNodeOutput(node_path='8', field='image', alias='image')
|
||||
])
|
||||
|
||||
|
||||
|
@ -3,7 +3,13 @@
|
||||
from typing import Any, Optional
|
||||
from invokeai.app.models.image import ProgressImage
|
||||
from invokeai.app.util.misc import get_timestamp
|
||||
from invokeai.app.services.model_manager_service import BaseModelType, ModelType, SubModelType, ModelInfo
|
||||
from invokeai.app.services.model_manager_service import (
|
||||
BaseModelType,
|
||||
ModelType,
|
||||
SubModelType,
|
||||
ModelInfo,
|
||||
)
|
||||
|
||||
|
||||
class EventServiceBase:
|
||||
session_event: str = "session_event"
|
||||
@ -38,7 +44,9 @@ class EventServiceBase:
|
||||
graph_execution_state_id=graph_execution_state_id,
|
||||
node=node,
|
||||
source_node_id=source_node_id,
|
||||
progress_image=progress_image.dict() if progress_image is not None else None,
|
||||
progress_image=progress_image.dict()
|
||||
if progress_image is not None
|
||||
else None,
|
||||
step=step,
|
||||
total_steps=total_steps,
|
||||
),
|
||||
@ -67,6 +75,7 @@ class EventServiceBase:
|
||||
graph_execution_state_id: str,
|
||||
node: dict,
|
||||
source_node_id: str,
|
||||
error_type: str,
|
||||
error: str,
|
||||
) -> None:
|
||||
"""Emitted when an invocation has completed"""
|
||||
@ -76,6 +85,7 @@ class EventServiceBase:
|
||||
graph_execution_state_id=graph_execution_state_id,
|
||||
node=node,
|
||||
source_node_id=source_node_id,
|
||||
error_type=error_type,
|
||||
error=error,
|
||||
),
|
||||
)
|
||||
@ -102,13 +112,13 @@ class EventServiceBase:
|
||||
),
|
||||
)
|
||||
|
||||
def emit_model_load_started (
|
||||
self,
|
||||
graph_execution_state_id: str,
|
||||
model_name: str,
|
||||
base_model: BaseModelType,
|
||||
model_type: ModelType,
|
||||
submodel: SubModelType,
|
||||
def emit_model_load_started(
|
||||
self,
|
||||
graph_execution_state_id: str,
|
||||
model_name: str,
|
||||
base_model: BaseModelType,
|
||||
model_type: ModelType,
|
||||
submodel: SubModelType,
|
||||
) -> None:
|
||||
"""Emitted when a model is requested"""
|
||||
self.__emit_session_event(
|
||||
@ -123,13 +133,13 @@ class EventServiceBase:
|
||||
)
|
||||
|
||||
def emit_model_load_completed(
|
||||
self,
|
||||
graph_execution_state_id: str,
|
||||
model_name: str,
|
||||
base_model: BaseModelType,
|
||||
model_type: ModelType,
|
||||
submodel: SubModelType,
|
||||
model_info: ModelInfo,
|
||||
self,
|
||||
graph_execution_state_id: str,
|
||||
model_name: str,
|
||||
base_model: BaseModelType,
|
||||
model_type: ModelType,
|
||||
submodel: SubModelType,
|
||||
model_info: ModelInfo,
|
||||
) -> None:
|
||||
"""Emitted when a model is correctly loaded (returns model info)"""
|
||||
self.__emit_session_event(
|
||||
@ -141,7 +151,41 @@ class EventServiceBase:
|
||||
model_type=model_type,
|
||||
submodel=submodel,
|
||||
hash=model_info.hash,
|
||||
location=model_info.location,
|
||||
location=str(model_info.location),
|
||||
precision=str(model_info.precision),
|
||||
),
|
||||
)
|
||||
|
||||
def emit_session_retrieval_error(
|
||||
self,
|
||||
graph_execution_state_id: str,
|
||||
error_type: str,
|
||||
error: str,
|
||||
) -> None:
|
||||
"""Emitted when session retrieval fails"""
|
||||
self.__emit_session_event(
|
||||
event_name="session_retrieval_error",
|
||||
payload=dict(
|
||||
graph_execution_state_id=graph_execution_state_id,
|
||||
error_type=error_type,
|
||||
error=error,
|
||||
),
|
||||
)
|
||||
|
||||
def emit_invocation_retrieval_error(
|
||||
self,
|
||||
graph_execution_state_id: str,
|
||||
node_id: str,
|
||||
error_type: str,
|
||||
error: str,
|
||||
) -> None:
|
||||
"""Emitted when invocation retrieval fails"""
|
||||
self.__emit_session_event(
|
||||
event_name="invocation_retrieval_error",
|
||||
payload=dict(
|
||||
graph_execution_state_id=graph_execution_state_id,
|
||||
node_id=node_id,
|
||||
error_type=error_type,
|
||||
error=error,
|
||||
),
|
||||
)
|
||||
|
@ -10,7 +10,10 @@ from pydantic.generics import GenericModel
|
||||
|
||||
from invokeai.app.models.image import ImageCategory, ResourceOrigin
|
||||
from invokeai.app.services.models.image_record import (
|
||||
ImageRecord, ImageRecordChanges, deserialize_image_record)
|
||||
ImageRecord,
|
||||
ImageRecordChanges,
|
||||
deserialize_image_record,
|
||||
)
|
||||
|
||||
T = TypeVar("T", bound=BaseModel)
|
||||
|
||||
@ -97,8 +100,8 @@ class ImageRecordStorageBase(ABC):
|
||||
@abstractmethod
|
||||
def get_many(
|
||||
self,
|
||||
offset: int = 0,
|
||||
limit: int = 10,
|
||||
offset: Optional[int] = None,
|
||||
limit: Optional[int] = None,
|
||||
image_origin: Optional[ResourceOrigin] = None,
|
||||
categories: Optional[list[ImageCategory]] = None,
|
||||
is_intermediate: Optional[bool] = None,
|
||||
@ -119,6 +122,11 @@ class ImageRecordStorageBase(ABC):
|
||||
"""Deletes many image records."""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def delete_intermediates(self) -> list[str]:
|
||||
"""Deletes all intermediate image records, returning a list of deleted image names."""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def save(
|
||||
self,
|
||||
@ -322,8 +330,8 @@ class SqliteImageRecordStorage(ImageRecordStorageBase):
|
||||
|
||||
def get_many(
|
||||
self,
|
||||
offset: int = 0,
|
||||
limit: int = 10,
|
||||
offset: Optional[int] = None,
|
||||
limit: Optional[int] = None,
|
||||
image_origin: Optional[ResourceOrigin] = None,
|
||||
categories: Optional[list[ImageCategory]] = None,
|
||||
is_intermediate: Optional[bool] = None,
|
||||
@ -377,11 +385,15 @@ class SqliteImageRecordStorage(ImageRecordStorageBase):
|
||||
|
||||
query_params.append(is_intermediate)
|
||||
|
||||
if board_id is not None:
|
||||
# board_id of "none" is reserved for images without a board
|
||||
if board_id == "none":
|
||||
query_conditions += """--sql
|
||||
AND board_images.board_id IS NULL
|
||||
"""
|
||||
elif board_id is not None:
|
||||
query_conditions += """--sql
|
||||
AND board_images.board_id = ?
|
||||
"""
|
||||
|
||||
query_params.append(board_id)
|
||||
|
||||
query_pagination = """--sql
|
||||
@ -392,8 +404,12 @@ class SqliteImageRecordStorage(ImageRecordStorageBase):
|
||||
images_query += query_conditions + query_pagination + ";"
|
||||
# Add all the parameters
|
||||
images_params = query_params.copy()
|
||||
images_params.append(limit)
|
||||
images_params.append(offset)
|
||||
|
||||
if limit is not None:
|
||||
images_params.append(limit)
|
||||
if offset is not None:
|
||||
images_params.append(offset)
|
||||
|
||||
# Build the list of images, deserializing each row
|
||||
self._cursor.execute(images_query, images_params)
|
||||
result = cast(list[sqlite3.Row], self._cursor.fetchall())
|
||||
@ -450,6 +466,32 @@ class SqliteImageRecordStorage(ImageRecordStorageBase):
|
||||
finally:
|
||||
self._lock.release()
|
||||
|
||||
|
||||
def delete_intermediates(self) -> list[str]:
|
||||
try:
|
||||
self._lock.acquire()
|
||||
self._cursor.execute(
|
||||
"""--sql
|
||||
SELECT image_name FROM images
|
||||
WHERE is_intermediate = TRUE;
|
||||
"""
|
||||
)
|
||||
result = cast(list[sqlite3.Row], self._cursor.fetchall())
|
||||
image_names = list(map(lambda r: r[0], result))
|
||||
self._cursor.execute(
|
||||
"""--sql
|
||||
DELETE FROM images
|
||||
WHERE is_intermediate = TRUE;
|
||||
"""
|
||||
)
|
||||
self._conn.commit()
|
||||
return image_names
|
||||
except sqlite3.Error as e:
|
||||
self._conn.rollback()
|
||||
raise ImageRecordDeleteException from e
|
||||
finally:
|
||||
self._lock.release()
|
||||
|
||||
def save(
|
||||
self,
|
||||
image_name: str,
|
||||
|
@ -6,22 +6,33 @@ from typing import TYPE_CHECKING, Optional
|
||||
from PIL.Image import Image as PILImageType
|
||||
|
||||
from invokeai.app.invocations.metadata import ImageMetadata
|
||||
from invokeai.app.models.image import (ImageCategory,
|
||||
InvalidImageCategoryException,
|
||||
InvalidOriginException, ResourceOrigin)
|
||||
from invokeai.app.services.board_image_record_storage import \
|
||||
BoardImageRecordStorageBase
|
||||
from invokeai.app.services.graph import Graph
|
||||
from invokeai.app.models.image import (
|
||||
ImageCategory,
|
||||
InvalidImageCategoryException,
|
||||
InvalidOriginException,
|
||||
ResourceOrigin,
|
||||
)
|
||||
from invokeai.app.services.board_image_record_storage import BoardImageRecordStorageBase
|
||||
from invokeai.app.services.image_file_storage import (
|
||||
ImageFileDeleteException, ImageFileNotFoundException,
|
||||
ImageFileSaveException, ImageFileStorageBase)
|
||||
ImageFileDeleteException,
|
||||
ImageFileNotFoundException,
|
||||
ImageFileSaveException,
|
||||
ImageFileStorageBase,
|
||||
)
|
||||
from invokeai.app.services.image_record_storage import (
|
||||
ImageRecordDeleteException, ImageRecordNotFoundException,
|
||||
ImageRecordSaveException, ImageRecordStorageBase, OffsetPaginatedResults)
|
||||
ImageRecordDeleteException,
|
||||
ImageRecordNotFoundException,
|
||||
ImageRecordSaveException,
|
||||
ImageRecordStorageBase,
|
||||
OffsetPaginatedResults,
|
||||
)
|
||||
from invokeai.app.services.item_storage import ItemStorageABC
|
||||
from invokeai.app.services.models.image_record import (ImageDTO, ImageRecord,
|
||||
ImageRecordChanges,
|
||||
image_record_to_dto)
|
||||
from invokeai.app.services.models.image_record import (
|
||||
ImageDTO,
|
||||
ImageRecord,
|
||||
ImageRecordChanges,
|
||||
image_record_to_dto,
|
||||
)
|
||||
from invokeai.app.services.resource_name import NameServiceBase
|
||||
from invokeai.app.services.urls import UrlServiceBase
|
||||
from invokeai.app.util.metadata import get_metadata_graph_from_raw_session
|
||||
@ -41,6 +52,7 @@ class ImageServiceABC(ABC):
|
||||
image_category: ImageCategory,
|
||||
node_id: Optional[str] = None,
|
||||
session_id: Optional[str] = None,
|
||||
board_id: Optional[str] = None,
|
||||
is_intermediate: bool = False,
|
||||
metadata: Optional[dict] = None,
|
||||
) -> ImageDTO:
|
||||
@ -109,6 +121,11 @@ class ImageServiceABC(ABC):
|
||||
"""Deletes an image."""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def delete_intermediates(self) -> int:
|
||||
"""Deletes all intermediate images."""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def delete_images_on_board(self, board_id: str):
|
||||
"""Deletes all images on a board."""
|
||||
@ -158,6 +175,7 @@ class ImageService(ImageServiceABC):
|
||||
image_category: ImageCategory,
|
||||
node_id: Optional[str] = None,
|
||||
session_id: Optional[str] = None,
|
||||
board_id: Optional[str] = None,
|
||||
is_intermediate: bool = False,
|
||||
metadata: Optional[dict] = None,
|
||||
) -> ImageDTO:
|
||||
@ -198,11 +216,13 @@ class ImageService(ImageServiceABC):
|
||||
metadata=metadata,
|
||||
session_id=session_id,
|
||||
)
|
||||
|
||||
if board_id is not None:
|
||||
self._services.board_image_records.add_image_to_board(
|
||||
board_id=board_id, image_name=image_name
|
||||
)
|
||||
self._services.image_files.save(
|
||||
image_name=image_name, image=image, metadata=metadata, graph=graph
|
||||
)
|
||||
|
||||
image_dto = self.get_dto(image_name)
|
||||
|
||||
return image_dto
|
||||
@ -213,7 +233,7 @@ class ImageService(ImageServiceABC):
|
||||
self._services.logger.error("Failed to save image file")
|
||||
raise
|
||||
except Exception as e:
|
||||
self._services.logger.error("Problem saving image record and file")
|
||||
self._services.logger.error(f"Problem saving image record and file: {str(e)}")
|
||||
raise e
|
||||
|
||||
def update(
|
||||
@ -378,16 +398,31 @@ class ImageService(ImageServiceABC):
|
||||
|
||||
def delete_images_on_board(self, board_id: str):
|
||||
try:
|
||||
images = self._services.board_image_records.get_images_for_board(board_id)
|
||||
image_name_list = list(
|
||||
map(
|
||||
lambda r: r.image_name,
|
||||
images.items,
|
||||
image_names = (
|
||||
self._services.board_image_records.get_all_board_image_names_for_board(
|
||||
board_id
|
||||
)
|
||||
)
|
||||
for image_name in image_name_list:
|
||||
for image_name in image_names:
|
||||
self._services.image_files.delete(image_name)
|
||||
self._services.image_records.delete_many(image_name_list)
|
||||
self._services.image_records.delete_many(image_names)
|
||||
except ImageRecordDeleteException:
|
||||
self._services.logger.error(f"Failed to delete image records")
|
||||
raise
|
||||
except ImageFileDeleteException:
|
||||
self._services.logger.error(f"Failed to delete image files")
|
||||
raise
|
||||
except Exception as e:
|
||||
self._services.logger.error("Problem deleting image records and files")
|
||||
raise e
|
||||
|
||||
def delete_intermediates(self) -> int:
|
||||
try:
|
||||
image_names = self._services.image_records.delete_intermediates()
|
||||
count = len(image_names)
|
||||
for image_name in image_names:
|
||||
self._services.image_files.delete(image_name)
|
||||
return count
|
||||
except ImageRecordDeleteException:
|
||||
self._services.logger.error(f"Failed to delete image records")
|
||||
raise
|
||||
|
@ -299,10 +299,11 @@ class ModelManagerService(ModelManagerServiceBase):
|
||||
else:
|
||||
config_file = config.root_dir / "configs/models.yaml"
|
||||
|
||||
logger.debug(f'config file={config_file}')
|
||||
logger.debug(f'Config file={config_file}')
|
||||
|
||||
device = torch.device(choose_torch_device())
|
||||
logger.debug(f'GPU device = {device}')
|
||||
device_name = torch.cuda.get_device_name() if device==torch.device('cuda') else ''
|
||||
logger.info(f'GPU device = {device} {device_name}')
|
||||
|
||||
precision = config.precision
|
||||
if precision == "auto":
|
||||
@ -599,7 +600,7 @@ class ModelManagerService(ModelManagerServiceBase):
|
||||
"""
|
||||
Return list of all models found in the designated directory.
|
||||
"""
|
||||
search = FindModels(directory,self.logger)
|
||||
search = FindModels([directory], self.logger)
|
||||
return search.list_models()
|
||||
|
||||
def sync_to_config(self):
|
||||
|
@ -39,21 +39,41 @@ class DefaultInvocationProcessor(InvocationProcessorABC):
|
||||
try:
|
||||
queue_item: InvocationQueueItem = self.__invoker.services.queue.get()
|
||||
except Exception as e:
|
||||
logger.debug("Exception while getting from queue: %s" % e)
|
||||
self.__invoker.services.logger.error("Exception while getting from queue:\n%s" % e)
|
||||
|
||||
if not queue_item: # Probably stopping
|
||||
# do not hammer the queue
|
||||
time.sleep(0.5)
|
||||
continue
|
||||
|
||||
graph_execution_state = (
|
||||
self.__invoker.services.graph_execution_manager.get(
|
||||
queue_item.graph_execution_state_id
|
||||
try:
|
||||
graph_execution_state = (
|
||||
self.__invoker.services.graph_execution_manager.get(
|
||||
queue_item.graph_execution_state_id
|
||||
)
|
||||
)
|
||||
)
|
||||
invocation = graph_execution_state.execution_graph.get_node(
|
||||
queue_item.invocation_id
|
||||
)
|
||||
except Exception as e:
|
||||
self.__invoker.services.logger.error("Exception while retrieving session:\n%s" % e)
|
||||
self.__invoker.services.events.emit_session_retrieval_error(
|
||||
graph_execution_state_id=queue_item.graph_execution_state_id,
|
||||
error_type=e.__class__.__name__,
|
||||
error=traceback.format_exc(),
|
||||
)
|
||||
continue
|
||||
|
||||
try:
|
||||
invocation = graph_execution_state.execution_graph.get_node(
|
||||
queue_item.invocation_id
|
||||
)
|
||||
except Exception as e:
|
||||
self.__invoker.services.logger.error("Exception while retrieving invocation:\n%s" % e)
|
||||
self.__invoker.services.events.emit_invocation_retrieval_error(
|
||||
graph_execution_state_id=queue_item.graph_execution_state_id,
|
||||
node_id=queue_item.invocation_id,
|
||||
error_type=e.__class__.__name__,
|
||||
error=traceback.format_exc(),
|
||||
)
|
||||
continue
|
||||
|
||||
# get the source node id to provide to clients (the prepared node id is not as useful)
|
||||
source_node_id = graph_execution_state.prepared_source_mapping[invocation.id]
|
||||
@ -114,11 +134,13 @@ class DefaultInvocationProcessor(InvocationProcessorABC):
|
||||
graph_execution_state
|
||||
)
|
||||
|
||||
self.__invoker.services.logger.error("Error while invoking:\n%s" % e)
|
||||
# Send error event
|
||||
self.__invoker.services.events.emit_invocation_error(
|
||||
graph_execution_state_id=graph_execution_state.id,
|
||||
node=invocation.dict(),
|
||||
source_node_id=source_node_id,
|
||||
error_type=e.__class__.__name__,
|
||||
error=error,
|
||||
)
|
||||
|
||||
@ -136,11 +158,12 @@ class DefaultInvocationProcessor(InvocationProcessorABC):
|
||||
try:
|
||||
self.__invoker.invoke(graph_execution_state, invoke_all=True)
|
||||
except Exception as e:
|
||||
logger.error("Error while invoking: %s" % e)
|
||||
self.__invoker.services.logger.error("Error while invoking:\n%s" % e)
|
||||
self.__invoker.services.events.emit_invocation_error(
|
||||
graph_execution_state_id=graph_execution_state.id,
|
||||
node=invocation.dict(),
|
||||
source_node_id=source_node_id,
|
||||
error_type=e.__class__.__name__,
|
||||
error=traceback.format_exc()
|
||||
)
|
||||
elif is_complete:
|
||||
|
342
invokeai/app/util/controlnet_utils.py
Normal file
@ -0,0 +1,342 @@
|
||||
import torch
|
||||
import numpy as np
|
||||
import cv2
|
||||
from PIL import Image
|
||||
from diffusers.utils import PIL_INTERPOLATION
|
||||
|
||||
from einops import rearrange
|
||||
from controlnet_aux.util import HWC3, resize_image
|
||||
|
||||
###################################################################
|
||||
# Copy of scripts/lvminthin.py from Mikubill/sd-webui-controlnet
|
||||
###################################################################
|
||||
# High Quality Edge Thinning using Pure Python
|
||||
# Written by Lvmin Zhangu
|
||||
# 2023 April
|
||||
# Stanford University
|
||||
# If you use this, please Cite "High Quality Edge Thinning using Pure Python", Lvmin Zhang, In Mikubill/sd-webui-controlnet.
|
||||
|
||||
lvmin_kernels_raw = [
|
||||
np.array([
|
||||
[-1, -1, -1],
|
||||
[0, 1, 0],
|
||||
[1, 1, 1]
|
||||
], dtype=np.int32),
|
||||
np.array([
|
||||
[0, -1, -1],
|
||||
[1, 1, -1],
|
||||
[0, 1, 0]
|
||||
], dtype=np.int32)
|
||||
]
|
||||
|
||||
lvmin_kernels = []
|
||||
lvmin_kernels += [np.rot90(x, k=0, axes=(0, 1)) for x in lvmin_kernels_raw]
|
||||
lvmin_kernels += [np.rot90(x, k=1, axes=(0, 1)) for x in lvmin_kernels_raw]
|
||||
lvmin_kernels += [np.rot90(x, k=2, axes=(0, 1)) for x in lvmin_kernels_raw]
|
||||
lvmin_kernels += [np.rot90(x, k=3, axes=(0, 1)) for x in lvmin_kernels_raw]
|
||||
|
||||
lvmin_prunings_raw = [
|
||||
np.array([
|
||||
[-1, -1, -1],
|
||||
[-1, 1, -1],
|
||||
[0, 0, -1]
|
||||
], dtype=np.int32),
|
||||
np.array([
|
||||
[-1, -1, -1],
|
||||
[-1, 1, -1],
|
||||
[-1, 0, 0]
|
||||
], dtype=np.int32)
|
||||
]
|
||||
|
||||
lvmin_prunings = []
|
||||
lvmin_prunings += [np.rot90(x, k=0, axes=(0, 1)) for x in lvmin_prunings_raw]
|
||||
lvmin_prunings += [np.rot90(x, k=1, axes=(0, 1)) for x in lvmin_prunings_raw]
|
||||
lvmin_prunings += [np.rot90(x, k=2, axes=(0, 1)) for x in lvmin_prunings_raw]
|
||||
lvmin_prunings += [np.rot90(x, k=3, axes=(0, 1)) for x in lvmin_prunings_raw]
|
||||
|
||||
|
||||
def remove_pattern(x, kernel):
|
||||
objects = cv2.morphologyEx(x, cv2.MORPH_HITMISS, kernel)
|
||||
objects = np.where(objects > 127)
|
||||
x[objects] = 0
|
||||
return x, objects[0].shape[0] > 0
|
||||
|
||||
|
||||
def thin_one_time(x, kernels):
|
||||
y = x
|
||||
is_done = True
|
||||
for k in kernels:
|
||||
y, has_update = remove_pattern(y, k)
|
||||
if has_update:
|
||||
is_done = False
|
||||
return y, is_done
|
||||
|
||||
|
||||
def lvmin_thin(x, prunings=True):
|
||||
y = x
|
||||
for i in range(32):
|
||||
y, is_done = thin_one_time(y, lvmin_kernels)
|
||||
if is_done:
|
||||
break
|
||||
if prunings:
|
||||
y, _ = thin_one_time(y, lvmin_prunings)
|
||||
return y
|
||||
|
||||
|
||||
def nake_nms(x):
|
||||
f1 = np.array([[0, 0, 0], [1, 1, 1], [0, 0, 0]], dtype=np.uint8)
|
||||
f2 = np.array([[0, 1, 0], [0, 1, 0], [0, 1, 0]], dtype=np.uint8)
|
||||
f3 = np.array([[1, 0, 0], [0, 1, 0], [0, 0, 1]], dtype=np.uint8)
|
||||
f4 = np.array([[0, 0, 1], [0, 1, 0], [1, 0, 0]], dtype=np.uint8)
|
||||
y = np.zeros_like(x)
|
||||
for f in [f1, f2, f3, f4]:
|
||||
np.putmask(y, cv2.dilate(x, kernel=f) == x, x)
|
||||
return y
|
||||
|
||||
|
||||
################################################################################
|
||||
# copied from Mikubill/sd-webui-controlnet external_code.py and modified for InvokeAI
|
||||
################################################################################
|
||||
# FIXME: not using yet, if used in the future will most likely require modification of preprocessors
|
||||
def pixel_perfect_resolution(
|
||||
image: np.ndarray,
|
||||
target_H: int,
|
||||
target_W: int,
|
||||
resize_mode: str,
|
||||
) -> int:
|
||||
"""
|
||||
Calculate the estimated resolution for resizing an image while preserving aspect ratio.
|
||||
|
||||
The function first calculates scaling factors for height and width of the image based on the target
|
||||
height and width. Then, based on the chosen resize mode, it either takes the smaller or the larger
|
||||
scaling factor to estimate the new resolution.
|
||||
|
||||
If the resize mode is OUTER_FIT, the function uses the smaller scaling factor, ensuring the whole image
|
||||
fits within the target dimensions, potentially leaving some empty space.
|
||||
|
||||
If the resize mode is not OUTER_FIT, the function uses the larger scaling factor, ensuring the target
|
||||
dimensions are fully filled, potentially cropping the image.
|
||||
|
||||
After calculating the estimated resolution, the function prints some debugging information.
|
||||
|
||||
Args:
|
||||
image (np.ndarray): A 3D numpy array representing an image. The dimensions represent [height, width, channels].
|
||||
target_H (int): The target height for the image.
|
||||
target_W (int): The target width for the image.
|
||||
resize_mode (ResizeMode): The mode for resizing.
|
||||
|
||||
Returns:
|
||||
int: The estimated resolution after resizing.
|
||||
"""
|
||||
raw_H, raw_W, _ = image.shape
|
||||
|
||||
k0 = float(target_H) / float(raw_H)
|
||||
k1 = float(target_W) / float(raw_W)
|
||||
|
||||
if resize_mode == "fill_resize":
|
||||
estimation = min(k0, k1) * float(min(raw_H, raw_W))
|
||||
else: # "crop_resize" or "just_resize" (or possibly "just_resize_simple"?)
|
||||
estimation = max(k0, k1) * float(min(raw_H, raw_W))
|
||||
|
||||
# print(f"Pixel Perfect Computation:")
|
||||
# print(f"resize_mode = {resize_mode}")
|
||||
# print(f"raw_H = {raw_H}")
|
||||
# print(f"raw_W = {raw_W}")
|
||||
# print(f"target_H = {target_H}")
|
||||
# print(f"target_W = {target_W}")
|
||||
# print(f"estimation = {estimation}")
|
||||
|
||||
return int(np.round(estimation))
|
||||
|
||||
|
||||
###########################################################################
|
||||
# Copied from detectmap_proc method in scripts/detectmap_proc.py in Mikubill/sd-webui-controlnet
|
||||
# modified for InvokeAI
|
||||
###########################################################################
|
||||
# def detectmap_proc(detected_map, module, resize_mode, h, w):
|
||||
def np_img_resize(
|
||||
np_img: np.ndarray,
|
||||
resize_mode: str,
|
||||
h: int,
|
||||
w: int,
|
||||
device: torch.device = torch.device('cpu')
|
||||
):
|
||||
# if 'inpaint' in module:
|
||||
# np_img = np_img.astype(np.float32)
|
||||
# else:
|
||||
# np_img = HWC3(np_img)
|
||||
np_img = HWC3(np_img)
|
||||
|
||||
def safe_numpy(x):
|
||||
# A very safe method to make sure that Apple/Mac works
|
||||
y = x
|
||||
|
||||
# below is very boring but do not change these. If you change these Apple or Mac may fail.
|
||||
y = y.copy()
|
||||
y = np.ascontiguousarray(y)
|
||||
y = y.copy()
|
||||
return y
|
||||
|
||||
def get_pytorch_control(x):
|
||||
# A very safe method to make sure that Apple/Mac works
|
||||
y = x
|
||||
|
||||
# below is very boring but do not change these. If you change these Apple or Mac may fail.
|
||||
y = torch.from_numpy(y)
|
||||
y = y.float() / 255.0
|
||||
y = rearrange(y, 'h w c -> 1 c h w')
|
||||
y = y.clone()
|
||||
# y = y.to(devices.get_device_for("controlnet"))
|
||||
y = y.to(device)
|
||||
y = y.clone()
|
||||
return y
|
||||
|
||||
def high_quality_resize(x: np.ndarray,
|
||||
size):
|
||||
# Written by lvmin
|
||||
# Super high-quality control map up-scaling, considering binary, seg, and one-pixel edges
|
||||
inpaint_mask = None
|
||||
if x.ndim == 3 and x.shape[2] == 4:
|
||||
inpaint_mask = x[:, :, 3]
|
||||
x = x[:, :, 0:3]
|
||||
|
||||
new_size_is_smaller = (size[0] * size[1]) < (x.shape[0] * x.shape[1])
|
||||
new_size_is_bigger = (size[0] * size[1]) > (x.shape[0] * x.shape[1])
|
||||
unique_color_count = np.unique(x.reshape(-1, x.shape[2]), axis=0).shape[0]
|
||||
is_one_pixel_edge = False
|
||||
is_binary = False
|
||||
if unique_color_count == 2:
|
||||
is_binary = np.min(x) < 16 and np.max(x) > 240
|
||||
if is_binary:
|
||||
xc = x
|
||||
xc = cv2.erode(xc, np.ones(shape=(3, 3), dtype=np.uint8), iterations=1)
|
||||
xc = cv2.dilate(xc, np.ones(shape=(3, 3), dtype=np.uint8), iterations=1)
|
||||
one_pixel_edge_count = np.where(xc < x)[0].shape[0]
|
||||
all_edge_count = np.where(x > 127)[0].shape[0]
|
||||
is_one_pixel_edge = one_pixel_edge_count * 2 > all_edge_count
|
||||
|
||||
if 2 < unique_color_count < 200:
|
||||
interpolation = cv2.INTER_NEAREST
|
||||
elif new_size_is_smaller:
|
||||
interpolation = cv2.INTER_AREA
|
||||
else:
|
||||
interpolation = cv2.INTER_CUBIC # Must be CUBIC because we now use nms. NEVER CHANGE THIS
|
||||
|
||||
y = cv2.resize(x, size, interpolation=interpolation)
|
||||
if inpaint_mask is not None:
|
||||
inpaint_mask = cv2.resize(inpaint_mask, size, interpolation=interpolation)
|
||||
|
||||
if is_binary:
|
||||
y = np.mean(y.astype(np.float32), axis=2).clip(0, 255).astype(np.uint8)
|
||||
if is_one_pixel_edge:
|
||||
y = nake_nms(y)
|
||||
_, y = cv2.threshold(y, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
|
||||
y = lvmin_thin(y, prunings=new_size_is_bigger)
|
||||
else:
|
||||
_, y = cv2.threshold(y, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
|
||||
y = np.stack([y] * 3, axis=2)
|
||||
|
||||
if inpaint_mask is not None:
|
||||
inpaint_mask = (inpaint_mask > 127).astype(np.float32) * 255.0
|
||||
inpaint_mask = inpaint_mask[:, :, None].clip(0, 255).astype(np.uint8)
|
||||
y = np.concatenate([y, inpaint_mask], axis=2)
|
||||
|
||||
return y
|
||||
|
||||
# if resize_mode == external_code.ResizeMode.RESIZE:
|
||||
if resize_mode == "just_resize": # RESIZE
|
||||
np_img = high_quality_resize(np_img, (w, h))
|
||||
np_img = safe_numpy(np_img)
|
||||
return get_pytorch_control(np_img), np_img
|
||||
|
||||
old_h, old_w, _ = np_img.shape
|
||||
old_w = float(old_w)
|
||||
old_h = float(old_h)
|
||||
k0 = float(h) / old_h
|
||||
k1 = float(w) / old_w
|
||||
|
||||
safeint = lambda x: int(np.round(x))
|
||||
|
||||
# if resize_mode == external_code.ResizeMode.OUTER_FIT:
|
||||
if resize_mode == "fill_resize": # OUTER_FIT
|
||||
k = min(k0, k1)
|
||||
borders = np.concatenate([np_img[0, :, :], np_img[-1, :, :], np_img[:, 0, :], np_img[:, -1, :]], axis=0)
|
||||
high_quality_border_color = np.median(borders, axis=0).astype(np_img.dtype)
|
||||
if len(high_quality_border_color) == 4:
|
||||
# Inpaint hijack
|
||||
high_quality_border_color[3] = 255
|
||||
high_quality_background = np.tile(high_quality_border_color[None, None], [h, w, 1])
|
||||
np_img = high_quality_resize(np_img, (safeint(old_w * k), safeint(old_h * k)))
|
||||
new_h, new_w, _ = np_img.shape
|
||||
pad_h = max(0, (h - new_h) // 2)
|
||||
pad_w = max(0, (w - new_w) // 2)
|
||||
high_quality_background[pad_h:pad_h + new_h, pad_w:pad_w + new_w] = np_img
|
||||
np_img = high_quality_background
|
||||
np_img = safe_numpy(np_img)
|
||||
return get_pytorch_control(np_img), np_img
|
||||
else: # resize_mode == "crop_resize" (INNER_FIT)
|
||||
k = max(k0, k1)
|
||||
np_img = high_quality_resize(np_img, (safeint(old_w * k), safeint(old_h * k)))
|
||||
new_h, new_w, _ = np_img.shape
|
||||
pad_h = max(0, (new_h - h) // 2)
|
||||
pad_w = max(0, (new_w - w) // 2)
|
||||
np_img = np_img[pad_h:pad_h + h, pad_w:pad_w + w]
|
||||
np_img = safe_numpy(np_img)
|
||||
return get_pytorch_control(np_img), np_img
|
||||
|
||||
def prepare_control_image(
|
||||
# image used to be Union[PIL.Image.Image, List[PIL.Image.Image], torch.Tensor, List[torch.Tensor]]
|
||||
# but now should be able to assume that image is a single PIL.Image, which simplifies things
|
||||
image: Image,
|
||||
# FIXME: need to fix hardwiring of width and height, change to basing on latents dimensions?
|
||||
# latents_to_match_resolution, # TorchTensor of shape (batch_size, 3, height, width)
|
||||
width=512, # should be 8 * latent.shape[3]
|
||||
height=512, # should be 8 * latent height[2]
|
||||
# batch_size=1, # currently no batching
|
||||
# num_images_per_prompt=1, # currently only single image
|
||||
device="cuda",
|
||||
dtype=torch.float16,
|
||||
do_classifier_free_guidance=True,
|
||||
control_mode="balanced",
|
||||
resize_mode="just_resize_simple",
|
||||
):
|
||||
# FIXME: implement "crop_resize_simple" and "fill_resize_simple", or pull them out
|
||||
if (resize_mode == "just_resize_simple" or
|
||||
resize_mode == "crop_resize_simple" or
|
||||
resize_mode == "fill_resize_simple"):
|
||||
image = image.convert("RGB")
|
||||
if (resize_mode == "just_resize_simple"):
|
||||
image = image.resize((width, height), resample=PIL_INTERPOLATION["lanczos"])
|
||||
elif (resize_mode == "crop_resize_simple"): # not yet implemented
|
||||
pass
|
||||
elif (resize_mode == "fill_resize_simple"): # not yet implemented
|
||||
pass
|
||||
nimage = np.array(image)
|
||||
nimage = nimage[None, :]
|
||||
nimage = np.concatenate([nimage], axis=0)
|
||||
# normalizing RGB values to [0,1] range (in PIL.Image they are [0-255])
|
||||
nimage = np.array(nimage).astype(np.float32) / 255.0
|
||||
nimage = nimage.transpose(0, 3, 1, 2)
|
||||
timage = torch.from_numpy(nimage)
|
||||
|
||||
# use fancy lvmin controlnet resizing
|
||||
elif (resize_mode == "just_resize" or resize_mode == "crop_resize" or resize_mode == "fill_resize"):
|
||||
nimage = np.array(image)
|
||||
timage, nimage = np_img_resize(
|
||||
np_img=nimage,
|
||||
resize_mode=resize_mode,
|
||||
h=height,
|
||||
w=width,
|
||||
# device=torch.device('cpu')
|
||||
device=device,
|
||||
)
|
||||
else:
|
||||
pass
|
||||
print("ERROR: invalid resize_mode ==> ", resize_mode)
|
||||
exit(1)
|
||||
|
||||
timage = timage.to(device=device, dtype=dtype)
|
||||
cfg_injection = (control_mode == "more_control" or control_mode == "unbalanced")
|
||||
if do_classifier_free_guidance and not cfg_injection:
|
||||
timage = torch.cat([timage] * 2)
|
||||
return timage
|
@ -14,8 +14,9 @@ def get_datetime_from_iso_timestamp(iso_timestamp: str) -> datetime.datetime:
|
||||
return datetime.datetime.fromisoformat(iso_timestamp)
|
||||
|
||||
|
||||
SEED_MAX = np.iinfo(np.int32).max
|
||||
SEED_MAX = np.iinfo(np.uint32).max
|
||||
|
||||
|
||||
def get_random_seed():
|
||||
return np.random.randint(0, SEED_MAX)
|
||||
rng = np.random.default_rng(seed=0)
|
||||
return int(rng.integers(0, SEED_MAX))
|
||||
|
@ -1,9 +1,30 @@
|
||||
import torch
|
||||
from PIL import Image
|
||||
from invokeai.app.models.exceptions import CanceledException
|
||||
from invokeai.app.models.image import ProgressImage
|
||||
from ..invocations.baseinvocation import InvocationContext
|
||||
from ...backend.util.util import image_to_dataURL
|
||||
from ...backend.generator.base import Generator
|
||||
from ...backend.stable_diffusion import PipelineIntermediateState
|
||||
from invokeai.app.services.config import InvokeAIAppConfig
|
||||
|
||||
|
||||
def sample_to_lowres_estimated_image(samples, latent_rgb_factors, smooth_matrix = None):
|
||||
latent_image = samples[0].permute(1, 2, 0) @ latent_rgb_factors
|
||||
|
||||
if smooth_matrix is not None:
|
||||
latent_image = latent_image.unsqueeze(0).permute(3, 0, 1, 2)
|
||||
latent_image = torch.nn.functional.conv2d(latent_image, smooth_matrix.reshape((1,1,3,3)), padding=1)
|
||||
latent_image = latent_image.permute(1, 2, 3, 0).squeeze(0)
|
||||
|
||||
latents_ubyte = (
|
||||
((latent_image + 1) / 2)
|
||||
.clamp(0, 1) # change scale from -1..1 to 0..1
|
||||
.mul(0xFF) # to 0..255
|
||||
.byte()
|
||||
).cpu()
|
||||
|
||||
return Image.fromarray(latents_ubyte.numpy())
|
||||
|
||||
|
||||
def stable_diffusion_step_callback(
|
||||
@ -37,7 +58,24 @@ def stable_diffusion_step_callback(
|
||||
# step = intermediate_state.step
|
||||
|
||||
# TODO: only output a preview image when requested
|
||||
image = Generator.sample_to_lowres_estimated_image(sample)
|
||||
|
||||
# origingally adapted from code by @erucipe and @keturn here:
|
||||
# https://discuss.huggingface.co/t/decoding-latents-to-rgb-without-upscaling/23204/7
|
||||
|
||||
# these updated numbers for v1.5 are from @torridgristle
|
||||
v1_5_latent_rgb_factors = torch.tensor(
|
||||
[
|
||||
# R G B
|
||||
[0.3444, 0.1385, 0.0670], # L1
|
||||
[0.1247, 0.4027, 0.1494], # L2
|
||||
[-0.3192, 0.2513, 0.2103], # L3
|
||||
[-0.1307, -0.1874, -0.7445], # L4
|
||||
],
|
||||
dtype=sample.dtype,
|
||||
device=sample.device,
|
||||
)
|
||||
|
||||
image = sample_to_lowres_estimated_image(sample, v1_5_latent_rgb_factors)
|
||||
|
||||
(width, height) = image.size
|
||||
width *= 8
|
||||
@ -53,3 +91,56 @@ def stable_diffusion_step_callback(
|
||||
step=intermediate_state.step,
|
||||
total_steps=node["steps"],
|
||||
)
|
||||
|
||||
def stable_diffusion_xl_step_callback(
|
||||
context: InvocationContext,
|
||||
node: dict,
|
||||
source_node_id: str,
|
||||
sample,
|
||||
step,
|
||||
total_steps,
|
||||
):
|
||||
if context.services.queue.is_canceled(context.graph_execution_state_id):
|
||||
raise CanceledException
|
||||
|
||||
sdxl_latent_rgb_factors = torch.tensor(
|
||||
[
|
||||
# R G B
|
||||
[ 0.3816, 0.4930, 0.5320],
|
||||
[-0.3753, 0.1631, 0.1739],
|
||||
[ 0.1770, 0.3588, -0.2048],
|
||||
[-0.4350, -0.2644, -0.4289],
|
||||
],
|
||||
dtype=sample.dtype,
|
||||
device=sample.device,
|
||||
)
|
||||
|
||||
sdxl_smooth_matrix = torch.tensor(
|
||||
[
|
||||
#[ 0.0478, 0.1285, 0.0478],
|
||||
#[ 0.1285, 0.2948, 0.1285],
|
||||
#[ 0.0478, 0.1285, 0.0478],
|
||||
[0.0358, 0.0964, 0.0358],
|
||||
[0.0964, 0.4711, 0.0964],
|
||||
[0.0358, 0.0964, 0.0358],
|
||||
],
|
||||
dtype=sample.dtype,
|
||||
device=sample.device,
|
||||
)
|
||||
|
||||
image = sample_to_lowres_estimated_image(sample, sdxl_latent_rgb_factors, sdxl_smooth_matrix)
|
||||
|
||||
(width, height) = image.size
|
||||
width *= 8
|
||||
height *= 8
|
||||
|
||||
dataURL = image_to_dataURL(image, image_format="JPEG")
|
||||
|
||||
context.services.events.emit_generator_progress(
|
||||
graph_execution_state_id=context.graph_execution_state_id,
|
||||
node=node,
|
||||
source_node_id=source_node_id,
|
||||
progress_image=ProgressImage(width=width, height=height, dataURL=dataURL),
|
||||
step=step,
|
||||
total_steps=total_steps,
|
||||
)
|
@ -12,4 +12,4 @@ from .model_management import (
|
||||
ModelManager, ModelCache, BaseModelType,
|
||||
ModelType, SubModelType, ModelInfo
|
||||
)
|
||||
from .safety_checker import SafetyChecker
|
||||
from .model_management.models import SilenceWarnings
|
||||
|
@ -28,7 +28,6 @@ from diffusers.schedulers import SchedulerMixin as Scheduler
|
||||
import invokeai.backend.util.logging as logger
|
||||
from ..image_util import configure_model_padding
|
||||
from ..util.util import rand_perlin_2d
|
||||
from ..safety_checker import SafetyChecker
|
||||
from ..stable_diffusion.diffusers_pipeline import StableDiffusionGeneratorPipeline
|
||||
from ..stable_diffusion.schedulers import SCHEDULER_MAP
|
||||
|
||||
@ -52,7 +51,6 @@ class InvokeAIGeneratorBasicParams:
|
||||
v_symmetry_time_pct: Optional[float]=None
|
||||
variation_amount: float = 0.0
|
||||
with_variations: list=field(default_factory=list)
|
||||
safety_checker: Optional[SafetyChecker]=None
|
||||
|
||||
@dataclass
|
||||
class InvokeAIGeneratorOutput:
|
||||
@ -240,7 +238,6 @@ class Generator:
|
||||
self.seed = None
|
||||
self.latent_channels = model.unet.config.in_channels
|
||||
self.downsampling_factor = downsampling # BUG: should come from model or config
|
||||
self.safety_checker = None
|
||||
self.perlin = 0.0
|
||||
self.threshold = 0
|
||||
self.variation_amount = 0
|
||||
@ -277,12 +274,10 @@ class Generator:
|
||||
perlin=0.0,
|
||||
h_symmetry_time_pct=None,
|
||||
v_symmetry_time_pct=None,
|
||||
safety_checker: SafetyChecker=None,
|
||||
free_gpu_mem: bool = False,
|
||||
**kwargs,
|
||||
):
|
||||
scope = nullcontext
|
||||
self.safety_checker = safety_checker
|
||||
self.free_gpu_mem = free_gpu_mem
|
||||
attention_maps_images = []
|
||||
attention_maps_callback = lambda saver: attention_maps_images.append(
|
||||
@ -329,9 +324,6 @@ class Generator:
|
||||
# Pass on the seed in case a layer beneath us needs to generate noise on its own.
|
||||
image = make_image(x_T, seed)
|
||||
|
||||
if self.safety_checker is not None:
|
||||
image = self.safety_checker.check(image)
|
||||
|
||||
results.append([image, seed, attention_maps_images])
|
||||
|
||||
if image_callback is not None:
|
||||
|
34
invokeai/backend/image_util/invisible_watermark.py
Normal file
@ -0,0 +1,34 @@
|
||||
"""
|
||||
This module defines a singleton object, "invisible_watermark" that
|
||||
wraps the invisible watermark model. It respects the global "invisible_watermark"
|
||||
configuration variable, that allows the watermarking to be supressed.
|
||||
"""
|
||||
import numpy as np
|
||||
import cv2
|
||||
from PIL import Image
|
||||
from imwatermark import WatermarkEncoder
|
||||
from invokeai.app.services.config import InvokeAIAppConfig
|
||||
import invokeai.backend.util.logging as logger
|
||||
config = InvokeAIAppConfig.get_config()
|
||||
|
||||
class InvisibleWatermark:
|
||||
"""
|
||||
Wrapper around InvisibleWatermark module.
|
||||
"""
|
||||
|
||||
@classmethod
|
||||
def invisible_watermark_available(self) -> bool:
|
||||
return config.invisible_watermark
|
||||
|
||||
@classmethod
|
||||
def add_watermark(self, image: Image, watermark_text:str) -> Image:
|
||||
if not self.invisible_watermark_available():
|
||||
return image
|
||||
logger.debug(f'Applying invisible watermark "{watermark_text}"')
|
||||
bgr = cv2.cvtColor(np.array(image.convert("RGB")), cv2.COLOR_RGB2BGR)
|
||||
encoder = WatermarkEncoder()
|
||||
encoder.set_watermark('bytes', watermark_text.encode('utf-8'))
|
||||
bgr_encoded = encoder.encode(bgr, 'dwtDct')
|
||||
return Image.fromarray(
|
||||
cv2.cvtColor(bgr_encoded, cv2.COLOR_BGR2RGB)
|
||||
).convert("RGBA")
|
63
invokeai/backend/image_util/safety_checker.py
Normal file
@ -0,0 +1,63 @@
|
||||
"""
|
||||
This module defines a singleton object, "safety_checker" that
|
||||
wraps the safety_checker model. It respects the global "nsfw_checker"
|
||||
configuration variable, that allows the checker to be supressed.
|
||||
"""
|
||||
import numpy as np
|
||||
from PIL import Image
|
||||
from invokeai.backend import SilenceWarnings
|
||||
from invokeai.app.services.config import InvokeAIAppConfig
|
||||
from invokeai.backend.util.devices import choose_torch_device
|
||||
import invokeai.backend.util.logging as logger
|
||||
config = InvokeAIAppConfig.get_config()
|
||||
|
||||
CHECKER_PATH = 'core/convert/stable-diffusion-safety-checker'
|
||||
|
||||
class SafetyChecker:
|
||||
"""
|
||||
Wrapper around SafetyChecker model.
|
||||
"""
|
||||
safety_checker = None
|
||||
feature_extractor = None
|
||||
tried_load: bool = False
|
||||
|
||||
@classmethod
|
||||
def _load_safety_checker(self):
|
||||
if self.tried_load:
|
||||
return
|
||||
|
||||
if config.nsfw_checker:
|
||||
try:
|
||||
from diffusers.pipelines.stable_diffusion.safety_checker import StableDiffusionSafetyChecker
|
||||
from transformers import AutoFeatureExtractor
|
||||
self.safety_checker = StableDiffusionSafetyChecker.from_pretrained(
|
||||
config.models_path / CHECKER_PATH
|
||||
)
|
||||
self.feature_extractor = AutoFeatureExtractor.from_pretrained(
|
||||
config.models_path / CHECKER_PATH)
|
||||
logger.info('NSFW checker initialized')
|
||||
except Exception as e:
|
||||
logger.warning(f'Could not load NSFW checker: {str(e)}')
|
||||
else:
|
||||
logger.info('NSFW checker loading disabled')
|
||||
self.tried_load = True
|
||||
|
||||
@classmethod
|
||||
def safety_checker_available(self) -> bool:
|
||||
self._load_safety_checker()
|
||||
return self.safety_checker is not None
|
||||
|
||||
@classmethod
|
||||
def has_nsfw_concept(self, image: Image) -> bool:
|
||||
if not self.safety_checker_available():
|
||||
return False
|
||||
|
||||
device = choose_torch_device()
|
||||
features = self.feature_extractor([image], return_tensors="pt")
|
||||
features.to(device)
|
||||
self.safety_checker.to(device)
|
||||
x_image = np.array(image).astype(np.float32) / 255.0
|
||||
x_image = x_image[None].transpose(0, 3, 1, 2)
|
||||
with SilenceWarnings():
|
||||
checked_image, has_nsfw_concept = self.safety_checker(images=x_image, clip_input=features.pixel_values)
|
||||
return has_nsfw_concept[0]
|
31
invokeai/backend/install/check_root.py
Normal file
@ -0,0 +1,31 @@
|
||||
"""
|
||||
Check that the invokeai_root is correctly configured and exit if not.
|
||||
"""
|
||||
import sys
|
||||
from invokeai.app.services.config import (
|
||||
InvokeAIAppConfig,
|
||||
)
|
||||
|
||||
def check_invokeai_root(config: InvokeAIAppConfig):
|
||||
try:
|
||||
assert config.model_conf_path.exists()
|
||||
assert config.db_path.exists()
|
||||
assert config.models_path.exists()
|
||||
for model in [
|
||||
'CLIP-ViT-bigG-14-laion2B-39B-b160k',
|
||||
'bert-base-uncased',
|
||||
'clip-vit-large-patch14',
|
||||
'sd-vae-ft-mse',
|
||||
'stable-diffusion-2-clip',
|
||||
'stable-diffusion-safety-checker']:
|
||||
assert (config.models_path / f'core/convert/{model}').exists()
|
||||
except:
|
||||
print()
|
||||
print('== STARTUP ABORTED ==')
|
||||
print('** One or more necessary files is missing from your InvokeAI root directory **')
|
||||
print('** Please rerun the configuration script to fix this problem. **')
|
||||
print('** From the launcher, selection option [7]. **')
|
||||
print('** From the command line, activate the virtual environment and run "invokeai-configure --yes --skip-sd-weights" **')
|
||||
input('Press any key to continue...')
|
||||
sys.exit(0)
|
||||
|
@ -23,6 +23,7 @@ from urllib import request
|
||||
|
||||
import npyscreen
|
||||
import transformers
|
||||
import omegaconf
|
||||
from diffusers import AutoencoderKL
|
||||
from diffusers.pipelines.stable_diffusion.safety_checker import StableDiffusionSafetyChecker
|
||||
from huggingface_hub import HfFolder
|
||||
@ -31,6 +32,7 @@ from omegaconf import OmegaConf
|
||||
from tqdm import tqdm
|
||||
from transformers import (
|
||||
CLIPTextModel,
|
||||
CLIPTextConfig,
|
||||
CLIPTokenizer,
|
||||
AutoFeatureExtractor,
|
||||
BertTokenizerFast,
|
||||
@ -44,6 +46,7 @@ from invokeai.backend.util.logging import InvokeAILogger
|
||||
from invokeai.frontend.install.model_install import addModelsForm, process_and_execute
|
||||
from invokeai.frontend.install.widgets import (
|
||||
CenteredButtonPress,
|
||||
FileBox,
|
||||
IntTitleSlider,
|
||||
set_min_terminal_size,
|
||||
CyclingForm,
|
||||
@ -53,6 +56,7 @@ from invokeai.frontend.install.widgets import (
|
||||
from invokeai.backend.install.legacy_arg_parsing import legacy_parser
|
||||
from invokeai.backend.install.model_install_backend import (
|
||||
hf_download_from_pretrained,
|
||||
hf_download_with_resume,
|
||||
InstallSelections,
|
||||
ModelInstall,
|
||||
)
|
||||
@ -202,6 +206,15 @@ def download_conversion_models():
|
||||
pipeline = CLIPTextModel.from_pretrained(repo_id, subfolder="text_encoder", **kwargs)
|
||||
pipeline.save_pretrained(target_dir / 'stable-diffusion-2-clip' / 'text_encoder', safe_serialization=True)
|
||||
|
||||
# sd-xl - tokenizer_2
|
||||
repo_id = "laion/CLIP-ViT-bigG-14-laion2B-39B-b160k"
|
||||
_, model_name = repo_id.split('/')
|
||||
pipeline = CLIPTokenizer.from_pretrained(repo_id, **kwargs)
|
||||
pipeline.save_pretrained(target_dir / model_name, safe_serialization=True)
|
||||
|
||||
pipeline = CLIPTextConfig.from_pretrained(repo_id, **kwargs)
|
||||
pipeline.save_pretrained(target_dir / model_name, safe_serialization=True)
|
||||
|
||||
# VAE
|
||||
logger.info('Downloading stable diffusion VAE')
|
||||
vae = AutoencoderKL.from_pretrained('stabilityai/sd-vae-ft-mse', **kwargs)
|
||||
@ -285,47 +298,6 @@ Use cursor arrows to make a checkbox selection, and space to toggle.
|
||||
color="CONTROL",
|
||||
)
|
||||
|
||||
self.nextrely += 1
|
||||
self.add_widget_intelligent(
|
||||
npyscreen.TitleFixedText,
|
||||
name="== BASIC OPTIONS ==",
|
||||
begin_entry_at=0,
|
||||
editable=False,
|
||||
color="CONTROL",
|
||||
scroll_exit=True,
|
||||
)
|
||||
self.nextrely -= 1
|
||||
self.add_widget_intelligent(
|
||||
npyscreen.FixedText,
|
||||
value="Select an output directory for images:",
|
||||
editable=False,
|
||||
color="CONTROL",
|
||||
)
|
||||
self.outdir = self.add_widget_intelligent(
|
||||
npyscreen.TitleFilename,
|
||||
name="(<tab> autocompletes, ctrl-N advances):",
|
||||
value=str(default_output_dir()),
|
||||
select_dir=True,
|
||||
must_exist=False,
|
||||
use_two_lines=False,
|
||||
labelColor="GOOD",
|
||||
begin_entry_at=40,
|
||||
scroll_exit=True,
|
||||
)
|
||||
self.nextrely += 1
|
||||
self.add_widget_intelligent(
|
||||
npyscreen.FixedText,
|
||||
value="Activate the NSFW checker to blur images showing potential sexual imagery:",
|
||||
editable=False,
|
||||
color="CONTROL",
|
||||
)
|
||||
self.nsfw_checker = self.add_widget_intelligent(
|
||||
npyscreen.Checkbox,
|
||||
name="NSFW checker",
|
||||
value=old_opts.nsfw_checker,
|
||||
relx=5,
|
||||
scroll_exit=True,
|
||||
)
|
||||
self.nextrely += 1
|
||||
label = """HuggingFace access token (OPTIONAL) for automatic model downloads. See https://huggingface.co/settings/tokens."""
|
||||
for line in textwrap.wrap(label,width=window_width-6):
|
||||
@ -345,15 +317,6 @@ Use cursor arrows to make a checkbox selection, and space to toggle.
|
||||
scroll_exit=True,
|
||||
)
|
||||
self.nextrely += 1
|
||||
self.add_widget_intelligent(
|
||||
npyscreen.TitleFixedText,
|
||||
name="== ADVANCED OPTIONS ==",
|
||||
begin_entry_at=0,
|
||||
editable=False,
|
||||
color="CONTROL",
|
||||
scroll_exit=True,
|
||||
)
|
||||
self.nextrely -= 1
|
||||
self.add_widget_intelligent(
|
||||
npyscreen.TitleFixedText,
|
||||
name="GPU Management",
|
||||
@ -409,21 +372,33 @@ Use cursor arrows to make a checkbox selection, and space to toggle.
|
||||
self.nextrely += 1
|
||||
self.add_widget_intelligent(
|
||||
npyscreen.FixedText,
|
||||
value="Directories containing textual inversion, controlnet and LoRA models (<tab> autocompletes, ctrl-N advances):",
|
||||
value="Folder to recursively scan for new checkpoints, ControlNets, LoRAs and TI models (<tab> autocompletes, ctrl-N advances):",
|
||||
editable=False,
|
||||
color="CONTROL",
|
||||
)
|
||||
self.outdir = self.add_widget_intelligent(
|
||||
FileBox,
|
||||
name="Output directory for images (<tab> autocompletes, ctrl-N advances):",
|
||||
value=str(default_output_dir()),
|
||||
select_dir=True,
|
||||
must_exist=False,
|
||||
use_two_lines=False,
|
||||
labelColor="GOOD",
|
||||
begin_entry_at=40,
|
||||
max_height=3,
|
||||
scroll_exit=True,
|
||||
)
|
||||
self.autoimport_dirs = {}
|
||||
for description, config_name, path in autoimport_paths(old_opts):
|
||||
self.autoimport_dirs[config_name] = self.add_widget_intelligent(
|
||||
npyscreen.TitleFilename,
|
||||
name=description+':',
|
||||
value=str(path),
|
||||
self.autoimport_dirs['autoimport_dir'] = self.add_widget_intelligent(
|
||||
FileBox,
|
||||
name=f'Autoimport Folder',
|
||||
value=str(config.root_path / config.autoimport_dir),
|
||||
select_dir=True,
|
||||
must_exist=False,
|
||||
use_two_lines=False,
|
||||
labelColor="GOOD",
|
||||
begin_entry_at=32,
|
||||
max_height = 3,
|
||||
scroll_exit=True
|
||||
)
|
||||
self.nextrely += 1
|
||||
@ -504,7 +479,6 @@ https://huggingface.co/spaces/CompVis/stable-diffusion-license
|
||||
|
||||
for attr in [
|
||||
"outdir",
|
||||
"nsfw_checker",
|
||||
"free_gpu_mem",
|
||||
"max_cache_size",
|
||||
"xformers_enabled",
|
||||
@ -540,7 +514,7 @@ class EditOptApplication(npyscreen.NPSAppManaged):
|
||||
"MAIN",
|
||||
editOptsForm,
|
||||
name="InvokeAI Startup Options",
|
||||
cycle_widgets=True,
|
||||
cycle_widgets=False,
|
||||
)
|
||||
if not (self.program_opts.skip_sd_weights or self.program_opts.default_only):
|
||||
self.model_select = self.addForm(
|
||||
@ -548,7 +522,7 @@ class EditOptApplication(npyscreen.NPSAppManaged):
|
||||
addModelsForm,
|
||||
name="Install Stable Diffusion Models",
|
||||
multipage=True,
|
||||
cycle_widgets=True,
|
||||
cycle_widgets=False,
|
||||
)
|
||||
|
||||
def new_opts(self):
|
||||
@ -560,15 +534,19 @@ def edit_opts(program_opts: Namespace, invokeai_opts: Namespace) -> argparse.Nam
|
||||
editApp.run()
|
||||
return editApp.new_opts()
|
||||
|
||||
|
||||
def default_startup_options(init_file: Path) -> Namespace:
|
||||
opts = InvokeAIAppConfig.get_config()
|
||||
if not init_file.exists():
|
||||
opts.nsfw_checker = True
|
||||
return opts
|
||||
|
||||
def default_user_selections(program_opts: Namespace) -> InstallSelections:
|
||||
installer = ModelInstall(config)
|
||||
|
||||
try:
|
||||
installer = ModelInstall(config)
|
||||
except omegaconf.errors.ConfigKeyError:
|
||||
logger.warning('Your models.yaml file is corrupt or out of date. Reinitializing')
|
||||
initialize_rootdir(config.root_path, True)
|
||||
installer = ModelInstall(config)
|
||||
|
||||
models = installer.all_models()
|
||||
return InstallSelections(
|
||||
install_models=[models[installer.default_model()].path or models[installer.default_model()].repo_id]
|
||||
@ -576,19 +554,8 @@ def default_user_selections(program_opts: Namespace) -> InstallSelections:
|
||||
else [models[x].path or models[x].repo_id for x in installer.recommended_models()]
|
||||
if program_opts.yes_to_all
|
||||
else list(),
|
||||
# scan_directory=None,
|
||||
# autoscan_on_startup=None,
|
||||
)
|
||||
|
||||
# -------------------------------------
|
||||
def autoimport_paths(config: InvokeAIAppConfig):
|
||||
return [
|
||||
('Checkpoints & diffusers models', 'autoimport_dir', config.root_path / config.autoimport_dir),
|
||||
('LoRA/LyCORIS models', 'lora_dir', config.root_path / config.lora_dir),
|
||||
('Controlnet models', 'controlnet_dir', config.root_path / config.controlnet_dir),
|
||||
('Textual Inversion Embeddings', 'embedding_dir', config.root_path / config.embedding_dir),
|
||||
]
|
||||
|
||||
# -------------------------------------
|
||||
def initialize_rootdir(root: Path, yes_to_all: bool = False):
|
||||
logger.info("** INITIALIZING INVOKEAI RUNTIME DIRECTORY **")
|
||||
@ -664,6 +631,9 @@ def write_opts(opts: Namespace, init_file: Path):
|
||||
with open(init_file,'w', encoding='utf-8') as file:
|
||||
file.write(new_config.to_yaml())
|
||||
|
||||
if hasattr(opts,'hf_token') and opts.hf_token:
|
||||
HfLogin(opts.hf_token)
|
||||
|
||||
# -------------------------------------
|
||||
def default_output_dir() -> Path:
|
||||
return config.root_path / "outputs"
|
||||
@ -689,7 +659,6 @@ def migrate_init_file(legacy_format:Path):
|
||||
|
||||
# a few places where the field names have changed and we have to
|
||||
# manually add in the new names/values
|
||||
new.nsfw_checker = old.safety_checker
|
||||
new.xformers_enabled = old.xformers
|
||||
new.conf_path = old.conf
|
||||
new.root = legacy_format.parent.resolve()
|
||||
|
@ -58,7 +58,15 @@ LEGACY_CONFIGS = {
|
||||
SchedulerPredictionType.Epsilon: 'v2-inpainting-inference.yaml',
|
||||
SchedulerPredictionType.VPrediction: 'v2-inpainting-inference-v.yaml',
|
||||
}
|
||||
}
|
||||
},
|
||||
|
||||
BaseModelType.StableDiffusionXL: {
|
||||
ModelVariantType.Normal: 'sd_xl_base.yaml',
|
||||
},
|
||||
|
||||
BaseModelType.StableDiffusionXLRefiner: {
|
||||
ModelVariantType.Normal: 'sd_xl_refiner.yaml',
|
||||
},
|
||||
}
|
||||
|
||||
@dataclass
|
||||
@ -329,6 +337,7 @@ class ModelInstall(object):
|
||||
description = str(description),
|
||||
model_format = info.format,
|
||||
)
|
||||
legacy_conf = None
|
||||
if info.model_type == ModelType.Main:
|
||||
attributes.update(dict(variant = info.variant_type,))
|
||||
if info.format=="checkpoint":
|
||||
@ -343,11 +352,17 @@ class ModelInstall(object):
|
||||
except KeyError:
|
||||
legacy_conf = Path(self.config.legacy_conf_dir, 'v1-inference.yaml') # best guess
|
||||
|
||||
attributes.update(
|
||||
dict(
|
||||
config = str(legacy_conf)
|
||||
)
|
||||
if info.model_type == ModelType.ControlNet and info.format=="checkpoint":
|
||||
possible_conf = path.with_suffix('.yaml')
|
||||
if possible_conf.exists():
|
||||
legacy_conf = str(self.relative_to_root(possible_conf))
|
||||
|
||||
if legacy_conf:
|
||||
attributes.update(
|
||||
dict(
|
||||
config = str(legacy_conf)
|
||||
)
|
||||
)
|
||||
return attributes
|
||||
|
||||
def relative_to_root(self, path: Path)->Path:
|
||||
|
@ -4,6 +4,6 @@ Initialization file for invokeai.backend.model_management
|
||||
from .model_manager import ModelManager, ModelInfo, AddModelResult, SchedulerPredictionType
|
||||
from .model_cache import ModelCache
|
||||
from .lora import ModelPatcher, ONNXModelPatcher
|
||||
from .models import BaseModelType, ModelType, SubModelType, ModelVariantType, ModelNotFoundException
|
||||
from .models import BaseModelType, ModelType, SubModelType, ModelVariantType, ModelNotFoundException, DuplicateModelException
|
||||
from .model_merge import ModelMerger, MergeInterpolationMethod
|
||||
|
||||
|
@ -485,7 +485,7 @@ class ModelPatcher:
|
||||
|
||||
@staticmethod
|
||||
def _lora_forward_hook(
|
||||
applied_loras: List[Tuple[LoraModel, float]],
|
||||
applied_loras: List[Tuple[LoRAModel, float]],
|
||||
layer_name: str,
|
||||
):
|
||||
|
||||
@ -530,7 +530,7 @@ class ModelPatcher:
|
||||
def apply_lora(
|
||||
cls,
|
||||
model: torch.nn.Module,
|
||||
loras: List[Tuple[LoraModel, float]],
|
||||
loras: List[Tuple[LoRAModel, float]],
|
||||
prefix: str,
|
||||
):
|
||||
original_weights = dict()
|
||||
|
@ -251,7 +251,9 @@ from .model_search import ModelSearch
|
||||
from .models import (
|
||||
BaseModelType, ModelType, SubModelType,
|
||||
ModelError, SchedulerPredictionType, MODEL_CLASSES,
|
||||
ModelConfigBase, ModelNotFoundException, InvalidModelException,
|
||||
ModelConfigBase,
|
||||
ModelNotFoundException, InvalidModelException,
|
||||
DuplicateModelException,
|
||||
)
|
||||
|
||||
# We are only starting to number the config file with release 3.
|
||||
@ -671,6 +673,7 @@ class ModelManager(object):
|
||||
|
||||
self.models[model_key] = model_config
|
||||
self.commit()
|
||||
|
||||
return AddModelResult(
|
||||
name = model_name,
|
||||
model_type = model_type,
|
||||
@ -838,7 +841,7 @@ class ModelManager(object):
|
||||
Returns the preamble for the config file.
|
||||
"""
|
||||
return textwrap.dedent(
|
||||
"""\
|
||||
"""
|
||||
# This file describes the alternative machine learning models
|
||||
# available to InvokeAI script.
|
||||
#
|
||||
@ -858,7 +861,7 @@ class ModelManager(object):
|
||||
loaded_files = set()
|
||||
new_models_found = False
|
||||
|
||||
self.logger.info(f'scanning {self.app_config.models_path} for new models')
|
||||
self.logger.info(f'Scanning {self.app_config.models_path} for new models')
|
||||
with Chdir(self.app_config.root_path):
|
||||
for model_key, model_config in list(self.models.items()):
|
||||
model_name, cur_base_model, cur_model_type = self.parse_key(model_key)
|
||||
@ -891,15 +894,18 @@ class ModelManager(object):
|
||||
model_name = model_path.name if model_path.is_dir() else model_path.stem
|
||||
model_key = self.create_key(model_name, cur_base_model, cur_model_type)
|
||||
|
||||
if model_key in self.models:
|
||||
raise Exception(f"Model with key {model_key} added twice")
|
||||
|
||||
if model_path.is_relative_to(self.app_config.root_path):
|
||||
model_path = model_path.relative_to(self.app_config.root_path)
|
||||
try:
|
||||
if model_key in self.models:
|
||||
raise DuplicateModelException(f"Model with key {model_key} added twice")
|
||||
|
||||
if model_path.is_relative_to(self.app_config.root_path):
|
||||
model_path = model_path.relative_to(self.app_config.root_path)
|
||||
|
||||
model_config: ModelConfigBase = model_class.probe_config(str(model_path))
|
||||
self.models[model_key] = model_config
|
||||
new_models_found = True
|
||||
except DuplicateModelException as e:
|
||||
self.logger.warning(e)
|
||||
except InvalidModelException:
|
||||
self.logger.warning(f"Not a valid model: {model_path}")
|
||||
except NotImplementedError as e:
|
||||
@ -938,20 +944,29 @@ class ModelManager(object):
|
||||
def models_found(self):
|
||||
return self.new_models_found
|
||||
|
||||
config = self.app_config
|
||||
|
||||
# LS: hacky
|
||||
# Patch in the SD VAE from core so that it is available for use by the UI
|
||||
try:
|
||||
self.heuristic_import({config.root_path / 'models/core/convert/sd-vae-ft-mse'})
|
||||
except:
|
||||
pass
|
||||
|
||||
installer = ModelInstall(config = self.app_config,
|
||||
model_manager = self,
|
||||
prediction_type_helper = ask_user_for_prediction_type,
|
||||
)
|
||||
config = self.app_config
|
||||
known_paths = {config.root_path / x['path'] for x in self.list_models()}
|
||||
directories = {config.root_path / x for x in [config.autoimport_dir,
|
||||
config.lora_dir,
|
||||
config.embedding_dir,
|
||||
config.controlnet_dir]
|
||||
config.controlnet_dir,
|
||||
] if x
|
||||
}
|
||||
scanner = ScanAndImport(directories, self.logger, ignore=known_paths, installer=installer)
|
||||
scanner.search()
|
||||
|
||||
return scanner.models_found()
|
||||
|
||||
def heuristic_import(self,
|
||||
|
@ -39,6 +39,7 @@ class ModelProbe(object):
|
||||
|
||||
CLASS2TYPE = {
|
||||
'StableDiffusionPipeline' : ModelType.Main,
|
||||
'StableDiffusionInpaintPipeline' : ModelType.Main,
|
||||
'StableDiffusionXLPipeline' : ModelType.Main,
|
||||
'StableDiffusionXLImg2ImgPipeline' : ModelType.Main,
|
||||
'AutoencoderKL' : ModelType.Vae,
|
||||
@ -252,10 +253,13 @@ class PipelineCheckpointProbe(CheckpointProbeBase):
|
||||
return BaseModelType.StableDiffusion1
|
||||
if key_name in state_dict and state_dict[key_name].shape[-1] == 1024:
|
||||
return BaseModelType.StableDiffusion2
|
||||
# TODO: Verify that this is correct! Need an XL checkpoint file for this.
|
||||
key_name = 'model.diffusion_model.input_blocks.4.1.transformer_blocks.0.attn2.to_k.weight'
|
||||
if key_name in state_dict and state_dict[key_name].shape[-1] == 2048:
|
||||
return BaseModelType.StableDiffusionXL
|
||||
raise InvalidModelException("Cannot determine base type")
|
||||
elif key_name in state_dict and state_dict[key_name].shape[-1] == 1280:
|
||||
return BaseModelType.StableDiffusionXLRefiner
|
||||
else:
|
||||
raise InvalidModelException("Cannot determine base type")
|
||||
|
||||
def get_scheduler_prediction_type(self)->SchedulerPredictionType:
|
||||
type = self.get_base_type()
|
||||
@ -401,7 +405,7 @@ class PipelineFolderProbe(FolderProbeBase):
|
||||
|
||||
in_channels = conf['in_channels']
|
||||
if in_channels == 9:
|
||||
return ModelVariantType.Inpainting
|
||||
return ModelVariantType.Inpaint
|
||||
elif in_channels == 5:
|
||||
return ModelVariantType.Depth
|
||||
elif in_channels == 4:
|
||||
|
@ -98,6 +98,6 @@ class FindModels(ModelSearch):
|
||||
|
||||
def list_models(self) -> List[Path]:
|
||||
self.search()
|
||||
return self.models_found
|
||||
return list(self.models_found)
|
||||
|
||||
|
||||
|
@ -2,7 +2,11 @@ import inspect
|
||||
from enum import Enum
|
||||
from pydantic import BaseModel
|
||||
from typing import Literal, get_origin
|
||||
from .base import BaseModelType, ModelType, SubModelType, ModelBase, ModelConfigBase, ModelVariantType, SchedulerPredictionType, ModelError, SilenceWarnings, ModelNotFoundException, InvalidModelException
|
||||
from .base import (
|
||||
BaseModelType, ModelType, SubModelType, ModelBase, ModelConfigBase,
|
||||
ModelVariantType, SchedulerPredictionType, ModelError, SilenceWarnings,
|
||||
ModelNotFoundException, InvalidModelException, DuplicateModelException
|
||||
)
|
||||
from .stable_diffusion import StableDiffusion1Model, StableDiffusion2Model
|
||||
from .sdxl import StableDiffusionXLModel
|
||||
from .vae import VaeModel
|
||||
|
@ -21,6 +21,10 @@ import onnx
|
||||
from onnx import numpy_helper
|
||||
from onnx.external_data_helper import set_external_data
|
||||
from onnxruntime import InferenceSession, OrtValue, SessionOptions, ExecutionMode, GraphOptimizationLevel, get_available_providers
|
||||
|
||||
class DuplicateModelException(Exception):
|
||||
pass
|
||||
|
||||
class InvalidModelException(Exception):
|
||||
pass
|
||||
|
||||
|
@ -1,7 +1,8 @@
|
||||
import os
|
||||
import torch
|
||||
from enum import Enum
|
||||
from typing import Optional
|
||||
from pathlib import Path
|
||||
from typing import Optional, Literal
|
||||
from .base import (
|
||||
ModelBase,
|
||||
ModelConfigBase,
|
||||
@ -15,6 +16,7 @@ from .base import (
|
||||
InvalidModelException,
|
||||
ModelNotFoundException,
|
||||
)
|
||||
from invokeai.app.services.config import InvokeAIAppConfig
|
||||
|
||||
class ControlNetModelFormat(str, Enum):
|
||||
Checkpoint = "checkpoint"
|
||||
@ -24,8 +26,12 @@ class ControlNetModel(ModelBase):
|
||||
#model_class: Type
|
||||
#model_size: int
|
||||
|
||||
class Config(ModelConfigBase):
|
||||
model_format: ControlNetModelFormat
|
||||
class DiffusersConfig(ModelConfigBase):
|
||||
model_format: Literal[ControlNetModelFormat.Diffusers]
|
||||
|
||||
class CheckpointConfig(ModelConfigBase):
|
||||
model_format: Literal[ControlNetModelFormat.Checkpoint]
|
||||
config: str
|
||||
|
||||
def __init__(self, model_path: str, base_model: BaseModelType, model_type: ModelType):
|
||||
assert model_type == ModelType.ControlNet
|
||||
@ -99,13 +105,51 @@ class ControlNetModel(ModelBase):
|
||||
|
||||
@classmethod
|
||||
def convert_if_required(
|
||||
cls,
|
||||
model_path: str,
|
||||
output_path: str,
|
||||
config: ModelConfigBase,
|
||||
base_model: BaseModelType,
|
||||
) -> str:
|
||||
if cls.detect_format(model_path) == ControlNetModelFormat.Checkpoint:
|
||||
return _convert_controlnet_ckpt_and_cache(
|
||||
model_path = model_path,
|
||||
model_config = config.config,
|
||||
output_path = output_path,
|
||||
base_model = base_model,
|
||||
)
|
||||
else:
|
||||
return model_path
|
||||
|
||||
@classmethod
|
||||
def _convert_controlnet_ckpt_and_cache(
|
||||
cls,
|
||||
model_path: str,
|
||||
output_path: str,
|
||||
config: ModelConfigBase, # empty config or config of parent model
|
||||
base_model: BaseModelType,
|
||||
) -> str:
|
||||
if cls.detect_format(model_path) != ControlNetModelFormat.Diffusers:
|
||||
raise NotImplementedError("Checkpoint controlnet models currently unsupported")
|
||||
else:
|
||||
return model_path
|
||||
model_config: ControlNetModel.CheckpointConfig,
|
||||
) -> str:
|
||||
"""
|
||||
Convert the controlnet from checkpoint format to diffusers format,
|
||||
cache it to disk, and return Path to converted
|
||||
file. If already on disk then just returns Path.
|
||||
"""
|
||||
app_config = InvokeAIAppConfig.get_config()
|
||||
weights = app_config.root_path / model_path
|
||||
output_path = Path(output_path)
|
||||
|
||||
# return cached version if it exists
|
||||
if output_path.exists():
|
||||
return output_path
|
||||
|
||||
# to avoid circular import errors
|
||||
from ..convert_ckpt_to_diffusers import convert_controlnet_to_diffusers
|
||||
convert_controlnet_to_diffusers(
|
||||
weights,
|
||||
output_path,
|
||||
original_config_file = app_config.root_path / model_config,
|
||||
image_size = 512,
|
||||
scan_needed = True,
|
||||
from_safetensors = weights.suffix == ".safetensors"
|
||||
)
|
||||
return output_path
|
||||
|
@ -10,6 +10,7 @@ from .base import (
|
||||
SubModelType,
|
||||
classproperty,
|
||||
InvalidModelException,
|
||||
ModelNotFoundException,
|
||||
)
|
||||
# TODO: naming
|
||||
from ..lora import LoRAModel as LoRAModelRaw
|
||||
|
@ -1,5 +1,6 @@
|
||||
import os
|
||||
import json
|
||||
import invokeai.backend.util.logging as logger
|
||||
from enum import Enum
|
||||
from pydantic import Field
|
||||
from typing import Literal, Optional
|
||||
@ -48,7 +49,7 @@ class StableDiffusionXLModel(DiffusersModel):
|
||||
if model_format == StableDiffusionXLModelFormat.Checkpoint:
|
||||
if ckpt_config_path:
|
||||
ckpt_config = OmegaConf.load(ckpt_config_path)
|
||||
ckpt_config["model"]["params"]["unet_config"]["params"]["in_channels"]
|
||||
in_channels = ckpt_config["model"]["params"]["unet_config"]["params"]["in_channels"]
|
||||
|
||||
else:
|
||||
checkpoint = read_checkpoint_meta(path)
|
||||
@ -108,7 +109,20 @@ class StableDiffusionXLModel(DiffusersModel):
|
||||
config: ModelConfigBase,
|
||||
base_model: BaseModelType,
|
||||
) -> str:
|
||||
# The convert script adapted from the diffusers package uses
|
||||
# strings for the base model type. To avoid making too many
|
||||
# source code changes, we simply translate here
|
||||
model_base_to_model_type = {BaseModelType.StableDiffusionXL: 'SDXL',
|
||||
BaseModelType.StableDiffusionXLRefiner: 'SDXL-Refiner',
|
||||
}
|
||||
if isinstance(config, cls.CheckpointConfig):
|
||||
raise NotImplementedError('conversion of SDXL checkpoint models to diffusers format is not yet supported')
|
||||
from invokeai.backend.model_management.models.stable_diffusion import _convert_ckpt_and_cache
|
||||
return _convert_ckpt_and_cache(
|
||||
version=base_model,
|
||||
model_config=config,
|
||||
output_path=output_path,
|
||||
model_type=model_base_to_model_type[base_model],
|
||||
use_safetensors=False, # corrupts sdxl models for some reason
|
||||
)
|
||||
else:
|
||||
return model_path
|
||||
|
@ -15,9 +15,12 @@ from .base import (
|
||||
classproperty,
|
||||
InvalidModelException,
|
||||
)
|
||||
from .sdxl import StableDiffusionXLModel
|
||||
import invokeai.backend.util.logging as logger
|
||||
from invokeai.app.services.config import InvokeAIAppConfig
|
||||
from omegaconf import OmegaConf
|
||||
|
||||
|
||||
class StableDiffusion1ModelFormat(str, Enum):
|
||||
Checkpoint = "checkpoint"
|
||||
Diffusers = "diffusers"
|
||||
@ -235,42 +238,17 @@ class StableDiffusion2Model(DiffusersModel):
|
||||
else:
|
||||
return model_path
|
||||
|
||||
def _select_ckpt_config(version: BaseModelType, variant: ModelVariantType):
|
||||
ckpt_configs = {
|
||||
BaseModelType.StableDiffusion1: {
|
||||
ModelVariantType.Normal: "v1-inference.yaml",
|
||||
ModelVariantType.Inpaint: "v1-inpainting-inference.yaml",
|
||||
},
|
||||
BaseModelType.StableDiffusion2: {
|
||||
ModelVariantType.Normal: "v2-inference-v.yaml", # best guess, as we can't differentiate with base(512)
|
||||
ModelVariantType.Inpaint: "v2-inpainting-inference.yaml",
|
||||
ModelVariantType.Depth: "v2-midas-inference.yaml",
|
||||
},
|
||||
# note that these .yaml files don't yet exist!
|
||||
BaseModelType.StableDiffusionXL: {
|
||||
ModelVariantType.Normal: "xl-inference-v.yaml",
|
||||
ModelVariantType.Inpaint: "xl-inpainting-inference.yaml",
|
||||
ModelVariantType.Depth: "xl-midas-inference.yaml",
|
||||
}
|
||||
}
|
||||
|
||||
app_config = InvokeAIAppConfig.get_config()
|
||||
try:
|
||||
config_path = app_config.legacy_conf_path / ckpt_configs[version][variant]
|
||||
if config_path.is_relative_to(app_config.root_path):
|
||||
config_path = config_path.relative_to(app_config.root_path)
|
||||
return str(config_path)
|
||||
|
||||
except:
|
||||
return None
|
||||
|
||||
|
||||
# TODO: rework
|
||||
# Note that convert_ckpt_to_diffuses does not currently support conversion of SDXL models
|
||||
# pass precision - currently defaulting to fp16
|
||||
def _convert_ckpt_and_cache(
|
||||
version: BaseModelType,
|
||||
model_config: Union[StableDiffusion1Model.CheckpointConfig, StableDiffusion2Model.CheckpointConfig],
|
||||
output_path: str,
|
||||
version: BaseModelType,
|
||||
model_config: Union[StableDiffusion1Model.CheckpointConfig,
|
||||
StableDiffusion2Model.CheckpointConfig,
|
||||
StableDiffusionXLModel.CheckpointConfig,
|
||||
],
|
||||
output_path: str,
|
||||
use_save_model: bool=False,
|
||||
**kwargs,
|
||||
) -> str:
|
||||
"""
|
||||
Convert the checkpoint model indicated in mconfig into a
|
||||
@ -289,6 +267,9 @@ def _convert_ckpt_and_cache(
|
||||
|
||||
# to avoid circular import errors
|
||||
from ..convert_ckpt_to_diffusers import convert_ckpt_to_diffusers
|
||||
from ...util.devices import choose_torch_device, torch_dtype
|
||||
|
||||
logger.info(f'Converting {weights} to diffusers format')
|
||||
with SilenceWarnings():
|
||||
convert_ckpt_to_diffusers(
|
||||
weights,
|
||||
@ -298,5 +279,43 @@ def _convert_ckpt_and_cache(
|
||||
original_config_file=config_file,
|
||||
extract_ema=True,
|
||||
scan_needed=True,
|
||||
from_safetensors = weights.suffix == ".safetensors",
|
||||
precision = torch_dtype(choose_torch_device()),
|
||||
**kwargs,
|
||||
)
|
||||
return output_path
|
||||
|
||||
def _select_ckpt_config(version: BaseModelType, variant: ModelVariantType):
|
||||
ckpt_configs = {
|
||||
BaseModelType.StableDiffusion1: {
|
||||
ModelVariantType.Normal: "v1-inference.yaml",
|
||||
ModelVariantType.Inpaint: "v1-inpainting-inference.yaml",
|
||||
},
|
||||
BaseModelType.StableDiffusion2: {
|
||||
ModelVariantType.Normal: "v2-inference-v.yaml", # best guess, as we can't differentiate with base(512)
|
||||
ModelVariantType.Inpaint: "v2-inpainting-inference.yaml",
|
||||
ModelVariantType.Depth: "v2-midas-inference.yaml",
|
||||
},
|
||||
BaseModelType.StableDiffusionXL: {
|
||||
ModelVariantType.Normal: "sd_xl_base.yaml",
|
||||
ModelVariantType.Inpaint: None,
|
||||
ModelVariantType.Depth: None,
|
||||
},
|
||||
BaseModelType.StableDiffusionXLRefiner: {
|
||||
ModelVariantType.Normal: "sd_xl_refiner.yaml",
|
||||
ModelVariantType.Inpaint: None,
|
||||
ModelVariantType.Depth: None,
|
||||
},
|
||||
}
|
||||
|
||||
app_config = InvokeAIAppConfig.get_config()
|
||||
try:
|
||||
config_path = app_config.legacy_conf_path / ckpt_configs[version][variant]
|
||||
if config_path.is_relative_to(app_config.root_path):
|
||||
config_path = config_path.relative_to(app_config.root_path)
|
||||
return str(config_path)
|
||||
|
||||
except:
|
||||
return None
|
||||
|
||||
|
||||
|
@ -1,77 +0,0 @@
|
||||
'''
|
||||
SafetyChecker class - checks images against the StabilityAI NSFW filter
|
||||
and blurs images that contain potential NSFW content.
|
||||
'''
|
||||
import diffusers
|
||||
import numpy as np
|
||||
import torch
|
||||
import traceback
|
||||
from diffusers.pipelines.stable_diffusion.safety_checker import (
|
||||
StableDiffusionSafetyChecker,
|
||||
)
|
||||
from pathlib import Path
|
||||
from PIL import Image, ImageFilter
|
||||
from transformers import AutoFeatureExtractor
|
||||
|
||||
import invokeai.assets.web as web_assets
|
||||
import invokeai.backend.util.logging as logger
|
||||
from invokeai.app.services.config import InvokeAIAppConfig
|
||||
from .util import CPU_DEVICE
|
||||
|
||||
config = InvokeAIAppConfig.get_config()
|
||||
|
||||
class SafetyChecker(object):
|
||||
CAUTION_IMG = "caution.png"
|
||||
|
||||
def __init__(self, device: torch.device):
|
||||
path = Path(web_assets.__path__[0]) / self.CAUTION_IMG
|
||||
caution = Image.open(path)
|
||||
self.caution_img = caution.resize((caution.width // 2, caution.height // 2))
|
||||
self.device = device
|
||||
|
||||
try:
|
||||
safety_model_id = config.models_path / 'core/convert/stable-diffusion-safety-checker'
|
||||
feature_extractor_id = config.models_path / 'core/convert/stable-diffusion-safety-checker-extractor'
|
||||
self.safety_checker = StableDiffusionSafetyChecker.from_pretrained(safety_model_id)
|
||||
self.safety_feature_extractor = AutoFeatureExtractor.from_pretrained(feature_extractor_id)
|
||||
except Exception:
|
||||
logger.error(
|
||||
"An error was encountered while installing the safety checker:"
|
||||
)
|
||||
print(traceback.format_exc())
|
||||
|
||||
def check(self, image: Image.Image):
|
||||
"""
|
||||
Check provided image against the StabilityAI safety checker and return
|
||||
|
||||
"""
|
||||
|
||||
self.safety_checker.to(self.device)
|
||||
features = self.safety_feature_extractor([image], return_tensors="pt")
|
||||
features.to(self.device)
|
||||
|
||||
# unfortunately checker requires the numpy version, so we have to convert back
|
||||
x_image = np.array(image).astype(np.float32) / 255.0
|
||||
x_image = x_image[None].transpose(0, 3, 1, 2)
|
||||
|
||||
diffusers.logging.set_verbosity_error()
|
||||
checked_image, has_nsfw_concept = self.safety_checker(
|
||||
images=x_image, clip_input=features.pixel_values
|
||||
)
|
||||
self.safety_checker.to(CPU_DEVICE) # offload
|
||||
if has_nsfw_concept[0]:
|
||||
logger.warning(
|
||||
"An image with potential non-safe content has been detected. A blurred image will be returned."
|
||||
)
|
||||
return self.blur(image)
|
||||
else:
|
||||
return image
|
||||
|
||||
def blur(self, input):
|
||||
blurry = input.filter(filter=ImageFilter.GaussianBlur(radius=32))
|
||||
try:
|
||||
if caution := self.caution_img:
|
||||
blurry.paste(caution, (0, 0), caution)
|
||||
except FileNotFoundError:
|
||||
pass
|
||||
return blurry
|
@ -219,6 +219,7 @@ class ControlNetData:
|
||||
begin_step_percent: float = Field(default=0.0)
|
||||
end_step_percent: float = Field(default=1.0)
|
||||
control_mode: str = Field(default="balanced")
|
||||
resize_mode: str = Field(default="just_resize")
|
||||
|
||||
|
||||
@dataclass
|
||||
@ -653,7 +654,7 @@ class StableDiffusionGeneratorPipeline(StableDiffusionPipeline):
|
||||
if cfg_injection:
|
||||
# Inferred ControlNet only for the conditional batch.
|
||||
# To apply the output of ControlNet to both the unconditional and conditional batches,
|
||||
# add 0 to the unconditional batch to keep it unchanged.
|
||||
# prepend zeros for unconditional batch
|
||||
down_samples = [torch.cat([torch.zeros_like(d), d]) for d in down_samples]
|
||||
mid_sample = torch.cat([torch.zeros_like(mid_sample), mid_sample])
|
||||
|
||||
@ -954,53 +955,3 @@ class StableDiffusionGeneratorPipeline(StableDiffusionPipeline):
|
||||
debug_image(
|
||||
img, f"latents {msg} {i+1}/{len(decoded)}", debug_status=True
|
||||
)
|
||||
|
||||
# Copied from diffusers pipeline_stable_diffusion_controlnet.py
|
||||
# Returns torch.Tensor of shape (batch_size, 3, height, width)
|
||||
@staticmethod
|
||||
def prepare_control_image(
|
||||
image,
|
||||
# FIXME: need to fix hardwiring of width and height, change to basing on latents dimensions?
|
||||
# latents,
|
||||
width=512, # should be 8 * latent.shape[3]
|
||||
height=512, # should be 8 * latent height[2]
|
||||
batch_size=1,
|
||||
num_images_per_prompt=1,
|
||||
device="cuda",
|
||||
dtype=torch.float16,
|
||||
do_classifier_free_guidance=True,
|
||||
control_mode="balanced"
|
||||
):
|
||||
|
||||
if not isinstance(image, torch.Tensor):
|
||||
if isinstance(image, PIL.Image.Image):
|
||||
image = [image]
|
||||
|
||||
if isinstance(image[0], PIL.Image.Image):
|
||||
images = []
|
||||
for image_ in image:
|
||||
image_ = image_.convert("RGB")
|
||||
image_ = image_.resize((width, height), resample=PIL_INTERPOLATION["lanczos"])
|
||||
image_ = np.array(image_)
|
||||
image_ = image_[None, :]
|
||||
images.append(image_)
|
||||
image = images
|
||||
image = np.concatenate(image, axis=0)
|
||||
image = np.array(image).astype(np.float32) / 255.0
|
||||
image = image.transpose(0, 3, 1, 2)
|
||||
image = torch.from_numpy(image)
|
||||
elif isinstance(image[0], torch.Tensor):
|
||||
image = torch.cat(image, dim=0)
|
||||
|
||||
image_batch_size = image.shape[0]
|
||||
if image_batch_size == 1:
|
||||
repeat_by = batch_size
|
||||
else:
|
||||
# image batch size is the same as prompt batch size
|
||||
repeat_by = num_images_per_prompt
|
||||
image = image.repeat_interleave(repeat_by, dim=0)
|
||||
image = image.to(device=device, dtype=dtype)
|
||||
cfg_injection = (control_mode == "more_control" or control_mode == "unbalanced")
|
||||
if do_classifier_free_guidance and not cfg_injection:
|
||||
image = torch.cat([image] * 2)
|
||||
return image
|
||||
|
@ -1,7 +1,7 @@
|
||||
# Copyright (c) 2023 Lincoln D. Stein and The InvokeAI Development Team
|
||||
|
||||
"""
|
||||
invokeai.util.logging
|
||||
invokeai.backend.util.logging
|
||||
|
||||
Logging class for InvokeAI that produces console messages
|
||||
|
||||
|
@ -1,4 +1,6 @@
|
||||
import math
|
||||
import torch
|
||||
import diffusers
|
||||
|
||||
|
||||
if torch.backends.mps.is_available():
|
||||
@ -61,3 +63,150 @@ def new_torch_interpolate(input, size=None, scale_factor=None, mode='nearest', a
|
||||
return _torch_interpolate(input, size, scale_factor, mode, align_corners, recompute_scale_factor, antialias)
|
||||
|
||||
torch.nn.functional.interpolate = new_torch_interpolate
|
||||
|
||||
# TODO: refactor it
|
||||
_SlicedAttnProcessor = diffusers.models.attention_processor.SlicedAttnProcessor
|
||||
class ChunkedSlicedAttnProcessor:
|
||||
r"""
|
||||
Processor for implementing sliced attention.
|
||||
|
||||
Args:
|
||||
slice_size (`int`, *optional*):
|
||||
The number of steps to compute attention. Uses as many slices as `attention_head_dim // slice_size`, and
|
||||
`attention_head_dim` must be a multiple of the `slice_size`.
|
||||
"""
|
||||
|
||||
def __init__(self, slice_size):
|
||||
assert isinstance(slice_size, int)
|
||||
slice_size = 1 # TODO: maybe implement chunking in batches too when enough memory
|
||||
self.slice_size = slice_size
|
||||
self._sliced_attn_processor = _SlicedAttnProcessor(slice_size)
|
||||
|
||||
def __call__(self, attn, hidden_states, encoder_hidden_states=None, attention_mask=None):
|
||||
if self.slice_size != 1 or attn.upcast_attention:
|
||||
return self._sliced_attn_processor(attn, hidden_states, encoder_hidden_states, attention_mask)
|
||||
|
||||
residual = hidden_states
|
||||
|
||||
input_ndim = hidden_states.ndim
|
||||
|
||||
if input_ndim == 4:
|
||||
batch_size, channel, height, width = hidden_states.shape
|
||||
hidden_states = hidden_states.view(batch_size, channel, height * width).transpose(1, 2)
|
||||
|
||||
batch_size, sequence_length, _ = (
|
||||
hidden_states.shape if encoder_hidden_states is None else encoder_hidden_states.shape
|
||||
)
|
||||
attention_mask = attn.prepare_attention_mask(attention_mask, sequence_length, batch_size)
|
||||
|
||||
if attn.group_norm is not None:
|
||||
hidden_states = attn.group_norm(hidden_states.transpose(1, 2)).transpose(1, 2)
|
||||
|
||||
query = attn.to_q(hidden_states)
|
||||
dim = query.shape[-1]
|
||||
query = attn.head_to_batch_dim(query)
|
||||
|
||||
if encoder_hidden_states is None:
|
||||
encoder_hidden_states = hidden_states
|
||||
elif attn.norm_cross:
|
||||
encoder_hidden_states = attn.norm_encoder_hidden_states(encoder_hidden_states)
|
||||
|
||||
key = attn.to_k(encoder_hidden_states)
|
||||
value = attn.to_v(encoder_hidden_states)
|
||||
key = attn.head_to_batch_dim(key)
|
||||
value = attn.head_to_batch_dim(value)
|
||||
|
||||
batch_size_attention, query_tokens, _ = query.shape
|
||||
hidden_states = torch.zeros(
|
||||
(batch_size_attention, query_tokens, dim // attn.heads), device=query.device, dtype=query.dtype
|
||||
)
|
||||
|
||||
chunk_tmp_tensor = torch.empty(self.slice_size, query.shape[1], key.shape[1], dtype=query.dtype, device=query.device)
|
||||
|
||||
for i in range(batch_size_attention // self.slice_size):
|
||||
start_idx = i * self.slice_size
|
||||
end_idx = (i + 1) * self.slice_size
|
||||
|
||||
query_slice = query[start_idx:end_idx]
|
||||
key_slice = key[start_idx:end_idx]
|
||||
attn_mask_slice = attention_mask[start_idx:end_idx] if attention_mask is not None else None
|
||||
|
||||
self.get_attention_scores_chunked(attn, query_slice, key_slice, attn_mask_slice, hidden_states[start_idx:end_idx], value[start_idx:end_idx], chunk_tmp_tensor)
|
||||
|
||||
hidden_states = attn.batch_to_head_dim(hidden_states)
|
||||
|
||||
# linear proj
|
||||
hidden_states = attn.to_out[0](hidden_states)
|
||||
# dropout
|
||||
hidden_states = attn.to_out[1](hidden_states)
|
||||
|
||||
if input_ndim == 4:
|
||||
hidden_states = hidden_states.transpose(-1, -2).reshape(batch_size, channel, height, width)
|
||||
|
||||
if attn.residual_connection:
|
||||
hidden_states = hidden_states + residual
|
||||
|
||||
hidden_states = hidden_states / attn.rescale_output_factor
|
||||
|
||||
return hidden_states
|
||||
|
||||
|
||||
def get_attention_scores_chunked(self, attn, query, key, attention_mask, hidden_states, value, chunk):
|
||||
# batch size = 1
|
||||
assert query.shape[0] == 1
|
||||
assert key.shape[0] == 1
|
||||
assert value.shape[0] == 1
|
||||
assert hidden_states.shape[0] == 1
|
||||
|
||||
dtype = query.dtype
|
||||
if attn.upcast_attention:
|
||||
query = query.float()
|
||||
key = key.float()
|
||||
|
||||
#out_item_size = query.dtype.itemsize
|
||||
#if attn.upcast_attention:
|
||||
# out_item_size = torch.float32.itemsize
|
||||
out_item_size = query.element_size()
|
||||
if attn.upcast_attention:
|
||||
out_item_size = 4
|
||||
|
||||
chunk_size = 2 ** 29
|
||||
|
||||
out_size = query.shape[1] * key.shape[1] * out_item_size
|
||||
chunks_count = min(query.shape[1], math.ceil((out_size - 1) / chunk_size))
|
||||
chunk_step = max(1, int(query.shape[1] / chunks_count))
|
||||
|
||||
key = key.transpose(-1, -2)
|
||||
|
||||
def _get_chunk_view(tensor, start, length):
|
||||
if start + length > tensor.shape[1]:
|
||||
length = tensor.shape[1] - start
|
||||
#print(f"view: [{tensor.shape[0]},{tensor.shape[1]},{tensor.shape[2]}] - start: {start}, length: {length}")
|
||||
return tensor[:,start:start+length]
|
||||
|
||||
for chunk_pos in range(0, query.shape[1], chunk_step):
|
||||
if attention_mask is not None:
|
||||
torch.baddbmm(
|
||||
_get_chunk_view(attention_mask, chunk_pos, chunk_step),
|
||||
_get_chunk_view(query, chunk_pos, chunk_step),
|
||||
key,
|
||||
beta=1,
|
||||
alpha=attn.scale,
|
||||
out=chunk,
|
||||
)
|
||||
else:
|
||||
torch.baddbmm(
|
||||
torch.zeros((1,1,1), device=query.device, dtype=query.dtype),
|
||||
_get_chunk_view(query, chunk_pos, chunk_step),
|
||||
key,
|
||||
beta=0,
|
||||
alpha=attn.scale,
|
||||
out=chunk,
|
||||
)
|
||||
chunk = chunk.softmax(dim=-1)
|
||||
torch.bmm(chunk, value, out=_get_chunk_view(hidden_states, chunk_pos, chunk_step))
|
||||
|
||||
#del chunk
|
||||
|
||||
|
||||
diffusers.models.attention_processor.SlicedAttnProcessor = ChunkedSlicedAttnProcessor
|
||||
|