Compare commits
23 Commits
v3.0.2rc1
...
Model-Mgmt
Author | SHA1 | Date | |
---|---|---|---|
cd93278355 | |||
78578b1faf | |||
e282d2ee7a | |||
7bef16f6f2 | |||
8c7063be1f | |||
7996d96d01 | |||
c025d0521d | |||
1654269125 | |||
cebc0600c2 | |||
2da020490e | |||
bd3703f298 | |||
2b067c0813 | |||
b8eea1f6e5 | |||
9e3cd33a99 | |||
0e0e5bdb3e | |||
ff287e6260 | |||
5860b517a7 | |||
f53b125caa | |||
545b41639e | |||
b52b9985bd | |||
6263cb945c | |||
5f92f290fc | |||
22d2c2b3e3 |
@ -20,13 +20,13 @@ def calc_images_mean_L1(image1_path, image2_path):
|
||||
|
||||
def parse_args():
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument("image1_path")
|
||||
parser.add_argument("image2_path")
|
||||
parser.add_argument('image1_path')
|
||||
parser.add_argument('image2_path')
|
||||
args = parser.parse_args()
|
||||
return args
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
if __name__ == '__main__':
|
||||
args = parse_args()
|
||||
mean_L1 = calc_images_mean_L1(args.image1_path, args.image2_path)
|
||||
print(mean_L1)
|
||||
|
@ -1,9 +1,25 @@
|
||||
# use this file as a whitelist
|
||||
*
|
||||
!invokeai
|
||||
!ldm
|
||||
!pyproject.toml
|
||||
!docker/docker-entrypoint.sh
|
||||
!LICENSE
|
||||
|
||||
**/node_modules
|
||||
**/__pycache__
|
||||
**/*.egg-info
|
||||
# ignore frontend/web but whitelist dist
|
||||
invokeai/frontend/web/
|
||||
!invokeai/frontend/web/dist/
|
||||
|
||||
# ignore invokeai/assets but whitelist invokeai/assets/web
|
||||
invokeai/assets/
|
||||
!invokeai/assets/web/
|
||||
|
||||
# Guard against pulling in any models that might exist in the directory tree
|
||||
**/*.pt*
|
||||
**/*.ckpt
|
||||
|
||||
# Byte-compiled / optimized / DLL files
|
||||
**/__pycache__/
|
||||
**/*.py[cod]
|
||||
|
||||
# Distribution / packaging
|
||||
**/*.egg-info/
|
||||
**/*.egg
|
||||
|
@ -1,2 +1 @@
|
||||
b3dccfaeb636599c02effc377cdd8a87d658256c
|
||||
218b6d0546b990fc449c876fb99f44b50c4daa35
|
||||
|
26
.github/CODEOWNERS
vendored
@ -1,16 +1,16 @@
|
||||
# continuous integration
|
||||
/.github/workflows/ @lstein @blessedcoolant
|
||||
/.github/workflows/ @mauwii @lstein @blessedcoolant
|
||||
|
||||
# documentation
|
||||
/docs/ @lstein @blessedcoolant @hipsterusername
|
||||
/mkdocs.yml @lstein @blessedcoolant
|
||||
/docs/ @lstein @mauwii @tildebyte @blessedcoolant
|
||||
/mkdocs.yml @lstein @mauwii @blessedcoolant
|
||||
|
||||
# nodes
|
||||
/invokeai/app/ @Kyle0654 @blessedcoolant @psychedelicious @brandonrising
|
||||
/invokeai/app/ @Kyle0654 @blessedcoolant
|
||||
|
||||
# installation and configuration
|
||||
/pyproject.toml @lstein @blessedcoolant
|
||||
/docker/ @lstein @blessedcoolant
|
||||
/pyproject.toml @mauwii @lstein @blessedcoolant
|
||||
/docker/ @mauwii @lstein @blessedcoolant
|
||||
/scripts/ @ebr @lstein
|
||||
/installer/ @lstein @ebr
|
||||
/invokeai/assets @lstein @ebr
|
||||
@ -18,17 +18,17 @@
|
||||
/invokeai/version @lstein @blessedcoolant
|
||||
|
||||
# web ui
|
||||
/invokeai/frontend @blessedcoolant @psychedelicious @lstein @maryhipp
|
||||
/invokeai/backend @blessedcoolant @psychedelicious @lstein @maryhipp
|
||||
/invokeai/frontend @blessedcoolant @psychedelicious @lstein
|
||||
/invokeai/backend @blessedcoolant @psychedelicious @lstein
|
||||
|
||||
# generation, model management, postprocessing
|
||||
/invokeai/backend @damian0815 @lstein @blessedcoolant @gregghelt2 @StAlKeR7779 @brandonrising
|
||||
/invokeai/backend @keturn @damian0815 @lstein @blessedcoolant @jpphoto
|
||||
|
||||
# front ends
|
||||
/invokeai/frontend/CLI @lstein
|
||||
/invokeai/frontend/install @lstein @ebr
|
||||
/invokeai/frontend/merge @lstein @blessedcoolant
|
||||
/invokeai/frontend/training @lstein @blessedcoolant
|
||||
/invokeai/frontend/web @psychedelicious @blessedcoolant @maryhipp
|
||||
/invokeai/frontend/install @lstein @ebr @mauwii
|
||||
/invokeai/frontend/merge @lstein @blessedcoolant @hipsterusername
|
||||
/invokeai/frontend/training @lstein @blessedcoolant @hipsterusername
|
||||
/invokeai/frontend/web @psychedelicious @blessedcoolant
|
||||
|
||||
|
||||
|
51
.github/pull_request_template.md
vendored
@ -1,51 +0,0 @@
|
||||
## What type of PR is this? (check all applicable)
|
||||
|
||||
- [ ] Refactor
|
||||
- [ ] Feature
|
||||
- [ ] Bug Fix
|
||||
- [ ] Optimization
|
||||
- [ ] Documentation Update
|
||||
- [ ] Community Node Submission
|
||||
|
||||
|
||||
## Have you discussed this change with the InvokeAI team?
|
||||
- [ ] Yes
|
||||
- [ ] No, because:
|
||||
|
||||
|
||||
## Have you updated all relevant documentation?
|
||||
- [ ] Yes
|
||||
- [ ] No
|
||||
|
||||
|
||||
## Description
|
||||
|
||||
|
||||
## Related Tickets & Documents
|
||||
|
||||
<!--
|
||||
For pull requests that relate or close an issue, please include them
|
||||
below.
|
||||
|
||||
For example having the text: "closes #1234" would connect the current pull
|
||||
request to issue 1234. And when we merge the pull request, Github will
|
||||
automatically close the issue.
|
||||
-->
|
||||
|
||||
- Related Issue #
|
||||
- Closes #
|
||||
|
||||
## QA Instructions, Screenshots, Recordings
|
||||
|
||||
<!--
|
||||
Please provide steps on how to test changes, any hardware or
|
||||
software specifications as well as any other pertinent information.
|
||||
-->
|
||||
|
||||
## Added/updated tests?
|
||||
|
||||
- [ ] Yes
|
||||
- [ ] No : _please replace this line with details on why tests
|
||||
have not been included_
|
||||
|
||||
## [optional] Are there any post deployment tasks we need to perform?
|
19
.github/stale.yaml
vendored
@ -1,19 +0,0 @@
|
||||
# Number of days of inactivity before an issue becomes stale
|
||||
daysUntilStale: 28
|
||||
# Number of days of inactivity before a stale issue is closed
|
||||
daysUntilClose: 14
|
||||
# Issues with these labels will never be considered stale
|
||||
exemptLabels:
|
||||
- pinned
|
||||
- security
|
||||
# Label to use when marking an issue as stale
|
||||
staleLabel: stale
|
||||
# Comment to post when marking an issue as stale. Set to `false` to disable
|
||||
markComment: >
|
||||
This issue has been automatically marked as stale because it has not had
|
||||
recent activity. It will be closed if no further activity occurs. Please
|
||||
update the ticket if this is still a problem on the latest release.
|
||||
# Comment to post when closing a stale issue. Set to `false` to disable
|
||||
closeComment: >
|
||||
Due to inactivity, this issue has been automatically closed. If this is
|
||||
still a problem on the latest release, please recreate the issue.
|
84
.github/workflows/build-container.yml
vendored
@ -3,20 +3,21 @@ on:
|
||||
push:
|
||||
branches:
|
||||
- 'main'
|
||||
- 'update/ci/docker/*'
|
||||
- 'update/docker/*'
|
||||
- 'dev/ci/docker/*'
|
||||
- 'dev/docker/*'
|
||||
paths:
|
||||
- 'pyproject.toml'
|
||||
- '.dockerignore'
|
||||
- 'invokeai/**'
|
||||
- 'docker/Dockerfile'
|
||||
- 'docker/docker-entrypoint.sh'
|
||||
- 'workflows/build-container.yml'
|
||||
tags:
|
||||
- 'v*'
|
||||
- 'v*.*.*'
|
||||
workflow_dispatch:
|
||||
|
||||
permissions:
|
||||
contents: write
|
||||
packages: write
|
||||
|
||||
jobs:
|
||||
docker:
|
||||
@ -24,27 +25,23 @@ jobs:
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
gpu-driver:
|
||||
- cuda
|
||||
- cpu
|
||||
- rocm
|
||||
flavor:
|
||||
- rocm
|
||||
- cuda
|
||||
- cpu
|
||||
include:
|
||||
- flavor: rocm
|
||||
pip-extra-index-url: 'https://download.pytorch.org/whl/rocm5.2'
|
||||
- flavor: cuda
|
||||
pip-extra-index-url: ''
|
||||
- flavor: cpu
|
||||
pip-extra-index-url: 'https://download.pytorch.org/whl/cpu'
|
||||
runs-on: ubuntu-latest
|
||||
name: ${{ matrix.gpu-driver }}
|
||||
name: ${{ matrix.flavor }}
|
||||
env:
|
||||
# torch/arm64 does not support GPU currently, so arm64 builds
|
||||
# would not be GPU-accelerated.
|
||||
# re-enable arm64 if there is sufficient demand.
|
||||
# PLATFORMS: 'linux/amd64,linux/arm64'
|
||||
PLATFORMS: 'linux/amd64'
|
||||
PLATFORMS: 'linux/amd64,linux/arm64'
|
||||
DOCKERFILE: 'docker/Dockerfile'
|
||||
steps:
|
||||
- name: Free up more disk space on the runner
|
||||
# https://github.com/actions/runner-images/issues/2840#issuecomment-1284059930
|
||||
run: |
|
||||
sudo rm -rf /usr/share/dotnet
|
||||
sudo rm -rf "$AGENT_TOOLSDIRECTORY"
|
||||
sudo swapoff /mnt/swapfile
|
||||
sudo rm -rf /mnt/swapfile
|
||||
|
||||
- name: Checkout
|
||||
uses: actions/checkout@v3
|
||||
|
||||
@ -55,7 +52,7 @@ jobs:
|
||||
github-token: ${{ secrets.GITHUB_TOKEN }}
|
||||
images: |
|
||||
ghcr.io/${{ github.repository }}
|
||||
${{ env.DOCKERHUB_REPOSITORY }}
|
||||
${{ vars.DOCKERHUB_REPOSITORY }}
|
||||
tags: |
|
||||
type=ref,event=branch
|
||||
type=ref,event=tag
|
||||
@ -64,8 +61,8 @@ jobs:
|
||||
type=pep440,pattern={{major}}
|
||||
type=sha,enable=true,prefix=sha-,format=short
|
||||
flavor: |
|
||||
latest=${{ matrix.gpu-driver == 'cuda' && github.ref == 'refs/heads/main' }}
|
||||
suffix=-${{ matrix.gpu-driver }},onlatest=false
|
||||
latest=${{ matrix.flavor == 'cuda' && github.ref == 'refs/heads/main' }}
|
||||
suffix=-${{ matrix.flavor }},onlatest=false
|
||||
|
||||
- name: Set up QEMU
|
||||
uses: docker/setup-qemu-action@v2
|
||||
@ -83,33 +80,34 @@ jobs:
|
||||
username: ${{ github.repository_owner }}
|
||||
password: ${{ secrets.GITHUB_TOKEN }}
|
||||
|
||||
# - name: Login to Docker Hub
|
||||
# if: github.event_name != 'pull_request' && vars.DOCKERHUB_REPOSITORY != ''
|
||||
# uses: docker/login-action@v2
|
||||
# with:
|
||||
# username: ${{ secrets.DOCKERHUB_USERNAME }}
|
||||
# password: ${{ secrets.DOCKERHUB_TOKEN }}
|
||||
- name: Login to Docker Hub
|
||||
if: github.event_name != 'pull_request' && vars.DOCKERHUB_REPOSITORY != ''
|
||||
uses: docker/login-action@v2
|
||||
with:
|
||||
username: ${{ secrets.DOCKERHUB_USERNAME }}
|
||||
password: ${{ secrets.DOCKERHUB_TOKEN }}
|
||||
|
||||
- name: Build container
|
||||
id: docker_build
|
||||
uses: docker/build-push-action@v4
|
||||
with:
|
||||
context: .
|
||||
file: docker/Dockerfile
|
||||
file: ${{ env.DOCKERFILE }}
|
||||
platforms: ${{ env.PLATFORMS }}
|
||||
push: ${{ github.ref == 'refs/heads/main' || github.ref_type == 'tag' }}
|
||||
tags: ${{ steps.meta.outputs.tags }}
|
||||
labels: ${{ steps.meta.outputs.labels }}
|
||||
build-args: PIP_EXTRA_INDEX_URL=${{ matrix.pip-extra-index-url }}
|
||||
cache-from: |
|
||||
type=gha,scope=${{ github.ref_name }}-${{ matrix.gpu-driver }}
|
||||
type=gha,scope=main-${{ matrix.gpu-driver }}
|
||||
cache-to: type=gha,mode=max,scope=${{ github.ref_name }}-${{ matrix.gpu-driver }}
|
||||
type=gha,scope=${{ github.ref_name }}-${{ matrix.flavor }}
|
||||
type=gha,scope=main-${{ matrix.flavor }}
|
||||
cache-to: type=gha,mode=max,scope=${{ github.ref_name }}-${{ matrix.flavor }}
|
||||
|
||||
# - name: Docker Hub Description
|
||||
# if: github.ref == 'refs/heads/main' || github.ref == 'refs/tags/*' && vars.DOCKERHUB_REPOSITORY != ''
|
||||
# uses: peter-evans/dockerhub-description@v3
|
||||
# with:
|
||||
# username: ${{ secrets.DOCKERHUB_USERNAME }}
|
||||
# password: ${{ secrets.DOCKERHUB_TOKEN }}
|
||||
# repository: ${{ vars.DOCKERHUB_REPOSITORY }}
|
||||
# short-description: ${{ github.event.repository.description }}
|
||||
- name: Docker Hub Description
|
||||
if: github.ref == 'refs/heads/main' || github.ref == 'refs/tags/*' && vars.DOCKERHUB_REPOSITORY != ''
|
||||
uses: peter-evans/dockerhub-description@v3
|
||||
with:
|
||||
username: ${{ secrets.DOCKERHUB_USERNAME }}
|
||||
password: ${{ secrets.DOCKERHUB_TOKEN }}
|
||||
repository: ${{ vars.DOCKERHUB_REPOSITORY }}
|
||||
short-description: ${{ github.event.repository.description }}
|
||||
|
9
.github/workflows/close-inactive-issues.yml
vendored
@ -1,11 +1,11 @@
|
||||
name: Close inactive issues
|
||||
on:
|
||||
schedule:
|
||||
- cron: "00 4 * * *"
|
||||
- cron: "00 6 * * *"
|
||||
|
||||
env:
|
||||
DAYS_BEFORE_ISSUE_STALE: 30
|
||||
DAYS_BEFORE_ISSUE_CLOSE: 14
|
||||
DAYS_BEFORE_ISSUE_STALE: 14
|
||||
DAYS_BEFORE_ISSUE_CLOSE: 28
|
||||
|
||||
jobs:
|
||||
close-issues:
|
||||
@ -14,7 +14,7 @@ jobs:
|
||||
issues: write
|
||||
pull-requests: write
|
||||
steps:
|
||||
- uses: actions/stale@v8
|
||||
- uses: actions/stale@v5
|
||||
with:
|
||||
days-before-issue-stale: ${{ env.DAYS_BEFORE_ISSUE_STALE }}
|
||||
days-before-issue-close: ${{ env.DAYS_BEFORE_ISSUE_CLOSE }}
|
||||
@ -23,6 +23,5 @@ jobs:
|
||||
close-issue-message: "Due to inactivity, this issue was automatically closed. If you are still experiencing the issue, please recreate the issue."
|
||||
days-before-pr-stale: -1
|
||||
days-before-pr-close: -1
|
||||
exempt-issue-labels: "Active Issue"
|
||||
repo-token: ${{ secrets.GITHUB_TOKEN }}
|
||||
operations-per-run: 500
|
||||
|
4
.github/workflows/lint-frontend.yml
vendored
@ -2,6 +2,8 @@ name: Lint frontend
|
||||
|
||||
on:
|
||||
pull_request:
|
||||
paths:
|
||||
- 'invokeai/frontend/web/**'
|
||||
types:
|
||||
- 'ready_for_review'
|
||||
- 'opened'
|
||||
@ -9,6 +11,8 @@ on:
|
||||
push:
|
||||
branches:
|
||||
- 'main'
|
||||
paths:
|
||||
- 'invokeai/frontend/web/**'
|
||||
merge_group:
|
||||
workflow_dispatch:
|
||||
|
||||
|
13
.github/workflows/mkdocs-material.yml
vendored
@ -2,7 +2,8 @@ name: mkdocs-material
|
||||
on:
|
||||
push:
|
||||
branches:
|
||||
- 'refs/heads/main'
|
||||
- 'main'
|
||||
- 'development'
|
||||
|
||||
permissions:
|
||||
contents: write
|
||||
@ -11,10 +12,6 @@ jobs:
|
||||
mkdocs-material:
|
||||
if: github.event.pull_request.draft == false
|
||||
runs-on: ubuntu-latest
|
||||
env:
|
||||
REPO_URL: '${{ github.server_url }}/${{ github.repository }}'
|
||||
REPO_NAME: '${{ github.repository }}'
|
||||
SITE_URL: 'https://${{ github.repository_owner }}.github.io/InvokeAI'
|
||||
steps:
|
||||
- name: checkout sources
|
||||
uses: actions/checkout@v3
|
||||
@ -25,15 +22,11 @@ jobs:
|
||||
uses: actions/setup-python@v4
|
||||
with:
|
||||
python-version: '3.10'
|
||||
cache: pip
|
||||
cache-dependency-path: pyproject.toml
|
||||
|
||||
- name: install requirements
|
||||
env:
|
||||
PIP_USE_PEP517: 1
|
||||
run: |
|
||||
python -m \
|
||||
pip install ".[docs]"
|
||||
pip install -r docs/requirements-mkdocs.txt
|
||||
|
||||
- name: confirm buildability
|
||||
run: |
|
||||
|
27
.github/workflows/style-checks.yml
vendored
@ -1,27 +0,0 @@
|
||||
name: style checks
|
||||
# just formatting for now
|
||||
# TODO: add isort and flake8 later
|
||||
|
||||
on:
|
||||
pull_request:
|
||||
push:
|
||||
branches: main
|
||||
|
||||
jobs:
|
||||
black:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v3
|
||||
|
||||
- name: Setup Python
|
||||
uses: actions/setup-python@v4
|
||||
with:
|
||||
python-version: '3.10'
|
||||
|
||||
- name: Install dependencies with pip
|
||||
run: |
|
||||
pip install black
|
||||
|
||||
# - run: isort --check-only .
|
||||
- run: black --check .
|
||||
# - run: flake8
|
66
.github/workflows/test-invoke-pip-skip.yml
vendored
Normal file
@ -0,0 +1,66 @@
|
||||
name: Test invoke.py pip
|
||||
on:
|
||||
pull_request:
|
||||
paths:
|
||||
- '**'
|
||||
- '!pyproject.toml'
|
||||
- '!invokeai/**'
|
||||
- 'invokeai/frontend/web/**'
|
||||
merge_group:
|
||||
workflow_dispatch:
|
||||
|
||||
concurrency:
|
||||
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
|
||||
cancel-in-progress: true
|
||||
|
||||
jobs:
|
||||
matrix:
|
||||
if: github.event.pull_request.draft == false
|
||||
strategy:
|
||||
matrix:
|
||||
python-version:
|
||||
# - '3.9'
|
||||
- '3.10'
|
||||
pytorch:
|
||||
# - linux-cuda-11_6
|
||||
- linux-cuda-11_7
|
||||
- linux-rocm-5_2
|
||||
- linux-cpu
|
||||
- macos-default
|
||||
- windows-cpu
|
||||
# - windows-cuda-11_6
|
||||
# - windows-cuda-11_7
|
||||
include:
|
||||
# - pytorch: linux-cuda-11_6
|
||||
# os: ubuntu-22.04
|
||||
# extra-index-url: 'https://download.pytorch.org/whl/cu116'
|
||||
# github-env: $GITHUB_ENV
|
||||
- pytorch: linux-cuda-11_7
|
||||
os: ubuntu-22.04
|
||||
github-env: $GITHUB_ENV
|
||||
- pytorch: linux-rocm-5_2
|
||||
os: ubuntu-22.04
|
||||
extra-index-url: 'https://download.pytorch.org/whl/rocm5.2'
|
||||
github-env: $GITHUB_ENV
|
||||
- pytorch: linux-cpu
|
||||
os: ubuntu-22.04
|
||||
extra-index-url: 'https://download.pytorch.org/whl/cpu'
|
||||
github-env: $GITHUB_ENV
|
||||
- pytorch: macos-default
|
||||
os: macOS-12
|
||||
github-env: $GITHUB_ENV
|
||||
- pytorch: windows-cpu
|
||||
os: windows-2022
|
||||
github-env: $env:GITHUB_ENV
|
||||
# - pytorch: windows-cuda-11_6
|
||||
# os: windows-2022
|
||||
# extra-index-url: 'https://download.pytorch.org/whl/cu116'
|
||||
# github-env: $env:GITHUB_ENV
|
||||
# - pytorch: windows-cuda-11_7
|
||||
# os: windows-2022
|
||||
# extra-index-url: 'https://download.pytorch.org/whl/cu117'
|
||||
# github-env: $env:GITHUB_ENV
|
||||
name: ${{ matrix.pytorch }} on ${{ matrix.python-version }}
|
||||
runs-on: ${{ matrix.os }}
|
||||
steps:
|
||||
- run: 'echo "No build required"'
|
111
.github/workflows/test-invoke-pip.yml
vendored
@ -3,7 +3,15 @@ on:
|
||||
push:
|
||||
branches:
|
||||
- 'main'
|
||||
paths:
|
||||
- 'pyproject.toml'
|
||||
- 'invokeai/**'
|
||||
- '!invokeai/frontend/web/**'
|
||||
pull_request:
|
||||
paths:
|
||||
- 'pyproject.toml'
|
||||
- 'invokeai/**'
|
||||
- '!invokeai/frontend/web/**'
|
||||
types:
|
||||
- 'ready_for_review'
|
||||
- 'opened'
|
||||
@ -24,12 +32,19 @@ jobs:
|
||||
# - '3.9'
|
||||
- '3.10'
|
||||
pytorch:
|
||||
# - linux-cuda-11_6
|
||||
- linux-cuda-11_7
|
||||
- linux-rocm-5_2
|
||||
- linux-cpu
|
||||
- macos-default
|
||||
- windows-cpu
|
||||
# - windows-cuda-11_6
|
||||
# - windows-cuda-11_7
|
||||
include:
|
||||
# - pytorch: linux-cuda-11_6
|
||||
# os: ubuntu-22.04
|
||||
# extra-index-url: 'https://download.pytorch.org/whl/cu116'
|
||||
# github-env: $GITHUB_ENV
|
||||
- pytorch: linux-cuda-11_7
|
||||
os: ubuntu-22.04
|
||||
github-env: $GITHUB_ENV
|
||||
@ -47,6 +62,14 @@ jobs:
|
||||
- pytorch: windows-cpu
|
||||
os: windows-2022
|
||||
github-env: $env:GITHUB_ENV
|
||||
# - pytorch: windows-cuda-11_6
|
||||
# os: windows-2022
|
||||
# extra-index-url: 'https://download.pytorch.org/whl/cu116'
|
||||
# github-env: $env:GITHUB_ENV
|
||||
# - pytorch: windows-cuda-11_7
|
||||
# os: windows-2022
|
||||
# extra-index-url: 'https://download.pytorch.org/whl/cu117'
|
||||
# github-env: $env:GITHUB_ENV
|
||||
name: ${{ matrix.pytorch }} on ${{ matrix.python-version }}
|
||||
runs-on: ${{ matrix.os }}
|
||||
env:
|
||||
@ -56,23 +79,15 @@ jobs:
|
||||
id: checkout-sources
|
||||
uses: actions/checkout@v3
|
||||
|
||||
- name: Check for changed python files
|
||||
id: changed-files
|
||||
uses: tj-actions/changed-files@v37
|
||||
with:
|
||||
files_yaml: |
|
||||
python:
|
||||
- 'pyproject.toml'
|
||||
- 'invokeai/**'
|
||||
- '!invokeai/frontend/web/**'
|
||||
- 'tests/**'
|
||||
|
||||
- name: set test prompt to main branch validation
|
||||
if: steps.changed-files.outputs.python_any_changed == 'true'
|
||||
if: ${{ github.ref == 'refs/heads/main' }}
|
||||
run: echo "TEST_PROMPTS=tests/preflight_prompts.txt" >> ${{ matrix.github-env }}
|
||||
|
||||
- name: set test prompt to Pull Request validation
|
||||
if: ${{ github.ref != 'refs/heads/main' }}
|
||||
run: echo "TEST_PROMPTS=tests/validate_pr_prompt.txt" >> ${{ matrix.github-env }}
|
||||
|
||||
- name: setup python
|
||||
if: steps.changed-files.outputs.python_any_changed == 'true'
|
||||
uses: actions/setup-python@v4
|
||||
with:
|
||||
python-version: ${{ matrix.python-version }}
|
||||
@ -80,7 +95,6 @@ jobs:
|
||||
cache-dependency-path: pyproject.toml
|
||||
|
||||
- name: install invokeai
|
||||
if: steps.changed-files.outputs.python_any_changed == 'true'
|
||||
env:
|
||||
PIP_EXTRA_INDEX_URL: ${{ matrix.extra-index-url }}
|
||||
run: >
|
||||
@ -88,42 +102,43 @@ jobs:
|
||||
--editable=".[test]"
|
||||
|
||||
- name: run pytest
|
||||
if: steps.changed-files.outputs.python_any_changed == 'true'
|
||||
id: run-pytest
|
||||
run: pytest
|
||||
|
||||
# - name: run invokeai-configure
|
||||
# env:
|
||||
# HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGINGFACE_TOKEN }}
|
||||
# run: >
|
||||
# invokeai-configure
|
||||
# --yes
|
||||
# --default_only
|
||||
# --full-precision
|
||||
# # can't use fp16 weights without a GPU
|
||||
- name: set INVOKEAI_OUTDIR
|
||||
run: >
|
||||
python -c
|
||||
"import os;from invokeai.backend.globals import Globals;OUTDIR=os.path.join(Globals.root,str('outputs'));print(f'INVOKEAI_OUTDIR={OUTDIR}')"
|
||||
>> ${{ matrix.github-env }}
|
||||
|
||||
# - name: run invokeai
|
||||
# id: run-invokeai
|
||||
# env:
|
||||
# # Set offline mode to make sure configure preloaded successfully.
|
||||
# HF_HUB_OFFLINE: 1
|
||||
# HF_DATASETS_OFFLINE: 1
|
||||
# TRANSFORMERS_OFFLINE: 1
|
||||
# INVOKEAI_OUTDIR: ${{ github.workspace }}/results
|
||||
# run: >
|
||||
# invokeai
|
||||
# --no-patchmatch
|
||||
# --no-nsfw_checker
|
||||
# --precision=float32
|
||||
# --always_use_cpu
|
||||
# --use_memory_db
|
||||
# --outdir ${{ env.INVOKEAI_OUTDIR }}/${{ matrix.python-version }}/${{ matrix.pytorch }}
|
||||
# --from_file ${{ env.TEST_PROMPTS }}
|
||||
- name: run invokeai-configure
|
||||
id: run-preload-models
|
||||
env:
|
||||
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGINGFACE_TOKEN }}
|
||||
run: >
|
||||
invokeai-configure
|
||||
--yes
|
||||
--default_only
|
||||
--full-precision
|
||||
# can't use fp16 weights without a GPU
|
||||
|
||||
# - name: Archive results
|
||||
# env:
|
||||
# INVOKEAI_OUTDIR: ${{ github.workspace }}/results
|
||||
# uses: actions/upload-artifact@v3
|
||||
# with:
|
||||
# name: results
|
||||
# path: ${{ env.INVOKEAI_OUTDIR }}
|
||||
- name: run invokeai
|
||||
id: run-invokeai
|
||||
env:
|
||||
# Set offline mode to make sure configure preloaded successfully.
|
||||
HF_HUB_OFFLINE: 1
|
||||
HF_DATASETS_OFFLINE: 1
|
||||
TRANSFORMERS_OFFLINE: 1
|
||||
run: >
|
||||
invokeai
|
||||
--no-patchmatch
|
||||
--no-nsfw_checker
|
||||
--from_file ${{ env.TEST_PROMPTS }}
|
||||
--outdir ${{ env.INVOKEAI_OUTDIR }}/${{ matrix.python-version }}/${{ matrix.pytorch }}
|
||||
|
||||
- name: Archive results
|
||||
id: archive-results
|
||||
uses: actions/upload-artifact@v3
|
||||
with:
|
||||
name: results
|
||||
path: ${{ env.INVOKEAI_OUTDIR }}
|
||||
|
9
.gitignore
vendored
@ -9,8 +9,6 @@ models/ldm/stable-diffusion-v1/model.ckpt
|
||||
configs/models.user.yaml
|
||||
config/models.user.yml
|
||||
invokeai.init
|
||||
.version
|
||||
.last_model
|
||||
|
||||
# ignore the Anaconda/Miniconda installer used while building Docker image
|
||||
anaconda.sh
|
||||
@ -34,10 +32,11 @@ __pycache__/
|
||||
.Python
|
||||
build/
|
||||
develop-eggs/
|
||||
# dist/
|
||||
dist/
|
||||
downloads/
|
||||
eggs/
|
||||
.eggs/
|
||||
lib/
|
||||
lib64/
|
||||
parts/
|
||||
sdist/
|
||||
@ -78,7 +77,6 @@ cov.xml
|
||||
.pytest.ini
|
||||
cover/
|
||||
junit/
|
||||
notes/
|
||||
|
||||
# Translations
|
||||
*.mo
|
||||
@ -201,9 +199,6 @@ checkpoints
|
||||
# If it's a Mac
|
||||
.DS_Store
|
||||
|
||||
invokeai/frontend/yarn.lock
|
||||
invokeai/frontend/node_modules
|
||||
|
||||
# Let the frontend manage its own gitignore
|
||||
!invokeai/frontend/web/*
|
||||
|
||||
|
@ -1,10 +0,0 @@
|
||||
# See https://pre-commit.com/ for usage and config
|
||||
repos:
|
||||
- repo: local
|
||||
hooks:
|
||||
- id: black
|
||||
name: black
|
||||
stages: [commit]
|
||||
language: system
|
||||
entry: black
|
||||
types: [python]
|
189
LICENSE
@ -1,176 +1,21 @@
|
||||
Apache License
|
||||
Version 2.0, January 2004
|
||||
http://www.apache.org/licenses/
|
||||
MIT License
|
||||
|
||||
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
|
||||
Copyright (c) 2022 InvokeAI Team
|
||||
|
||||
1. Definitions.
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
"License" shall mean the terms and conditions for use, reproduction,
|
||||
and distribution as defined by Sections 1 through 9 of this document.
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
|
||||
"Licensor" shall mean the copyright owner or entity authorized by
|
||||
the copyright owner that is granting the License.
|
||||
|
||||
"Legal Entity" shall mean the union of the acting entity and all
|
||||
other entities that control, are controlled by, or are under common
|
||||
control with that entity. For the purposes of this definition,
|
||||
"control" means (i) the power, direct or indirect, to cause the
|
||||
direction or management of such entity, whether by contract or
|
||||
otherwise, or (ii) ownership of fifty percent (50%) or more of the
|
||||
outstanding shares, or (iii) beneficial ownership of such entity.
|
||||
|
||||
"You" (or "Your") shall mean an individual or Legal Entity
|
||||
exercising permissions granted by this License.
|
||||
|
||||
"Source" form shall mean the preferred form for making modifications,
|
||||
including but not limited to software source code, documentation
|
||||
source, and configuration files.
|
||||
|
||||
"Object" form shall mean any form resulting from mechanical
|
||||
transformation or translation of a Source form, including but
|
||||
not limited to compiled object code, generated documentation,
|
||||
and conversions to other media types.
|
||||
|
||||
"Work" shall mean the work of authorship, whether in Source or
|
||||
Object form, made available under the License, as indicated by a
|
||||
copyright notice that is included in or attached to the work
|
||||
(an example is provided in the Appendix below).
|
||||
|
||||
"Derivative Works" shall mean any work, whether in Source or Object
|
||||
form, that is based on (or derived from) the Work and for which the
|
||||
editorial revisions, annotations, elaborations, or other modifications
|
||||
represent, as a whole, an original work of authorship. For the purposes
|
||||
of this License, Derivative Works shall not include works that remain
|
||||
separable from, or merely link (or bind by name) to the interfaces of,
|
||||
the Work and Derivative Works thereof.
|
||||
|
||||
"Contribution" shall mean any work of authorship, including
|
||||
the original version of the Work and any modifications or additions
|
||||
to that Work or Derivative Works thereof, that is intentionally
|
||||
submitted to Licensor for inclusion in the Work by the copyright owner
|
||||
or by an individual or Legal Entity authorized to submit on behalf of
|
||||
the copyright owner. For the purposes of this definition, "submitted"
|
||||
means any form of electronic, verbal, or written communication sent
|
||||
to the Licensor or its representatives, including but not limited to
|
||||
communication on electronic mailing lists, source code control systems,
|
||||
and issue tracking systems that are managed by, or on behalf of, the
|
||||
Licensor for the purpose of discussing and improving the Work, but
|
||||
excluding communication that is conspicuously marked or otherwise
|
||||
designated in writing by the copyright owner as "Not a Contribution."
|
||||
|
||||
"Contributor" shall mean Licensor and any individual or Legal Entity
|
||||
on behalf of whom a Contribution has been received by Licensor and
|
||||
subsequently incorporated within the Work.
|
||||
|
||||
2. Grant of Copyright License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
copyright license to reproduce, prepare Derivative Works of,
|
||||
publicly display, publicly perform, sublicense, and distribute the
|
||||
Work and such Derivative Works in Source or Object form.
|
||||
|
||||
3. Grant of Patent License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
(except as stated in this section) patent license to make, have made,
|
||||
use, offer to sell, sell, import, and otherwise transfer the Work,
|
||||
where such license applies only to those patent claims licensable
|
||||
by such Contributor that are necessarily infringed by their
|
||||
Contribution(s) alone or by combination of their Contribution(s)
|
||||
with the Work to which such Contribution(s) was submitted. If You
|
||||
institute patent litigation against any entity (including a
|
||||
cross-claim or counterclaim in a lawsuit) alleging that the Work
|
||||
or a Contribution incorporated within the Work constitutes direct
|
||||
or contributory patent infringement, then any patent licenses
|
||||
granted to You under this License for that Work shall terminate
|
||||
as of the date such litigation is filed.
|
||||
|
||||
4. Redistribution. You may reproduce and distribute copies of the
|
||||
Work or Derivative Works thereof in any medium, with or without
|
||||
modifications, and in Source or Object form, provided that You
|
||||
meet the following conditions:
|
||||
|
||||
(a) You must give any other recipients of the Work or
|
||||
Derivative Works a copy of this License; and
|
||||
|
||||
(b) You must cause any modified files to carry prominent notices
|
||||
stating that You changed the files; and
|
||||
|
||||
(c) You must retain, in the Source form of any Derivative Works
|
||||
that You distribute, all copyright, patent, trademark, and
|
||||
attribution notices from the Source form of the Work,
|
||||
excluding those notices that do not pertain to any part of
|
||||
the Derivative Works; and
|
||||
|
||||
(d) If the Work includes a "NOTICE" text file as part of its
|
||||
distribution, then any Derivative Works that You distribute must
|
||||
include a readable copy of the attribution notices contained
|
||||
within such NOTICE file, excluding those notices that do not
|
||||
pertain to any part of the Derivative Works, in at least one
|
||||
of the following places: within a NOTICE text file distributed
|
||||
as part of the Derivative Works; within the Source form or
|
||||
documentation, if provided along with the Derivative Works; or,
|
||||
within a display generated by the Derivative Works, if and
|
||||
wherever such third-party notices normally appear. The contents
|
||||
of the NOTICE file are for informational purposes only and
|
||||
do not modify the License. You may add Your own attribution
|
||||
notices within Derivative Works that You distribute, alongside
|
||||
or as an addendum to the NOTICE text from the Work, provided
|
||||
that such additional attribution notices cannot be construed
|
||||
as modifying the License.
|
||||
|
||||
You may add Your own copyright statement to Your modifications and
|
||||
may provide additional or different license terms and conditions
|
||||
for use, reproduction, or distribution of Your modifications, or
|
||||
for any such Derivative Works as a whole, provided Your use,
|
||||
reproduction, and distribution of the Work otherwise complies with
|
||||
the conditions stated in this License.
|
||||
|
||||
5. Submission of Contributions. Unless You explicitly state otherwise,
|
||||
any Contribution intentionally submitted for inclusion in the Work
|
||||
by You to the Licensor shall be under the terms and conditions of
|
||||
this License, without any additional terms or conditions.
|
||||
Notwithstanding the above, nothing herein shall supersede or modify
|
||||
the terms of any separate license agreement you may have executed
|
||||
with Licensor regarding such Contributions.
|
||||
|
||||
6. Trademarks. This License does not grant permission to use the trade
|
||||
names, trademarks, service marks, or product names of the Licensor,
|
||||
except as required for reasonable and customary use in describing the
|
||||
origin of the Work and reproducing the content of the NOTICE file.
|
||||
|
||||
7. Disclaimer of Warranty. Unless required by applicable law or
|
||||
agreed to in writing, Licensor provides the Work (and each
|
||||
Contributor provides its Contributions) on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
||||
implied, including, without limitation, any warranties or conditions
|
||||
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
|
||||
PARTICULAR PURPOSE. You are solely responsible for determining the
|
||||
appropriateness of using or redistributing the Work and assume any
|
||||
risks associated with Your exercise of permissions under this License.
|
||||
|
||||
8. Limitation of Liability. In no event and under no legal theory,
|
||||
whether in tort (including negligence), contract, or otherwise,
|
||||
unless required by applicable law (such as deliberate and grossly
|
||||
negligent acts) or agreed to in writing, shall any Contributor be
|
||||
liable to You for damages, including any direct, indirect, special,
|
||||
incidental, or consequential damages of any character arising as a
|
||||
result of this License or out of the use or inability to use the
|
||||
Work (including but not limited to damages for loss of goodwill,
|
||||
work stoppage, computer failure or malfunction, or any and all
|
||||
other commercial damages or losses), even if such Contributor
|
||||
has been advised of the possibility of such damages.
|
||||
|
||||
9. Accepting Warranty or Additional Liability. While redistributing
|
||||
the Work or Derivative Works thereof, You may choose to offer,
|
||||
and charge a fee for, acceptance of support, warranty, indemnity,
|
||||
or other liability obligations and/or rights consistent with this
|
||||
License. However, in accepting such obligations, You may act only
|
||||
on Your own behalf and on Your sole responsibility, not on behalf
|
||||
of any other Contributor, and only if You agree to indemnify,
|
||||
defend, and hold each Contributor harmless for any liability
|
||||
incurred by, or claims asserted against, such Contributor by reason
|
||||
of your accepting any such warranty or additional liability.
|
||||
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE.
|
||||
|
290
LICENSE-SDXL.txt
@ -1,290 +0,0 @@
|
||||
Copyright (c) 2023 Stability AI
|
||||
CreativeML Open RAIL++-M License dated July 26, 2023
|
||||
|
||||
Section I: PREAMBLE
|
||||
|
||||
Multimodal generative models are being widely adopted and used, and
|
||||
have the potential to transform the way artists, among other
|
||||
individuals, conceive and benefit from AI or ML technologies as a tool
|
||||
for content creation.
|
||||
|
||||
Notwithstanding the current and potential benefits that these
|
||||
artifacts can bring to society at large, there are also concerns about
|
||||
potential misuses of them, either due to their technical limitations
|
||||
or ethical considerations.
|
||||
|
||||
In short, this license strives for both the open and responsible
|
||||
downstream use of the accompanying model. When it comes to the open
|
||||
character, we took inspiration from open source permissive licenses
|
||||
regarding the grant of IP rights. Referring to the downstream
|
||||
responsible use, we added use-based restrictions not permitting the
|
||||
use of the model in very specific scenarios, in order for the licensor
|
||||
to be able to enforce the license in case potential misuses of the
|
||||
Model may occur. At the same time, we strive to promote open and
|
||||
responsible research on generative models for art and content
|
||||
generation.
|
||||
|
||||
Even though downstream derivative versions of the model could be
|
||||
released under different licensing terms, the latter will always have
|
||||
to include - at minimum - the same use-based restrictions as the ones
|
||||
in the original license (this license). We believe in the intersection
|
||||
between open and responsible AI development; thus, this agreement aims
|
||||
to strike a balance between both in order to enable responsible
|
||||
open-science in the field of AI.
|
||||
|
||||
This CreativeML Open RAIL++-M License governs the use of the model
|
||||
(and its derivatives) and is informed by the model card associated
|
||||
with the model.
|
||||
|
||||
NOW THEREFORE, You and Licensor agree as follows:
|
||||
|
||||
Definitions
|
||||
|
||||
"License" means the terms and conditions for use, reproduction, and
|
||||
Distribution as defined in this document.
|
||||
|
||||
"Data" means a collection of information and/or content extracted from
|
||||
the dataset used with the Model, including to train, pretrain, or
|
||||
otherwise evaluate the Model. The Data is not licensed under this
|
||||
License.
|
||||
|
||||
"Output" means the results of operating a Model as embodied in
|
||||
informational content resulting therefrom.
|
||||
|
||||
"Model" means any accompanying machine-learning based assemblies
|
||||
(including checkpoints), consisting of learnt weights, parameters
|
||||
(including optimizer states), corresponding to the model architecture
|
||||
as embodied in the Complementary Material, that have been trained or
|
||||
tuned, in whole or in part on the Data, using the Complementary
|
||||
Material.
|
||||
|
||||
"Derivatives of the Model" means all modifications to the Model, works
|
||||
based on the Model, or any other model which is created or initialized
|
||||
by transfer of patterns of the weights, parameters, activations or
|
||||
output of the Model, to the other model, in order to cause the other
|
||||
model to perform similarly to the Model, including - but not limited
|
||||
to - distillation methods entailing the use of intermediate data
|
||||
representations or methods based on the generation of synthetic data
|
||||
by the Model for training the other model.
|
||||
|
||||
"Complementary Material" means the accompanying source code and
|
||||
scripts used to define, run, load, benchmark or evaluate the Model,
|
||||
and used to prepare data for training or evaluation, if any. This
|
||||
includes any accompanying documentation, tutorials, examples, etc, if
|
||||
any.
|
||||
|
||||
"Distribution" means any transmission, reproduction, publication or
|
||||
other sharing of the Model or Derivatives of the Model to a third
|
||||
party, including providing the Model as a hosted service made
|
||||
available by electronic or other remote means - e.g. API-based or web
|
||||
access.
|
||||
|
||||
"Licensor" means the copyright owner or entity authorized by the
|
||||
copyright owner that is granting the License, including the persons or
|
||||
entities that may have rights in the Model and/or distributing the
|
||||
Model.
|
||||
|
||||
"You" (or "Your") means an individual or Legal Entity exercising
|
||||
permissions granted by this License and/or making use of the Model for
|
||||
whichever purpose and in any field of use, including usage of the
|
||||
Model in an end-use application - e.g. chatbot, translator, image
|
||||
generator.
|
||||
|
||||
"Third Parties" means individuals or legal entities that are not under
|
||||
common control with Licensor or You.
|
||||
|
||||
"Contribution" means any work of authorship, including the original
|
||||
version of the Model and any modifications or additions to that Model
|
||||
or Derivatives of the Model thereof, that is intentionally submitted
|
||||
to Licensor for inclusion in the Model by the copyright owner or by an
|
||||
individual or Legal Entity authorized to submit on behalf of the
|
||||
copyright owner. For the purposes of this definition, "submitted"
|
||||
means any form of electronic, verbal, or written communication sent to
|
||||
the Licensor or its representatives, including but not limited to
|
||||
communication on electronic mailing lists, source code control
|
||||
systems, and issue tracking systems that are managed by, or on behalf
|
||||
of, the Licensor for the purpose of discussing and improving the
|
||||
Model, but excluding communication that is conspicuously marked or
|
||||
otherwise designated in writing by the copyright owner as "Not a
|
||||
Contribution."
|
||||
|
||||
"Contributor" means Licensor and any individual or Legal Entity on
|
||||
behalf of whom a Contribution has been received by Licensor and
|
||||
subsequently incorporated within the Model.
|
||||
|
||||
Section II: INTELLECTUAL PROPERTY RIGHTS
|
||||
|
||||
Both copyright and patent grants apply to the Model, Derivatives of
|
||||
the Model and Complementary Material. The Model and Derivatives of the
|
||||
Model are subject to additional terms as described in
|
||||
|
||||
Section III.
|
||||
|
||||
Grant of Copyright License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
copyright license to reproduce, prepare, publicly display, publicly
|
||||
perform, sublicense, and distribute the Complementary Material, the
|
||||
Model, and Derivatives of the Model.
|
||||
|
||||
Grant of Patent License. Subject to the terms and conditions of this
|
||||
License and where and as applicable, each Contributor hereby grants to
|
||||
You a perpetual, worldwide, non-exclusive, no-charge, royalty-free,
|
||||
irrevocable (except as stated in this paragraph) patent license to
|
||||
make, have made, use, offer to sell, sell, import, and otherwise
|
||||
transfer the Model and the Complementary Material, where such license
|
||||
applies only to those patent claims licensable by such Contributor
|
||||
that are necessarily infringed by their Contribution(s) alone or by
|
||||
combination of their Contribution(s) with the Model to which such
|
||||
Contribution(s) was submitted. If You institute patent litigation
|
||||
against any entity (including a cross-claim or counterclaim in a
|
||||
lawsuit) alleging that the Model and/or Complementary Material or a
|
||||
Contribution incorporated within the Model and/or Complementary
|
||||
Material constitutes direct or contributory patent infringement, then
|
||||
any patent licenses granted to You under this License for the Model
|
||||
and/or Work shall terminate as of the date such litigation is asserted
|
||||
or filed.
|
||||
|
||||
Section III: CONDITIONS OF USAGE, DISTRIBUTION AND REDISTRIBUTION
|
||||
|
||||
Distribution and Redistribution. You may host for Third Party remote
|
||||
access purposes (e.g. software-as-a-service), reproduce and distribute
|
||||
copies of the Model or Derivatives of the Model thereof in any medium,
|
||||
with or without modifications, provided that You meet the following
|
||||
conditions: Use-based restrictions as referenced in paragraph 5 MUST
|
||||
be included as an enforceable provision by You in any type of legal
|
||||
agreement (e.g. a license) governing the use and/or distribution of
|
||||
the Model or Derivatives of the Model, and You shall give notice to
|
||||
subsequent users You Distribute to, that the Model or Derivatives of
|
||||
the Model are subject to paragraph 5. This provision does not apply to
|
||||
the use of Complementary Material. You must give any Third Party
|
||||
recipients of the Model or Derivatives of the Model a copy of this
|
||||
License; You must cause any modified files to carry prominent notices
|
||||
stating that You changed the files; You must retain all copyright,
|
||||
patent, trademark, and attribution notices excluding those notices
|
||||
that do not pertain to any part of the Model, Derivatives of the
|
||||
Model. You may add Your own copyright statement to Your modifications
|
||||
and may provide additional or different license terms and conditions -
|
||||
respecting paragraph 4.a. - for use, reproduction, or Distribution of
|
||||
Your modifications, or for any such Derivatives of the Model as a
|
||||
whole, provided Your use, reproduction, and Distribution of the Model
|
||||
otherwise complies with the conditions stated in this License.
|
||||
|
||||
Use-based restrictions. The restrictions set forth in Attachment A are
|
||||
considered Use-based restrictions. Therefore You cannot use the Model
|
||||
and the Derivatives of the Model for the specified restricted
|
||||
uses. You may use the Model subject to this License, including only
|
||||
for lawful purposes and in accordance with the License. Use may
|
||||
include creating any content with, finetuning, updating, running,
|
||||
training, evaluating and/or reparametrizing the Model. You shall
|
||||
require all of Your users who use the Model or a Derivative of the
|
||||
Model to comply with the terms of this paragraph (paragraph 5).
|
||||
|
||||
The Output You Generate. Except as set forth herein, Licensor claims
|
||||
no rights in the Output You generate using the Model. You are
|
||||
accountable for the Output you generate and its subsequent uses. No
|
||||
use of the output can contravene any provision as stated in the
|
||||
License.
|
||||
|
||||
Section IV: OTHER PROVISIONS
|
||||
|
||||
Updates and Runtime Restrictions. To the maximum extent permitted by
|
||||
law, Licensor reserves the right to restrict (remotely or otherwise)
|
||||
usage of the Model in violation of this License.
|
||||
|
||||
Trademarks and related. Nothing in this License permits You to make
|
||||
use of Licensors’ trademarks, trade names, logos or to otherwise
|
||||
suggest endorsement or misrepresent the relationship between the
|
||||
parties; and any rights not expressly granted herein are reserved by
|
||||
the Licensors.
|
||||
|
||||
Disclaimer of Warranty. Unless required by applicable law or agreed to
|
||||
in writing, Licensor provides the Model and the Complementary Material
|
||||
(and each Contributor provides its Contributions) on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
||||
implied, including, without limitation, any warranties or conditions
|
||||
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
|
||||
PARTICULAR PURPOSE. You are solely responsible for determining the
|
||||
appropriateness of using or redistributing the Model, Derivatives of
|
||||
the Model, and the Complementary Material and assume any risks
|
||||
associated with Your exercise of permissions under this License.
|
||||
|
||||
Limitation of Liability. In no event and under no legal theory,
|
||||
whether in tort (including negligence), contract, or otherwise, unless
|
||||
required by applicable law (such as deliberate and grossly negligent
|
||||
acts) or agreed to in writing, shall any Contributor be liable to You
|
||||
for damages, including any direct, indirect, special, incidental, or
|
||||
consequential damages of any character arising as a result of this
|
||||
License or out of the use or inability to use the Model and the
|
||||
Complementary Material (including but not limited to damages for loss
|
||||
of goodwill, work stoppage, computer failure or malfunction, or any
|
||||
and all other commercial damages or losses), even if such Contributor
|
||||
has been advised of the possibility of such damages.
|
||||
|
||||
Accepting Warranty or Additional Liability. While redistributing the
|
||||
Model, Derivatives of the Model and the Complementary Material
|
||||
thereof, You may choose to offer, and charge a fee for, acceptance of
|
||||
support, warranty, indemnity, or other liability obligations and/or
|
||||
rights consistent with this License. However, in accepting such
|
||||
obligations, You may act only on Your own behalf and on Your sole
|
||||
responsibility, not on behalf of any other Contributor, and only if
|
||||
You agree to indemnify, defend, and hold each Contributor harmless for
|
||||
any liability incurred by, or claims asserted against, such
|
||||
Contributor by reason of your accepting any such warranty or
|
||||
additional liability.
|
||||
|
||||
If any provision of this License is held to be invalid, illegal or
|
||||
unenforceable, the remaining provisions shall be unaffected thereby
|
||||
and remain valid as if such provision had not been set forth herein.
|
||||
|
||||
|
||||
END OF TERMS AND CONDITIONS
|
||||
|
||||
Attachment A
|
||||
|
||||
Use Restrictions
|
||||
|
||||
You agree not to use the Model or Derivatives of the Model:
|
||||
|
||||
* In any way that violates any applicable national, federal, state,
|
||||
local or international law or regulation;
|
||||
|
||||
* For the purpose of exploiting, harming or attempting to exploit or
|
||||
harm minors in any way;
|
||||
|
||||
* To generate or disseminate verifiably false information and/or
|
||||
content with the purpose of harming others;
|
||||
|
||||
* To generate or disseminate personal identifiable information that
|
||||
can be used to harm an individual;
|
||||
|
||||
* To defame, disparage or otherwise harass others;
|
||||
|
||||
* For fully automated decision making that adversely impacts an
|
||||
individual’s legal rights or otherwise creates or modifies a
|
||||
binding, enforceable obligation;
|
||||
|
||||
* For any use intended to or which has the effect of discriminating
|
||||
against or harming individuals or groups based on online or offline
|
||||
social behavior or known or predicted personal or personality
|
||||
characteristics;
|
||||
|
||||
* To exploit any of the vulnerabilities of a specific group of persons
|
||||
based on their age, social, physical or mental characteristics, in
|
||||
order to materially distort the behavior of a person pertaining to
|
||||
that group in a manner that causes or is likely to cause that person
|
||||
or another person physical or psychological harm;
|
||||
|
||||
* For any use intended to or which has the effect of discriminating
|
||||
against individuals or groups based on legally protected
|
||||
characteristics or categories;
|
||||
|
||||
* To provide medical advice and medical results interpretation;
|
||||
|
||||
* To generate or disseminate information for the purpose to be used
|
||||
for administration of justice, law enforcement, immigration or
|
||||
asylum processes, such as predicting an individual will commit
|
||||
fraud/crime commitment (e.g. by text profiling, drawing causal
|
||||
relationships between assertions made in documents, indiscriminate
|
||||
and arbitrarily-targeted use).
|
||||
|
234
README.md
@ -1,11 +1,8 @@
|
||||
<div align="center">
|
||||
|
||||

|
||||
|
||||
# Invoke AI - Generative AI for Professional Creatives
|
||||
## Professional Creative Tools for Stable Diffusion, Custom-Trained Models, and more.
|
||||
To learn more about Invoke AI, get started instantly, or implement our Business solutions, visit [invoke.ai](https://invoke.ai)
|
||||

|
||||
|
||||
# InvokeAI: A Stable Diffusion Toolkit
|
||||
|
||||
[![discord badge]][discord link]
|
||||
|
||||
@ -36,23 +33,13 @@
|
||||
|
||||
</div>
|
||||
|
||||
InvokeAI is a leading creative engine built to empower professionals
|
||||
and enthusiasts alike. Generate and create stunning visual media using
|
||||
the latest AI-driven technologies. InvokeAI offers an industry leading
|
||||
Web Interface, interactive Command Line Interface, and also serves as
|
||||
the foundation for multiple commercial products.
|
||||
InvokeAI is a leading creative engine built to empower professionals and enthusiasts alike. Generate and create stunning visual media using the latest AI-driven technologies. InvokeAI offers an industry leading Web Interface, interactive Command Line Interface, and also serves as the foundation for multiple commercial products.
|
||||
|
||||
**Quick links**: [[How to
|
||||
Install](https://invoke-ai.github.io/InvokeAI/#installation)] [<a
|
||||
href="https://discord.gg/ZmtBAhwWhy">Discord Server</a>] [<a
|
||||
href="https://invoke-ai.github.io/InvokeAI/">Documentation and
|
||||
Tutorials</a>] [<a
|
||||
href="https://github.com/invoke-ai/InvokeAI/">Code and
|
||||
Downloads</a>] [<a
|
||||
href="https://github.com/invoke-ai/InvokeAI/issues">Bug Reports</a>]
|
||||
[<a
|
||||
href="https://github.com/invoke-ai/InvokeAI/discussions">Discussion,
|
||||
Ideas & Q&A</a>]
|
||||
**Quick links**: [[How to Install](https://invoke-ai.github.io/InvokeAI/#installation)] [<a href="https://discord.gg/ZmtBAhwWhy">Discord Server</a>] [<a href="https://invoke-ai.github.io/InvokeAI/">Documentation and Tutorials</a>] [<a href="https://github.com/invoke-ai/InvokeAI/">Code and Downloads</a>] [<a href="https://github.com/invoke-ai/InvokeAI/issues">Bug Reports</a>] [<a href="https://github.com/invoke-ai/InvokeAI/discussions">Discussion, Ideas & Q&A</a>]
|
||||
|
||||
_Note: InvokeAI is rapidly evolving. Please use the
|
||||
[Issues](https://github.com/invoke-ai/InvokeAI/issues) tab to report bugs and make feature
|
||||
requests. Be sure to use the provided templates. They will help us diagnose issues faster._
|
||||
|
||||
<div align="center">
|
||||
|
||||
@ -62,30 +49,22 @@ the foundation for multiple commercial products.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
Table of Contents 📝
|
||||
1. [Quick Start](#getting-started-with-invokeai)
|
||||
2. [Installation](#detailed-installation-instructions)
|
||||
3. [Hardware Requirements](#hardware-requirements)
|
||||
4. [Features](#features)
|
||||
5. [Latest Changes](#latest-changes)
|
||||
6. [Troubleshooting](#troubleshooting)
|
||||
7. [Contributing](#contributing)
|
||||
8. [Contributors](#contributors)
|
||||
9. [Support](#support)
|
||||
10. [Further Reading](#further-reading)
|
||||
|
||||
**Getting Started**
|
||||
1. 🏁 [Quick Start](#quick-start)
|
||||
3. 🖥️ [Hardware Requirements](#hardware-requirements)
|
||||
|
||||
**More About Invoke**
|
||||
1. 🌟 [Features](#features)
|
||||
2. 📣 [Latest Changes](#latest-changes)
|
||||
3. 🛠️ [Troubleshooting](#troubleshooting)
|
||||
|
||||
**Supporting the Project**
|
||||
1. 🤝 [Contributing](#contributing)
|
||||
2. 👥 [Contributors](#contributors)
|
||||
3. 💕 [Support](#support)
|
||||
|
||||
## Quick Start
|
||||
## Getting Started with InvokeAI
|
||||
|
||||
For full installation and upgrade instructions, please see:
|
||||
[InvokeAI Installation Overview](https://invoke-ai.github.io/InvokeAI/installation/)
|
||||
|
||||
If upgrading from version 2.3, please read [Migrating a 2.3 root
|
||||
directory to 3.0](#migrating-to-3) first.
|
||||
|
||||
### Automatic Installer (suggested for 1st time users)
|
||||
|
||||
1. Go to the bottom of the [Latest Release Page](https://github.com/invoke-ai/InvokeAI/releases/latest)
|
||||
@ -94,8 +73,9 @@ directory to 3.0](#migrating-to-3) first.
|
||||
|
||||
3. Unzip the file.
|
||||
|
||||
4. **Windows:** double-click on the `install.bat` script. **macOS:** Open a Terminal window, drag the file `install.sh` from Finder
|
||||
into the Terminal, and press return. **Linux:** run `install.sh`.
|
||||
4. If you are on Windows, double-click on the `install.bat` script. On
|
||||
macOS, open a Terminal window, drag the file `install.sh` from Finder
|
||||
into the Terminal, and press return. On Linux, run `install.sh`.
|
||||
|
||||
5. You'll be asked to confirm the location of the folder in which
|
||||
to install InvokeAI and its image generation model files. Pick a
|
||||
@ -104,7 +84,7 @@ installing lots of models.
|
||||
|
||||
6. Wait while the installer does its thing. After installing the software,
|
||||
the installer will launch a script that lets you configure InvokeAI and
|
||||
select a set of starting image generation models.
|
||||
select a set of starting image generaiton models.
|
||||
|
||||
7. Find the folder that InvokeAI was installed into (it is not the
|
||||
same as the unpacked zip file directory!) The default location of this
|
||||
@ -121,12 +101,10 @@ and go to http://localhost:9090.
|
||||
|
||||
10. Type `banana sushi` in the box on the top left and click `Invoke`
|
||||
|
||||
### Command-Line Installation (for developers and users familiar with Terminals)
|
||||
### Command-Line Installation (for users familiar with Terminals)
|
||||
|
||||
You must have Python 3.9 through 3.11 installed on your machine. Earlier or
|
||||
later versions are not supported.
|
||||
Node.js also needs to be installed along with yarn (can be installed with
|
||||
the command `npm install -g yarn` if needed)
|
||||
You must have Python 3.9 or 3.10 installed on your machine. Earlier or later versions are
|
||||
not supported.
|
||||
|
||||
1. Open a command-line window on your machine. The PowerShell is recommended for Windows.
|
||||
2. Create a directory to install InvokeAI into. You'll need at least 15 GB of free space:
|
||||
@ -167,14 +145,9 @@ the command `npm install -g yarn` if needed)
|
||||
_For Linux with an AMD GPU:_
|
||||
|
||||
```sh
|
||||
pip install InvokeAI --use-pep517 --extra-index-url https://download.pytorch.org/whl/rocm5.4.2
|
||||
pip install InvokeAI --use-pep517 --extra-index-url https://download.pytorch.org/whl/rocm5.2
|
||||
```
|
||||
|
||||
_For non-GPU systems:_
|
||||
```terminal
|
||||
pip install InvokeAI --use-pep517 --extra-index-url https://download.pytorch.org/whl/cpu
|
||||
```
|
||||
|
||||
_For Macintoshes, either Intel or M1/M2:_
|
||||
|
||||
```sh
|
||||
@ -184,24 +157,22 @@ the command `npm install -g yarn` if needed)
|
||||
6. Configure InvokeAI and install a starting set of image generation models (you only need to do this once):
|
||||
|
||||
```terminal
|
||||
invokeai-configure --root .
|
||||
invokeai-configure
|
||||
```
|
||||
Don't miss the dot at the end!
|
||||
|
||||
7. Launch the web server (do it every time you run InvokeAI):
|
||||
|
||||
```terminal
|
||||
invokeai-web
|
||||
invokeai --web
|
||||
```
|
||||
|
||||
8. Point your browser to http://localhost:9090 to bring up the web interface.
|
||||
|
||||
9. Type `banana sushi` in the box on the top left and click `Invoke`.
|
||||
|
||||
Be sure to activate the virtual environment each time before re-launching InvokeAI,
|
||||
using `source .venv/bin/activate` or `.venv\Scripts\activate`.
|
||||
|
||||
## Detailed Installation Instructions
|
||||
### Detailed Installation Instructions
|
||||
|
||||
This fork is supported across Linux, Windows and Macintosh. Linux
|
||||
users can use either an Nvidia-based card (with CUDA support) or an
|
||||
@ -209,111 +180,6 @@ AMD card (using the ROCm driver). For full installation and upgrade
|
||||
instructions, please see:
|
||||
[InvokeAI Installation Overview](https://invoke-ai.github.io/InvokeAI/installation/INSTALL_SOURCE/)
|
||||
|
||||
<a name="migrating-to-3"></a>
|
||||
### Migrating a v2.3 InvokeAI root directory
|
||||
|
||||
The InvokeAI root directory is where the InvokeAI startup file,
|
||||
installed models, and generated images are stored. It is ordinarily
|
||||
named `invokeai` and located in your home directory. The contents and
|
||||
layout of this directory has changed between versions 2.3 and 3.0 and
|
||||
cannot be used directly.
|
||||
|
||||
We currently recommend that you use the installer to create a new root
|
||||
directory named differently from the 2.3 one, e.g. `invokeai-3` and
|
||||
then use a migration script to copy your 2.3 models into the new
|
||||
location. However, if you choose, you can upgrade this directory in
|
||||
place. This section gives both recipes.
|
||||
|
||||
#### Creating a new root directory and migrating old models
|
||||
|
||||
This is the safer recipe because it leaves your old root directory in
|
||||
place to fall back on.
|
||||
|
||||
1. Follow the instructions above to create and install InvokeAI in a
|
||||
directory that has a different name from the 2.3 invokeai directory.
|
||||
In this example, we will use "invokeai-3"
|
||||
|
||||
2. When you are prompted to select models to install, select a minimal
|
||||
set of models, such as stable-diffusion-v1.5 only.
|
||||
|
||||
3. After installation is complete launch `invokeai.sh` (Linux/Mac) or
|
||||
`invokeai.bat` and select option 8 "Open the developers console". This
|
||||
will take you to the command line.
|
||||
|
||||
4. Issue the command `invokeai-migrate3 --from /path/to/v2.3-root --to
|
||||
/path/to/invokeai-3-root`. Provide the correct `--from` and `--to`
|
||||
paths for your v2.3 and v3.0 root directories respectively.
|
||||
|
||||
This will copy and convert your old models from 2.3 format to 3.0
|
||||
format and create a new `models` directory in the 3.0 directory. The
|
||||
old models directory (which contains the models selected at install
|
||||
time) will be renamed `models.orig` and can be deleted once you have
|
||||
confirmed that the migration was successful.
|
||||
|
||||
If you wish, you can pass the 2.3 root directory to both `--from` and
|
||||
`--to` in order to update in place. Warning: this directory will no
|
||||
longer be usable with InvokeAI 2.3.
|
||||
|
||||
#### Migrating in place
|
||||
|
||||
For the adventurous, you may do an in-place upgrade from 2.3 to 3.0
|
||||
without touching the command line. ***This recipe does not work on
|
||||
Windows platforms due to a bug in the Windows version of the 2.3
|
||||
upgrade script.** See the next section for a Windows recipe.
|
||||
|
||||
##### For Mac and Linux Users:
|
||||
|
||||
1. Launch the InvokeAI launcher script in your current v2.3 root directory.
|
||||
|
||||
2. Select option [9] "Update InvokeAI" to bring up the updater dialog.
|
||||
|
||||
3. Select option [1] to upgrade to the latest release.
|
||||
|
||||
4. Once the upgrade is finished you will be returned to the launcher
|
||||
menu. Select option [7] "Re-run the configure script to fix a broken
|
||||
install or to complete a major upgrade".
|
||||
|
||||
This will run the configure script against the v2.3 directory and
|
||||
update it to the 3.0 format. The following files will be replaced:
|
||||
|
||||
- The invokeai.init file, replaced by invokeai.yaml
|
||||
- The models directory
|
||||
- The configs/models.yaml model index
|
||||
|
||||
The original versions of these files will be saved with the suffix
|
||||
".orig" appended to the end. Once you have confirmed that the upgrade
|
||||
worked, you can safely remove these files. Alternatively you can
|
||||
restore a working v2.3 directory by removing the new files and
|
||||
restoring the ".orig" files' original names.
|
||||
|
||||
##### For Windows Users:
|
||||
|
||||
Windows Users can upgrade with the
|
||||
|
||||
1. Enter the 2.3 root directory you wish to upgrade
|
||||
2. Launch `invoke.sh` or `invoke.bat`
|
||||
3. Select the "Developer's console" option [8]
|
||||
4. Type the following commands
|
||||
|
||||
```
|
||||
pip install "invokeai @ https://github.com/invoke-ai/InvokeAI/archive/refs/tags/v3.0.0" --use-pep517 --upgrade
|
||||
invokeai-configure --root .
|
||||
```
|
||||
(Replace `v3.0.0` with the current release number if this document is out of date).
|
||||
|
||||
The first command will install and upgrade new software to run
|
||||
InvokeAI. The second will prepare the 2.3 directory for use with 3.0.
|
||||
You may now launch the WebUI in the usual way, by selecting option [1]
|
||||
from the launcher script
|
||||
|
||||
#### Migration Caveats
|
||||
|
||||
The migration script will migrate your invokeai settings and models,
|
||||
including textual inversion models, LoRAs and merges that you may have
|
||||
installed previously. However it does **not** migrate the generated
|
||||
images stored in your 2.3-format outputs directory. You will need to
|
||||
manually import selected images into the 3.0 gallery via drag-and-drop.
|
||||
|
||||
## Hardware Requirements
|
||||
|
||||
InvokeAI is supported across Linux, Windows and macOS. Linux
|
||||
@ -324,20 +190,21 @@ AMD card (using the ROCm driver).
|
||||
|
||||
You will need one of the following:
|
||||
|
||||
- An NVIDIA-based graphics card with 4 GB or more VRAM memory. 6-8 GB
|
||||
of VRAM is highly recommended for rendering using the Stable
|
||||
Diffusion XL models
|
||||
- An NVIDIA-based graphics card with 4 GB or more VRAM memory.
|
||||
- An Apple computer with an M1 chip.
|
||||
- An AMD-based graphics card with 4GB or more VRAM memory (Linux
|
||||
only), 6-8 GB for XL rendering.
|
||||
- An AMD-based graphics card with 4GB or more VRAM memory. (Linux only)
|
||||
|
||||
We do not recommend the GTX 1650 or 1660 series video cards. They are
|
||||
unable to run in half-precision mode and do not have sufficient VRAM
|
||||
to render 512x512 images.
|
||||
|
||||
**Memory** - At least 12 GB Main Memory RAM.
|
||||
### Memory
|
||||
|
||||
**Disk** - At least 12 GB of free disk space for the machine learning model, Python, and all its dependencies.
|
||||
- At least 12 GB Main Memory RAM.
|
||||
|
||||
### Disk
|
||||
|
||||
- At least 12 GB of free disk space for the machine learning model, Python, and all its dependencies.
|
||||
|
||||
## Features
|
||||
|
||||
@ -351,23 +218,28 @@ InvokeAI offers a locally hosted Web Server & React Frontend, with an industry l
|
||||
|
||||
The Unified Canvas is a fully integrated canvas implementation with support for all core generation capabilities, in/outpainting, brush tools, and more. This creative tool unlocks the capability for artists to create with AI as a creative collaborator, and can be used to augment AI-generated imagery, sketches, photography, renders, and more.
|
||||
|
||||
### *Node Architecture & Editor (Beta)*
|
||||
### *Advanced Prompt Syntax*
|
||||
|
||||
Invoke AI's backend is built on a graph-based execution architecture. This allows for customizable generation pipelines to be developed by professional users looking to create specific workflows to support their production use-cases, and will be extended in the future with additional capabilities.
|
||||
InvokeAI's advanced prompt syntax allows for token weighting, cross-attention control, and prompt blending, allowing for fine-tuned tweaking of your invocations and exploration of the latent space.
|
||||
|
||||
### *Board & Gallery Management*
|
||||
### *Command Line Interface*
|
||||
|
||||
Invoke AI provides an organized gallery system for easily storing, accessing, and remixing your content in the Invoke workspace. Images can be dragged/dropped onto any Image-base UI element in the application, and rich metadata within the Image allows for easy recall of key prompts or settings used in your workflow.
|
||||
For users utilizing a terminal-based environment, or who want to take advantage of CLI features, InvokeAI offers an extensive and actively supported command-line interface that provides the full suite of generation functionality available in the tool.
|
||||
|
||||
### Other features
|
||||
|
||||
- *Support for both ckpt and diffusers models*
|
||||
- *SD 2.0, 2.1, XL support*
|
||||
- *Upscaling Tools*
|
||||
- *SD 2.0, 2.1 support*
|
||||
- *Noise Control & Tresholding*
|
||||
- *Popular Sampler Support*
|
||||
- *Upscaling & Face Restoration Tools*
|
||||
- *Embedding Manager & Support*
|
||||
- *Model Manager & Support*
|
||||
- *Node-Based Architecture*
|
||||
- *Node-Based Plug-&-Play UI (Beta)*
|
||||
|
||||
### Coming Soon
|
||||
|
||||
- *Node-Based Architecture & UI*
|
||||
- And more...
|
||||
|
||||
### Latest Changes
|
||||
|
||||
@ -375,7 +247,7 @@ For our latest changes, view our [Release
|
||||
Notes](https://github.com/invoke-ai/InvokeAI/releases) and the
|
||||
[CHANGELOG](docs/CHANGELOG.md).
|
||||
|
||||
### Troubleshooting
|
||||
## Troubleshooting
|
||||
|
||||
Please check out our **[Q&A](https://invoke-ai.github.io/InvokeAI/help/TROUBLESHOOT/#faq)** to get solutions for common installation
|
||||
problems and other issues.
|
||||
@ -405,6 +277,8 @@ This fork is a combined effort of various people from across the world.
|
||||
[Check out the list of all these amazing people](https://invoke-ai.github.io/InvokeAI/other/CONTRIBUTORS/). We thank them for
|
||||
their time, hard work and effort.
|
||||
|
||||
Thanks to [Weblate](https://weblate.org/) for generously providing translation services to this project.
|
||||
|
||||
### Support
|
||||
|
||||
For support, please use this repository's GitHub Issues tracking service, or join the Discord.
|
||||
|
BIN
binary_installer/WinLongPathsEnabled.reg
Normal file
164
binary_installer/install.bat.in
Normal file
@ -0,0 +1,164 @@
|
||||
@echo off
|
||||
|
||||
@rem This script will install git (if not found on the PATH variable)
|
||||
@rem using micromamba (an 8mb static-linked single-file binary, conda replacement).
|
||||
@rem For users who already have git, this step will be skipped.
|
||||
|
||||
@rem Next, it'll download the project's source code.
|
||||
@rem Then it will download a self-contained, standalone Python and unpack it.
|
||||
@rem Finally, it'll create the Python virtual environment and preload the models.
|
||||
|
||||
@rem This enables a user to install this project without manually installing git or Python
|
||||
|
||||
@rem change to the script's directory
|
||||
PUSHD "%~dp0"
|
||||
|
||||
set "no_cache_dir=--no-cache-dir"
|
||||
if "%1" == "use-cache" (
|
||||
set "no_cache_dir="
|
||||
)
|
||||
|
||||
echo ***** Installing InvokeAI.. *****
|
||||
@rem Config
|
||||
set INSTALL_ENV_DIR=%cd%\installer_files\env
|
||||
@rem https://mamba.readthedocs.io/en/latest/installation.html
|
||||
set MICROMAMBA_DOWNLOAD_URL=https://github.com/cmdr2/stable-diffusion-ui/releases/download/v1.1/micromamba.exe
|
||||
set RELEASE_URL=https://github.com/invoke-ai/InvokeAI
|
||||
set RELEASE_SOURCEBALL=/archive/refs/heads/main.tar.gz
|
||||
set PYTHON_BUILD_STANDALONE_URL=https://github.com/indygreg/python-build-standalone/releases/download
|
||||
set PYTHON_BUILD_STANDALONE=20221002/cpython-3.10.7+20221002-x86_64-pc-windows-msvc-shared-install_only.tar.gz
|
||||
|
||||
set PACKAGES_TO_INSTALL=
|
||||
|
||||
call git --version >.tmp1 2>.tmp2
|
||||
if "%ERRORLEVEL%" NEQ "0" set PACKAGES_TO_INSTALL=%PACKAGES_TO_INSTALL% git
|
||||
|
||||
@rem Cleanup
|
||||
del /q .tmp1 .tmp2
|
||||
|
||||
@rem (if necessary) install git into a contained environment
|
||||
if "%PACKAGES_TO_INSTALL%" NEQ "" (
|
||||
@rem download micromamba
|
||||
echo ***** Downloading micromamba from %MICROMAMBA_DOWNLOAD_URL% to micromamba.exe *****
|
||||
|
||||
call curl -L "%MICROMAMBA_DOWNLOAD_URL%" > micromamba.exe
|
||||
|
||||
@rem test the mamba binary
|
||||
echo ***** Micromamba version: *****
|
||||
call micromamba.exe --version
|
||||
|
||||
@rem create the installer env
|
||||
if not exist "%INSTALL_ENV_DIR%" (
|
||||
call micromamba.exe create -y --prefix "%INSTALL_ENV_DIR%"
|
||||
)
|
||||
|
||||
echo ***** Packages to install:%PACKAGES_TO_INSTALL% *****
|
||||
|
||||
call micromamba.exe install -y --prefix "%INSTALL_ENV_DIR%" -c conda-forge %PACKAGES_TO_INSTALL%
|
||||
|
||||
if not exist "%INSTALL_ENV_DIR%" (
|
||||
echo ----- There was a problem while installing "%PACKAGES_TO_INSTALL%" using micromamba. Cannot continue. -----
|
||||
pause
|
||||
exit /b
|
||||
)
|
||||
)
|
||||
|
||||
del /q micromamba.exe
|
||||
|
||||
@rem For 'git' only
|
||||
set PATH=%INSTALL_ENV_DIR%\Library\bin;%PATH%
|
||||
|
||||
@rem Download/unpack/clean up InvokeAI release sourceball
|
||||
set err_msg=----- InvokeAI source download failed -----
|
||||
echo Trying to download "%RELEASE_URL%%RELEASE_SOURCEBALL%"
|
||||
curl -L %RELEASE_URL%%RELEASE_SOURCEBALL% --output InvokeAI.tgz
|
||||
if %errorlevel% neq 0 goto err_exit
|
||||
|
||||
set err_msg=----- InvokeAI source unpack failed -----
|
||||
tar -zxf InvokeAI.tgz
|
||||
if %errorlevel% neq 0 goto err_exit
|
||||
|
||||
del /q InvokeAI.tgz
|
||||
|
||||
set err_msg=----- InvokeAI source copy failed -----
|
||||
cd InvokeAI-*
|
||||
xcopy . .. /e /h
|
||||
if %errorlevel% neq 0 goto err_exit
|
||||
cd ..
|
||||
|
||||
@rem cleanup
|
||||
for /f %%i in ('dir /b InvokeAI-*') do rd /s /q %%i
|
||||
rd /s /q .dev_scripts .github docker-build tests
|
||||
del /q requirements.in requirements-mkdocs.txt shell.nix
|
||||
|
||||
echo ***** Unpacked InvokeAI source *****
|
||||
|
||||
@rem Download/unpack/clean up python-build-standalone
|
||||
set err_msg=----- Python download failed -----
|
||||
curl -L %PYTHON_BUILD_STANDALONE_URL%/%PYTHON_BUILD_STANDALONE% --output python.tgz
|
||||
if %errorlevel% neq 0 goto err_exit
|
||||
|
||||
set err_msg=----- Python unpack failed -----
|
||||
tar -zxf python.tgz
|
||||
if %errorlevel% neq 0 goto err_exit
|
||||
|
||||
del /q python.tgz
|
||||
|
||||
echo ***** Unpacked python-build-standalone *****
|
||||
|
||||
@rem create venv
|
||||
set err_msg=----- problem creating venv -----
|
||||
.\python\python -E -s -m venv .venv
|
||||
if %errorlevel% neq 0 goto err_exit
|
||||
call .venv\Scripts\activate.bat
|
||||
|
||||
echo ***** Created Python virtual environment *****
|
||||
|
||||
@rem Print venv's Python version
|
||||
set err_msg=----- problem calling venv's python -----
|
||||
echo We're running under
|
||||
.venv\Scripts\python --version
|
||||
if %errorlevel% neq 0 goto err_exit
|
||||
|
||||
set err_msg=----- pip update failed -----
|
||||
.venv\Scripts\python -m pip install %no_cache_dir% --no-warn-script-location --upgrade pip wheel
|
||||
if %errorlevel% neq 0 goto err_exit
|
||||
|
||||
echo ***** Updated pip and wheel *****
|
||||
|
||||
set err_msg=----- requirements file copy failed -----
|
||||
copy binary_installer\py3.10-windows-x86_64-cuda-reqs.txt requirements.txt
|
||||
if %errorlevel% neq 0 goto err_exit
|
||||
|
||||
set err_msg=----- main pip install failed -----
|
||||
.venv\Scripts\python -m pip install %no_cache_dir% --no-warn-script-location -r requirements.txt
|
||||
if %errorlevel% neq 0 goto err_exit
|
||||
|
||||
echo ***** Installed Python dependencies *****
|
||||
|
||||
set err_msg=----- InvokeAI setup failed -----
|
||||
.venv\Scripts\python -m pip install %no_cache_dir% --no-warn-script-location -e .
|
||||
if %errorlevel% neq 0 goto err_exit
|
||||
|
||||
copy binary_installer\invoke.bat.in .\invoke.bat
|
||||
echo ***** Installed invoke launcher script ******
|
||||
|
||||
@rem more cleanup
|
||||
rd /s /q binary_installer installer_files
|
||||
|
||||
@rem preload the models
|
||||
call .venv\Scripts\python ldm\invoke\config\invokeai_configure.py
|
||||
set err_msg=----- model download clone failed -----
|
||||
if %errorlevel% neq 0 goto err_exit
|
||||
deactivate
|
||||
|
||||
echo ***** Finished downloading models *****
|
||||
|
||||
echo All done! Execute the file invoke.bat in this directory to start InvokeAI
|
||||
pause
|
||||
exit
|
||||
|
||||
:err_exit
|
||||
echo %err_msg%
|
||||
pause
|
||||
exit
|
235
binary_installer/install.sh.in
Normal file
@ -0,0 +1,235 @@
|
||||
#!/usr/bin/env bash
|
||||
|
||||
# ensure we're in the correct folder in case user's CWD is somewhere else
|
||||
scriptdir=$(dirname "$0")
|
||||
cd "$scriptdir"
|
||||
|
||||
set -euo pipefail
|
||||
IFS=$'\n\t'
|
||||
|
||||
function _err_exit {
|
||||
if test "$1" -ne 0
|
||||
then
|
||||
echo -e "Error code $1; Error caught was '$2'"
|
||||
read -p "Press any key to exit..."
|
||||
exit
|
||||
fi
|
||||
}
|
||||
|
||||
# This script will install git (if not found on the PATH variable)
|
||||
# using micromamba (an 8mb static-linked single-file binary, conda replacement).
|
||||
# For users who already have git, this step will be skipped.
|
||||
|
||||
# Next, it'll download the project's source code.
|
||||
# Then it will download a self-contained, standalone Python and unpack it.
|
||||
# Finally, it'll create the Python virtual environment and preload the models.
|
||||
|
||||
# This enables a user to install this project without manually installing git or Python
|
||||
|
||||
echo -e "\n***** Installing InvokeAI into $(pwd)... *****\n"
|
||||
|
||||
export no_cache_dir="--no-cache-dir"
|
||||
if [ $# -ge 1 ]; then
|
||||
if [ "$1" = "use-cache" ]; then
|
||||
export no_cache_dir=""
|
||||
fi
|
||||
fi
|
||||
|
||||
|
||||
OS_NAME=$(uname -s)
|
||||
case "${OS_NAME}" in
|
||||
Linux*) OS_NAME="linux";;
|
||||
Darwin*) OS_NAME="darwin";;
|
||||
*) echo -e "\n----- Unknown OS: $OS_NAME! This script runs only on Linux or macOS -----\n" && exit
|
||||
esac
|
||||
|
||||
OS_ARCH=$(uname -m)
|
||||
case "${OS_ARCH}" in
|
||||
x86_64*) ;;
|
||||
arm64*) ;;
|
||||
*) echo -e "\n----- Unknown system architecture: $OS_ARCH! This script runs only on x86_64 or arm64 -----\n" && exit
|
||||
esac
|
||||
|
||||
# https://mamba.readthedocs.io/en/latest/installation.html
|
||||
MAMBA_OS_NAME=$OS_NAME
|
||||
MAMBA_ARCH=$OS_ARCH
|
||||
if [ "$OS_NAME" == "darwin" ]; then
|
||||
MAMBA_OS_NAME="osx"
|
||||
fi
|
||||
|
||||
if [ "$OS_ARCH" == "linux" ]; then
|
||||
MAMBA_ARCH="aarch64"
|
||||
fi
|
||||
|
||||
if [ "$OS_ARCH" == "x86_64" ]; then
|
||||
MAMBA_ARCH="64"
|
||||
fi
|
||||
|
||||
PY_ARCH=$OS_ARCH
|
||||
if [ "$OS_ARCH" == "arm64" ]; then
|
||||
PY_ARCH="aarch64"
|
||||
fi
|
||||
|
||||
# Compute device ('cd' segment of reqs files) detect goes here
|
||||
# This needs a ton of work
|
||||
# Suggestions:
|
||||
# - lspci
|
||||
# - check $PATH for nvidia-smi, gtt CUDA/GPU version from output
|
||||
# - Surely there's a similar utility for AMD?
|
||||
CD="cuda"
|
||||
if [ "$OS_NAME" == "darwin" ] && [ "$OS_ARCH" == "arm64" ]; then
|
||||
CD="mps"
|
||||
fi
|
||||
|
||||
# config
|
||||
INSTALL_ENV_DIR="$(pwd)/installer_files/env"
|
||||
MICROMAMBA_DOWNLOAD_URL="https://micro.mamba.pm/api/micromamba/${MAMBA_OS_NAME}-${MAMBA_ARCH}/latest"
|
||||
RELEASE_URL=https://github.com/invoke-ai/InvokeAI
|
||||
RELEASE_SOURCEBALL=/archive/refs/heads/main.tar.gz
|
||||
PYTHON_BUILD_STANDALONE_URL=https://github.com/indygreg/python-build-standalone/releases/download
|
||||
if [ "$OS_NAME" == "darwin" ]; then
|
||||
PYTHON_BUILD_STANDALONE=20221002/cpython-3.10.7+20221002-${PY_ARCH}-apple-darwin-install_only.tar.gz
|
||||
elif [ "$OS_NAME" == "linux" ]; then
|
||||
PYTHON_BUILD_STANDALONE=20221002/cpython-3.10.7+20221002-${PY_ARCH}-unknown-linux-gnu-install_only.tar.gz
|
||||
fi
|
||||
echo "INSTALLING $RELEASE_SOURCEBALL FROM $RELEASE_URL"
|
||||
|
||||
PACKAGES_TO_INSTALL=""
|
||||
|
||||
if ! hash "git" &>/dev/null; then PACKAGES_TO_INSTALL="$PACKAGES_TO_INSTALL git"; fi
|
||||
|
||||
# (if necessary) install git and conda into a contained environment
|
||||
if [ "$PACKAGES_TO_INSTALL" != "" ]; then
|
||||
# download micromamba
|
||||
echo -e "\n***** Downloading micromamba from $MICROMAMBA_DOWNLOAD_URL to micromamba *****\n"
|
||||
|
||||
curl -L "$MICROMAMBA_DOWNLOAD_URL" | tar -xvjO bin/micromamba > micromamba
|
||||
|
||||
chmod u+x ./micromamba
|
||||
|
||||
# test the mamba binary
|
||||
echo -e "\n***** Micromamba version: *****\n"
|
||||
./micromamba --version
|
||||
|
||||
# create the installer env
|
||||
if [ ! -e "$INSTALL_ENV_DIR" ]; then
|
||||
./micromamba create -y --prefix "$INSTALL_ENV_DIR"
|
||||
fi
|
||||
|
||||
echo -e "\n***** Packages to install:$PACKAGES_TO_INSTALL *****\n"
|
||||
|
||||
./micromamba install -y --prefix "$INSTALL_ENV_DIR" -c conda-forge "$PACKAGES_TO_INSTALL"
|
||||
|
||||
if [ ! -e "$INSTALL_ENV_DIR" ]; then
|
||||
echo -e "\n----- There was a problem while initializing micromamba. Cannot continue. -----\n"
|
||||
exit
|
||||
fi
|
||||
fi
|
||||
|
||||
rm -f micromamba.exe
|
||||
|
||||
export PATH="$INSTALL_ENV_DIR/bin:$PATH"
|
||||
|
||||
# Download/unpack/clean up InvokeAI release sourceball
|
||||
_err_msg="\n----- InvokeAI source download failed -----\n"
|
||||
curl -L $RELEASE_URL/$RELEASE_SOURCEBALL --output InvokeAI.tgz
|
||||
_err_exit $? _err_msg
|
||||
_err_msg="\n----- InvokeAI source unpack failed -----\n"
|
||||
tar -zxf InvokeAI.tgz
|
||||
_err_exit $? _err_msg
|
||||
|
||||
rm -f InvokeAI.tgz
|
||||
|
||||
_err_msg="\n----- InvokeAI source copy failed -----\n"
|
||||
cd InvokeAI-*
|
||||
cp -r . ..
|
||||
_err_exit $? _err_msg
|
||||
cd ..
|
||||
|
||||
# cleanup
|
||||
rm -rf InvokeAI-*/
|
||||
rm -rf .dev_scripts/ .github/ docker-build/ tests/ requirements.in requirements-mkdocs.txt shell.nix
|
||||
|
||||
echo -e "\n***** Unpacked InvokeAI source *****\n"
|
||||
|
||||
# Download/unpack/clean up python-build-standalone
|
||||
_err_msg="\n----- Python download failed -----\n"
|
||||
curl -L $PYTHON_BUILD_STANDALONE_URL/$PYTHON_BUILD_STANDALONE --output python.tgz
|
||||
_err_exit $? _err_msg
|
||||
_err_msg="\n----- Python unpack failed -----\n"
|
||||
tar -zxf python.tgz
|
||||
_err_exit $? _err_msg
|
||||
|
||||
rm -f python.tgz
|
||||
|
||||
echo -e "\n***** Unpacked python-build-standalone *****\n"
|
||||
|
||||
# create venv
|
||||
_err_msg="\n----- problem creating venv -----\n"
|
||||
|
||||
if [ "$OS_NAME" == "darwin" ]; then
|
||||
# patch sysconfig so that extensions can build properly
|
||||
# adapted from https://github.com/cashapp/hermit-packages/commit/fcba384663892f4d9cfb35e8639ff7a28166ee43
|
||||
PYTHON_INSTALL_DIR="$(pwd)/python"
|
||||
SYSCONFIG="$(echo python/lib/python*/_sysconfigdata_*.py)"
|
||||
TMPFILE="$(mktemp)"
|
||||
chmod +w "${SYSCONFIG}"
|
||||
cp "${SYSCONFIG}" "${TMPFILE}"
|
||||
sed "s,'/install,'${PYTHON_INSTALL_DIR},g" "${TMPFILE}" > "${SYSCONFIG}"
|
||||
rm -f "${TMPFILE}"
|
||||
fi
|
||||
|
||||
./python/bin/python3 -E -s -m venv .venv
|
||||
_err_exit $? _err_msg
|
||||
source .venv/bin/activate
|
||||
|
||||
echo -e "\n***** Created Python virtual environment *****\n"
|
||||
|
||||
# Print venv's Python version
|
||||
_err_msg="\n----- problem calling venv's python -----\n"
|
||||
echo -e "We're running under"
|
||||
.venv/bin/python3 --version
|
||||
_err_exit $? _err_msg
|
||||
|
||||
_err_msg="\n----- pip update failed -----\n"
|
||||
.venv/bin/python3 -m pip install $no_cache_dir --no-warn-script-location --upgrade pip
|
||||
_err_exit $? _err_msg
|
||||
|
||||
echo -e "\n***** Updated pip *****\n"
|
||||
|
||||
_err_msg="\n----- requirements file copy failed -----\n"
|
||||
cp binary_installer/py3.10-${OS_NAME}-"${OS_ARCH}"-${CD}-reqs.txt requirements.txt
|
||||
_err_exit $? _err_msg
|
||||
|
||||
_err_msg="\n----- main pip install failed -----\n"
|
||||
.venv/bin/python3 -m pip install $no_cache_dir --no-warn-script-location -r requirements.txt
|
||||
_err_exit $? _err_msg
|
||||
|
||||
echo -e "\n***** Installed Python dependencies *****\n"
|
||||
|
||||
_err_msg="\n----- InvokeAI setup failed -----\n"
|
||||
.venv/bin/python3 -m pip install $no_cache_dir --no-warn-script-location -e .
|
||||
_err_exit $? _err_msg
|
||||
|
||||
echo -e "\n***** Installed InvokeAI *****\n"
|
||||
|
||||
cp binary_installer/invoke.sh.in ./invoke.sh
|
||||
chmod a+rx ./invoke.sh
|
||||
echo -e "\n***** Installed invoke launcher script ******\n"
|
||||
|
||||
# more cleanup
|
||||
rm -rf binary_installer/ installer_files/
|
||||
|
||||
# preload the models
|
||||
.venv/bin/python3 scripts/configure_invokeai.py
|
||||
_err_msg="\n----- model download clone failed -----\n"
|
||||
_err_exit $? _err_msg
|
||||
deactivate
|
||||
|
||||
echo -e "\n***** Finished downloading models *****\n"
|
||||
|
||||
echo "All done! Run the command"
|
||||
echo " $scriptdir/invoke.sh"
|
||||
echo "to start InvokeAI."
|
||||
read -p "Press any key to exit..."
|
||||
exit
|
36
binary_installer/invoke.bat.in
Normal file
@ -0,0 +1,36 @@
|
||||
@echo off
|
||||
|
||||
PUSHD "%~dp0"
|
||||
call .venv\Scripts\activate.bat
|
||||
|
||||
echo Do you want to generate images using the
|
||||
echo 1. command-line
|
||||
echo 2. browser-based UI
|
||||
echo OR
|
||||
echo 3. open the developer console
|
||||
set /p choice="Please enter 1, 2 or 3: "
|
||||
if /i "%choice%" == "1" (
|
||||
echo Starting the InvokeAI command-line.
|
||||
.venv\Scripts\python scripts\invoke.py %*
|
||||
) else if /i "%choice%" == "2" (
|
||||
echo Starting the InvokeAI browser-based UI.
|
||||
.venv\Scripts\python scripts\invoke.py --web %*
|
||||
) else if /i "%choice%" == "3" (
|
||||
echo Developer Console
|
||||
echo Python command is:
|
||||
where python
|
||||
echo Python version is:
|
||||
python --version
|
||||
echo *************************
|
||||
echo You are now in the system shell, with the local InvokeAI Python virtual environment activated,
|
||||
echo so that you can troubleshoot this InvokeAI installation as necessary.
|
||||
echo *************************
|
||||
echo *** Type `exit` to quit this shell and deactivate the Python virtual environment ***
|
||||
call cmd /k
|
||||
) else (
|
||||
echo Invalid selection
|
||||
pause
|
||||
exit /b
|
||||
)
|
||||
|
||||
deactivate
|
46
binary_installer/invoke.sh.in
Normal file
@ -0,0 +1,46 @@
|
||||
#!/usr/bin/env sh
|
||||
|
||||
set -eu
|
||||
|
||||
. .venv/bin/activate
|
||||
|
||||
# set required env var for torch on mac MPS
|
||||
if [ "$(uname -s)" == "Darwin" ]; then
|
||||
export PYTORCH_ENABLE_MPS_FALLBACK=1
|
||||
fi
|
||||
|
||||
echo "Do you want to generate images using the"
|
||||
echo "1. command-line"
|
||||
echo "2. browser-based UI"
|
||||
echo "OR"
|
||||
echo "3. open the developer console"
|
||||
echo "Please enter 1, 2, or 3:"
|
||||
read choice
|
||||
|
||||
case $choice in
|
||||
1)
|
||||
printf "\nStarting the InvokeAI command-line..\n";
|
||||
.venv/bin/python scripts/invoke.py $*;
|
||||
;;
|
||||
2)
|
||||
printf "\nStarting the InvokeAI browser-based UI..\n";
|
||||
.venv/bin/python scripts/invoke.py --web $*;
|
||||
;;
|
||||
3)
|
||||
printf "\nDeveloper Console:\n";
|
||||
printf "Python command is:\n\t";
|
||||
which python;
|
||||
printf "Python version is:\n\t";
|
||||
python --version;
|
||||
echo "*************************"
|
||||
echo "You are now in your user shell ($SHELL) with the local InvokeAI Python virtual environment activated,";
|
||||
echo "so that you can troubleshoot this InvokeAI installation as necessary.";
|
||||
printf "*************************\n"
|
||||
echo "*** Type \`exit\` to quit this shell and deactivate the Python virtual environment *** ";
|
||||
/usr/bin/env "$SHELL";
|
||||
;;
|
||||
*)
|
||||
echo "Invalid selection";
|
||||
exit
|
||||
;;
|
||||
esac
|
2097
binary_installer/py3.10-darwin-arm64-mps-reqs.txt
Normal file
2077
binary_installer/py3.10-darwin-x86_64-cpu-reqs.txt
Normal file
2103
binary_installer/py3.10-linux-x86_64-cuda-reqs.txt
Normal file
2109
binary_installer/py3.10-windows-x86_64-cuda-reqs.txt
Normal file
17
binary_installer/readme.txt
Normal file
@ -0,0 +1,17 @@
|
||||
InvokeAI
|
||||
|
||||
Project homepage: https://github.com/invoke-ai/InvokeAI
|
||||
|
||||
Installation on Windows:
|
||||
NOTE: You might need to enable Windows Long Paths. If you're not sure,
|
||||
then you almost certainly need to. Simply double-click the 'WinLongPathsEnabled.reg'
|
||||
file. Note that you will need to have admin privileges in order to
|
||||
do this.
|
||||
|
||||
Please double-click the 'install.bat' file (while keeping it inside the invokeAI folder).
|
||||
|
||||
Installation on Linux and Mac:
|
||||
Please open the terminal, and run './install.sh' (while keeping it inside the invokeAI folder).
|
||||
|
||||
After installation, please run the 'invoke.bat' file (on Windows) or 'invoke.sh'
|
||||
file (on Linux/Mac) to start InvokeAI.
|
33
binary_installer/requirements.in
Normal file
@ -0,0 +1,33 @@
|
||||
--prefer-binary
|
||||
--extra-index-url https://download.pytorch.org/whl/torch_stable.html
|
||||
--extra-index-url https://download.pytorch.org/whl/cu116
|
||||
--trusted-host https://download.pytorch.org
|
||||
accelerate~=0.15
|
||||
albumentations
|
||||
diffusers[torch]~=0.11
|
||||
einops
|
||||
eventlet
|
||||
flask_cors
|
||||
flask_socketio
|
||||
flaskwebgui==1.0.3
|
||||
getpass_asterisk
|
||||
imageio-ffmpeg
|
||||
pyreadline3
|
||||
realesrgan
|
||||
send2trash
|
||||
streamlit
|
||||
taming-transformers-rom1504
|
||||
test-tube
|
||||
torch-fidelity
|
||||
torch==1.12.1 ; platform_system == 'Darwin'
|
||||
torch==1.12.0+cu116 ; platform_system == 'Linux' or platform_system == 'Windows'
|
||||
torchvision==0.13.1 ; platform_system == 'Darwin'
|
||||
torchvision==0.13.0+cu116 ; platform_system == 'Linux' or platform_system == 'Windows'
|
||||
transformers
|
||||
picklescan
|
||||
https://github.com/openai/CLIP/archive/d50d76daa670286dd6cacf3bcd80b5e4823fc8e1.zip
|
||||
https://github.com/invoke-ai/clipseg/archive/1f754751c85d7d4255fa681f4491ff5711c1c288.zip
|
||||
https://github.com/invoke-ai/GFPGAN/archive/3f5d2397361199bc4a91c08bb7d80f04d7805615.zip ; platform_system=='Windows'
|
||||
https://github.com/invoke-ai/GFPGAN/archive/c796277a1cf77954e5fc0b288d7062d162894248.zip ; platform_system=='Linux' or platform_system=='Darwin'
|
||||
https://github.com/Birch-san/k-diffusion/archive/363386981fee88620709cf8f6f2eea167bd6cd74.zip
|
||||
https://github.com/invoke-ai/PyPatchMatch/archive/129863937a8ab37f6bbcec327c994c0f932abdbc.zip
|
@ -1,13 +0,0 @@
|
||||
## Make a copy of this file named `.env` and fill in the values below.
|
||||
## Any environment variables supported by InvokeAI can be specified here.
|
||||
|
||||
# INVOKEAI_ROOT is the path to a path on the local filesystem where InvokeAI will store data.
|
||||
# Outputs will also be stored here by default.
|
||||
# This **must** be an absolute path.
|
||||
INVOKEAI_ROOT=
|
||||
|
||||
HUGGINGFACE_TOKEN=
|
||||
|
||||
## optional variables specific to the docker setup
|
||||
# GPU_DRIVER=cuda
|
||||
# CONTAINER_UID=1000
|
@ -1,129 +1,107 @@
|
||||
# syntax=docker/dockerfile:1.4
|
||||
# syntax=docker/dockerfile:1
|
||||
|
||||
## Builder stage
|
||||
ARG PYTHON_VERSION=3.9
|
||||
##################
|
||||
## base image ##
|
||||
##################
|
||||
FROM --platform=${TARGETPLATFORM} python:${PYTHON_VERSION}-slim AS python-base
|
||||
|
||||
FROM library/ubuntu:22.04 AS builder
|
||||
LABEL org.opencontainers.image.authors="mauwii@outlook.de"
|
||||
|
||||
ARG DEBIAN_FRONTEND=noninteractive
|
||||
RUN rm -f /etc/apt/apt.conf.d/docker-clean; echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' > /etc/apt/apt.conf.d/keep-cache
|
||||
RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
|
||||
--mount=type=cache,target=/var/lib/apt,sharing=locked \
|
||||
apt update && apt-get install -y \
|
||||
git \
|
||||
python3.10-venv \
|
||||
python3-pip \
|
||||
build-essential
|
||||
# Prepare apt for buildkit cache
|
||||
RUN rm -f /etc/apt/apt.conf.d/docker-clean \
|
||||
&& echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' >/etc/apt/apt.conf.d/keep-cache
|
||||
|
||||
ENV INVOKEAI_SRC=/opt/invokeai
|
||||
ENV VIRTUAL_ENV=/opt/venv/invokeai
|
||||
# Install dependencies
|
||||
RUN \
|
||||
--mount=type=cache,target=/var/cache/apt,sharing=locked \
|
||||
--mount=type=cache,target=/var/lib/apt,sharing=locked \
|
||||
apt-get update \
|
||||
&& apt-get install -y \
|
||||
--no-install-recommends \
|
||||
libgl1-mesa-glx=20.3.* \
|
||||
libglib2.0-0=2.66.* \
|
||||
libopencv-dev=4.5.*
|
||||
|
||||
ENV PATH="$VIRTUAL_ENV/bin:$PATH"
|
||||
ARG TORCH_VERSION=2.0.1
|
||||
ARG TORCHVISION_VERSION=0.15.2
|
||||
ARG GPU_DRIVER=cuda
|
||||
ARG TARGETPLATFORM="linux/amd64"
|
||||
# unused but available
|
||||
ARG BUILDPLATFORM
|
||||
# Set working directory and env
|
||||
ARG APPDIR=/usr/src
|
||||
ARG APPNAME=InvokeAI
|
||||
WORKDIR ${APPDIR}
|
||||
ENV PATH ${APPDIR}/${APPNAME}/bin:$PATH
|
||||
# Keeps Python from generating .pyc files in the container
|
||||
ENV PYTHONDONTWRITEBYTECODE 1
|
||||
# Turns off buffering for easier container logging
|
||||
ENV PYTHONUNBUFFERED 1
|
||||
# Don't fall back to legacy build system
|
||||
ENV PIP_USE_PEP517=1
|
||||
|
||||
WORKDIR ${INVOKEAI_SRC}
|
||||
#######################
|
||||
## build pyproject ##
|
||||
#######################
|
||||
FROM python-base AS pyproject-builder
|
||||
|
||||
# Install pytorch before all other pip packages
|
||||
# NOTE: there are no pytorch builds for arm64 + cuda, only cpu
|
||||
# x86_64/CUDA is default
|
||||
RUN --mount=type=cache,target=/root/.cache/pip \
|
||||
python3 -m venv ${VIRTUAL_ENV} &&\
|
||||
if [ "$TARGETPLATFORM" = "linux/arm64" ] || [ "$GPU_DRIVER" = "cpu" ]; then \
|
||||
extra_index_url_arg="--extra-index-url https://download.pytorch.org/whl/cpu"; \
|
||||
elif [ "$GPU_DRIVER" = "rocm" ]; then \
|
||||
extra_index_url_arg="--extra-index-url https://download.pytorch.org/whl/rocm5.4.2"; \
|
||||
else \
|
||||
extra_index_url_arg="--extra-index-url https://download.pytorch.org/whl/cu118"; \
|
||||
fi &&\
|
||||
pip install $extra_index_url_arg \
|
||||
torch==$TORCH_VERSION \
|
||||
torchvision==$TORCHVISION_VERSION
|
||||
# Install build dependencies
|
||||
RUN \
|
||||
--mount=type=cache,target=/var/cache/apt,sharing=locked \
|
||||
--mount=type=cache,target=/var/lib/apt,sharing=locked \
|
||||
apt-get update \
|
||||
&& apt-get install -y \
|
||||
--no-install-recommends \
|
||||
build-essential=12.9 \
|
||||
gcc=4:10.2.* \
|
||||
python3-dev=3.9.*
|
||||
|
||||
# Install the local package.
|
||||
# Editable mode helps use the same image for development:
|
||||
# the local working copy can be bind-mounted into the image
|
||||
# at path defined by ${INVOKEAI_SRC}
|
||||
COPY invokeai ./invokeai
|
||||
COPY pyproject.toml ./
|
||||
RUN --mount=type=cache,target=/root/.cache/pip \
|
||||
# xformers + triton fails to install on arm64
|
||||
if [ "$GPU_DRIVER" = "cuda" ] && [ "$TARGETPLATFORM" = "linux/amd64" ]; then \
|
||||
pip install -e ".[xformers]"; \
|
||||
else \
|
||||
pip install -e "."; \
|
||||
fi
|
||||
# Prepare pip for buildkit cache
|
||||
ARG PIP_CACHE_DIR=/var/cache/buildkit/pip
|
||||
ENV PIP_CACHE_DIR ${PIP_CACHE_DIR}
|
||||
RUN mkdir -p ${PIP_CACHE_DIR}
|
||||
|
||||
# #### Build the Web UI ------------------------------------
|
||||
# Create virtual environment
|
||||
RUN --mount=type=cache,target=${PIP_CACHE_DIR} \
|
||||
python3 -m venv "${APPNAME}" \
|
||||
--upgrade-deps
|
||||
|
||||
FROM node:18 AS web-builder
|
||||
WORKDIR /build
|
||||
COPY invokeai/frontend/web/ ./
|
||||
RUN --mount=type=cache,target=/usr/lib/node_modules \
|
||||
npm install --include dev
|
||||
RUN --mount=type=cache,target=/usr/lib/node_modules \
|
||||
yarn vite build
|
||||
# Install requirements
|
||||
COPY --link pyproject.toml .
|
||||
COPY --link invokeai/version/invokeai_version.py invokeai/version/__init__.py invokeai/version/
|
||||
ARG PIP_EXTRA_INDEX_URL
|
||||
ENV PIP_EXTRA_INDEX_URL ${PIP_EXTRA_INDEX_URL}
|
||||
RUN --mount=type=cache,target=${PIP_CACHE_DIR} \
|
||||
"${APPNAME}"/bin/pip install .
|
||||
|
||||
# Install pyproject.toml
|
||||
COPY --link . .
|
||||
RUN --mount=type=cache,target=${PIP_CACHE_DIR} \
|
||||
"${APPNAME}/bin/pip" install .
|
||||
|
||||
#### Runtime stage ---------------------------------------
|
||||
|
||||
FROM library/ubuntu:22.04 AS runtime
|
||||
|
||||
ARG DEBIAN_FRONTEND=noninteractive
|
||||
ENV PYTHONUNBUFFERED=1
|
||||
ENV PYTHONDONTWRITEBYTECODE=1
|
||||
|
||||
RUN apt update && apt install -y --no-install-recommends \
|
||||
git \
|
||||
curl \
|
||||
vim \
|
||||
tmux \
|
||||
ncdu \
|
||||
iotop \
|
||||
bzip2 \
|
||||
gosu \
|
||||
libglib2.0-0 \
|
||||
libgl1-mesa-glx \
|
||||
python3-venv \
|
||||
python3-pip \
|
||||
build-essential \
|
||||
libopencv-dev \
|
||||
libstdc++-10-dev &&\
|
||||
apt-get clean && apt-get autoclean
|
||||
|
||||
# globally add magic-wormhole
|
||||
# for ease of transferring data to and from the container
|
||||
# when running in sandboxed cloud environments; e.g. Runpod etc.
|
||||
RUN pip install magic-wormhole
|
||||
|
||||
ENV INVOKEAI_SRC=/opt/invokeai
|
||||
ENV VIRTUAL_ENV=/opt/venv/invokeai
|
||||
ENV INVOKEAI_ROOT=/invokeai
|
||||
ENV PATH="$VIRTUAL_ENV/bin:$INVOKEAI_SRC:$PATH"
|
||||
|
||||
# --link requires buldkit w/ dockerfile syntax 1.4
|
||||
COPY --link --from=builder ${INVOKEAI_SRC} ${INVOKEAI_SRC}
|
||||
COPY --link --from=builder ${VIRTUAL_ENV} ${VIRTUAL_ENV}
|
||||
COPY --link --from=web-builder /build/dist ${INVOKEAI_SRC}/invokeai/frontend/web/dist
|
||||
|
||||
# Link amdgpu.ids for ROCm builds
|
||||
# contributed by https://github.com/Rubonnek
|
||||
RUN mkdir -p "/opt/amdgpu/share/libdrm" &&\
|
||||
ln -s "/usr/share/libdrm/amdgpu.ids" "/opt/amdgpu/share/libdrm/amdgpu.ids"
|
||||
|
||||
WORKDIR ${INVOKEAI_SRC}
|
||||
|
||||
# build patchmatch
|
||||
RUN cd /usr/lib/$(uname -p)-linux-gnu/pkgconfig/ && ln -sf opencv4.pc opencv.pc
|
||||
# Build patchmatch
|
||||
RUN python3 -c "from patchmatch import patch_match"
|
||||
|
||||
# Create unprivileged user and make the local dir
|
||||
RUN useradd --create-home --shell /bin/bash -u 1000 --comment "container local user" invoke
|
||||
RUN mkdir -p ${INVOKEAI_ROOT} && chown -R invoke:invoke ${INVOKEAI_ROOT}
|
||||
#####################
|
||||
## runtime image ##
|
||||
#####################
|
||||
FROM python-base AS runtime
|
||||
|
||||
COPY docker/docker-entrypoint.sh ./
|
||||
ENTRYPOINT ["/opt/invokeai/docker-entrypoint.sh"]
|
||||
CMD ["invokeai-web", "--host", "0.0.0.0"]
|
||||
# Create a new user
|
||||
ARG UNAME=appuser
|
||||
RUN useradd \
|
||||
--no-log-init \
|
||||
-m \
|
||||
-U \
|
||||
"${UNAME}"
|
||||
|
||||
# Create volume directory
|
||||
ARG VOLUME_DIR=/data
|
||||
RUN mkdir -p "${VOLUME_DIR}" \
|
||||
&& chown -hR "${UNAME}:${UNAME}" "${VOLUME_DIR}"
|
||||
|
||||
# Setup runtime environment
|
||||
USER ${UNAME}:${UNAME}
|
||||
COPY --chown=${UNAME}:${UNAME} --from=pyproject-builder ${APPDIR}/${APPNAME} ${APPNAME}
|
||||
ENV INVOKEAI_ROOT ${VOLUME_DIR}
|
||||
ENV TRANSFORMERS_CACHE ${VOLUME_DIR}/.cache
|
||||
ENV INVOKE_MODEL_RECONFIGURE "--yes --default_only"
|
||||
EXPOSE 9090
|
||||
ENTRYPOINT [ "invokeai" ]
|
||||
CMD [ "--web", "--host", "0.0.0.0", "--port", "9090" ]
|
||||
VOLUME [ "${VOLUME_DIR}" ]
|
||||
|
@ -1,77 +0,0 @@
|
||||
# InvokeAI Containerized
|
||||
|
||||
All commands are to be run from the `docker` directory: `cd docker`
|
||||
|
||||
#### Linux
|
||||
|
||||
1. Ensure builkit is enabled in the Docker daemon settings (`/etc/docker/daemon.json`)
|
||||
2. Install the `docker compose` plugin using your package manager, or follow a [tutorial](https://www.digitalocean.com/community/tutorials/how-to-install-and-use-docker-compose-on-ubuntu-22-04).
|
||||
- The deprecated `docker-compose` (hyphenated) CLI continues to work for now.
|
||||
3. Ensure docker daemon is able to access the GPU.
|
||||
- You may need to install [nvidia-container-toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)
|
||||
|
||||
#### macOS
|
||||
|
||||
1. Ensure Docker has at least 16GB RAM
|
||||
2. Enable VirtioFS for file sharing
|
||||
3. Enable `docker compose` V2 support
|
||||
|
||||
This is done via Docker Desktop preferences
|
||||
|
||||
## Quickstart
|
||||
|
||||
|
||||
1. Make a copy of `env.sample` and name it `.env` (`cp env.sample .env` (Mac/Linux) or `copy example.env .env` (Windows)). Make changes as necessary. Set `INVOKEAI_ROOT` to an absolute path to:
|
||||
a. the desired location of the InvokeAI runtime directory, or
|
||||
b. an existing, v3.0.0 compatible runtime directory.
|
||||
1. `docker compose up`
|
||||
|
||||
The image will be built automatically if needed.
|
||||
|
||||
The runtime directory (holding models and outputs) will be created in the location specified by `INVOKEAI_ROOT`. The default location is `~/invokeai`. The runtime directory will be populated with the base configs and models necessary to start generating.
|
||||
|
||||
### Use a GPU
|
||||
|
||||
- Linux is *recommended* for GPU support in Docker.
|
||||
- WSL2 is *required* for Windows.
|
||||
- only `x86_64` architecture is supported.
|
||||
|
||||
The Docker daemon on the system must be already set up to use the GPU. In case of Linux, this involves installing `nvidia-docker-runtime` and configuring the `nvidia` runtime as default. Steps will be different for AMD. Please see Docker documentation for the most up-to-date instructions for using your GPU with Docker.
|
||||
|
||||
## Customize
|
||||
|
||||
Check the `.env.sample` file. It contains some environment variables for running in Docker. Copy it, name it `.env`, and fill it in with your own values. Next time you run `docker compose up`, your custom values will be used.
|
||||
|
||||
You can also set these values in `docker compose.yml` directly, but `.env` will help avoid conflicts when code is updated.
|
||||
|
||||
Example (most values are optional):
|
||||
|
||||
```
|
||||
INVOKEAI_ROOT=/Volumes/WorkDrive/invokeai
|
||||
HUGGINGFACE_TOKEN=the_actual_token
|
||||
CONTAINER_UID=1000
|
||||
GPU_DRIVER=cuda
|
||||
```
|
||||
|
||||
## Even Moar Customizing!
|
||||
|
||||
See the `docker compose.yaml` file. The `command` instruction can be uncommented and used to run arbitrary startup commands. Some examples below.
|
||||
|
||||
### Reconfigure the runtime directory
|
||||
|
||||
Can be used to download additional models from the supported model list
|
||||
|
||||
In conjunction with `INVOKEAI_ROOT` can be also used to initialize a runtime directory
|
||||
|
||||
```
|
||||
command:
|
||||
- invokeai-configure
|
||||
- --yes
|
||||
```
|
||||
|
||||
Or install models:
|
||||
|
||||
```
|
||||
command:
|
||||
- invokeai-model-install
|
||||
```
|
@ -1,11 +1,51 @@
|
||||
#!/usr/bin/env bash
|
||||
set -e
|
||||
|
||||
build_args=""
|
||||
# If you want to build a specific flavor, set the CONTAINER_FLAVOR environment variable
|
||||
# e.g. CONTAINER_FLAVOR=cpu ./build.sh
|
||||
# Possible Values are:
|
||||
# - cpu
|
||||
# - cuda
|
||||
# - rocm
|
||||
# Don't forget to also set it when executing run.sh
|
||||
# if it is not set, the script will try to detect the flavor by itself.
|
||||
#
|
||||
# Doc can be found here:
|
||||
# https://invoke-ai.github.io/InvokeAI/installation/040_INSTALL_DOCKER/
|
||||
|
||||
[[ -f ".env" ]] && build_args=$(awk '$1 ~ /\=[^$]/ {print "--build-arg " $0 " "}' .env)
|
||||
SCRIPTDIR=$(dirname "${BASH_SOURCE[0]}")
|
||||
cd "$SCRIPTDIR" || exit 1
|
||||
|
||||
echo "docker-compose build args:"
|
||||
echo $build_args
|
||||
source ./env.sh
|
||||
|
||||
docker-compose build $build_args
|
||||
DOCKERFILE=${INVOKE_DOCKERFILE:-./Dockerfile}
|
||||
|
||||
# print the settings
|
||||
echo -e "You are using these values:\n"
|
||||
echo -e "Dockerfile:\t\t${DOCKERFILE}"
|
||||
echo -e "index-url:\t\t${PIP_EXTRA_INDEX_URL:-none}"
|
||||
echo -e "Volumename:\t\t${VOLUMENAME}"
|
||||
echo -e "Platform:\t\t${PLATFORM}"
|
||||
echo -e "Container Registry:\t${CONTAINER_REGISTRY}"
|
||||
echo -e "Container Repository:\t${CONTAINER_REPOSITORY}"
|
||||
echo -e "Container Tag:\t\t${CONTAINER_TAG}"
|
||||
echo -e "Container Flavor:\t${CONTAINER_FLAVOR}"
|
||||
echo -e "Container Image:\t${CONTAINER_IMAGE}\n"
|
||||
|
||||
# Create docker volume
|
||||
if [[ -n "$(docker volume ls -f name="${VOLUMENAME}" -q)" ]]; then
|
||||
echo -e "Volume already exists\n"
|
||||
else
|
||||
echo -n "creating docker volume "
|
||||
docker volume create "${VOLUMENAME}"
|
||||
fi
|
||||
|
||||
# Build Container
|
||||
docker build \
|
||||
--platform="${PLATFORM:-linux/amd64}" \
|
||||
--tag="${CONTAINER_IMAGE:-invokeai}" \
|
||||
${CONTAINER_FLAVOR:+--build-arg="CONTAINER_FLAVOR=${CONTAINER_FLAVOR}"} \
|
||||
${PIP_EXTRA_INDEX_URL:+--build-arg="PIP_EXTRA_INDEX_URL=${PIP_EXTRA_INDEX_URL}"} \
|
||||
${PIP_PACKAGE:+--build-arg="PIP_PACKAGE=${PIP_PACKAGE}"} \
|
||||
--file="${DOCKERFILE}" \
|
||||
..
|
||||
|
@ -1,48 +0,0 @@
|
||||
# Copyright (c) 2023 Eugene Brodsky https://github.com/ebr
|
||||
|
||||
version: '3.8'
|
||||
|
||||
services:
|
||||
invokeai:
|
||||
image: "local/invokeai:latest"
|
||||
# edit below to run on a container runtime other than nvidia-container-runtime.
|
||||
# not yet tested with rocm/AMD GPUs
|
||||
# Comment out the "deploy" section to run on CPU only
|
||||
deploy:
|
||||
resources:
|
||||
reservations:
|
||||
devices:
|
||||
- driver: nvidia
|
||||
count: 1
|
||||
capabilities: [gpu]
|
||||
build:
|
||||
context: ..
|
||||
dockerfile: docker/Dockerfile
|
||||
|
||||
# variables without a default will automatically inherit from the host environment
|
||||
environment:
|
||||
- INVOKEAI_ROOT
|
||||
- HF_HOME
|
||||
|
||||
# Create a .env file in the same directory as this docker-compose.yml file
|
||||
# and populate it with environment variables. See .env.sample
|
||||
env_file:
|
||||
- .env
|
||||
|
||||
ports:
|
||||
- "${INVOKEAI_PORT:-9090}:9090"
|
||||
volumes:
|
||||
- ${INVOKEAI_ROOT:-~/invokeai}:${INVOKEAI_ROOT:-/invokeai}
|
||||
- ${HF_HOME:-~/.cache/huggingface}:${HF_HOME:-/invokeai/.cache/huggingface}
|
||||
# - ${INVOKEAI_MODELS_DIR:-${INVOKEAI_ROOT:-/invokeai/models}}
|
||||
# - ${INVOKEAI_MODELS_CONFIG_PATH:-${INVOKEAI_ROOT:-/invokeai/configs/models.yaml}}
|
||||
tty: true
|
||||
stdin_open: true
|
||||
|
||||
# # Example of running alternative commands/scripts in the container
|
||||
# command:
|
||||
# - bash
|
||||
# - -c
|
||||
# - |
|
||||
# invokeai-model-install --yes --default-only --config_file ${INVOKEAI_ROOT}/config_custom.yaml
|
||||
# invokeai-nodes-web --host 0.0.0.0
|
@ -1,65 +0,0 @@
|
||||
#!/bin/bash
|
||||
set -e -o pipefail
|
||||
|
||||
### Container entrypoint
|
||||
# Runs the CMD as defined by the Dockerfile or passed to `docker run`
|
||||
# Can be used to configure the runtime dir
|
||||
# Bypass by using ENTRYPOINT or `--entrypoint`
|
||||
|
||||
### Set INVOKEAI_ROOT pointing to a valid runtime directory
|
||||
# Otherwise configure the runtime dir first.
|
||||
|
||||
### Configure the InvokeAI runtime directory (done by default)):
|
||||
# docker run --rm -it <this image> --configure
|
||||
# or skip with --no-configure
|
||||
|
||||
### Set the CONTAINER_UID envvar to match your user.
|
||||
# Ensures files created in the container are owned by you:
|
||||
# docker run --rm -it -v /some/path:/invokeai -e CONTAINER_UID=$(id -u) <this image>
|
||||
# Default UID: 1000 chosen due to popularity on Linux systems. Possibly 501 on MacOS.
|
||||
|
||||
USER_ID=${CONTAINER_UID:-1000}
|
||||
USER=invoke
|
||||
usermod -u ${USER_ID} ${USER} 1>/dev/null
|
||||
|
||||
configure() {
|
||||
# Configure the runtime directory
|
||||
if [[ -f ${INVOKEAI_ROOT}/invokeai.yaml ]]; then
|
||||
echo "${INVOKEAI_ROOT}/invokeai.yaml exists. InvokeAI is already configured."
|
||||
echo "To reconfigure InvokeAI, delete the above file."
|
||||
echo "======================================================================"
|
||||
else
|
||||
mkdir -p ${INVOKEAI_ROOT}
|
||||
chown --recursive ${USER} ${INVOKEAI_ROOT}
|
||||
gosu ${USER} invokeai-configure --yes --default_only
|
||||
fi
|
||||
}
|
||||
|
||||
## Skip attempting to configure.
|
||||
## Must be passed first, before any other args.
|
||||
if [[ $1 != "--no-configure" ]]; then
|
||||
configure
|
||||
else
|
||||
shift
|
||||
fi
|
||||
|
||||
### Set the $PUBLIC_KEY env var to enable SSH access.
|
||||
# We do not install openssh-server in the image by default to avoid bloat.
|
||||
# but it is useful to have the full SSH server e.g. on Runpod.
|
||||
# (use SCP to copy files to/from the image, etc)
|
||||
if [[ -v "PUBLIC_KEY" ]] && [[ ! -d "${HOME}/.ssh" ]]; then
|
||||
apt-get update
|
||||
apt-get install -y openssh-server
|
||||
pushd $HOME
|
||||
mkdir -p .ssh
|
||||
echo ${PUBLIC_KEY} > .ssh/authorized_keys
|
||||
chmod -R 700 .ssh
|
||||
popd
|
||||
service ssh start
|
||||
fi
|
||||
|
||||
|
||||
cd ${INVOKEAI_ROOT}
|
||||
|
||||
# Run the CMD as the Container User (not root).
|
||||
exec gosu ${USER} "$@"
|
54
docker/env.sh
Normal file
@ -0,0 +1,54 @@
|
||||
#!/usr/bin/env bash
|
||||
|
||||
# This file is used to set environment variables for the build.sh and run.sh scripts.
|
||||
|
||||
# Try to detect the container flavor if no PIP_EXTRA_INDEX_URL got specified
|
||||
if [[ -z "$PIP_EXTRA_INDEX_URL" ]]; then
|
||||
|
||||
# Activate virtual environment if not already activated and exists
|
||||
if [[ -z $VIRTUAL_ENV ]]; then
|
||||
[[ -e "$(dirname "${BASH_SOURCE[0]}")/../.venv/bin/activate" ]] \
|
||||
&& source "$(dirname "${BASH_SOURCE[0]}")/../.venv/bin/activate" \
|
||||
&& echo "Activated virtual environment: $VIRTUAL_ENV"
|
||||
fi
|
||||
|
||||
# Decide which container flavor to build if not specified
|
||||
if [[ -z "$CONTAINER_FLAVOR" ]] && python -c "import torch" &>/dev/null; then
|
||||
# Check for CUDA and ROCm
|
||||
CUDA_AVAILABLE=$(python -c "import torch;print(torch.cuda.is_available())")
|
||||
ROCM_AVAILABLE=$(python -c "import torch;print(torch.version.hip is not None)")
|
||||
if [[ "${CUDA_AVAILABLE}" == "True" ]]; then
|
||||
CONTAINER_FLAVOR="cuda"
|
||||
elif [[ "${ROCM_AVAILABLE}" == "True" ]]; then
|
||||
CONTAINER_FLAVOR="rocm"
|
||||
else
|
||||
CONTAINER_FLAVOR="cpu"
|
||||
fi
|
||||
fi
|
||||
|
||||
# Set PIP_EXTRA_INDEX_URL based on container flavor
|
||||
if [[ "$CONTAINER_FLAVOR" == "rocm" ]]; then
|
||||
PIP_EXTRA_INDEX_URL="https://download.pytorch.org/whl/rocm"
|
||||
elif [[ "$CONTAINER_FLAVOR" == "cpu" ]]; then
|
||||
PIP_EXTRA_INDEX_URL="https://download.pytorch.org/whl/cpu"
|
||||
# elif [[ -z "$CONTAINER_FLAVOR" || "$CONTAINER_FLAVOR" == "cuda" ]]; then
|
||||
# PIP_PACKAGE=${PIP_PACKAGE-".[xformers]"}
|
||||
fi
|
||||
fi
|
||||
|
||||
# Variables shared by build.sh and run.sh
|
||||
REPOSITORY_NAME="${REPOSITORY_NAME-$(basename "$(git rev-parse --show-toplevel)")}"
|
||||
REPOSITORY_NAME="${REPOSITORY_NAME,,}"
|
||||
VOLUMENAME="${VOLUMENAME-"${REPOSITORY_NAME}_data"}"
|
||||
ARCH="${ARCH-$(uname -m)}"
|
||||
PLATFORM="${PLATFORM-linux/${ARCH}}"
|
||||
INVOKEAI_BRANCH="${INVOKEAI_BRANCH-$(git branch --show)}"
|
||||
CONTAINER_REGISTRY="${CONTAINER_REGISTRY-"ghcr.io"}"
|
||||
CONTAINER_REPOSITORY="${CONTAINER_REPOSITORY-"$(whoami)/${REPOSITORY_NAME}"}"
|
||||
CONTAINER_FLAVOR="${CONTAINER_FLAVOR-cuda}"
|
||||
CONTAINER_TAG="${CONTAINER_TAG-"${INVOKEAI_BRANCH##*/}-${CONTAINER_FLAVOR}"}"
|
||||
CONTAINER_IMAGE="${CONTAINER_REGISTRY}/${CONTAINER_REPOSITORY}:${CONTAINER_TAG}"
|
||||
CONTAINER_IMAGE="${CONTAINER_IMAGE,,}"
|
||||
|
||||
# enable docker buildkit
|
||||
export DOCKER_BUILDKIT=1
|
@ -1,8 +1,41 @@
|
||||
#!/usr/bin/env bash
|
||||
set -e
|
||||
|
||||
# How to use: https://invoke-ai.github.io/InvokeAI/installation/040_INSTALL_DOCKER/
|
||||
|
||||
SCRIPTDIR=$(dirname "${BASH_SOURCE[0]}")
|
||||
cd "$SCRIPTDIR" || exit 1
|
||||
|
||||
docker-compose up --build -d
|
||||
docker-compose logs -f
|
||||
source ./env.sh
|
||||
|
||||
# Create outputs directory if it does not exist
|
||||
[[ -d ./outputs ]] || mkdir ./outputs
|
||||
|
||||
echo -e "You are using these values:\n"
|
||||
echo -e "Volumename:\t${VOLUMENAME}"
|
||||
echo -e "Invokeai_tag:\t${CONTAINER_IMAGE}"
|
||||
echo -e "local Models:\t${MODELSPATH:-unset}\n"
|
||||
|
||||
docker run \
|
||||
--interactive \
|
||||
--tty \
|
||||
--rm \
|
||||
--platform="${PLATFORM}" \
|
||||
--name="${REPOSITORY_NAME}" \
|
||||
--hostname="${REPOSITORY_NAME}" \
|
||||
--mount type=volume,volume-driver=local,source="${VOLUMENAME}",target=/data \
|
||||
--mount type=bind,source="$(pwd)"/outputs/,target=/data/outputs/ \
|
||||
${MODELSPATH:+--mount="type=bind,source=${MODELSPATH},target=/data/models"} \
|
||||
${HUGGING_FACE_HUB_TOKEN:+--env="HUGGING_FACE_HUB_TOKEN=${HUGGING_FACE_HUB_TOKEN}"} \
|
||||
--publish=9090:9090 \
|
||||
--cap-add=sys_nice \
|
||||
${GPU_FLAGS:+--gpus="${GPU_FLAGS}"} \
|
||||
"${CONTAINER_IMAGE}" ${@:+$@}
|
||||
|
||||
echo -e "\nCleaning trash folder ..."
|
||||
for f in outputs/.Trash*; do
|
||||
if [ -e "$f" ]; then
|
||||
rm -Rf "$f"
|
||||
break
|
||||
fi
|
||||
done
|
||||
|
@ -1,60 +0,0 @@
|
||||
# InvokeAI - A Stable Diffusion Toolkit
|
||||
|
||||
Stable Diffusion distribution by InvokeAI: https://github.com/invoke-ai
|
||||
|
||||
The Docker image tracks the `main` branch of the InvokeAI project, which means it includes the latest features, but may contain some bugs.
|
||||
|
||||
Your working directory is mounted under the `/workspace` path inside the pod. The models are in `/workspace/invokeai/models`, and outputs are in `/workspace/invokeai/outputs`.
|
||||
|
||||
> **Only the /workspace directory will persist between pod restarts!**
|
||||
|
||||
> **If you _terminate_ (not just _stop_) the pod, the /workspace will be lost.**
|
||||
|
||||
## Quickstart
|
||||
|
||||
1. Launch a pod from this template. **It will take about 5-10 minutes to run through the initial setup**. Be patient.
|
||||
1. Wait for the application to load.
|
||||
- TIP: you know it's ready when the CPU usage goes idle
|
||||
- You can also check the logs for a line that says "_Point your browser at..._"
|
||||
1. Open the Invoke AI web UI: click the `Connect` => `connect over HTTP` button.
|
||||
1. Generate some art!
|
||||
|
||||
## Other things you can do
|
||||
|
||||
At any point you may edit the pod configuration and set an arbitrary Docker command. For example, you could run a command to downloads some models using `curl`, or fetch some images and place them into your outputs to continue a working session.
|
||||
|
||||
If you need to run *multiple commands*, define them in the Docker Command field like this:
|
||||
|
||||
`bash -c "cd ${INVOKEAI_ROOT}/outputs; wormhole receive 2-foo-bar; invoke.py --web --host 0.0.0.0"`
|
||||
|
||||
### Copying your data in and out of the pod
|
||||
|
||||
This image includes a couple of handy tools to help you get the data into the pod (such as your custom models or embeddings), and out of the pod (such as downloading your outputs). Here are your options for getting your data in and out of the pod:
|
||||
|
||||
- **SSH server**:
|
||||
1. Make sure to create and set your Public Key in the RunPod settings (follow the official instructions)
|
||||
1. Add an exposed port 22 (TCP) in the pod settings!
|
||||
1. When your pod restarts, you will see a new entry in the `Connect` dialog. Use this SSH server to `scp` or `sftp` your files as necessary, or SSH into the pod using the fully fledged SSH server.
|
||||
|
||||
- [**Magic Wormhole**](https://magic-wormhole.readthedocs.io/en/latest/welcome.html):
|
||||
1. On your computer, `pip install magic-wormhole` (see above instructions for details)
|
||||
1. Connect to the command line **using the "light" SSH client** or the browser-based console. _Currently there's a bug where `wormhole` isn't available when connected to "full" SSH server, as described above_.
|
||||
1. `wormhole send /workspace/invokeai/outputs` will send the entire `outputs` directory. You can also send individual files.
|
||||
1. Once packaged, you will see a `wormhole receive <123-some-words>` command. Copy it
|
||||
1. Paste this command into the terminal on your local machine to securely download the payload.
|
||||
1. It works the same in reverse: you can `wormhole send` some models from your computer to the pod. Again, save your files somewhere in `/workspace` or they will be lost when the pod is stopped.
|
||||
|
||||
- **RunPod's Cloud Sync feature** may be used to sync the persistent volume to cloud storage. You could, for example, copy the entire `/workspace` to S3, add some custom models to it, and copy it back from S3 when launching new pod configurations. Follow the Cloud Sync instructions.
|
||||
|
||||
|
||||
### Disable the NSFW checker
|
||||
|
||||
The NSFW checker is enabled by default. To disable it, edit the pod configuration and set the following command:
|
||||
|
||||
```
|
||||
invoke --web --host 0.0.0.0 --no-nsfw_checker
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
Template ©2023 Eugene Brodsky [ebr](https://github.com/ebr)
|
@ -4,236 +4,6 @@ title: Changelog
|
||||
|
||||
# :octicons-log-16: **Changelog**
|
||||
|
||||
## v2.3.5 <small>(22 May 2023)</small>
|
||||
|
||||
This release (along with the post1 and post2 follow-on releases) expands support for additional LoRA and LyCORIS models, upgrades diffusers versions, and fixes a few bugs.
|
||||
|
||||
### LoRA and LyCORIS Support Improvement
|
||||
|
||||
A number of LoRA/LyCORIS fine-tune files (those which alter the text encoder as well as the unet model) were not having the desired effect in InvokeAI. This bug has now been fixed. Full documentation of LoRA support is available at InvokeAI LoRA Support.
|
||||
Previously, InvokeAI did not distinguish between LoRA/LyCORIS models based on Stable Diffusion v1.5 vs those based on v2.0 and 2.1, leading to a crash when an incompatible model was loaded. This has now been fixed. In addition, the web pulldown menus for LoRA and Textual Inversion selection have been enhanced to show only those files that are compatible with the currently-selected Stable Diffusion model.
|
||||
Support for the newer LoKR LyCORIS files has been added.
|
||||
|
||||
### Library Updates and Speed/Reproducibility Advancements
|
||||
The major enhancement in this version is that NVIDIA users no longer need to decide between speed and reproducibility. Previously, if you activated the Xformers library, you would see improvements in speed and memory usage, but multiple images generated with the same seed and other parameters would be slightly different from each other. This is no longer the case. Relative to 2.3.5 you will see improved performance when running without Xformers, and even better performance when Xformers is activated. In both cases, images generated with the same settings will be identical.
|
||||
|
||||
Here are the new library versions:
|
||||
Library Version
|
||||
Torch 2.0.0
|
||||
Diffusers 0.16.1
|
||||
Xformers 0.0.19
|
||||
Compel 1.1.5
|
||||
Other Improvements
|
||||
|
||||
### Performance Improvements
|
||||
|
||||
When a model is loaded for the first time, InvokeAI calculates its checksum for incorporation into the PNG metadata. This process could take up to a minute on network-mounted disks and WSL mounts. This release noticeably speeds up the process.
|
||||
|
||||
### Bug Fixes
|
||||
|
||||
The "import models from directory" and "import from URL" functionality in the console-based model installer has now been fixed.
|
||||
When running the WebUI, we have reduced the number of times that InvokeAI reaches out to HuggingFace to fetch the list of embeddable Textual Inversion models. We have also caught and fixed a problem with the updater not correctly detecting when another instance of the updater is running
|
||||
|
||||
|
||||
## v2.3.4 <small>(7 April 2023)</small>
|
||||
|
||||
What's New in 2.3.4
|
||||
|
||||
This features release adds support for LoRA (Low-Rank Adaptation) and LyCORIS (Lora beYond Conventional) models, as well as some minor bug fixes.
|
||||
### LoRA and LyCORIS Support
|
||||
|
||||
LoRA files contain fine-tuning weights that enable particular styles, subjects or concepts to be applied to generated images. LyCORIS files are an extended variant of LoRA. InvokeAI supports the most common LoRA/LyCORIS format, which ends in the suffix .safetensors. You will find numerous LoRA and LyCORIS models for download at Civitai, and a small but growing number at Hugging Face. Full documentation of LoRA support is available at InvokeAI LoRA Support.( Pre-release note: this page will only be available after release)
|
||||
|
||||
To use LoRA/LyCORIS models in InvokeAI:
|
||||
|
||||
Download the .safetensors files of your choice and place in /path/to/invokeai/loras. This directory was not present in earlier version of InvokeAI but will be created for you the first time you run the command-line or web client. You can also create the directory manually.
|
||||
|
||||
Add withLora(lora-file,weight) to your prompts. The weight is optional and will default to 1.0. A few examples, assuming that a LoRA file named loras/sushi.safetensors is present:
|
||||
|
||||
family sitting at dinner table eating sushi withLora(sushi,0.9)
|
||||
family sitting at dinner table eating sushi withLora(sushi, 0.75)
|
||||
family sitting at dinner table eating sushi withLora(sushi)
|
||||
|
||||
Multiple withLora() prompt fragments are allowed. The weight can be arbitrarily large, but the useful range is roughly 0.5 to 1.0. Higher weights make the LoRA's influence stronger. Negative weights are also allowed, which can lead to some interesting effects.
|
||||
|
||||
Generate as you usually would! If you find that the image is too "crisp" try reducing the overall CFG value or reducing individual LoRA weights. As is the case with all fine-tunes, you'll get the best results when running the LoRA on top of the model similar to, or identical with, the one that was used during the LoRA's training. Don't try to load a SD 1.x-trained LoRA into a SD 2.x model, and vice versa. This will trigger a non-fatal error message and generation will not proceed.
|
||||
|
||||
You can change the location of the loras directory by passing the --lora_directory option to `invokeai.
|
||||
|
||||
### New WebUI LoRA and Textual Inversion Buttons
|
||||
|
||||
This version adds two new web interface buttons for inserting LoRA and Textual Inversion triggers into the prompt as shown in the screenshot below.
|
||||
|
||||
Clicking on one or the other of the buttons will bring up a menu of available LoRA/LyCORIS or Textual Inversion trigger terms. Select a menu item to insert the properly-formatted withLora() or <textual-inversion> prompt fragment into the positive prompt. The number in parentheses indicates the number of trigger terms currently in the prompt. You may click the button again and deselect the LoRA or trigger to remove it from the prompt, or simply edit the prompt directly.
|
||||
|
||||
Currently terms are inserted into the positive prompt textbox only. However, some textual inversion embeddings are designed to be used with negative prompts. To move a textual inversion trigger into the negative prompt, simply cut and paste it.
|
||||
|
||||
By default the Textual Inversion menu only shows locally installed models found at startup time in /path/to/invokeai/embeddings. However, InvokeAI has the ability to dynamically download and install additional Textual Inversion embeddings from the HuggingFace Concepts Library. You may choose to display the most popular of these (with five or more likes) in the Textual Inversion menu by going to Settings and turning on "Show Textual Inversions from HF Concepts Library." When this option is activated, the locally-installed TI embeddings will be shown first, followed by uninstalled terms from Hugging Face. See The Hugging Face Concepts Library and Importing Textual Inversion files for more information.
|
||||
### Minor features and fixes
|
||||
|
||||
This release changes model switching behavior so that the command-line and Web UIs save the last model used and restore it the next time they are launched. It also improves the behavior of the installer so that the pip utility is kept up to date.
|
||||
|
||||
### Known Bugs in 2.3.4
|
||||
|
||||
These are known bugs in the release.
|
||||
|
||||
The Ancestral DPMSolverMultistepScheduler (k_dpmpp_2a) sampler is not yet implemented for diffusers models and will disappear from the WebUI Sampler menu when a diffusers model is selected.
|
||||
Windows Defender will sometimes raise Trojan or backdoor alerts for the codeformer.pth face restoration model, as well as the CIDAS/clipseg and runwayml/stable-diffusion-v1.5 models. These are false positives and can be safely ignored. InvokeAI performs a malware scan on all models as they are loaded. For additional security, you should use safetensors models whenever they are available.
|
||||
|
||||
|
||||
## v2.3.3 <small>(28 March 2023)</small>
|
||||
|
||||
This is a bugfix and minor feature release.
|
||||
### Bugfixes
|
||||
|
||||
Since version 2.3.2 the following bugs have been fixed:
|
||||
Bugs
|
||||
|
||||
When using legacy checkpoints with an external VAE, the VAE file is now scanned for malware prior to loading. Previously only the main model weights file was scanned.
|
||||
Textual inversion will select an appropriate batchsize based on whether xformers is active, and will default to xformers enabled if the library is detected.
|
||||
The batch script log file names have been fixed to be compatible with Windows.
|
||||
Occasional corruption of the .next_prefix file (which stores the next output file name in sequence) on Windows systems is now detected and corrected.
|
||||
Support loading of legacy config files that have no personalization (textual inversion) section.
|
||||
An infinite loop when opening the developer's console from within the invoke.sh script has been corrected.
|
||||
Documentation fixes, including a recipe for detecting and fixing problems with the AMD GPU ROCm driver.
|
||||
|
||||
Enhancements
|
||||
|
||||
It is now possible to load and run several community-contributed SD-2.0 based models, including the often-requested "Illuminati" model.
|
||||
The "NegativePrompts" embedding file, and others like it, can now be loaded by placing it in the InvokeAI embeddings directory.
|
||||
If no --model is specified at launch time, InvokeAI will remember the last model used and restore it the next time it is launched.
|
||||
On Linux systems, the invoke.sh launcher now uses a prettier console-based interface. To take advantage of it, install the dialog package using your package manager (e.g. sudo apt install dialog).
|
||||
When loading legacy models (safetensors/ckpt) you can specify a custom config file and/or a VAE by placing like-named files in the same directory as the model following this example:
|
||||
|
||||
my-favorite-model.ckpt
|
||||
my-favorite-model.yaml
|
||||
my-favorite-model.vae.pt # or my-favorite-model.vae.safetensors
|
||||
|
||||
### Known Bugs in 2.3.3
|
||||
|
||||
These are known bugs in the release.
|
||||
|
||||
The Ancestral DPMSolverMultistepScheduler (k_dpmpp_2a) sampler is not yet implemented for diffusers models and will disappear from the WebUI Sampler menu when a diffusers model is selected.
|
||||
Windows Defender will sometimes raise Trojan or backdoor alerts for the codeformer.pth face restoration model, as well as the CIDAS/clipseg and runwayml/stable-diffusion-v1.5 models. These are false positives and can be safely ignored. InvokeAI performs a malware scan on all models as they are loaded. For additional security, you should use safetensors models whenever they are available.
|
||||
|
||||
|
||||
## v2.3.2 <small>(11 March 2023)</small>
|
||||
This is a bugfix and minor feature release.
|
||||
|
||||
### Bugfixes
|
||||
|
||||
Since version 2.3.1 the following bugs have been fixed:
|
||||
|
||||
Black images appearing for potential NSFW images when generating with legacy checkpoint models and both --no-nsfw_checker and --ckpt_convert turned on.
|
||||
Black images appearing when generating from models fine-tuned on Stable-Diffusion-2-1-base. When importing V2-derived models, you may be asked to select whether the model was derived from a "base" model (512 pixels) or the 768-pixel SD-2.1 model.
|
||||
The "Use All" button was not restoring the Hi-Res Fix setting on the WebUI
|
||||
When using the model installer console app, models failed to import correctly when importing from directories with spaces in their names. A similar issue with the output directory was also fixed.
|
||||
Crashes that occurred during model merging.
|
||||
Restore previous naming of Stable Diffusion base and 768 models.
|
||||
Upgraded to latest versions of diffusers, transformers, safetensors and accelerate libraries upstream. We hope that this will fix the assertion NDArray > 2**32 issue that MacOS users have had when generating images larger than 768x768 pixels. Please report back.
|
||||
|
||||
As part of the upgrade to diffusers, the location of the diffusers-based models has changed from models/diffusers to models/hub. When you launch InvokeAI for the first time, it will prompt you to OK a one-time move. This should be quick and harmless, but if you have modified your models/diffusers directory in some way, for example using symlinks, you may wish to cancel the migration and make appropriate adjustments.
|
||||
New "Invokeai-batch" script
|
||||
|
||||
### Invoke AI Batch
|
||||
2.3.2 introduces a new command-line only script called invokeai-batch that can be used to generate hundreds of images from prompts and settings that vary systematically. This can be used to try the same prompt across multiple combinations of models, steps, CFG settings and so forth. It also allows you to template prompts and generate a combinatorial list like:
|
||||
|
||||
a shack in the mountains, photograph
|
||||
a shack in the mountains, watercolor
|
||||
a shack in the mountains, oil painting
|
||||
a chalet in the mountains, photograph
|
||||
a chalet in the mountains, watercolor
|
||||
a chalet in the mountains, oil painting
|
||||
a shack in the desert, photograph
|
||||
...
|
||||
|
||||
If you have a system with multiple GPUs, or a single GPU with lots of VRAM, you can parallelize generation across the combinatorial set, reducing wait times and using your system's resources efficiently (make sure you have good GPU cooling).
|
||||
|
||||
To try invokeai-batch out. Launch the "developer's console" using the invoke launcher script, or activate the invokeai virtual environment manually. From the console, give the command invokeai-batch --help in order to learn how the script works and create your first template file for dynamic prompt generation.
|
||||
|
||||
|
||||
### Known Bugs in 2.3.2
|
||||
|
||||
These are known bugs in the release.
|
||||
|
||||
The Ancestral DPMSolverMultistepScheduler (k_dpmpp_2a) sampler is not yet implemented for diffusers models and will disappear from the WebUI Sampler menu when a diffusers model is selected.
|
||||
Windows Defender will sometimes raise a Trojan alert for the codeformer.pth face restoration model. As far as we have been able to determine, this is a false positive and can be safely whitelisted.
|
||||
|
||||
|
||||
## v2.3.1 <small>(22 February 2023)</small>
|
||||
This is primarily a bugfix release, but it does provide several new features that will improve the user experience.
|
||||
|
||||
### Enhanced support for model management
|
||||
|
||||
InvokeAI now makes it convenient to add, remove and modify models. You can individually import models that are stored on your local system, scan an entire folder and its subfolders for models and import them automatically, and even directly import models from the internet by providing their download URLs. You also have the option of designating a local folder to scan for new models each time InvokeAI is restarted.
|
||||
|
||||
There are three ways of accessing the model management features:
|
||||
|
||||
From the WebUI, click on the cube to the right of the model selection menu. This will bring up a form that allows you to import models individually from your local disk or scan a directory for models to import.
|
||||
|
||||
Using the Model Installer App
|
||||
|
||||
Choose option (5) download and install models from the invoke launcher script to start a new console-based application for model management. You can use this to select from a curated set of starter models, or import checkpoint, safetensors, and diffusers models from a local disk or the internet. The example below shows importing two checkpoint URLs from popular SD sites and a HuggingFace diffusers model using its Repository ID. It also shows how to designate a folder to be scanned at startup time for new models to import.
|
||||
|
||||
Command-line users can start this app using the command invokeai-model-install.
|
||||
|
||||
Using the Command Line Client (CLI)
|
||||
|
||||
The !install_model and !convert_model commands have been enhanced to allow entering of URLs and local directories to scan and import. The first command installs .ckpt and .safetensors files as-is. The second one converts them into the faster diffusers format before installation.
|
||||
|
||||
Internally InvokeAI is able to probe the contents of a .ckpt or .safetensors file to distinguish among v1.x, v2.x and inpainting models. This means that you do not need to include "inpaint" in your model names to use an inpainting model. Note that Stable Diffusion v2.x models will be autoconverted into a diffusers model the first time you use it.
|
||||
|
||||
Please see INSTALLING MODELS for more information on model management.
|
||||
|
||||
### An Improved Installer Experience
|
||||
|
||||
The installer now launches a console-based UI for setting and changing commonly-used startup options:
|
||||
|
||||
After selecting the desired options, the installer installs several support models needed by InvokeAI's face reconstruction and upscaling features and then launches the interface for selecting and installing models shown earlier. At any time, you can edit the startup options by launching invoke.sh/invoke.bat and entering option (6) change InvokeAI startup options
|
||||
|
||||
Command-line users can launch the new configure app using invokeai-configure.
|
||||
|
||||
This release also comes with a renewed updater. To do an update without going through a whole reinstallation, launch invoke.sh or invoke.bat and choose option (9) update InvokeAI . This will bring you to a screen that prompts you to update to the latest released version, to the most current development version, or any released or unreleased version you choose by selecting the tag or branch of the desired version.
|
||||
|
||||
Command-line users can run this interface by typing invokeai-configure
|
||||
|
||||
### Image Symmetry Options
|
||||
|
||||
There are now features to generate horizontal and vertical symmetry during generation. The way these work is to wait until a selected step in the generation process and then to turn on a mirror image effect. In addition to generating some cool images, you can also use this to make side-by-side comparisons of how an image will look with more or fewer steps. Access this option from the WebUI by selecting Symmetry from the image generation settings, or within the CLI by using the options --h_symmetry_time_pct and --v_symmetry_time_pct (these can be abbreviated to --h_sym and --v_sym like all other options).
|
||||
|
||||
### A New Unified Canvas Look
|
||||
|
||||
This release introduces a beta version of the WebUI Unified Canvas. To try it out, open up the settings dialogue in the WebUI (gear icon) and select Use Canvas Beta Layout:
|
||||
|
||||
Refresh the screen and go to to Unified Canvas (left side of screen, third icon from the top). The new layout is designed to provide more space to work in and to keep the image controls close to the image itself:
|
||||
|
||||
Model conversion and merging within the WebUI
|
||||
|
||||
The WebUI now has an intuitive interface for model merging, as well as for permanent conversion of models from legacy .ckpt/.safetensors formats into diffusers format. These options are also available directly from the invoke.sh/invoke.bat scripts.
|
||||
An easier way to contribute translations to the WebUI
|
||||
|
||||
We have migrated our translation efforts to Weblate, a FOSS translation product. Maintaining the growing project's translations is now far simpler for the maintainers and community. Please review our brief translation guide for more information on how to contribute.
|
||||
Numerous internal bugfixes and performance issues
|
||||
|
||||
### Bug Fixes
|
||||
This releases quashes multiple bugs that were reported in 2.3.0. Major internal changes include upgrading to diffusers 0.13.0, and using the compel library for prompt parsing. See Detailed Change Log for a detailed list of bugs caught and squished.
|
||||
Summary of InvokeAI command line scripts (all accessible via the launcher menu)
|
||||
Command Description
|
||||
invokeai Command line interface
|
||||
invokeai --web Web interface
|
||||
invokeai-model-install Model installer with console forms-based front end
|
||||
invokeai-ti --gui Textual inversion, with a console forms-based front end
|
||||
invokeai-merge --gui Model merging, with a console forms-based front end
|
||||
invokeai-configure Startup configuration; can also be used to reinstall support models
|
||||
invokeai-update InvokeAI software updater
|
||||
|
||||
### Known Bugs in 2.3.1
|
||||
|
||||
These are known bugs in the release.
|
||||
MacOS users generating 768x768 pixel images or greater using diffusers models may experience a hard crash with assertion NDArray > 2**32 This appears to be an issu...
|
||||
|
||||
|
||||
|
||||
## v2.3.0 <small>(15 January 2023)</small>
|
||||
|
||||
**Transition to diffusers
|
||||
@ -494,7 +264,7 @@ sections describe what's new for InvokeAI.
|
||||
[Manual Installation](installation/020_INSTALL_MANUAL.md).
|
||||
- The ability to save frequently-used startup options (model to load, steps,
|
||||
sampler, etc) in a `.invokeai` file. See
|
||||
[Client](deprecated/CLI.md)
|
||||
[Client](features/CLI.md)
|
||||
- Support for AMD GPU cards (non-CUDA) on Linux machines.
|
||||
- Multiple bugs and edge cases squashed.
|
||||
|
||||
@ -617,6 +387,8 @@ sections describe what's new for InvokeAI.
|
||||
- `dream.py` script renamed `invoke.py`. A `dream.py` script wrapper remains for
|
||||
backward compatibility.
|
||||
- Completely new WebGUI - launch with `python3 scripts/invoke.py --web`
|
||||
- Support for [inpainting](features/INPAINTING.md) and
|
||||
[outpainting](features/OUTPAINTING.md)
|
||||
- img2img runs on all k\* samplers
|
||||
- Support for
|
||||
[negative prompts](features/PROMPTS.md#negative-and-unconditioned-prompts)
|
||||
@ -627,7 +399,7 @@ sections describe what's new for InvokeAI.
|
||||
using facial reconstruction, ESRGAN upscaling, outcropping (similar to DALL-E
|
||||
infinite canvas), and "embiggen" upscaling. See the `!fix` command.
|
||||
- New `--hires` option on `invoke>` line allows
|
||||
[larger images to be created without duplicating elements](deprecated/CLI.md#this-is-an-example-of-txt2img),
|
||||
[larger images to be created without duplicating elements](features/CLI.md#this-is-an-example-of-txt2img),
|
||||
at the cost of some performance.
|
||||
- New `--perlin` and `--threshold` options allow you to add and control
|
||||
variation during image generation (see
|
||||
@ -636,7 +408,7 @@ sections describe what's new for InvokeAI.
|
||||
of images and tweaking of previous settings.
|
||||
- Command-line completion in `invoke.py` now works on Windows, Linux and Mac
|
||||
platforms.
|
||||
- Improved [command-line completion behavior](deprecated/CLI.md) New commands
|
||||
- Improved [command-line completion behavior](features/CLI.md) New commands
|
||||
added:
|
||||
- List command-line history with `!history`
|
||||
- Search command-line history with `!search`
|
||||
|
Before Width: | Height: | Size: 7.1 KiB |
Before Width: | Height: | Size: 17 KiB |
Before Width: | Height: | Size: 415 KiB |
Before Width: | Height: | Size: 4.0 MiB |
Before Width: | Height: | Size: 310 KiB |
Before Width: | Height: | Size: 8.3 MiB |
Before Width: | Height: | Size: 57 KiB |
Before Width: | Height: | Size: 37 KiB |
Before Width: | Height: | Size: 1.1 MiB After Width: | Height: | Size: 983 KiB |
Before Width: | Height: | Size: 22 KiB After Width: | Height: | Size: 101 KiB |
Before Width: | Height: | Size: 16 KiB After Width: | Height: | Size: 29 KiB |
Before Width: | Height: | Size: 76 KiB After Width: | Height: | Size: 148 KiB |
Before Width: | Height: | Size: 729 KiB After Width: | Height: | Size: 637 KiB |
Before Width: | Height: | Size: 530 KiB |
Before Width: | Height: | Size: 24 KiB |
Before Width: | Height: | Size: 8.5 KiB |
Before Width: | Height: | Size: 409 KiB |
Before Width: | Height: | Size: 490 KiB |
Before Width: | Height: | Size: 335 KiB |
Before Width: | Height: | Size: 217 KiB |
Before Width: | Height: | Size: 244 KiB |
Before Width: | Height: | Size: 948 KiB |
Before Width: | Height: | Size: 292 KiB |
Before Width: | Height: | Size: 420 KiB |
Before Width: | Height: | Size: 179 KiB |
Before Width: | Height: | Size: 216 KiB |
Before Width: | Height: | Size: 439 KiB |
Before Width: | Height: | Size: 563 KiB |
Before Width: | Height: | Size: 353 KiB |
Before Width: | Height: | Size: 41 KiB |
Before Width: | Height: | Size: 131 KiB |
Before Width: | Height: | Size: 637 KiB |
@ -1,56 +0,0 @@
|
||||
# How to Contribute
|
||||
|
||||
## Welcome to Invoke AI
|
||||
Invoke AI originated as a project built by the community, and that vision carries forward today as we aim to build the best pro-grade tools available. We work together to incorporate the latest in AI/ML research, making these tools available in over 20 languages to artists and creatives around the world as part of our fully permissive OSS project designed for individual users to self-host and use.
|
||||
|
||||
|
||||
## Contributing to Invoke AI
|
||||
Anyone who wishes to contribute to InvokeAI, whether features, bug fixes, code cleanup, testing, code reviews, documentation or translation is very much encouraged to do so.
|
||||
|
||||
To join, just raise your hand on the InvokeAI Discord server (#dev-chat) or the GitHub discussion board.
|
||||
|
||||
### Areas of contribution:
|
||||
|
||||
#### Development
|
||||
If you’d like to help with development, please see our [development guide](contribution_guides/development.md). If you’re unfamiliar with contributing to open source projects, there is a tutorial contained within the development guide.
|
||||
|
||||
#### Documentation
|
||||
If you’d like to help with documentation, please see our [documentation guide](contribution_guides/documenation.md).
|
||||
|
||||
#### Translation
|
||||
If you'd like to help with translation, please see our [translation guide](docs/contributing/.contribution_guides/translation.md).
|
||||
|
||||
#### Tutorials
|
||||
Please reach out to @imic or @hipsterusername on [Discord](https://discord.gg/ZmtBAhwWhy) to help create tutorials for InvokeAI.
|
||||
|
||||
We hope you enjoy using our software as much as we enjoy creating it, and we hope that some of those of you who are reading this will elect to become part of our contributor community.
|
||||
|
||||
|
||||
### Contributors
|
||||
|
||||
This project is a combined effort of dedicated people from across the world. [Check out the list of all these amazing people](https://invoke-ai.github.io/InvokeAI/other/CONTRIBUTORS/). We thank them for their time, hard work and effort.
|
||||
|
||||
### Code of Conduct
|
||||
|
||||
The InvokeAI community is a welcoming place, and we want your help in maintaining that. Please review our [Code of Conduct](https://github.com/invoke-ai/InvokeAI/blob/main/CODE_OF_CONDUCT.md) to learn more - it's essential to maintaining a respectful and inclusive environment.
|
||||
|
||||
By making a contribution to this project, you certify that:
|
||||
|
||||
1. The contribution was created in whole or in part by you and you have the right to submit it under the open-source license indicated in this project’s GitHub repository; or
|
||||
2. The contribution is based upon previous work that, to the best of your knowledge, is covered under an appropriate open-source license and you have the right under that license to submit that work with modifications, whether created in whole or in part by you, under the same open-source license (unless you are permitted to submit under a different license); or
|
||||
3. The contribution was provided directly to you by some other person who certified (1) or (2) and you have not modified it; or
|
||||
4. You understand and agree that this project and the contribution are public and that a record of the contribution (including all personal information you submit with it, including your sign-off) is maintained indefinitely and may be redistributed consistent with this project or the open-source license(s) involved.
|
||||
|
||||
This disclaimer is not a license and does not grant any rights or permissions. You must obtain necessary permissions and licenses, including from third parties, before contributing to this project.
|
||||
|
||||
This disclaimer is provided "as is" without warranty of any kind, whether expressed or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose, or non-infringement. In no event shall the authors or copyright holders be liable for any claim, damages, or other liability, whether in an action of contract, tort, or otherwise, arising from, out of, or in connection with the contribution or the use or other dealings in the contribution.
|
||||
|
||||
### Support
|
||||
|
||||
For support, please use this repository's [GitHub Issues](https://github.com/invoke-ai/InvokeAI/issues), or join the [Discord](https://discord.gg/ZmtBAhwWhy).
|
||||
|
||||
Original portions of the software are Copyright (c) 2023 by respective contributors.
|
||||
|
||||
---
|
||||
|
||||
Remember, your contributions help make this project great. We're excited to see what you'll bring to our community!
|
@ -1,790 +1,105 @@
|
||||
# Invocations
|
||||
|
||||
Features in InvokeAI are added in the form of modular node-like systems called
|
||||
**Invocations**.
|
||||
|
||||
An Invocation is simply a single operation that takes in some inputs and gives
|
||||
out some outputs. We can then chain multiple Invocations together to create more
|
||||
complex functionality.
|
||||
|
||||
## Invocations Directory
|
||||
|
||||
InvokeAI Invocations can be found in the `invokeai/app/invocations` directory.
|
||||
|
||||
You can add your new functionality to one of the existing Invocations in this
|
||||
directory or create a new file in this directory as per your needs.
|
||||
|
||||
**Note:** _All Invocations must be inside this directory for InvokeAI to
|
||||
recognize them as valid Invocations._
|
||||
|
||||
## Creating A New Invocation
|
||||
|
||||
In order to understand the process of creating a new Invocation, let us actually
|
||||
create one.
|
||||
|
||||
In our example, let us create an Invocation that will take in an image, resize
|
||||
it and output the resized image.
|
||||
|
||||
The first set of things we need to do when creating a new Invocation are -
|
||||
|
||||
- Create a new class that derives from a predefined parent class called
|
||||
`BaseInvocation`.
|
||||
- The name of every Invocation must end with the word `Invocation` in order for
|
||||
it to be recognized as an Invocation.
|
||||
- Every Invocation must have a `docstring` that describes what this Invocation
|
||||
does.
|
||||
- Every Invocation must have a unique `type` field defined which becomes its
|
||||
indentifier.
|
||||
- Invocations are strictly typed. We make use of the native
|
||||
[typing](https://docs.python.org/3/library/typing.html) library and the
|
||||
installed [pydantic](https://pydantic-docs.helpmanual.io/) library for
|
||||
validation.
|
||||
|
||||
So let us do that.
|
||||
|
||||
```python
|
||||
from typing import Literal
|
||||
from .baseinvocation import BaseInvocation
|
||||
|
||||
class ResizeInvocation(BaseInvocation):
|
||||
'''Resizes an image'''
|
||||
type: Literal['resize'] = 'resize'
|
||||
```
|
||||
|
||||
That's great.
|
||||
|
||||
Now we have setup the base of our new Invocation. Let us think about what inputs
|
||||
our Invocation takes.
|
||||
|
||||
- We need an `image` that we are going to resize.
|
||||
- We will need new `width` and `height` values to which we need to resize the
|
||||
image to.
|
||||
|
||||
### **Inputs**
|
||||
|
||||
Every Invocation input is a pydantic `Field` and like everything else should be
|
||||
strictly typed and defined.
|
||||
|
||||
So let us create these inputs for our Invocation. First up, the `image` input we
|
||||
need. Generally, we can use standard variable types in Python but InvokeAI
|
||||
already has a custom `ImageField` type that handles all the stuff that is needed
|
||||
for image inputs.
|
||||
|
||||
But what is this `ImageField` ..? It is a special class type specifically
|
||||
written to handle how images are dealt with in InvokeAI. We will cover how to
|
||||
create your own custom field types later in this guide. For now, let's go ahead
|
||||
and use it.
|
||||
|
||||
```python
|
||||
from typing import Literal, Union
|
||||
from pydantic import Field
|
||||
|
||||
from .baseinvocation import BaseInvocation
|
||||
from ..models.image import ImageField
|
||||
|
||||
class ResizeInvocation(BaseInvocation):
|
||||
'''Resizes an image'''
|
||||
type: Literal['resize'] = 'resize'
|
||||
|
||||
# Inputs
|
||||
image: Union[ImageField, None] = Field(description="The input image", default=None)
|
||||
```
|
||||
|
||||
Let us break down our input code.
|
||||
|
||||
```python
|
||||
image: Union[ImageField, None] = Field(description="The input image", default=None)
|
||||
```
|
||||
|
||||
| Part | Value | Description |
|
||||
| --------- | ---------------------------------------------------- | -------------------------------------------------------------------------------------------------- |
|
||||
| Name | `image` | The variable that will hold our image |
|
||||
| Type Hint | `Union[ImageField, None]` | The types for our field. Indicates that the image can either be an `ImageField` type or `None` |
|
||||
| Field | `Field(description="The input image", default=None)` | The image variable is a field which needs a description and a default value that we set to `None`. |
|
||||
|
||||
Great. Now let us create our other inputs for `width` and `height`
|
||||
|
||||
```python
|
||||
from typing import Literal, Union
|
||||
from pydantic import Field
|
||||
|
||||
from .baseinvocation import BaseInvocation
|
||||
from ..models.image import ImageField
|
||||
|
||||
class ResizeInvocation(BaseInvocation):
|
||||
'''Resizes an image'''
|
||||
type: Literal['resize'] = 'resize'
|
||||
|
||||
# Inputs
|
||||
image: Union[ImageField, None] = Field(description="The input image", default=None)
|
||||
width: int = Field(default=512, ge=64, le=2048, description="Width of the new image")
|
||||
height: int = Field(default=512, ge=64, le=2048, description="Height of the new image")
|
||||
```
|
||||
|
||||
As you might have noticed, we added two new parameters to the field type for
|
||||
`width` and `height` called `gt` and `le`. These basically stand for _greater
|
||||
than or equal to_ and _less than or equal to_. There are various other param
|
||||
types for field that you can find on the **pydantic** documentation.
|
||||
|
||||
**Note:** _Any time it is possible to define constraints for our field, we
|
||||
should do it so the frontend has more information on how to parse this field._
|
||||
|
||||
Perfect. We now have our inputs. Let us do something with these.
|
||||
|
||||
### **Invoke Function**
|
||||
|
||||
The `invoke` function is where all the magic happens. This function provides you
|
||||
the `context` parameter that is of the type `InvocationContext` which will give
|
||||
you access to the current context of the generation and all the other services
|
||||
that are provided by it by InvokeAI.
|
||||
|
||||
Let us create this function first.
|
||||
|
||||
```python
|
||||
from typing import Literal, Union
|
||||
from pydantic import Field
|
||||
|
||||
from .baseinvocation import BaseInvocation, InvocationContext
|
||||
from ..models.image import ImageField
|
||||
|
||||
class ResizeInvocation(BaseInvocation):
|
||||
'''Resizes an image'''
|
||||
type: Literal['resize'] = 'resize'
|
||||
|
||||
# Inputs
|
||||
image: Union[ImageField, None] = Field(description="The input image", default=None)
|
||||
width: int = Field(default=512, ge=64, le=2048, description="Width of the new image")
|
||||
height: int = Field(default=512, ge=64, le=2048, description="Height of the new image")
|
||||
|
||||
def invoke(self, context: InvocationContext):
|
||||
pass
|
||||
```
|
||||
|
||||
### **Outputs**
|
||||
|
||||
The output of our Invocation will be whatever is returned by this `invoke`
|
||||
function. Like with our inputs, we need to strongly type and define our outputs
|
||||
too.
|
||||
|
||||
What is our output going to be? Another image. Normally you'd have to create a
|
||||
type for this but InvokeAI already offers you an `ImageOutput` type that handles
|
||||
all the necessary info related to image outputs. So let us use that.
|
||||
|
||||
We will cover how to create your own output types later in this guide.
|
||||
|
||||
```python
|
||||
from typing import Literal, Union
|
||||
from pydantic import Field
|
||||
|
||||
from .baseinvocation import BaseInvocation, InvocationContext
|
||||
from ..models.image import ImageField
|
||||
from .image import ImageOutput
|
||||
|
||||
class ResizeInvocation(BaseInvocation):
|
||||
'''Resizes an image'''
|
||||
type: Literal['resize'] = 'resize'
|
||||
|
||||
# Inputs
|
||||
image: Union[ImageField, None] = Field(description="The input image", default=None)
|
||||
width: int = Field(default=512, ge=64, le=2048, description="Width of the new image")
|
||||
height: int = Field(default=512, ge=64, le=2048, description="Height of the new image")
|
||||
|
||||
def invoke(self, context: InvocationContext) -> ImageOutput:
|
||||
pass
|
||||
```
|
||||
|
||||
Perfect. Now that we have our Invocation setup, let us do what we want to do.
|
||||
|
||||
- We will first load the image. Generally we do this using the `PIL` library but
|
||||
we can use one of the services provided by InvokeAI to load the image.
|
||||
- We will resize the image using `PIL` to our input data.
|
||||
- We will output this image in the format we set above.
|
||||
|
||||
So let's do that.
|
||||
|
||||
```python
|
||||
from typing import Literal, Union
|
||||
from pydantic import Field
|
||||
|
||||
from .baseinvocation import BaseInvocation, InvocationContext
|
||||
from ..models.image import ImageField, ResourceOrigin, ImageCategory
|
||||
from .image import ImageOutput
|
||||
|
||||
class ResizeInvocation(BaseInvocation):
|
||||
'''Resizes an image'''
|
||||
type: Literal['resize'] = 'resize'
|
||||
|
||||
# Inputs
|
||||
image: Union[ImageField, None] = Field(description="The input image", default=None)
|
||||
width: int = Field(default=512, ge=64, le=2048, description="Width of the new image")
|
||||
height: int = Field(default=512, ge=64, le=2048, description="Height of the new image")
|
||||
|
||||
def invoke(self, context: InvocationContext) -> ImageOutput:
|
||||
# Load the image using InvokeAI's predefined Image Service.
|
||||
image = context.services.images.get_pil_image(self.image.image_origin, self.image.image_name)
|
||||
|
||||
# Resizing the image
|
||||
# Because we used the above service, we already have a PIL image. So we can simply resize.
|
||||
resized_image = image.resize((self.width, self.height))
|
||||
|
||||
# Preparing the image for output using InvokeAI's predefined Image Service.
|
||||
output_image = context.services.images.create(
|
||||
image=resized_image,
|
||||
image_origin=ResourceOrigin.INTERNAL,
|
||||
image_category=ImageCategory.GENERAL,
|
||||
node_id=self.id,
|
||||
session_id=context.graph_execution_state_id,
|
||||
is_intermediate=self.is_intermediate,
|
||||
)
|
||||
|
||||
# Returning the Image
|
||||
return ImageOutput(
|
||||
image=ImageField(
|
||||
image_name=output_image.image_name,
|
||||
image_origin=output_image.image_origin,
|
||||
),
|
||||
width=output_image.width,
|
||||
height=output_image.height,
|
||||
)
|
||||
```
|
||||
|
||||
**Note:** Do not be overwhelmed by the `ImageOutput` process. InvokeAI has a
|
||||
certain way that the images need to be dispatched in order to be stored and read
|
||||
correctly. In 99% of the cases when dealing with an image output, you can simply
|
||||
copy-paste the template above.
|
||||
|
||||
That's it. You made your own **Resize Invocation**.
|
||||
|
||||
## Result
|
||||
|
||||
Once you make your Invocation correctly, the rest of the process is fully
|
||||
automated for you.
|
||||
|
||||
When you launch InvokeAI, you can go to `http://localhost:9090/docs` and see
|
||||
your new Invocation show up there with all the relevant info.
|
||||
|
||||

|
||||
|
||||
When you launch the frontend UI, you can go to the Node Editor tab and find your
|
||||
new Invocation ready to be used.
|
||||
|
||||

|
||||
|
||||
# Advanced
|
||||
|
||||
## Custom Input Fields
|
||||
|
||||
Now that you know how to create your own Invocations, let us dive into slightly
|
||||
more advanced topics.
|
||||
|
||||
While creating your own Invocations, you might run into a scenario where the
|
||||
existing input types in InvokeAI do not meet your requirements. In such cases,
|
||||
you can create your own input types.
|
||||
|
||||
Let us create one as an example. Let us say we want to create a color input
|
||||
field that represents a color code. But before we start on that here are some
|
||||
general good practices to keep in mind.
|
||||
|
||||
**Good Practices**
|
||||
|
||||
- There is no naming convention for input fields but we highly recommend that
|
||||
you name it something appropriate like `ColorField`.
|
||||
- It is not mandatory but it is heavily recommended to add a relevant
|
||||
`docstring` to describe your input field.
|
||||
- Keep your field in the same file as the Invocation that it is made for or in
|
||||
another file where it is relevant.
|
||||
|
||||
All input types a class that derive from the `BaseModel` type from `pydantic`.
|
||||
So let's create one.
|
||||
|
||||
```python
|
||||
from pydantic import BaseModel
|
||||
|
||||
class ColorField(BaseModel):
|
||||
'''A field that holds the rgba values of a color'''
|
||||
pass
|
||||
```
|
||||
|
||||
Perfect. Now let us create our custom inputs for our field. This is exactly
|
||||
similar how you created input fields for your Invocation. All the same rules
|
||||
apply. Let us create four fields representing the _red(r)_, _blue(b)_,
|
||||
_green(g)_ and _alpha(a)_ channel of the color.
|
||||
|
||||
```python
|
||||
class ColorField(BaseModel):
|
||||
'''A field that holds the rgba values of a color'''
|
||||
r: int = Field(ge=0, le=255, description="The red channel")
|
||||
g: int = Field(ge=0, le=255, description="The green channel")
|
||||
b: int = Field(ge=0, le=255, description="The blue channel")
|
||||
a: int = Field(ge=0, le=255, description="The alpha channel")
|
||||
```
|
||||
|
||||
That's it. We now have a new input field type that we can use in our Invocations
|
||||
like this.
|
||||
|
||||
```python
|
||||
color: ColorField = Field(default=ColorField(r=0, g=0, b=0, a=0), description='Background color of an image')
|
||||
```
|
||||
|
||||
**Extra Config**
|
||||
|
||||
All input fields also take an additional `Config` class that you can use to do
|
||||
various advanced things like setting required parameters and etc.
|
||||
|
||||
Let us do that for our _ColorField_ and enforce all the values because we did
|
||||
not define any defaults for our fields.
|
||||
|
||||
```python
|
||||
class ColorField(BaseModel):
|
||||
'''A field that holds the rgba values of a color'''
|
||||
r: int = Field(ge=0, le=255, description="The red channel")
|
||||
g: int = Field(ge=0, le=255, description="The green channel")
|
||||
b: int = Field(ge=0, le=255, description="The blue channel")
|
||||
a: int = Field(ge=0, le=255, description="The alpha channel")
|
||||
|
||||
class Config:
|
||||
schema_extra = {"required": ["r", "g", "b", "a"]}
|
||||
```
|
||||
|
||||
Now it becomes mandatory for the user to supply all the values required by our
|
||||
input field.
|
||||
|
||||
We will discuss the `Config` class in extra detail later in this guide and how
|
||||
you can use it to make your Invocations more robust.
|
||||
|
||||
## Custom Output Types
|
||||
|
||||
Like with custom inputs, sometimes you might find yourself needing custom
|
||||
outputs that InvokeAI does not provide. We can easily set one up.
|
||||
|
||||
Now that you are familiar with Invocations and Inputs, let us use that knowledge
|
||||
to put together a custom output type for an Invocation that returns _width_,
|
||||
_height_ and _background_color_ that we need to create a blank image.
|
||||
|
||||
- A custom output type is a class that derives from the parent class of
|
||||
`BaseInvocationOutput`.
|
||||
- It is not mandatory but we recommend using names ending with `Output` for
|
||||
output types. So we'll call our class `BlankImageOutput`
|
||||
- It is not mandatory but we highly recommend adding a `docstring` to describe
|
||||
what your output type is for.
|
||||
- Like Invocations, each output type should have a `type` variable that is
|
||||
**unique**
|
||||
|
||||
Now that we know the basic rules for creating a new output type, let us go ahead
|
||||
and make it.
|
||||
|
||||
```python
|
||||
from typing import Literal
|
||||
from pydantic import Field
|
||||
|
||||
from .baseinvocation import BaseInvocationOutput
|
||||
|
||||
class BlankImageOutput(BaseInvocationOutput):
|
||||
'''Base output type for creating a blank image'''
|
||||
type: Literal['blank_image_output'] = 'blank_image_output'
|
||||
|
||||
# Inputs
|
||||
width: int = Field(description='Width of blank image')
|
||||
height: int = Field(description='Height of blank image')
|
||||
bg_color: ColorField = Field(description='Background color of blank image')
|
||||
|
||||
class Config:
|
||||
schema_extra = {"required": ["type", "width", "height", "bg_color"]}
|
||||
```
|
||||
|
||||
All set. We now have an output type that requires what we need to create a
|
||||
blank_image. And if you noticed it, we even used the `Config` class to ensure
|
||||
the fields are required.
|
||||
|
||||
## Custom Configuration
|
||||
|
||||
As you might have noticed when making inputs and outputs, we used a class called
|
||||
`Config` from _pydantic_ to further customize them. Because our inputs and
|
||||
outputs essentially inherit from _pydantic_'s `BaseModel` class, all
|
||||
[configuration options](https://docs.pydantic.dev/latest/usage/schema/#schema-customization)
|
||||
that are valid for _pydantic_ classes are also valid for our inputs and outputs.
|
||||
You can do the same for your Invocations too but InvokeAI makes our life a
|
||||
little bit easier on that end.
|
||||
|
||||
InvokeAI provides a custom configuration class called `InvocationConfig`
|
||||
particularly for configuring Invocations. This is exactly the same as the raw
|
||||
`Config` class from _pydantic_ with some extra stuff on top to help faciliate
|
||||
parsing of the scheme in the frontend UI.
|
||||
|
||||
At the current moment, tihs `InvocationConfig` class is further improved with
|
||||
the following features related the `ui`.
|
||||
|
||||
| Config Option | Field Type | Example |
|
||||
| ------------- | ------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------- |
|
||||
| type_hints | `Dict[str, Literal["integer", "float", "boolean", "string", "enum", "image", "latents", "model", "control"]]` | `type_hint: "model"` provides type hints related to the model like displaying a list of available models |
|
||||
| tags | `List[str]` | `tags: ['resize', 'image']` will classify your invocation under the tags of resize and image. |
|
||||
| title | `str` | `title: 'Resize Image` will rename your to this custom title rather than infer from the name of the Invocation class. |
|
||||
|
||||
So let us update your `ResizeInvocation` with some extra configuration and see
|
||||
how that works.
|
||||
|
||||
```python
|
||||
from typing import Literal, Union
|
||||
from pydantic import Field
|
||||
|
||||
from .baseinvocation import BaseInvocation, InvocationContext, InvocationConfig
|
||||
from ..models.image import ImageField, ResourceOrigin, ImageCategory
|
||||
from .image import ImageOutput
|
||||
|
||||
class ResizeInvocation(BaseInvocation):
|
||||
'''Resizes an image'''
|
||||
type: Literal['resize'] = 'resize'
|
||||
|
||||
# Inputs
|
||||
image: Union[ImageField, None] = Field(description="The input image", default=None)
|
||||
width: int = Field(default=512, ge=64, le=2048, description="Width of the new image")
|
||||
height: int = Field(default=512, ge=64, le=2048, description="Height of the new image")
|
||||
|
||||
class Config(InvocationConfig):
|
||||
schema_extra: {
|
||||
ui: {
|
||||
tags: ['resize', 'image'],
|
||||
title: ['My Custom Resize']
|
||||
}
|
||||
}
|
||||
|
||||
def invoke(self, context: InvocationContext) -> ImageOutput:
|
||||
# Load the image using InvokeAI's predefined Image Service.
|
||||
image = context.services.images.get_pil_image(self.image.image_origin, self.image.image_name)
|
||||
|
||||
# Resizing the image
|
||||
# Because we used the above service, we already have a PIL image. So we can simply resize.
|
||||
resized_image = image.resize((self.width, self.height))
|
||||
|
||||
# Preparing the image for output using InvokeAI's predefined Image Service.
|
||||
output_image = context.services.images.create(
|
||||
image=resized_image,
|
||||
image_origin=ResourceOrigin.INTERNAL,
|
||||
image_category=ImageCategory.GENERAL,
|
||||
node_id=self.id,
|
||||
session_id=context.graph_execution_state_id,
|
||||
is_intermediate=self.is_intermediate,
|
||||
)
|
||||
|
||||
# Returning the Image
|
||||
return ImageOutput(
|
||||
image=ImageField(
|
||||
image_name=output_image.image_name,
|
||||
image_origin=output_image.image_origin,
|
||||
),
|
||||
width=output_image.width,
|
||||
height=output_image.height,
|
||||
)
|
||||
```
|
||||
|
||||
We now customized our code to let the frontend know that our Invocation falls
|
||||
under `resize` and `image` categories. So when the user searches for these
|
||||
particular words, our Invocation will show up too.
|
||||
|
||||
We also set a custom title for our Invocation. So instead of being called
|
||||
`Resize`, it will be called `My Custom Resize`.
|
||||
|
||||
As simple as that.
|
||||
|
||||
As time goes by, InvokeAI will further improve and add more customizability for
|
||||
Invocation configuration. We will have more documentation regarding this at a
|
||||
later time.
|
||||
|
||||
# **[TODO]**
|
||||
|
||||
## Custom Components For Frontend
|
||||
|
||||
Every backend input type should have a corresponding frontend component so the
|
||||
UI knows what to render when you use a particular field type.
|
||||
|
||||
If you are using existing field types, we already have components for those. So
|
||||
you don't have to worry about creating anything new. But this might not always
|
||||
be the case. Sometimes you might want to create new field types and have the
|
||||
frontend UI deal with it in a different way.
|
||||
|
||||
This is where we venture into the world of React and Javascript and create our
|
||||
own new components for our Invocations. Do not fear the world of JS. It's
|
||||
actually pretty straightforward.
|
||||
|
||||
Let us create a new component for our custom color field we created above. When
|
||||
we use a color field, let us say we want the UI to display a color picker for
|
||||
the user to pick from rather than entering values. That is what we will build
|
||||
now.
|
||||
|
||||
---
|
||||
|
||||
# OLD -- TO BE DELETED OR MOVED LATER
|
||||
|
||||
---
|
||||
Invocations represent a single operation, its inputs, and its outputs. These operations and their outputs can be chained together to generate and modify images.
|
||||
|
||||
## Creating a new invocation
|
||||
|
||||
To create a new invocation, either find the appropriate module file in
|
||||
`/ldm/invoke/app/invocations` to add your invocation to, or create a new one in
|
||||
that folder. All invocations in that folder will be discovered and made
|
||||
available to the CLI and API automatically. Invocations make use of
|
||||
[typing](https://docs.python.org/3/library/typing.html) and
|
||||
[pydantic](https://pydantic-docs.helpmanual.io/) for validation and integration
|
||||
into the CLI and API.
|
||||
To create a new invocation, either find the appropriate module file in `/ldm/invoke/app/invocations` to add your invocation to, or create a new one in that folder. All invocations in that folder will be discovered and made available to the CLI and API automatically. Invocations make use of [typing](https://docs.python.org/3/library/typing.html) and [pydantic](https://pydantic-docs.helpmanual.io/) for validation and integration into the CLI and API.
|
||||
|
||||
An invocation looks like this:
|
||||
|
||||
```py
|
||||
class UpscaleInvocation(BaseInvocation):
|
||||
"""Upscales an image."""
|
||||
|
||||
# fmt: off
|
||||
type: Literal["upscale"] = "upscale"
|
||||
type: Literal['upscale'] = 'upscale'
|
||||
|
||||
# Inputs
|
||||
image: Union[ImageField, None] = Field(description="The input image", default=None)
|
||||
strength: float = Field(default=0.75, gt=0, le=1, description="The strength")
|
||||
level: Literal[2, 4] = Field(default=2, description="The upscale level")
|
||||
# fmt: on
|
||||
|
||||
# Schema customisation
|
||||
class Config(InvocationConfig):
|
||||
schema_extra = {
|
||||
"ui": {
|
||||
"tags": ["upscaling", "image"],
|
||||
},
|
||||
}
|
||||
image: Union[ImageField,None] = Field(description="The input image")
|
||||
strength: float = Field(default=0.75, gt=0, le=1, description="The strength")
|
||||
level: Literal[2,4] = Field(default=2, description = "The upscale level")
|
||||
|
||||
def invoke(self, context: InvocationContext) -> ImageOutput:
|
||||
image = context.services.images.get_pil_image(
|
||||
self.image.image_origin, self.image.image_name
|
||||
)
|
||||
results = context.services.restoration.upscale_and_reconstruct(
|
||||
image_list=[[image, 0]],
|
||||
upscale=(self.level, self.strength),
|
||||
strength=0.0, # GFPGAN strength
|
||||
save_original=False,
|
||||
image_callback=None,
|
||||
image = context.services.images.get(self.image.image_type, self.image.image_name)
|
||||
results = context.services.generate.upscale_and_reconstruct(
|
||||
image_list = [[image, 0]],
|
||||
upscale = (self.level, self.strength),
|
||||
strength = 0.0, # GFPGAN strength
|
||||
save_original = False,
|
||||
image_callback = None,
|
||||
)
|
||||
|
||||
# Results are image and seed, unwrap for now
|
||||
# TODO: can this return multiple results?
|
||||
image_dto = context.services.images.create(
|
||||
image=results[0][0],
|
||||
image_origin=ResourceOrigin.INTERNAL,
|
||||
image_category=ImageCategory.GENERAL,
|
||||
node_id=self.id,
|
||||
session_id=context.graph_execution_state_id,
|
||||
is_intermediate=self.is_intermediate,
|
||||
)
|
||||
|
||||
image_type = ImageType.RESULT
|
||||
image_name = context.services.images.create_name(context.graph_execution_state_id, self.id)
|
||||
context.services.images.save(image_type, image_name, results[0][0])
|
||||
return ImageOutput(
|
||||
image=ImageField(
|
||||
image_name=image_dto.image_name,
|
||||
image_origin=image_dto.image_origin,
|
||||
),
|
||||
width=image_dto.width,
|
||||
height=image_dto.height,
|
||||
image = ImageField(image_type = image_type, image_name = image_name)
|
||||
)
|
||||
|
||||
```
|
||||
|
||||
Each portion is important to implement correctly.
|
||||
|
||||
### Class definition and type
|
||||
|
||||
```py
|
||||
class UpscaleInvocation(BaseInvocation):
|
||||
"""Upscales an image."""
|
||||
type: Literal['upscale'] = 'upscale'
|
||||
```
|
||||
|
||||
All invocations must derive from `BaseInvocation`. They should have a docstring
|
||||
that declares what they do in a single, short line. They should also have a
|
||||
`type` with a type hint that's `Literal["command_name"]`, where `command_name`
|
||||
is what the user will type on the CLI or use in the API to create this
|
||||
invocation. The `command_name` must be unique. The `type` must be assigned to
|
||||
the value of the literal in the type hint.
|
||||
All invocations must derive from `BaseInvocation`. They should have a docstring that declares what they do in a single, short line. They should also have a `type` with a type hint that's `Literal["command_name"]`, where `command_name` is what the user will type on the CLI or use in the API to create this invocation. The `command_name` must be unique. The `type` must be assigned to the value of the literal in the type hint.
|
||||
|
||||
### Inputs
|
||||
|
||||
```py
|
||||
# Inputs
|
||||
image: Union[ImageField,None] = Field(description="The input image")
|
||||
strength: float = Field(default=0.75, gt=0, le=1, description="The strength")
|
||||
level: Literal[2,4] = Field(default=2, description="The upscale level")
|
||||
```
|
||||
Inputs consist of three parts: a name, a type hint, and a `Field` with default, description, and validation information. For example:
|
||||
| Part | Value | Description |
|
||||
| ---- | ----- | ----------- |
|
||||
| Name | `strength` | This field is referred to as `strength` |
|
||||
| Type Hint | `float` | This field must be of type `float` |
|
||||
| Field | `Field(default=0.75, gt=0, le=1, description="The strength")` | The default value is `0.75`, the value must be in the range (0,1], and help text will show "The strength" for this field. |
|
||||
|
||||
Inputs consist of three parts: a name, a type hint, and a `Field` with default,
|
||||
description, and validation information. For example:
|
||||
Notice that `image` has type `Union[ImageField,None]`. The `Union` allows this field to be parsed with `None` as a value, which enables linking to previous invocations. All fields should either provide a default value or allow `None` as a value, so that they can be overwritten with a linked output from another invocation.
|
||||
|
||||
| Part | Value | Description |
|
||||
| --------- | ------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------- |
|
||||
| Name | `strength` | This field is referred to as `strength` |
|
||||
| Type Hint | `float` | This field must be of type `float` |
|
||||
| Field | `Field(default=0.75, gt=0, le=1, description="The strength")` | The default value is `0.75`, the value must be in the range (0,1], and help text will show "The strength" for this field. |
|
||||
The special type `ImageField` is also used here. All images are passed as `ImageField`, which protects them from pydantic validation errors (since images only ever come from links).
|
||||
|
||||
Notice that `image` has type `Union[ImageField,None]`. The `Union` allows this
|
||||
field to be parsed with `None` as a value, which enables linking to previous
|
||||
invocations. All fields should either provide a default value or allow `None` as
|
||||
a value, so that they can be overwritten with a linked output from another
|
||||
invocation.
|
||||
|
||||
The special type `ImageField` is also used here. All images are passed as
|
||||
`ImageField`, which protects them from pydantic validation errors (since images
|
||||
only ever come from links).
|
||||
|
||||
Finally, note that for all linking, the `type` of the linked fields must match.
|
||||
If the `name` also matches, then the field can be **automatically linked** to a
|
||||
previous invocation by name and matching.
|
||||
|
||||
### Config
|
||||
|
||||
```py
|
||||
# Schema customisation
|
||||
class Config(InvocationConfig):
|
||||
schema_extra = {
|
||||
"ui": {
|
||||
"tags": ["upscaling", "image"],
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
This is an optional configuration for the invocation. It inherits from
|
||||
pydantic's model `Config` class, and it used primarily to customize the
|
||||
autogenerated OpenAPI schema.
|
||||
|
||||
The UI relies on the OpenAPI schema in two ways:
|
||||
|
||||
- An API client & Typescript types are generated from it. This happens at build
|
||||
time.
|
||||
- The node editor parses the schema into a template used by the UI to create the
|
||||
node editor UI. This parsing happens at runtime.
|
||||
|
||||
In this example, a `ui` key has been added to the `schema_extra` dict to provide
|
||||
some tags for the UI, to facilitate filtering nodes.
|
||||
|
||||
See the Schema Generation section below for more information.
|
||||
Finally, note that for all linking, the `type` of the linked fields must match. If the `name` also matches, then the field can be **automatically linked** to a previous invocation by name and matching.
|
||||
|
||||
### Invoke Function
|
||||
|
||||
```py
|
||||
def invoke(self, context: InvocationContext) -> ImageOutput:
|
||||
image = context.services.images.get_pil_image(
|
||||
self.image.image_origin, self.image.image_name
|
||||
)
|
||||
results = context.services.restoration.upscale_and_reconstruct(
|
||||
image_list=[[image, 0]],
|
||||
upscale=(self.level, self.strength),
|
||||
strength=0.0, # GFPGAN strength
|
||||
save_original=False,
|
||||
image_callback=None,
|
||||
image = context.services.images.get(self.image.image_type, self.image.image_name)
|
||||
results = context.services.generate.upscale_and_reconstruct(
|
||||
image_list = [[image, 0]],
|
||||
upscale = (self.level, self.strength),
|
||||
strength = 0.0, # GFPGAN strength
|
||||
save_original = False,
|
||||
image_callback = None,
|
||||
)
|
||||
|
||||
# Results are image and seed, unwrap for now
|
||||
# TODO: can this return multiple results?
|
||||
image_dto = context.services.images.create(
|
||||
image=results[0][0],
|
||||
image_origin=ResourceOrigin.INTERNAL,
|
||||
image_category=ImageCategory.GENERAL,
|
||||
node_id=self.id,
|
||||
session_id=context.graph_execution_state_id,
|
||||
is_intermediate=self.is_intermediate,
|
||||
)
|
||||
|
||||
image_type = ImageType.RESULT
|
||||
image_name = context.services.images.create_name(context.graph_execution_state_id, self.id)
|
||||
context.services.images.save(image_type, image_name, results[0][0])
|
||||
return ImageOutput(
|
||||
image=ImageField(
|
||||
image_name=image_dto.image_name,
|
||||
image_origin=image_dto.image_origin,
|
||||
),
|
||||
width=image_dto.width,
|
||||
height=image_dto.height,
|
||||
image = ImageField(image_type = image_type, image_name = image_name)
|
||||
)
|
||||
```
|
||||
The `invoke` function is the last portion of an invocation. It is provided an `InvocationContext` which contains services to perform work as well as a `session_id` for use as needed. It should return a class with output values that derives from `BaseInvocationOutput`.
|
||||
|
||||
The `invoke` function is the last portion of an invocation. It is provided an
|
||||
`InvocationContext` which contains services to perform work as well as a
|
||||
`session_id` for use as needed. It should return a class with output values that
|
||||
derives from `BaseInvocationOutput`.
|
||||
Before being called, the invocation will have all of its fields set from defaults, inputs, and finally links (overriding in that order).
|
||||
|
||||
Before being called, the invocation will have all of its fields set from
|
||||
defaults, inputs, and finally links (overriding in that order).
|
||||
|
||||
Assume that this invocation may be running simultaneously with other
|
||||
invocations, may be running on another machine, or in other interesting
|
||||
scenarios. If you need functionality, please provide it as a service in the
|
||||
`InvocationServices` class, and make sure it can be overridden.
|
||||
Assume that this invocation may be running simultaneously with other invocations, may be running on another machine, or in other interesting scenarios. If you need functionality, please provide it as a service in the `InvocationServices` class, and make sure it can be overridden.
|
||||
|
||||
### Outputs
|
||||
|
||||
```py
|
||||
class ImageOutput(BaseInvocationOutput):
|
||||
"""Base class for invocations that output an image"""
|
||||
type: Literal['image'] = 'image'
|
||||
|
||||
# fmt: off
|
||||
type: Literal["image_output"] = "image_output"
|
||||
image: ImageField = Field(default=None, description="The output image")
|
||||
width: int = Field(description="The width of the image in pixels")
|
||||
height: int = Field(description="The height of the image in pixels")
|
||||
# fmt: on
|
||||
|
||||
class Config:
|
||||
schema_extra = {"required": ["type", "image", "width", "height"]}
|
||||
image: ImageField = Field(default=None, description="The output image")
|
||||
```
|
||||
|
||||
Output classes look like an invocation class without the invoke method. Prefer
|
||||
to use an existing output class if available, and prefer to name inputs the same
|
||||
as outputs when possible, to promote automatic invocation linking.
|
||||
|
||||
## Schema Generation
|
||||
|
||||
Invocation, output and related classes are used to generate an OpenAPI schema.
|
||||
|
||||
### Required Properties
|
||||
|
||||
The schema generation treat all properties with default values as optional. This
|
||||
makes sense internally, but when when using these classes via the generated
|
||||
schema, we end up with e.g. the `ImageOutput` class having its `image` property
|
||||
marked as optional.
|
||||
|
||||
We know that this property will always be present, so the additional logic
|
||||
needed to always check if the property exists adds a lot of extraneous cruft.
|
||||
|
||||
To fix this, we can leverage `pydantic`'s
|
||||
[schema customisation](https://docs.pydantic.dev/usage/schema/#schema-customization)
|
||||
to mark properties that we know will always be present as required.
|
||||
|
||||
Here's that `ImageOutput` class, without the needed schema customisation:
|
||||
|
||||
```python
|
||||
class ImageOutput(BaseInvocationOutput):
|
||||
"""Base class for invocations that output an image"""
|
||||
|
||||
# fmt: off
|
||||
type: Literal["image_output"] = "image_output"
|
||||
image: ImageField = Field(default=None, description="The output image")
|
||||
width: int = Field(description="The width of the image in pixels")
|
||||
height: int = Field(description="The height of the image in pixels")
|
||||
# fmt: on
|
||||
```
|
||||
|
||||
The OpenAPI schema that results from this `ImageOutput` will have the `type`,
|
||||
`image`, `width` and `height` properties marked as optional, even though we know
|
||||
they will always have a value.
|
||||
|
||||
```python
|
||||
class ImageOutput(BaseInvocationOutput):
|
||||
"""Base class for invocations that output an image"""
|
||||
|
||||
# fmt: off
|
||||
type: Literal["image_output"] = "image_output"
|
||||
image: ImageField = Field(default=None, description="The output image")
|
||||
width: int = Field(description="The width of the image in pixels")
|
||||
height: int = Field(description="The height of the image in pixels")
|
||||
# fmt: on
|
||||
|
||||
# Add schema customization
|
||||
class Config:
|
||||
schema_extra = {"required": ["type", "image", "width", "height"]}
|
||||
```
|
||||
|
||||
With the customization in place, the schema will now show these properties as
|
||||
required, obviating the need for extensive null checks in client code.
|
||||
|
||||
See this `pydantic` issue for discussion on this solution:
|
||||
<https://github.com/pydantic/pydantic/discussions/4577>
|
||||
Output classes look like an invocation class without the invoke method. Prefer to use an existing output class if available, and prefer to name inputs the same as outputs when possible, to promote automatic invocation linking.
|
||||
|
@ -81,193 +81,3 @@ pytest --cov; open ./coverage/html/index.html
|
||||
<!--#TODO: get input from blessedcoolant here, for the moment inserted the frontend README via snippets extension.-->
|
||||
|
||||
--8<-- "invokeai/frontend/web/README.md"
|
||||
|
||||
## Developing InvokeAI in VSCode
|
||||
|
||||
VSCode offers some nice tools:
|
||||
|
||||
- python debugger
|
||||
- automatic `venv` activation
|
||||
- remote dev (e.g. run InvokeAI on a beefy linux desktop while you type in
|
||||
comfort on your macbook)
|
||||
|
||||
### Setup
|
||||
|
||||
You'll need the
|
||||
[Python](https://marketplace.visualstudio.com/items?itemName=ms-python.python)
|
||||
and
|
||||
[Pylance](https://marketplace.visualstudio.com/items?itemName=ms-python.vscode-pylance)
|
||||
extensions installed first.
|
||||
|
||||
It's also really handy to install the `Jupyter` extensions:
|
||||
|
||||
- [Jupyter](https://marketplace.visualstudio.com/items?itemName=ms-toolsai.jupyter)
|
||||
- [Jupyter Cell Tags](https://marketplace.visualstudio.com/items?itemName=ms-toolsai.vscode-jupyter-cell-tags)
|
||||
- [Jupyter Notebook Renderers](https://marketplace.visualstudio.com/items?itemName=ms-toolsai.jupyter-renderers)
|
||||
- [Jupyter Slide Show](https://marketplace.visualstudio.com/items?itemName=ms-toolsai.vscode-jupyter-slideshow)
|
||||
|
||||
#### InvokeAI workspace
|
||||
|
||||
Creating a VSCode workspace for working on InvokeAI is highly recommended. It
|
||||
can hold InvokeAI-specific settings and configs.
|
||||
|
||||
To make a workspace:
|
||||
|
||||
- Open the InvokeAI repo dir in VSCode
|
||||
- `File` > `Save Workspace As` > save it _outside_ the repo
|
||||
|
||||
#### Default python interpreter (i.e. automatic virtual environment activation)
|
||||
|
||||
- Use command palette to run command
|
||||
`Preferences: Open Workspace Settings (JSON)`
|
||||
- Add `python.defaultInterpreterPath` to `settings`, pointing to your `venv`'s
|
||||
python
|
||||
|
||||
Should look something like this:
|
||||
|
||||
```jsonc
|
||||
{
|
||||
// I like to have all InvokeAI-related folders in my workspace
|
||||
"folders": [
|
||||
{
|
||||
// repo root
|
||||
"path": "InvokeAI"
|
||||
},
|
||||
{
|
||||
// InvokeAI root dir, where `invokeai.yaml` lives
|
||||
"path": "/path/to/invokeai_root"
|
||||
}
|
||||
],
|
||||
"settings": {
|
||||
// Where your InvokeAI `venv`'s python executable lives
|
||||
"python.defaultInterpreterPath": "/path/to/invokeai_root/.venv/bin/python"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Now when you open the VSCode integrated terminal, or do anything that needs to
|
||||
run python, it will automatically be in your InvokeAI virtual environment.
|
||||
|
||||
Bonus: When you create a Jupyter notebook, when you run it, you'll be prompted
|
||||
for the python interpreter to run in. This will default to your `venv` python,
|
||||
and so you'll have access to the same python environment as the InvokeAI app.
|
||||
|
||||
This is _super_ handy.
|
||||
|
||||
#### Debugging configs with `launch.json`
|
||||
|
||||
Debugging configs are managed in a `launch.json` file. Like most VSCode configs,
|
||||
these can be scoped to a workspace or folder.
|
||||
|
||||
Follow the [official guide](https://code.visualstudio.com/docs/python/debugging)
|
||||
to set up your `launch.json` and try it out.
|
||||
|
||||
Now we can create the InvokeAI debugging configs:
|
||||
|
||||
```jsonc
|
||||
{
|
||||
// Use IntelliSense to learn about possible attributes.
|
||||
// Hover to view descriptions of existing attributes.
|
||||
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
|
||||
"version": "0.2.0",
|
||||
"configurations": [
|
||||
{
|
||||
// Run the InvokeAI backend & serve the pre-built UI
|
||||
"name": "InvokeAI Web",
|
||||
"type": "python",
|
||||
"request": "launch",
|
||||
"program": "scripts/invokeai-web.py",
|
||||
"args": [
|
||||
// Your InvokeAI root dir (where `invokeai.yaml` lives)
|
||||
"--root",
|
||||
"/path/to/invokeai_root",
|
||||
// Access the app from anywhere on your local network
|
||||
"--host",
|
||||
"0.0.0.0"
|
||||
],
|
||||
"justMyCode": true
|
||||
},
|
||||
{
|
||||
// Run the nodes-based CLI
|
||||
"name": "InvokeAI CLI",
|
||||
"type": "python",
|
||||
"request": "launch",
|
||||
"program": "scripts/invokeai-cli.py",
|
||||
"justMyCode": true
|
||||
},
|
||||
{
|
||||
// Run tests
|
||||
"name": "InvokeAI Test",
|
||||
"type": "python",
|
||||
"request": "launch",
|
||||
"module": "pytest",
|
||||
"args": ["--capture=no"],
|
||||
"justMyCode": true
|
||||
},
|
||||
{
|
||||
// Run a single test
|
||||
"name": "InvokeAI Single Test",
|
||||
"type": "python",
|
||||
"request": "launch",
|
||||
"module": "pytest",
|
||||
"args": [
|
||||
// Change this to point to the specific test you are working on
|
||||
"tests/nodes/test_invoker.py"
|
||||
],
|
||||
"justMyCode": true
|
||||
},
|
||||
{
|
||||
// This is the default, useful to just run a single file
|
||||
"name": "Python: File",
|
||||
"type": "python",
|
||||
"request": "launch",
|
||||
"program": "${file}",
|
||||
"justMyCode": true
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
You'll see these configs in the debugging configs drop down. Running them will
|
||||
start InvokeAI with attached debugger, in the correct environment, and work just
|
||||
like the normal app.
|
||||
|
||||
Enjoy debugging InvokeAI with ease (not that we have any bugs of course).
|
||||
|
||||
#### Remote dev
|
||||
|
||||
This is very easy to set up and provides the same very smooth experience as
|
||||
local development. Environments and debugging, as set up above, just work,
|
||||
though you'd need to recreate the workspace and debugging configs on the remote.
|
||||
|
||||
Consult the
|
||||
[official guide](https://code.visualstudio.com/docs/remote/remote-overview) to
|
||||
get it set up.
|
||||
|
||||
Suggest using VSCode's included settings sync so that your remote dev host has
|
||||
all the same app settings and extensions automagically.
|
||||
|
||||
##### One remote dev gotcha
|
||||
|
||||
I've found the automatic port forwarding to be very flakey. You can disable it
|
||||
in `Preferences: Open Remote Settings (ssh: hostname)`. Search for
|
||||
`remote.autoForwardPorts` and untick the box.
|
||||
|
||||
To forward ports very reliably, use SSH on the remote dev client (e.g. your
|
||||
macbook). Here's how to forward both backend API port (`9090`) and the frontend
|
||||
live dev server port (`5173`):
|
||||
|
||||
```bash
|
||||
ssh \
|
||||
-L 9090:localhost:9090 \
|
||||
-L 5173:localhost:5173 \
|
||||
user@remote-dev-host
|
||||
```
|
||||
|
||||
The forwarding stops when you close the terminal window, so suggest to do this
|
||||
_outside_ the VSCode integrated terminal in case you need to restart VSCode for
|
||||
an extension update or something
|
||||
|
||||
Now, on your remote dev client, you can open `localhost:9090` and access the UI,
|
||||
now served from the remote dev host, just the same as if it was running on the
|
||||
client.
|
||||
|
@ -1,91 +0,0 @@
|
||||
# Development
|
||||
|
||||
## **What do I need to know to help?**
|
||||
|
||||
If you are looking to help to with a code contribution, InvokeAI uses several different technologies under the hood: Python (Pydantic, FastAPI, diffusers) and Typescript (React, Redux Toolkit, ChakraUI, Mantine, Konva). Familiarity with StableDiffusion and image generation concepts is helpful, but not essential.
|
||||
|
||||
For more information, please review our area specific documentation:
|
||||
|
||||
* #### [InvokeAI Architecure](../ARCHITECTURE.md)
|
||||
* #### [Frontend Documentation](development_guides/contributingToFrontend.md)
|
||||
* #### [Node Documentation](../INVOCATIONS.md)
|
||||
* #### [Local Development](../LOCAL_DEVELOPMENT.md)
|
||||
|
||||
If you don't feel ready to make a code contribution yet, no problem! You can also help out in other ways, such as [documentation](documentation.md) or [translation](translation.md).
|
||||
|
||||
There are two paths to making a development contribution:
|
||||
|
||||
1. Choosing an open issue to address. Open issues can be found in the [Issues](https://github.com/invoke-ai/InvokeAI/issues?q=is%3Aissue+is%3Aopen) section of the InvokeAI repository. These are tagged by the issue type (bug, enhancement, etc.) along with the “good first issues” tag denoting if they are suitable for first time contributors.
|
||||
1. Additional items can be found on our [roadmap](https://github.com/orgs/invoke-ai/projects/7). The roadmap is organized in terms of priority, and contains features of varying size and complexity. If there is an inflight item you’d like to help with, reach out to the contributor assigned to the item to see how you can help.
|
||||
2. Opening a new issue or feature to add. **Please make sure you have searched through existing issues before creating new ones.**
|
||||
|
||||
*Regardless of what you choose, please post in the [#dev-chat](https://discord.com/channels/1020123559063990373/1049495067846524939) channel of the Discord before you start development in order to confirm that the issue or feature is aligned with the current direction of the project. We value our contributors time and effort and want to ensure that no one’s time is being misspent.*
|
||||
|
||||
## Best Practices:
|
||||
* Keep your pull requests small. Smaller pull requests are more likely to be accepted and merged
|
||||
* Comments! Commenting your code helps reviwers easily understand your contribution
|
||||
* Use Python and Typescript’s typing systems, and consider using an editor with [LSP](https://microsoft.github.io/language-server-protocol/) support to streamline development
|
||||
* Make all communications public. This ensure knowledge is shared with the whole community
|
||||
|
||||
## **How do I make a contribution?**
|
||||
|
||||
Never made an open source contribution before? Wondering how contributions work in our project? Here's a quick rundown!
|
||||
|
||||
Before starting these steps, ensure you have your local environment [configured for development](../LOCAL_DEVELOPMENT.md).
|
||||
|
||||
1. Find a [good first issue](https://github.com/invoke-ai/InvokeAI/contribute) that you are interested in addressing or a feature that you would like to add. Then, reach out to our team in the [#dev-chat](https://discord.com/channels/1020123559063990373/1049495067846524939) channel of the Discord to ensure you are setup for success.
|
||||
2. Fork the [InvokeAI](https://github.com/invoke-ai/InvokeAI) repository to your GitHub profile. This means that you will have a copy of the repository under **your-GitHub-username/InvokeAI**.
|
||||
3. Clone the repository to your local machine using:
|
||||
|
||||
```bash
|
||||
git clone https://github.com/your-GitHub-username/InvokeAI.git
|
||||
```
|
||||
|
||||
If you're unfamiliar with using Git through the commandline, [GitHub Desktop](https://desktop.github.com) is a easy-to-use alternative with a UI. You can do all the same steps listed here, but through the interface.
|
||||
|
||||
4. Create a new branch for your fix using:
|
||||
|
||||
```bash
|
||||
git checkout -b branch-name-here
|
||||
```
|
||||
|
||||
5. Make the appropriate changes for the issue you are trying to address or the feature that you want to add.
|
||||
6. Add the file contents of the changed files to the "snapshot" git uses to manage the state of the project, also known as the index:
|
||||
|
||||
```bash
|
||||
git add insert-paths-of-changed-files-here
|
||||
```
|
||||
|
||||
7. Store the contents of the index with a descriptive message.
|
||||
|
||||
```bash
|
||||
git commit -m "Insert a short message of the changes made here"
|
||||
```
|
||||
|
||||
8. Push the changes to the remote repository using
|
||||
|
||||
```markdown
|
||||
git push origin branch-name-here
|
||||
```
|
||||
|
||||
9. Submit a pull request to the **main** branch of the InvokeAI repository.
|
||||
10. Title the pull request with a short description of the changes made and the issue or bug number associated with your change. For example, you can title an issue like so "Added more log outputting to resolve #1234".
|
||||
11. In the description of the pull request, explain the changes that you made, any issues you think exist with the pull request you made, and any questions you have for the maintainer. It's OK if your pull request is not perfect (no pull request is), the reviewer will be able to help you fix any problems and improve it!
|
||||
12. Wait for the pull request to be reviewed by other collaborators.
|
||||
13. Make changes to the pull request if the reviewer(s) recommend them.
|
||||
14. Celebrate your success after your pull request is merged!
|
||||
|
||||
If you’d like to learn more about contributing to Open Source projects, here is a [Getting Started Guide](https://opensource.com/article/19/7/create-pull-request-github).
|
||||
|
||||
## **Where can I go for help?**
|
||||
|
||||
If you need help, you can ask questions in the [#dev-chat](https://discord.com/channels/1020123559063990373/1049495067846524939) channel of the Discord.
|
||||
|
||||
For frontend related work, **@pyschedelicious** is the best person to reach out to.
|
||||
|
||||
For backend related work, please reach out to **@blessedcoolant**, **@lstein**, **@StAlKeR7779** or **@pyschedelicious**.
|
||||
|
||||
## **What does the Code of Conduct mean for me?**
|
||||
|
||||
Our [Code of Conduct](CODE_OF_CONDUCT.md) means that you are responsible for treating everyone on the project with respect and courtesy regardless of their identity. If you are the victim of any inappropriate behavior or comments as described in our Code of Conduct, we are here for you and will do the best to ensure that the abuser is reprimanded appropriately, per our code.
|
||||
|
@ -1,75 +0,0 @@
|
||||
# Contributing to the Frontend
|
||||
|
||||
# InvokeAI Web UI
|
||||
|
||||
- [InvokeAI Web UI](https://github.com/invoke-ai/InvokeAI/tree/main/invokeai/frontend/web/docs#invokeai-web-ui)
|
||||
- [Stack](https://github.com/invoke-ai/InvokeAI/tree/main/invokeai/frontend/web/docs#stack)
|
||||
- [Contributing](https://github.com/invoke-ai/InvokeAI/tree/main/invokeai/frontend/web/docs#contributing)
|
||||
- [Dev Environment](https://github.com/invoke-ai/InvokeAI/tree/main/invokeai/frontend/web/docs#dev-environment)
|
||||
- [Production builds](https://github.com/invoke-ai/InvokeAI/tree/main/invokeai/frontend/web/docs#production-builds)
|
||||
|
||||
The UI is a fairly straightforward Typescript React app, with the Unified Canvas being more complex.
|
||||
|
||||
Code is located in `invokeai/frontend/web/` for review.
|
||||
|
||||
## Stack
|
||||
|
||||
State management is Redux via [Redux Toolkit](https://github.com/reduxjs/redux-toolkit). We lean heavily on RTK:
|
||||
|
||||
- `createAsyncThunk` for HTTP requests
|
||||
- `createEntityAdapter` for fetching images and models
|
||||
- `createListenerMiddleware` for workflows
|
||||
|
||||
The API client and associated types are generated from the OpenAPI schema. See API_CLIENT.md.
|
||||
|
||||
Communication with server is a mix of HTTP and [socket.io](https://github.com/socketio/socket.io-client) (with a simple socket.io redux middleware to help).
|
||||
|
||||
[Chakra-UI](https://github.com/chakra-ui/chakra-ui) & [Mantine](https://github.com/mantinedev/mantine) for components and styling.
|
||||
|
||||
[Konva](https://github.com/konvajs/react-konva) for the canvas, but we are pushing the limits of what is feasible with it (and HTML canvas in general). We plan to rebuild it with [PixiJS](https://github.com/pixijs/pixijs) to take advantage of WebGL's improved raster handling.
|
||||
|
||||
[Vite](https://vitejs.dev/) for bundling.
|
||||
|
||||
Localisation is via [i18next](https://github.com/i18next/react-i18next), but translation happens on our [Weblate](https://hosted.weblate.org/engage/invokeai/) project. Only the English source strings should be changed on this repo.
|
||||
|
||||
## Contributing
|
||||
|
||||
Thanks for your interest in contributing to the InvokeAI Web UI!
|
||||
|
||||
We encourage you to ping @psychedelicious and @blessedcoolant on [Discord](https://discord.gg/ZmtBAhwWhy) if you want to contribute, just to touch base and ensure your work doesn't conflict with anything else going on. The project is very active.
|
||||
|
||||
### Dev Environment
|
||||
|
||||
**Setup**
|
||||
|
||||
1. Install [node](https://nodejs.org/en/download/). You can confirm node is installed with:
|
||||
```bash
|
||||
node --version
|
||||
```
|
||||
2. Install [yarn classic](https://classic.yarnpkg.com/lang/en/) and confirm it is installed by running this:
|
||||
```bash
|
||||
npm install --global yarn
|
||||
yarn --version
|
||||
```
|
||||
|
||||
From `invokeai/frontend/web/` run `yarn install` to get everything set up.
|
||||
|
||||
Start everything in dev mode:
|
||||
1. Ensure your virtual environment is running
|
||||
2. Start the dev server: `yarn dev`
|
||||
3. Start the InvokeAI Nodes backend: `python scripts/invokeai-web.py # run from the repo root`
|
||||
4. Point your browser to the dev server address e.g. [http://localhost:5173/](http://localhost:5173/)
|
||||
|
||||
### VSCode Remote Dev
|
||||
|
||||
We've noticed an intermittent issue with the VSCode Remote Dev port forwarding. If you use this feature of VSCode, you may intermittently click the Invoke button and then get nothing until the request times out. Suggest disabling the IDE's port forwarding feature and doing it manually via SSH:
|
||||
|
||||
`ssh -L 9090:localhost:9090 -L 5173:localhost:5173 user@host`
|
||||
|
||||
### Production builds
|
||||
|
||||
For a number of technical and logistical reasons, we need to commit UI build artefacts to the repo.
|
||||
|
||||
If you submit a PR, there is a good chance we will ask you to include a separate commit with a build of the app.
|
||||
|
||||
To build for production, run `yarn build`.
|
@ -1,13 +0,0 @@
|
||||
# Documentation
|
||||
|
||||
Documentation is an important part of any open source project. It provides a clear and concise way to communicate how the software works, how to use it, and how to troubleshoot issues. Without proper documentation, it can be difficult for users to understand the purpose and functionality of the project.
|
||||
|
||||
## Contributing
|
||||
|
||||
All documentation is maintained in the InvokeAI GitHub repository. If you come across documentation that is out of date or incorrect, please submit a pull request with the necessary changes.
|
||||
|
||||
When updating or creating documentation, please keep in mind InvokeAI is a tool for everyone, not just those who have familiarity with generative art.
|
||||
|
||||
## Help & Questions
|
||||
|
||||
Please ping @imic1 or @hipsterusername in the [Discord](https://discord.com/channels/1020123559063990373/1049495067846524939) if you have any questions.
|
@ -1,19 +0,0 @@
|
||||
# Translation
|
||||
|
||||
InvokeAI uses [Weblate](https://weblate.org/) for translation. Weblate is a FOSS project providing a scalable translation service. Weblate automates the tedious parts of managing translation of a growing project, and the service is generously provided at no cost to FOSS projects like InvokeAI.
|
||||
|
||||
## Contributing
|
||||
|
||||
If you'd like to contribute by adding or updating a translation, please visit our [Weblate project](https://hosted.weblate.org/engage/invokeai/). You'll need to sign in with your GitHub account (a number of other accounts are supported, including Google).
|
||||
|
||||
Once signed in, select a language and then the Web UI component. From here you can Browse and Translate strings from English to your chosen language. Zen mode offers a simpler translation experience.
|
||||
|
||||
Your changes will be attributed to you in the automated PR process; you don't need to do anything else.
|
||||
|
||||
## Help & Questions
|
||||
|
||||
Please check Weblate's [documentation](https://docs.weblate.org/en/latest/index.html) or ping @Harvestor on [Discord](https://discord.com/channels/1020123559063990373/1049495067846524939) if you have any questions.
|
||||
|
||||
## Thanks
|
||||
|
||||
Thanks to the InvokeAI community for their efforts to translate the project!
|
@ -1,11 +0,0 @@
|
||||
# Tutorials
|
||||
|
||||
Tutorials help new & existing users expand their abilty to use InvokeAI to the full extent of our features and services.
|
||||
|
||||
Currently, we have a set of tutorials available on our [YouTube channel](https://www.youtube.com/@invokeai), but as InvokeAI continues to evolve with new updates, we want to ensure that we are giving our users the resources they need to succeed.
|
||||
|
||||
Tutorials can be in the form of videos or article walkthroughs on a subject of your choice. We recommend focusing tutorials on the key image generation methods, or on a specific component within one of the image generation methods.
|
||||
|
||||
## Contributing
|
||||
|
||||
Please reach out to @imic or @hipsterusername on [Discord](https://discord.gg/ZmtBAhwWhy) to help create tutorials for InvokeAI.
|
@ -1,589 +0,0 @@
|
||||
---
|
||||
title: Command-Line Interface
|
||||
---
|
||||
|
||||
# :material-bash: CLI
|
||||
|
||||
## **Interactive Command Line Interface**
|
||||
|
||||
The InvokeAI command line interface (CLI) provides scriptable access
|
||||
to InvokeAI's features.Some advanced features are only available
|
||||
through the CLI, though they eventually find their way into the WebUI.
|
||||
|
||||
The CLI is accessible from the `invoke.sh`/`invoke.bat` launcher by
|
||||
selecting option (1). Alternatively, it can be launched directly from
|
||||
the command line by activating the InvokeAI environment and giving the
|
||||
command:
|
||||
|
||||
```bash
|
||||
invokeai
|
||||
```
|
||||
|
||||
After some startup messages, you will be presented with the `invoke> `
|
||||
prompt. Here you can type prompts to generate images and issue other
|
||||
commands to load and manipulate generative models. The CLI has a large
|
||||
number of command-line options that control its behavior. To get a
|
||||
concise summary of the options, call `invokeai` with the `--help` argument:
|
||||
|
||||
```bash
|
||||
invokeai --help
|
||||
```
|
||||
|
||||
The script uses the readline library to allow for in-line editing, command
|
||||
history (++up++ and ++down++), autocompletion, and more. To help keep track of
|
||||
which prompts generated which images, the script writes a log file of image
|
||||
names and prompts to the selected output directory.
|
||||
|
||||
Here is a typical session
|
||||
|
||||
```bash
|
||||
PS1:C:\Users\fred> invokeai
|
||||
* Initializing, be patient...
|
||||
* Initializing, be patient...
|
||||
>> Initialization file /home/lstein/invokeai/invokeai.init found. Loading...
|
||||
>> Internet connectivity is True
|
||||
>> InvokeAI, version 2.3.0-rc5
|
||||
>> InvokeAI runtime directory is "/home/lstein/invokeai"
|
||||
>> GFPGAN Initialized
|
||||
>> CodeFormer Initialized
|
||||
>> ESRGAN Initialized
|
||||
>> Using device_type cuda
|
||||
>> xformers memory-efficient attention is available and enabled
|
||||
(...more initialization messages...)
|
||||
* Initialization done! Awaiting your command (-h for help, 'q' to quit)
|
||||
invoke> ashley judd riding a camel -n2 -s150
|
||||
Outputs:
|
||||
outputs/img-samples/00009.png: "ashley judd riding a camel" -n2 -s150 -S 416354203
|
||||
outputs/img-samples/00010.png: "ashley judd riding a camel" -n2 -s150 -S 1362479620
|
||||
|
||||
invoke> "there's a fly in my soup" -n6 -g
|
||||
outputs/img-samples/00011.png: "there's a fly in my soup" -n6 -g -S 2685670268
|
||||
seeds for individual rows: [2685670268, 1216708065, 2335773498, 822223658, 714542046, 3395302430]
|
||||
invoke> q
|
||||
```
|
||||
|
||||

|
||||
|
||||
## Arguments
|
||||
|
||||
The script recognizes a series of command-line switches that will
|
||||
change important global defaults, such as the directory for image
|
||||
outputs and the location of the model weight files.
|
||||
|
||||
### List of arguments recognized at the command line
|
||||
|
||||
These command-line arguments can be passed to `invoke.py` when you first run it
|
||||
from the Windows, Mac or Linux command line. Some set defaults that can be
|
||||
overridden on a per-prompt basis (see
|
||||
[List of prompt arguments](#list-of-prompt-arguments). Others
|
||||
|
||||
| Argument <img width="240" align="right"/> | Shortcut <img width="100" align="right"/> | Default <img width="320" align="right"/> | Description |
|
||||
| ----------------------------------------- | ----------------------------------------- | ---------------------------------------------- | ---------------------------------------------------------------------------------------------------- |
|
||||
| `--help` | `-h` | | Print a concise help message. |
|
||||
| `--outdir <path>` | `-o<path>` | `outputs/img_samples` | Location for generated images. |
|
||||
| `--prompt_as_dir` | `-p` | `False` | Name output directories using the prompt text. |
|
||||
| `--from_file <path>` | | `None` | Read list of prompts from a file. Use `-` to read from standard input |
|
||||
| `--model <modelname>` | | `stable-diffusion-1.5` | Loads the initial model specified in configs/models.yaml. |
|
||||
| `--ckpt_convert ` | | `False` | If provided both .ckpt and .safetensors files will be auto-converted into diffusers format in memory |
|
||||
| `--autoconvert <path>` | | `None` | On startup, scan the indicated directory for new .ckpt/.safetensor files and automatically convert and import them |
|
||||
| `--precision` | | `fp16` | Provide `fp32` for full precision mode, `fp16` for half-precision. `fp32` needed for Macintoshes and some NVidia cards. |
|
||||
| `--png_compression <0-9>` | `-z<0-9>` | `6` | Select level of compression for output files, from 0 (no compression) to 9 (max compression) |
|
||||
| `--safety-checker` | | `False` | Activate safety checker for NSFW and other potentially disturbing imagery |
|
||||
| `--patchmatch`, `--no-patchmatch` | | `--patchmatch` | Load/Don't load the PatchMatch inpainting extension |
|
||||
| `--xformers`, `--no-xformers` | | `--xformers` | Load/Don't load the Xformers memory-efficient attention module (CUDA only) |
|
||||
| `--web` | | `False` | Start in web server mode |
|
||||
| `--host <ip addr>` | | `localhost` | Which network interface web server should listen on. Set to 0.0.0.0 to listen on any. |
|
||||
| `--port <port>` | | `9090` | Which port web server should listen for requests on. |
|
||||
| `--config <path>` | | `configs/models.yaml` | Configuration file for models and their weights. |
|
||||
| `--iterations <int>` | `-n<int>` | `1` | How many images to generate per prompt. |
|
||||
| `--width <int>` | `-W<int>` | `512` | Width of generated image |
|
||||
| `--height <int>` | `-H<int>` | `512` | Height of generated image | `--steps <int>` | `-s<int>` | `50` | How many steps of refinement to apply |
|
||||
| `--strength <float>` | `-s<float>` | `0.75` | For img2img: how hard to try to match the prompt to the initial image. Ranges from 0.0-0.99, with higher values replacing the initial image completely. |
|
||||
| `--fit` | `-F` | `False` | For img2img: scale the init image to fit into the specified -H and -W dimensions |
|
||||
| `--grid` | `-g` | `False` | Save all image series as a grid rather than individually. |
|
||||
| `--sampler <sampler>` | `-A<sampler>` | `k_lms` | Sampler to use. Use `-h` to get list of available samplers. |
|
||||
| `--seamless` | | `False` | Create interesting effects by tiling elements of the image. |
|
||||
| `--embedding_path <path>` | | `None` | Path to pre-trained embedding manager checkpoints, for custom models |
|
||||
| `--gfpgan_model_path` | | `experiments/pretrained_models/GFPGANv1.4.pth` | Path to GFPGAN model file. |
|
||||
| `--free_gpu_mem` | | `False` | Free GPU memory after sampling, to allow image decoding and saving in low VRAM conditions |
|
||||
| `--precision` | | `auto` | Set model precision, default is selected by device. Options: auto, float32, float16, autocast |
|
||||
|
||||
!!! warning "These arguments are deprecated but still work"
|
||||
|
||||
<div align="center" markdown>
|
||||
|
||||
| Argument | Shortcut | Default | Description |
|
||||
|--------------------|------------|---------------------|--------------|
|
||||
| `--full_precision` | | `False` | Same as `--precision=fp32`|
|
||||
| `--weights <path>` | | `None` | Path to weights file; use `--model stable-diffusion-1.4` instead |
|
||||
| `--laion400m` | `-l` | `False` | Use older LAION400m weights; use `--model=laion400m` instead |
|
||||
|
||||
</div>
|
||||
|
||||
!!! tip
|
||||
|
||||
On Windows systems, you may run into
|
||||
problems when passing the invoke script standard backslashed path
|
||||
names because the Python interpreter treats "\" as an escape.
|
||||
You can either double your slashes (ick): `C:\\path\\to\\my\\file`, or
|
||||
use Linux/Mac style forward slashes (better): `C:/path/to/my/file`.
|
||||
|
||||
## The .invokeai initialization file
|
||||
|
||||
To start up invoke.py with your preferred settings, place your desired
|
||||
startup options in a file in your home directory named `.invokeai` The
|
||||
file should contain the startup options as you would type them on the
|
||||
command line (`--steps=10 --grid`), one argument per line, or a
|
||||
mixture of both using any of the accepted command switch formats:
|
||||
|
||||
!!! example "my unmodified initialization file"
|
||||
|
||||
```bash title="~/.invokeai" linenums="1"
|
||||
# InvokeAI initialization file
|
||||
# This is the InvokeAI initialization file, which contains command-line default values.
|
||||
# Feel free to edit. If anything goes wrong, you can re-initialize this file by deleting
|
||||
# or renaming it and then running invokeai-configure again.
|
||||
|
||||
# The --root option below points to the folder in which InvokeAI stores its models, configs and outputs.
|
||||
--root="/Users/mauwii/invokeai"
|
||||
|
||||
# the --outdir option controls the default location of image files.
|
||||
--outdir="/Users/mauwii/invokeai/outputs"
|
||||
|
||||
# You may place other frequently-used startup commands here, one or more per line.
|
||||
# Examples:
|
||||
# --web --host=0.0.0.0
|
||||
# --steps=20
|
||||
# -Ak_euler_a -C10.0
|
||||
```
|
||||
|
||||
!!! note
|
||||
|
||||
The initialization file only accepts the command line arguments.
|
||||
There are additional arguments that you can provide on the `invoke>` command
|
||||
line (such as `-n` or `--iterations`) that cannot be entered into this file.
|
||||
Also be alert for empty blank lines at the end of the file, which will cause
|
||||
an arguments error at startup time.
|
||||
|
||||
## List of prompt arguments
|
||||
|
||||
After the invoke.py script initializes, it will present you with a `invoke>`
|
||||
prompt. Here you can enter information to generate images from text
|
||||
([txt2img](#txt2img)), to embellish an existing image or sketch
|
||||
([img2img](#img2img)), or to selectively alter chosen regions of the image
|
||||
([inpainting](#inpainting)).
|
||||
|
||||
### txt2img
|
||||
|
||||
!!! example ""
|
||||
|
||||
```bash
|
||||
invoke> waterfall and rainbow -W640 -H480
|
||||
```
|
||||
|
||||
This will create the requested image with the dimensions 640 (width)
|
||||
and 480 (height).
|
||||
|
||||
Here are the invoke> command that apply to txt2img:
|
||||
|
||||
| Argument <img width="680" align="right"/> | Shortcut <img width="420" align="right"/> | Default <img width="480" align="right"/> | Description |
|
||||
| ----------------------------------------- | ----------------------------------------- | ---------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
||||
| "my prompt" | | | Text prompt to use. The quotation marks are optional. |
|
||||
| `--width <int>` | `-W<int>` | `512` | Width of generated image |
|
||||
| `--height <int>` | `-H<int>` | `512` | Height of generated image |
|
||||
| `--iterations <int>` | `-n<int>` | `1` | How many images to generate from this prompt |
|
||||
| `--steps <int>` | `-s<int>` | `50` | How many steps of refinement to apply |
|
||||
| `--cfg_scale <float>` | `-C<float>` | `7.5` | How hard to try to match the prompt to the generated image; any number greater than 1.0 works, but the useful range is roughly 5.0 to 20.0 |
|
||||
| `--seed <int>` | `-S<int>` | `None` | Set the random seed for the next series of images. This can be used to recreate an image generated previously. |
|
||||
| `--sampler <sampler>` | `-A<sampler>` | `k_lms` | Sampler to use. Use -h to get list of available samplers. |
|
||||
| `--karras_max <int>` | | `29` | When using k\_\* samplers, set the maximum number of steps before shifting from using the Karras noise schedule (good for low step counts) to the LatentDiffusion noise schedule (good for high step counts) This value is sticky. [29] |
|
||||
| `--hires_fix` | | | Larger images often have duplication artefacts. This option suppresses duplicates by generating the image at low res, and then using img2img to increase the resolution |
|
||||
| `--png_compression <0-9>` | `-z<0-9>` | `6` | Select level of compression for output files, from 0 (no compression) to 9 (max compression) |
|
||||
| `--grid` | `-g` | `False` | Turn on grid mode to return a single image combining all the images generated by this prompt |
|
||||
| `--individual` | `-i` | `True` | Turn off grid mode (deprecated; leave off --grid instead) |
|
||||
| `--outdir <path>` | `-o<path>` | `outputs/img_samples` | Temporarily change the location of these images |
|
||||
| `--seamless` | | `False` | Activate seamless tiling for interesting effects |
|
||||
| `--seamless_axes` | | `x,y` | Specify which axes to use circular convolution on. |
|
||||
| `--log_tokenization` | `-t` | `False` | Display a color-coded list of the parsed tokens derived from the prompt |
|
||||
| `--skip_normalization` | `-x` | `False` | Weighted subprompts will not be normalized. See [Weighted Prompts](../features/OTHER.md#weighted-prompts) |
|
||||
| `--upscale <int> <float>` | `-U <int> <float>` | `-U 1 0.75` | Upscale image by magnification factor (2, 4), and set strength of upscaling (0.0-1.0). If strength not set, will default to 0.75. |
|
||||
| `--facetool_strength <float>` | `-G <float> ` | `-G0` | Fix faces (defaults to using the GFPGAN algorithm); argument indicates how hard the algorithm should try (0.0-1.0) |
|
||||
| `--facetool <name>` | `-ft <name>` | `-ft gfpgan` | Select face restoration algorithm to use: gfpgan, codeformer |
|
||||
| `--codeformer_fidelity` | `-cf <float>` | `0.75` | Used along with CodeFormer. Takes values between 0 and 1. 0 produces high quality but low accuracy. 1 produces high accuracy but low quality |
|
||||
| `--save_original` | `-save_orig` | `False` | When upscaling or fixing faces, this will cause the original image to be saved rather than replaced. |
|
||||
| `--variation <float>` | `-v<float>` | `0.0` | Add a bit of noise (0.0=none, 1.0=high) to the image in order to generate a series of variations. Usually used in combination with `-S<seed>` and `-n<int>` to generate a series a riffs on a starting image. See [Variations](../features/VARIATIONS.md). |
|
||||
| `--with_variations <pattern>` | | `None` | Combine two or more variations. See [Variations](../features/VARIATIONS.md) for now to use this. |
|
||||
| `--save_intermediates <n>` | | `None` | Save the image from every nth step into an "intermediates" folder inside the output directory |
|
||||
| `--h_symmetry_time_pct <float>` | | `None` | Create symmetry along the X axis at the desired percent complete of the generation process. (Must be between 0.0 and 1.0; set to a very small number like 0.0001 for just after the first step of generation.) |
|
||||
| `--v_symmetry_time_pct <float>` | | `None` | Create symmetry along the Y axis at the desired percent complete of the generation process. (Must be between 0.0 and 1.0; set to a very small number like 0.0001 for just after the first step of generation.) |
|
||||
|
||||
!!! note
|
||||
|
||||
the width and height of the image must be multiples of 64. You can
|
||||
provide different values, but they will be rounded down to the nearest multiple
|
||||
of 64.
|
||||
|
||||
!!! example "This is a example of img2img"
|
||||
|
||||
```bash
|
||||
invoke> waterfall and rainbow -I./vacation-photo.png -W640 -H480 --fit
|
||||
```
|
||||
|
||||
This will modify the indicated vacation photograph by making it more like the
|
||||
prompt. Results will vary greatly depending on what is in the image. We also ask
|
||||
to --fit the image into a box no bigger than 640x480. Otherwise the image size
|
||||
will be identical to the provided photo and you may run out of memory if it is
|
||||
large.
|
||||
|
||||
In addition to the command-line options recognized by txt2img, img2img accepts
|
||||
additional options:
|
||||
|
||||
| Argument <img width="160" align="right"/> | Shortcut | Default | Description |
|
||||
| ----------------------------------------- | ----------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------------ |
|
||||
| `--init_img <path>` | `-I<path>` | `None` | Path to the initialization image |
|
||||
| `--fit` | `-F` | `False` | Scale the image to fit into the specified -H and -W dimensions |
|
||||
| `--strength <float>` | `-s<float>` | `0.75` | How hard to try to match the prompt to the initial image. Ranges from 0.0-0.99, with higher values replacing the initial image completely. |
|
||||
|
||||
### inpainting
|
||||
|
||||
!!! example ""
|
||||
|
||||
```bash
|
||||
invoke> waterfall and rainbow -I./vacation-photo.png -M./vacation-mask.png -W640 -H480 --fit
|
||||
```
|
||||
|
||||
This will do the same thing as img2img, but image alterations will
|
||||
only occur within transparent areas defined by the mask file specified
|
||||
by `-M`. You may also supply just a single initial image with the areas
|
||||
to overpaint made transparent, but you must be careful not to destroy
|
||||
the pixels underneath when you create the transparent areas. See
|
||||
[Inpainting](INPAINTING.md) for details.
|
||||
|
||||
inpainting accepts all the arguments used for txt2img and img2img, as well as
|
||||
the --mask (-M) and --text_mask (-tm) arguments:
|
||||
|
||||
| Argument <img width="100" align="right"/> | Shortcut | Default | Description |
|
||||
| ----------------------------------------- | ------------------------ | ------- | ------------------------------------------------------------------------------------------------ |
|
||||
| `--init_mask <path>` | `-M<path>` | `None` | Path to an image the same size as the initial_image, with areas for inpainting made transparent. |
|
||||
| `--invert_mask ` | | False | If true, invert the mask so that transparent areas are opaque and vice versa. |
|
||||
| `--text_mask <prompt> [<float>]` | `-tm <prompt> [<float>]` | <none> | Create a mask from a text prompt describing part of the image |
|
||||
|
||||
The mask may either be an image with transparent areas, in which case the
|
||||
inpainting will occur in the transparent areas only, or a black and white image,
|
||||
in which case all black areas will be painted into.
|
||||
|
||||
`--text_mask` (short form `-tm`) is a way to generate a mask using a text
|
||||
description of the part of the image to replace. For example, if you have an
|
||||
image of a breakfast plate with a bagel, toast and scrambled eggs, you can
|
||||
selectively mask the bagel and replace it with a piece of cake this way:
|
||||
|
||||
```bash
|
||||
invoke> a piece of cake -I /path/to/breakfast.png -tm bagel
|
||||
```
|
||||
|
||||
The algorithm uses <a
|
||||
href="https://github.com/timojl/clipseg">clipseg</a> to classify different
|
||||
regions of the image. The classifier puts out a confidence score for each region
|
||||
it identifies. Generally regions that score above 0.5 are reliable, but if you
|
||||
are getting too much or too little masking you can adjust the threshold down (to
|
||||
get more mask), or up (to get less). In this example, by passing `-tm` a higher
|
||||
value, we are insisting on a more stringent classification.
|
||||
|
||||
```bash
|
||||
invoke> a piece of cake -I /path/to/breakfast.png -tm bagel 0.6
|
||||
```
|
||||
|
||||
### Custom Styles and Subjects
|
||||
|
||||
You can load and use hundreds of community-contributed Textual
|
||||
Inversion models just by typing the appropriate trigger phrase. Please
|
||||
see [Concepts Library](../features/CONCEPTS.md) for more details.
|
||||
|
||||
## Other Commands
|
||||
|
||||
The CLI offers a number of commands that begin with "!".
|
||||
|
||||
### Postprocessing images
|
||||
|
||||
To postprocess a file using face restoration or upscaling, use the `!fix`
|
||||
command.
|
||||
|
||||
#### `!fix`
|
||||
|
||||
This command runs a post-processor on a previously-generated image. It takes a
|
||||
PNG filename or path and applies your choice of the `-U`, `-G`, or `--embiggen`
|
||||
switches in order to fix faces or upscale. If you provide a filename, the script
|
||||
will look for it in the current output directory. Otherwise you can provide a
|
||||
full or partial path to the desired file.
|
||||
|
||||
Some examples:
|
||||
|
||||
!!! example "Upscale to 4X its original size and fix faces using codeformer"
|
||||
|
||||
```bash
|
||||
invoke> !fix 0000045.4829112.png -G1 -U4 -ft codeformer
|
||||
```
|
||||
|
||||
!!! example "Use the GFPGAN algorithm to fix faces, then upscale to 3X using --embiggen"
|
||||
|
||||
```bash
|
||||
invoke> !fix 0000045.4829112.png -G0.8 -ft gfpgan
|
||||
>> fixing outputs/img-samples/0000045.4829112.png
|
||||
>> retrieved seed 4829112 and prompt "boy enjoying a banana split"
|
||||
>> GFPGAN - Restoring Faces for image seed:4829112
|
||||
Outputs:
|
||||
[1] outputs/img-samples/000017.4829112.gfpgan-00.png: !fix "outputs/img-samples/0000045.4829112.png" -s 50 -S -W 512 -H 512 -C 7.5 -A k_lms -G 0.8
|
||||
```
|
||||
|
||||
#### `!mask`
|
||||
|
||||
This command takes an image, a text prompt, and uses the `clipseg` algorithm to
|
||||
automatically generate a mask of the area that matches the text prompt. It is
|
||||
useful for debugging the text masking process prior to inpainting with the
|
||||
`--text_mask` argument. See [INPAINTING.md] for details.
|
||||
|
||||
### Model selection and importation
|
||||
|
||||
The CLI allows you to add new models on the fly, as well as to switch
|
||||
among them rapidly without leaving the script. There are several
|
||||
different model formats, each described in the [Model Installation
|
||||
Guide](../installation/050_INSTALLING_MODELS.md).
|
||||
|
||||
#### `!models`
|
||||
|
||||
This prints out a list of the models defined in `config/models.yaml'. The active
|
||||
model is bold-faced
|
||||
|
||||
Example:
|
||||
|
||||
<pre>
|
||||
inpainting-1.5 not loaded Stable Diffusion inpainting model
|
||||
<b>stable-diffusion-1.5 active Stable Diffusion v1.5</b>
|
||||
waifu-diffusion not loaded Waifu Diffusion v1.4
|
||||
</pre>
|
||||
|
||||
#### `!switch <model>`
|
||||
|
||||
This quickly switches from one model to another without leaving the CLI script.
|
||||
`invoke.py` uses a memory caching system; once a model has been loaded,
|
||||
switching back and forth is quick. The following example shows this in action.
|
||||
Note how the second column of the `!models` table changes to `cached` after a
|
||||
model is first loaded, and that the long initialization step is not needed when
|
||||
loading a cached model.
|
||||
|
||||
#### `!import_model <hugging_face_repo_ID>`
|
||||
|
||||
This imports and installs a `diffusers`-style model that is stored on
|
||||
the [HuggingFace Web Site](https://huggingface.co). You can look up
|
||||
any [Stable Diffusion diffusers
|
||||
model](https://huggingface.co/models?library=diffusers) and install it
|
||||
with a command like the following:
|
||||
|
||||
```bash
|
||||
!import_model prompthero/openjourney
|
||||
```
|
||||
|
||||
#### `!import_model <path/to/diffusers/directory>`
|
||||
|
||||
If you have a copy of a `diffusers`-style model saved to disk, you can
|
||||
import it by passing the path to model's top-level directory.
|
||||
|
||||
#### `!import_model <url>`
|
||||
|
||||
For a `.ckpt` or `.safetensors` file, if you have a direct download
|
||||
URL for the file, you can provide it to `!import_model` and the file
|
||||
will be downloaded and installed for you.
|
||||
|
||||
#### `!import_model <path/to/model/weights.ckpt>`
|
||||
|
||||
This command imports a new model weights file into InvokeAI, makes it available
|
||||
for image generation within the script, and writes out the configuration for the
|
||||
model into `config/models.yaml` for use in subsequent sessions.
|
||||
|
||||
Provide `!import_model` with the path to a weights file ending in `.ckpt`. If
|
||||
you type a partial path and press tab, the CLI will autocomplete. Although it
|
||||
will also autocomplete to `.vae` files, these are not currenty supported (but
|
||||
will be soon).
|
||||
|
||||
When you hit return, the CLI will prompt you to fill in additional information
|
||||
about the model, including the short name you wish to use for it with the
|
||||
`!switch` command, a brief description of the model, the default image width and
|
||||
height to use with this model, and the model's configuration file. The latter
|
||||
three fields are automatically filled with reasonable defaults. In the example
|
||||
below, the bold-faced text shows what the user typed in with the exception of
|
||||
the width, height and configuration file paths, which were filled in
|
||||
automatically.
|
||||
|
||||
#### `!import_model <path/to/directory_of_models>`
|
||||
|
||||
If you provide the path of a directory that contains one or more
|
||||
`.ckpt` or `.safetensors` files, the CLI will scan the directory and
|
||||
interactively offer to import the models it finds there. Also see the
|
||||
`--autoconvert` command-line option.
|
||||
|
||||
#### `!edit_model <name_of_model>`
|
||||
|
||||
The `!edit_model` command can be used to modify a model that is already defined
|
||||
in `config/models.yaml`. Call it with the short name of the model you wish to
|
||||
modify, and it will allow you to modify the model's `description`, `weights` and
|
||||
other fields.
|
||||
|
||||
Example:
|
||||
|
||||
<pre>
|
||||
invoke> <b>!edit_model waifu-diffusion</b>
|
||||
>> Editing model waifu-diffusion from configuration file ./configs/models.yaml
|
||||
description: <b>Waifu diffusion v1.4beta</b>
|
||||
weights: models/ldm/stable-diffusion-v1/<b>model-epoch10-float16.ckpt</b>
|
||||
config: configs/stable-diffusion/v1-inference.yaml
|
||||
width: 512
|
||||
height: 512
|
||||
|
||||
>> New configuration:
|
||||
waifu-diffusion:
|
||||
config: configs/stable-diffusion/v1-inference.yaml
|
||||
description: Waifu diffusion v1.4beta
|
||||
weights: models/ldm/stable-diffusion-v1/model-epoch10-float16.ckpt
|
||||
height: 512
|
||||
width: 512
|
||||
|
||||
OK to import [n]? y
|
||||
>> Caching model stable-diffusion-1.4 in system RAM
|
||||
>> Loading waifu-diffusion from models/ldm/stable-diffusion-v1/model-epoch10-float16.ckpt
|
||||
...
|
||||
</pre>
|
||||
|
||||
### History processing
|
||||
|
||||
The CLI provides a series of convenient commands for reviewing previous actions,
|
||||
retrieving them, modifying them, and re-running them.
|
||||
|
||||
#### `!history`
|
||||
|
||||
The invoke script keeps track of all the commands you issue during a session,
|
||||
allowing you to re-run them. On Mac and Linux systems, it also writes the
|
||||
command-line history out to disk, giving you access to the most recent 1000
|
||||
commands issued.
|
||||
|
||||
The `!history` command will return a numbered list of all the commands issued
|
||||
during the session (Windows), or the most recent 1000 commands (Mac|Linux). You
|
||||
can then repeat a command by using the command `!NNN`, where "NNN" is the
|
||||
history line number. For example:
|
||||
|
||||
!!! example ""
|
||||
|
||||
```bash
|
||||
invoke> !history
|
||||
...
|
||||
[14] happy woman sitting under tree wearing broad hat and flowing garment
|
||||
[15] beautiful woman sitting under tree wearing broad hat and flowing garment
|
||||
[18] beautiful woman sitting under tree wearing broad hat and flowing garment -v0.2 -n6
|
||||
[20] watercolor of beautiful woman sitting under tree wearing broad hat and flowing garment -v0.2 -n6 -S2878767194
|
||||
[21] surrealist painting of beautiful woman sitting under tree wearing broad hat and flowing garment -v0.2 -n6 -S2878767194
|
||||
...
|
||||
invoke> !20
|
||||
invoke> watercolor of beautiful woman sitting under tree wearing broad hat and flowing garment -v0.2 -n6 -S2878767194
|
||||
```
|
||||
|
||||
####`!fetch`
|
||||
|
||||
This command retrieves the generation parameters from a previously generated
|
||||
image and either loads them into the command line (Linux|Mac), or prints them
|
||||
out in a comment for copy-and-paste (Windows). You may provide either the name
|
||||
of a file in the current output directory, or a full file path. Specify path to
|
||||
a folder with image png files, and wildcard \*.png to retrieve the dream command
|
||||
used to generate the images, and save them to a file commands.txt for further
|
||||
processing.
|
||||
|
||||
!!! example "load the generation command for a single png file"
|
||||
|
||||
```bash
|
||||
invoke> !fetch 0000015.8929913.png
|
||||
# the script returns the next line, ready for editing and running:
|
||||
invoke> a fantastic alien landscape -W 576 -H 512 -s 60 -A plms -C 7.5
|
||||
```
|
||||
|
||||
!!! example "fetch the generation commands from a batch of files and store them into `selected.txt`"
|
||||
|
||||
```bash
|
||||
invoke> !fetch outputs\selected-imgs\*.png selected.txt
|
||||
```
|
||||
|
||||
#### `!replay`
|
||||
|
||||
This command replays a text file generated by !fetch or created manually
|
||||
|
||||
!!! example
|
||||
|
||||
```bash
|
||||
invoke> !replay outputs\selected-imgs\selected.txt
|
||||
```
|
||||
|
||||
!!! note
|
||||
|
||||
These commands may behave unexpectedly if given a PNG file that was
|
||||
not generated by InvokeAI.
|
||||
|
||||
#### `!search <search string>`
|
||||
|
||||
This is similar to !history but it only returns lines that contain
|
||||
`search string`. For example:
|
||||
|
||||
```bash
|
||||
invoke> !search surreal
|
||||
[21] surrealist painting of beautiful woman sitting under tree wearing broad hat and flowing garment -v0.2 -n6 -S2878767194
|
||||
```
|
||||
|
||||
#### `!clear`
|
||||
|
||||
This clears the search history from memory and disk. Be advised that this
|
||||
operation is irreversible and does not issue any warnings!
|
||||
|
||||
## Command-line editing and completion
|
||||
|
||||
The command-line offers convenient history tracking, editing, and command
|
||||
completion.
|
||||
|
||||
- To scroll through previous commands and potentially edit/reuse them, use the
|
||||
++up++ and ++down++ keys.
|
||||
- To edit the current command, use the ++left++ and ++right++ keys to position
|
||||
the cursor, and then ++backspace++, ++delete++ or insert characters.
|
||||
- To move to the very beginning of the command, type ++ctrl+a++ (or
|
||||
++command+a++ on the Mac)
|
||||
- To move to the end of the command, type ++ctrl+e++.
|
||||
- To cut a section of the command, position the cursor where you want to start
|
||||
cutting and type ++ctrl+k++
|
||||
- To paste a cut section back in, position the cursor where you want to paste,
|
||||
and type ++ctrl+y++
|
||||
|
||||
Windows users can get similar, but more limited, functionality if they launch
|
||||
`invoke.py` with the `winpty` program and have the `pyreadline3` library
|
||||
installed:
|
||||
|
||||
```batch
|
||||
> winpty python scripts\invoke.py
|
||||
```
|
||||
|
||||
On the Mac and Linux platforms, when you exit invoke.py, the last 1000 lines of
|
||||
your command-line history will be saved. When you restart `invoke.py`, you can
|
||||
access the saved history using the ++up++ key.
|
||||
|
||||
In addition, limited command-line completion is installed. In various contexts,
|
||||
you can start typing your command and press ++tab++. A list of potential
|
||||
completions will be presented to you. You can then type a little more, hit
|
||||
++tab++ again, and eventually autocomplete what you want.
|
||||
|
||||
When specifying file paths using the one-letter shortcuts, the CLI will attempt
|
||||
to complete pathnames for you. This is most handy for the `-I` (init image) and
|
||||
`-M` (init mask) paths. To initiate completion, start the path with a slash
|
||||
(`/`) or `./`. For example:
|
||||
|
||||
```bash
|
||||
invoke> zebra with a mustache -I./test-pictures<TAB>
|
||||
-I./test-pictures/Lincoln-and-Parrot.png -I./test-pictures/zebra.jpg -I./test-pictures/madonna.png
|
||||
-I./test-pictures/bad-sketch.png -I./test-pictures/man_with_eagle/
|
||||
```
|
||||
|
||||
You can then type ++z++, hit ++tab++ again, and it will autofill to `zebra.jpg`.
|
||||
|
||||
More text completion features (such as autocompleting seeds) are on their way.
|
589
docs/features/CLI.md
Normal file
@ -0,0 +1,589 @@
|
||||
---
|
||||
title: Command-Line Interface
|
||||
---
|
||||
|
||||
# :material-bash: CLI
|
||||
|
||||
## **Interactive Command Line Interface**
|
||||
|
||||
The InvokeAI command line interface (CLI) provides scriptable access
|
||||
to InvokeAI's features.Some advanced features are only available
|
||||
through the CLI, though they eventually find their way into the WebUI.
|
||||
|
||||
The CLI is accessible from the `invoke.sh`/`invoke.bat` launcher by
|
||||
selecting option (1). Alternatively, it can be launched directly from
|
||||
the command line by activating the InvokeAI environment and giving the
|
||||
command:
|
||||
|
||||
```bash
|
||||
invokeai
|
||||
```
|
||||
|
||||
After some startup messages, you will be presented with the `invoke> `
|
||||
prompt. Here you can type prompts to generate images and issue other
|
||||
commands to load and manipulate generative models. The CLI has a large
|
||||
number of command-line options that control its behavior. To get a
|
||||
concise summary of the options, call `invokeai` with the `--help` argument:
|
||||
|
||||
```bash
|
||||
invokeai --help
|
||||
```
|
||||
|
||||
The script uses the readline library to allow for in-line editing, command
|
||||
history (++up++ and ++down++), autocompletion, and more. To help keep track of
|
||||
which prompts generated which images, the script writes a log file of image
|
||||
names and prompts to the selected output directory.
|
||||
|
||||
Here is a typical session
|
||||
|
||||
```bash
|
||||
PS1:C:\Users\fred> invokeai
|
||||
* Initializing, be patient...
|
||||
* Initializing, be patient...
|
||||
>> Initialization file /home/lstein/invokeai/invokeai.init found. Loading...
|
||||
>> Internet connectivity is True
|
||||
>> InvokeAI, version 2.3.0-rc5
|
||||
>> InvokeAI runtime directory is "/home/lstein/invokeai"
|
||||
>> GFPGAN Initialized
|
||||
>> CodeFormer Initialized
|
||||
>> ESRGAN Initialized
|
||||
>> Using device_type cuda
|
||||
>> xformers memory-efficient attention is available and enabled
|
||||
(...more initialization messages...)
|
||||
* Initialization done! Awaiting your command (-h for help, 'q' to quit)
|
||||
invoke> ashley judd riding a camel -n2 -s150
|
||||
Outputs:
|
||||
outputs/img-samples/00009.png: "ashley judd riding a camel" -n2 -s150 -S 416354203
|
||||
outputs/img-samples/00010.png: "ashley judd riding a camel" -n2 -s150 -S 1362479620
|
||||
|
||||
invoke> "there's a fly in my soup" -n6 -g
|
||||
outputs/img-samples/00011.png: "there's a fly in my soup" -n6 -g -S 2685670268
|
||||
seeds for individual rows: [2685670268, 1216708065, 2335773498, 822223658, 714542046, 3395302430]
|
||||
invoke> q
|
||||
```
|
||||
|
||||

|
||||
|
||||
## Arguments
|
||||
|
||||
The script recognizes a series of command-line switches that will
|
||||
change important global defaults, such as the directory for image
|
||||
outputs and the location of the model weight files.
|
||||
|
||||
### List of arguments recognized at the command line
|
||||
|
||||
These command-line arguments can be passed to `invoke.py` when you first run it
|
||||
from the Windows, Mac or Linux command line. Some set defaults that can be
|
||||
overridden on a per-prompt basis (see
|
||||
[List of prompt arguments](#list-of-prompt-arguments). Others
|
||||
|
||||
| Argument <img width="240" align="right"/> | Shortcut <img width="100" align="right"/> | Default <img width="320" align="right"/> | Description |
|
||||
| ----------------------------------------- | ----------------------------------------- | ---------------------------------------------- | ---------------------------------------------------------------------------------------------------- |
|
||||
| `--help` | `-h` | | Print a concise help message. |
|
||||
| `--outdir <path>` | `-o<path>` | `outputs/img_samples` | Location for generated images. |
|
||||
| `--prompt_as_dir` | `-p` | `False` | Name output directories using the prompt text. |
|
||||
| `--from_file <path>` | | `None` | Read list of prompts from a file. Use `-` to read from standard input |
|
||||
| `--model <modelname>` | | `stable-diffusion-1.5` | Loads the initial model specified in configs/models.yaml. |
|
||||
| `--ckpt_convert ` | | `False` | If provided both .ckpt and .safetensors files will be auto-converted into diffusers format in memory |
|
||||
| `--autoconvert <path>` | | `None` | On startup, scan the indicated directory for new .ckpt/.safetensor files and automatically convert and import them |
|
||||
| `--precision` | | `fp16` | Provide `fp32` for full precision mode, `fp16` for half-precision. `fp32` needed for Macintoshes and some NVidia cards. |
|
||||
| `--png_compression <0-9>` | `-z<0-9>` | `6` | Select level of compression for output files, from 0 (no compression) to 9 (max compression) |
|
||||
| `--safety-checker` | | `False` | Activate safety checker for NSFW and other potentially disturbing imagery |
|
||||
| `--patchmatch`, `--no-patchmatch` | | `--patchmatch` | Load/Don't load the PatchMatch inpainting extension |
|
||||
| `--xformers`, `--no-xformers` | | `--xformers` | Load/Don't load the Xformers memory-efficient attention module (CUDA only) |
|
||||
| `--web` | | `False` | Start in web server mode |
|
||||
| `--host <ip addr>` | | `localhost` | Which network interface web server should listen on. Set to 0.0.0.0 to listen on any. |
|
||||
| `--port <port>` | | `9090` | Which port web server should listen for requests on. |
|
||||
| `--config <path>` | | `configs/models.yaml` | Configuration file for models and their weights. |
|
||||
| `--iterations <int>` | `-n<int>` | `1` | How many images to generate per prompt. |
|
||||
| `--width <int>` | `-W<int>` | `512` | Width of generated image |
|
||||
| `--height <int>` | `-H<int>` | `512` | Height of generated image | `--steps <int>` | `-s<int>` | `50` | How many steps of refinement to apply |
|
||||
| `--strength <float>` | `-s<float>` | `0.75` | For img2img: how hard to try to match the prompt to the initial image. Ranges from 0.0-0.99, with higher values replacing the initial image completely. |
|
||||
| `--fit` | `-F` | `False` | For img2img: scale the init image to fit into the specified -H and -W dimensions |
|
||||
| `--grid` | `-g` | `False` | Save all image series as a grid rather than individually. |
|
||||
| `--sampler <sampler>` | `-A<sampler>` | `k_lms` | Sampler to use. Use `-h` to get list of available samplers. |
|
||||
| `--seamless` | | `False` | Create interesting effects by tiling elements of the image. |
|
||||
| `--embedding_path <path>` | | `None` | Path to pre-trained embedding manager checkpoints, for custom models |
|
||||
| `--gfpgan_model_path` | | `experiments/pretrained_models/GFPGANv1.4.pth` | Path to GFPGAN model file. |
|
||||
| `--free_gpu_mem` | | `False` | Free GPU memory after sampling, to allow image decoding and saving in low VRAM conditions |
|
||||
| `--precision` | | `auto` | Set model precision, default is selected by device. Options: auto, float32, float16, autocast |
|
||||
|
||||
!!! warning "These arguments are deprecated but still work"
|
||||
|
||||
<div align="center" markdown>
|
||||
|
||||
| Argument | Shortcut | Default | Description |
|
||||
|--------------------|------------|---------------------|--------------|
|
||||
| `--full_precision` | | `False` | Same as `--precision=fp32`|
|
||||
| `--weights <path>` | | `None` | Path to weights file; use `--model stable-diffusion-1.4` instead |
|
||||
| `--laion400m` | `-l` | `False` | Use older LAION400m weights; use `--model=laion400m` instead |
|
||||
|
||||
</div>
|
||||
|
||||
!!! tip
|
||||
|
||||
On Windows systems, you may run into
|
||||
problems when passing the invoke script standard backslashed path
|
||||
names because the Python interpreter treats "\" as an escape.
|
||||
You can either double your slashes (ick): `C:\\path\\to\\my\\file`, or
|
||||
use Linux/Mac style forward slashes (better): `C:/path/to/my/file`.
|
||||
|
||||
## The .invokeai initialization file
|
||||
|
||||
To start up invoke.py with your preferred settings, place your desired
|
||||
startup options in a file in your home directory named `.invokeai` The
|
||||
file should contain the startup options as you would type them on the
|
||||
command line (`--steps=10 --grid`), one argument per line, or a
|
||||
mixture of both using any of the accepted command switch formats:
|
||||
|
||||
!!! example "my unmodified initialization file"
|
||||
|
||||
```bash title="~/.invokeai" linenums="1"
|
||||
# InvokeAI initialization file
|
||||
# This is the InvokeAI initialization file, which contains command-line default values.
|
||||
# Feel free to edit. If anything goes wrong, you can re-initialize this file by deleting
|
||||
# or renaming it and then running invokeai-configure again.
|
||||
|
||||
# The --root option below points to the folder in which InvokeAI stores its models, configs and outputs.
|
||||
--root="/Users/mauwii/invokeai"
|
||||
|
||||
# the --outdir option controls the default location of image files.
|
||||
--outdir="/Users/mauwii/invokeai/outputs"
|
||||
|
||||
# You may place other frequently-used startup commands here, one or more per line.
|
||||
# Examples:
|
||||
# --web --host=0.0.0.0
|
||||
# --steps=20
|
||||
# -Ak_euler_a -C10.0
|
||||
```
|
||||
|
||||
!!! note
|
||||
|
||||
The initialization file only accepts the command line arguments.
|
||||
There are additional arguments that you can provide on the `invoke>` command
|
||||
line (such as `-n` or `--iterations`) that cannot be entered into this file.
|
||||
Also be alert for empty blank lines at the end of the file, which will cause
|
||||
an arguments error at startup time.
|
||||
|
||||
## List of prompt arguments
|
||||
|
||||
After the invoke.py script initializes, it will present you with a `invoke>`
|
||||
prompt. Here you can enter information to generate images from text
|
||||
([txt2img](#txt2img)), to embellish an existing image or sketch
|
||||
([img2img](#img2img)), or to selectively alter chosen regions of the image
|
||||
([inpainting](#inpainting)).
|
||||
|
||||
### txt2img
|
||||
|
||||
!!! example ""
|
||||
|
||||
```bash
|
||||
invoke> waterfall and rainbow -W640 -H480
|
||||
```
|
||||
|
||||
This will create the requested image with the dimensions 640 (width)
|
||||
and 480 (height).
|
||||
|
||||
Here are the invoke> command that apply to txt2img:
|
||||
|
||||
| Argument <img width="680" align="right"/> | Shortcut <img width="420" align="right"/> | Default <img width="480" align="right"/> | Description |
|
||||
| ----------------------------------------- | ----------------------------------------- | ---------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
||||
| "my prompt" | | | Text prompt to use. The quotation marks are optional. |
|
||||
| `--width <int>` | `-W<int>` | `512` | Width of generated image |
|
||||
| `--height <int>` | `-H<int>` | `512` | Height of generated image |
|
||||
| `--iterations <int>` | `-n<int>` | `1` | How many images to generate from this prompt |
|
||||
| `--steps <int>` | `-s<int>` | `50` | How many steps of refinement to apply |
|
||||
| `--cfg_scale <float>` | `-C<float>` | `7.5` | How hard to try to match the prompt to the generated image; any number greater than 1.0 works, but the useful range is roughly 5.0 to 20.0 |
|
||||
| `--seed <int>` | `-S<int>` | `None` | Set the random seed for the next series of images. This can be used to recreate an image generated previously. |
|
||||
| `--sampler <sampler>` | `-A<sampler>` | `k_lms` | Sampler to use. Use -h to get list of available samplers. |
|
||||
| `--karras_max <int>` | | `29` | When using k\_\* samplers, set the maximum number of steps before shifting from using the Karras noise schedule (good for low step counts) to the LatentDiffusion noise schedule (good for high step counts) This value is sticky. [29] |
|
||||
| `--hires_fix` | | | Larger images often have duplication artefacts. This option suppresses duplicates by generating the image at low res, and then using img2img to increase the resolution |
|
||||
| `--png_compression <0-9>` | `-z<0-9>` | `6` | Select level of compression for output files, from 0 (no compression) to 9 (max compression) |
|
||||
| `--grid` | `-g` | `False` | Turn on grid mode to return a single image combining all the images generated by this prompt |
|
||||
| `--individual` | `-i` | `True` | Turn off grid mode (deprecated; leave off --grid instead) |
|
||||
| `--outdir <path>` | `-o<path>` | `outputs/img_samples` | Temporarily change the location of these images |
|
||||
| `--seamless` | | `False` | Activate seamless tiling for interesting effects |
|
||||
| `--seamless_axes` | | `x,y` | Specify which axes to use circular convolution on. |
|
||||
| `--log_tokenization` | `-t` | `False` | Display a color-coded list of the parsed tokens derived from the prompt |
|
||||
| `--skip_normalization` | `-x` | `False` | Weighted subprompts will not be normalized. See [Weighted Prompts](./OTHER.md#weighted-prompts) |
|
||||
| `--upscale <int> <float>` | `-U <int> <float>` | `-U 1 0.75` | Upscale image by magnification factor (2, 4), and set strength of upscaling (0.0-1.0). If strength not set, will default to 0.75. |
|
||||
| `--facetool_strength <float>` | `-G <float> ` | `-G0` | Fix faces (defaults to using the GFPGAN algorithm); argument indicates how hard the algorithm should try (0.0-1.0) |
|
||||
| `--facetool <name>` | `-ft <name>` | `-ft gfpgan` | Select face restoration algorithm to use: gfpgan, codeformer |
|
||||
| `--codeformer_fidelity` | `-cf <float>` | `0.75` | Used along with CodeFormer. Takes values between 0 and 1. 0 produces high quality but low accuracy. 1 produces high accuracy but low quality |
|
||||
| `--save_original` | `-save_orig` | `False` | When upscaling or fixing faces, this will cause the original image to be saved rather than replaced. |
|
||||
| `--variation <float>` | `-v<float>` | `0.0` | Add a bit of noise (0.0=none, 1.0=high) to the image in order to generate a series of variations. Usually used in combination with `-S<seed>` and `-n<int>` to generate a series a riffs on a starting image. See [Variations](./VARIATIONS.md). |
|
||||
| `--with_variations <pattern>` | | `None` | Combine two or more variations. See [Variations](./VARIATIONS.md) for now to use this. |
|
||||
| `--save_intermediates <n>` | | `None` | Save the image from every nth step into an "intermediates" folder inside the output directory |
|
||||
| `--h_symmetry_time_pct <float>` | | `None` | Create symmetry along the X axis at the desired percent complete of the generation process. (Must be between 0.0 and 1.0; set to a very small number like 0.0001 for just after the first step of generation.) |
|
||||
| `--v_symmetry_time_pct <float>` | | `None` | Create symmetry along the Y axis at the desired percent complete of the generation process. (Must be between 0.0 and 1.0; set to a very small number like 0.0001 for just after the first step of generation.) |
|
||||
|
||||
!!! note
|
||||
|
||||
the width and height of the image must be multiples of 64. You can
|
||||
provide different values, but they will be rounded down to the nearest multiple
|
||||
of 64.
|
||||
|
||||
!!! example "This is a example of img2img"
|
||||
|
||||
```bash
|
||||
invoke> waterfall and rainbow -I./vacation-photo.png -W640 -H480 --fit
|
||||
```
|
||||
|
||||
This will modify the indicated vacation photograph by making it more like the
|
||||
prompt. Results will vary greatly depending on what is in the image. We also ask
|
||||
to --fit the image into a box no bigger than 640x480. Otherwise the image size
|
||||
will be identical to the provided photo and you may run out of memory if it is
|
||||
large.
|
||||
|
||||
In addition to the command-line options recognized by txt2img, img2img accepts
|
||||
additional options:
|
||||
|
||||
| Argument <img width="160" align="right"/> | Shortcut | Default | Description |
|
||||
| ----------------------------------------- | ----------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------------ |
|
||||
| `--init_img <path>` | `-I<path>` | `None` | Path to the initialization image |
|
||||
| `--fit` | `-F` | `False` | Scale the image to fit into the specified -H and -W dimensions |
|
||||
| `--strength <float>` | `-s<float>` | `0.75` | How hard to try to match the prompt to the initial image. Ranges from 0.0-0.99, with higher values replacing the initial image completely. |
|
||||
|
||||
### inpainting
|
||||
|
||||
!!! example ""
|
||||
|
||||
```bash
|
||||
invoke> waterfall and rainbow -I./vacation-photo.png -M./vacation-mask.png -W640 -H480 --fit
|
||||
```
|
||||
|
||||
This will do the same thing as img2img, but image alterations will
|
||||
only occur within transparent areas defined by the mask file specified
|
||||
by `-M`. You may also supply just a single initial image with the areas
|
||||
to overpaint made transparent, but you must be careful not to destroy
|
||||
the pixels underneath when you create the transparent areas. See
|
||||
[Inpainting](./INPAINTING.md) for details.
|
||||
|
||||
inpainting accepts all the arguments used for txt2img and img2img, as well as
|
||||
the --mask (-M) and --text_mask (-tm) arguments:
|
||||
|
||||
| Argument <img width="100" align="right"/> | Shortcut | Default | Description |
|
||||
| ----------------------------------------- | ------------------------ | ------- | ------------------------------------------------------------------------------------------------ |
|
||||
| `--init_mask <path>` | `-M<path>` | `None` | Path to an image the same size as the initial_image, with areas for inpainting made transparent. |
|
||||
| `--invert_mask ` | | False | If true, invert the mask so that transparent areas are opaque and vice versa. |
|
||||
| `--text_mask <prompt> [<float>]` | `-tm <prompt> [<float>]` | <none> | Create a mask from a text prompt describing part of the image |
|
||||
|
||||
The mask may either be an image with transparent areas, in which case the
|
||||
inpainting will occur in the transparent areas only, or a black and white image,
|
||||
in which case all black areas will be painted into.
|
||||
|
||||
`--text_mask` (short form `-tm`) is a way to generate a mask using a text
|
||||
description of the part of the image to replace. For example, if you have an
|
||||
image of a breakfast plate with a bagel, toast and scrambled eggs, you can
|
||||
selectively mask the bagel and replace it with a piece of cake this way:
|
||||
|
||||
```bash
|
||||
invoke> a piece of cake -I /path/to/breakfast.png -tm bagel
|
||||
```
|
||||
|
||||
The algorithm uses <a
|
||||
href="https://github.com/timojl/clipseg">clipseg</a> to classify different
|
||||
regions of the image. The classifier puts out a confidence score for each region
|
||||
it identifies. Generally regions that score above 0.5 are reliable, but if you
|
||||
are getting too much or too little masking you can adjust the threshold down (to
|
||||
get more mask), or up (to get less). In this example, by passing `-tm` a higher
|
||||
value, we are insisting on a more stringent classification.
|
||||
|
||||
```bash
|
||||
invoke> a piece of cake -I /path/to/breakfast.png -tm bagel 0.6
|
||||
```
|
||||
|
||||
### Custom Styles and Subjects
|
||||
|
||||
You can load and use hundreds of community-contributed Textual
|
||||
Inversion models just by typing the appropriate trigger phrase. Please
|
||||
see [Concepts Library](CONCEPTS.md) for more details.
|
||||
|
||||
## Other Commands
|
||||
|
||||
The CLI offers a number of commands that begin with "!".
|
||||
|
||||
### Postprocessing images
|
||||
|
||||
To postprocess a file using face restoration or upscaling, use the `!fix`
|
||||
command.
|
||||
|
||||
#### `!fix`
|
||||
|
||||
This command runs a post-processor on a previously-generated image. It takes a
|
||||
PNG filename or path and applies your choice of the `-U`, `-G`, or `--embiggen`
|
||||
switches in order to fix faces or upscale. If you provide a filename, the script
|
||||
will look for it in the current output directory. Otherwise you can provide a
|
||||
full or partial path to the desired file.
|
||||
|
||||
Some examples:
|
||||
|
||||
!!! example "Upscale to 4X its original size and fix faces using codeformer"
|
||||
|
||||
```bash
|
||||
invoke> !fix 0000045.4829112.png -G1 -U4 -ft codeformer
|
||||
```
|
||||
|
||||
!!! example "Use the GFPGAN algorithm to fix faces, then upscale to 3X using --embiggen"
|
||||
|
||||
```bash
|
||||
invoke> !fix 0000045.4829112.png -G0.8 -ft gfpgan
|
||||
>> fixing outputs/img-samples/0000045.4829112.png
|
||||
>> retrieved seed 4829112 and prompt "boy enjoying a banana split"
|
||||
>> GFPGAN - Restoring Faces for image seed:4829112
|
||||
Outputs:
|
||||
[1] outputs/img-samples/000017.4829112.gfpgan-00.png: !fix "outputs/img-samples/0000045.4829112.png" -s 50 -S -W 512 -H 512 -C 7.5 -A k_lms -G 0.8
|
||||
```
|
||||
|
||||
#### `!mask`
|
||||
|
||||
This command takes an image, a text prompt, and uses the `clipseg` algorithm to
|
||||
automatically generate a mask of the area that matches the text prompt. It is
|
||||
useful for debugging the text masking process prior to inpainting with the
|
||||
`--text_mask` argument. See [INPAINTING.md] for details.
|
||||
|
||||
### Model selection and importation
|
||||
|
||||
The CLI allows you to add new models on the fly, as well as to switch
|
||||
among them rapidly without leaving the script. There are several
|
||||
different model formats, each described in the [Model Installation
|
||||
Guide](../installation/050_INSTALLING_MODELS.md).
|
||||
|
||||
#### `!models`
|
||||
|
||||
This prints out a list of the models defined in `config/models.yaml'. The active
|
||||
model is bold-faced
|
||||
|
||||
Example:
|
||||
|
||||
<pre>
|
||||
inpainting-1.5 not loaded Stable Diffusion inpainting model
|
||||
<b>stable-diffusion-1.5 active Stable Diffusion v1.5</b>
|
||||
waifu-diffusion not loaded Waifu Diffusion v1.4
|
||||
</pre>
|
||||
|
||||
#### `!switch <model>`
|
||||
|
||||
This quickly switches from one model to another without leaving the CLI script.
|
||||
`invoke.py` uses a memory caching system; once a model has been loaded,
|
||||
switching back and forth is quick. The following example shows this in action.
|
||||
Note how the second column of the `!models` table changes to `cached` after a
|
||||
model is first loaded, and that the long initialization step is not needed when
|
||||
loading a cached model.
|
||||
|
||||
#### `!import_model <hugging_face_repo_ID>`
|
||||
|
||||
This imports and installs a `diffusers`-style model that is stored on
|
||||
the [HuggingFace Web Site](https://huggingface.co). You can look up
|
||||
any [Stable Diffusion diffusers
|
||||
model](https://huggingface.co/models?library=diffusers) and install it
|
||||
with a command like the following:
|
||||
|
||||
```bash
|
||||
!import_model prompthero/openjourney
|
||||
```
|
||||
|
||||
#### `!import_model <path/to/diffusers/directory>`
|
||||
|
||||
If you have a copy of a `diffusers`-style model saved to disk, you can
|
||||
import it by passing the path to model's top-level directory.
|
||||
|
||||
#### `!import_model <url>`
|
||||
|
||||
For a `.ckpt` or `.safetensors` file, if you have a direct download
|
||||
URL for the file, you can provide it to `!import_model` and the file
|
||||
will be downloaded and installed for you.
|
||||
|
||||
#### `!import_model <path/to/model/weights.ckpt>`
|
||||
|
||||
This command imports a new model weights file into InvokeAI, makes it available
|
||||
for image generation within the script, and writes out the configuration for the
|
||||
model into `config/models.yaml` for use in subsequent sessions.
|
||||
|
||||
Provide `!import_model` with the path to a weights file ending in `.ckpt`. If
|
||||
you type a partial path and press tab, the CLI will autocomplete. Although it
|
||||
will also autocomplete to `.vae` files, these are not currenty supported (but
|
||||
will be soon).
|
||||
|
||||
When you hit return, the CLI will prompt you to fill in additional information
|
||||
about the model, including the short name you wish to use for it with the
|
||||
`!switch` command, a brief description of the model, the default image width and
|
||||
height to use with this model, and the model's configuration file. The latter
|
||||
three fields are automatically filled with reasonable defaults. In the example
|
||||
below, the bold-faced text shows what the user typed in with the exception of
|
||||
the width, height and configuration file paths, which were filled in
|
||||
automatically.
|
||||
|
||||
#### `!import_model <path/to/directory_of_models>`
|
||||
|
||||
If you provide the path of a directory that contains one or more
|
||||
`.ckpt` or `.safetensors` files, the CLI will scan the directory and
|
||||
interactively offer to import the models it finds there. Also see the
|
||||
`--autoconvert` command-line option.
|
||||
|
||||
#### `!edit_model <name_of_model>`
|
||||
|
||||
The `!edit_model` command can be used to modify a model that is already defined
|
||||
in `config/models.yaml`. Call it with the short name of the model you wish to
|
||||
modify, and it will allow you to modify the model's `description`, `weights` and
|
||||
other fields.
|
||||
|
||||
Example:
|
||||
|
||||
<pre>
|
||||
invoke> <b>!edit_model waifu-diffusion</b>
|
||||
>> Editing model waifu-diffusion from configuration file ./configs/models.yaml
|
||||
description: <b>Waifu diffusion v1.4beta</b>
|
||||
weights: models/ldm/stable-diffusion-v1/<b>model-epoch10-float16.ckpt</b>
|
||||
config: configs/stable-diffusion/v1-inference.yaml
|
||||
width: 512
|
||||
height: 512
|
||||
|
||||
>> New configuration:
|
||||
waifu-diffusion:
|
||||
config: configs/stable-diffusion/v1-inference.yaml
|
||||
description: Waifu diffusion v1.4beta
|
||||
weights: models/ldm/stable-diffusion-v1/model-epoch10-float16.ckpt
|
||||
height: 512
|
||||
width: 512
|
||||
|
||||
OK to import [n]? y
|
||||
>> Caching model stable-diffusion-1.4 in system RAM
|
||||
>> Loading waifu-diffusion from models/ldm/stable-diffusion-v1/model-epoch10-float16.ckpt
|
||||
...
|
||||
</pre>
|
||||
|
||||
### History processing
|
||||
|
||||
The CLI provides a series of convenient commands for reviewing previous actions,
|
||||
retrieving them, modifying them, and re-running them.
|
||||
|
||||
#### `!history`
|
||||
|
||||
The invoke script keeps track of all the commands you issue during a session,
|
||||
allowing you to re-run them. On Mac and Linux systems, it also writes the
|
||||
command-line history out to disk, giving you access to the most recent 1000
|
||||
commands issued.
|
||||
|
||||
The `!history` command will return a numbered list of all the commands issued
|
||||
during the session (Windows), or the most recent 1000 commands (Mac|Linux). You
|
||||
can then repeat a command by using the command `!NNN`, where "NNN" is the
|
||||
history line number. For example:
|
||||
|
||||
!!! example ""
|
||||
|
||||
```bash
|
||||
invoke> !history
|
||||
...
|
||||
[14] happy woman sitting under tree wearing broad hat and flowing garment
|
||||
[15] beautiful woman sitting under tree wearing broad hat and flowing garment
|
||||
[18] beautiful woman sitting under tree wearing broad hat and flowing garment -v0.2 -n6
|
||||
[20] watercolor of beautiful woman sitting under tree wearing broad hat and flowing garment -v0.2 -n6 -S2878767194
|
||||
[21] surrealist painting of beautiful woman sitting under tree wearing broad hat and flowing garment -v0.2 -n6 -S2878767194
|
||||
...
|
||||
invoke> !20
|
||||
invoke> watercolor of beautiful woman sitting under tree wearing broad hat and flowing garment -v0.2 -n6 -S2878767194
|
||||
```
|
||||
|
||||
####`!fetch`
|
||||
|
||||
This command retrieves the generation parameters from a previously generated
|
||||
image and either loads them into the command line (Linux|Mac), or prints them
|
||||
out in a comment for copy-and-paste (Windows). You may provide either the name
|
||||
of a file in the current output directory, or a full file path. Specify path to
|
||||
a folder with image png files, and wildcard \*.png to retrieve the dream command
|
||||
used to generate the images, and save them to a file commands.txt for further
|
||||
processing.
|
||||
|
||||
!!! example "load the generation command for a single png file"
|
||||
|
||||
```bash
|
||||
invoke> !fetch 0000015.8929913.png
|
||||
# the script returns the next line, ready for editing and running:
|
||||
invoke> a fantastic alien landscape -W 576 -H 512 -s 60 -A plms -C 7.5
|
||||
```
|
||||
|
||||
!!! example "fetch the generation commands from a batch of files and store them into `selected.txt`"
|
||||
|
||||
```bash
|
||||
invoke> !fetch outputs\selected-imgs\*.png selected.txt
|
||||
```
|
||||
|
||||
#### `!replay`
|
||||
|
||||
This command replays a text file generated by !fetch or created manually
|
||||
|
||||
!!! example
|
||||
|
||||
```bash
|
||||
invoke> !replay outputs\selected-imgs\selected.txt
|
||||
```
|
||||
|
||||
!!! note
|
||||
|
||||
These commands may behave unexpectedly if given a PNG file that was
|
||||
not generated by InvokeAI.
|
||||
|
||||
#### `!search <search string>`
|
||||
|
||||
This is similar to !history but it only returns lines that contain
|
||||
`search string`. For example:
|
||||
|
||||
```bash
|
||||
invoke> !search surreal
|
||||
[21] surrealist painting of beautiful woman sitting under tree wearing broad hat and flowing garment -v0.2 -n6 -S2878767194
|
||||
```
|
||||
|
||||
#### `!clear`
|
||||
|
||||
This clears the search history from memory and disk. Be advised that this
|
||||
operation is irreversible and does not issue any warnings!
|
||||
|
||||
## Command-line editing and completion
|
||||
|
||||
The command-line offers convenient history tracking, editing, and command
|
||||
completion.
|
||||
|
||||
- To scroll through previous commands and potentially edit/reuse them, use the
|
||||
++up++ and ++down++ keys.
|
||||
- To edit the current command, use the ++left++ and ++right++ keys to position
|
||||
the cursor, and then ++backspace++, ++delete++ or insert characters.
|
||||
- To move to the very beginning of the command, type ++ctrl+a++ (or
|
||||
++command+a++ on the Mac)
|
||||
- To move to the end of the command, type ++ctrl+e++.
|
||||
- To cut a section of the command, position the cursor where you want to start
|
||||
cutting and type ++ctrl+k++
|
||||
- To paste a cut section back in, position the cursor where you want to paste,
|
||||
and type ++ctrl+y++
|
||||
|
||||
Windows users can get similar, but more limited, functionality if they launch
|
||||
`invoke.py` with the `winpty` program and have the `pyreadline3` library
|
||||
installed:
|
||||
|
||||
```batch
|
||||
> winpty python scripts\invoke.py
|
||||
```
|
||||
|
||||
On the Mac and Linux platforms, when you exit invoke.py, the last 1000 lines of
|
||||
your command-line history will be saved. When you restart `invoke.py`, you can
|
||||
access the saved history using the ++up++ key.
|
||||
|
||||
In addition, limited command-line completion is installed. In various contexts,
|
||||
you can start typing your command and press ++tab++. A list of potential
|
||||
completions will be presented to you. You can then type a little more, hit
|
||||
++tab++ again, and eventually autocomplete what you want.
|
||||
|
||||
When specifying file paths using the one-letter shortcuts, the CLI will attempt
|
||||
to complete pathnames for you. This is most handy for the `-I` (init image) and
|
||||
`-M` (init mask) paths. To initiate completion, start the path with a slash
|
||||
(`/`) or `./`. For example:
|
||||
|
||||
```bash
|
||||
invoke> zebra with a mustache -I./test-pictures<TAB>
|
||||
-I./test-pictures/Lincoln-and-Parrot.png -I./test-pictures/zebra.jpg -I./test-pictures/madonna.png
|
||||
-I./test-pictures/bad-sketch.png -I./test-pictures/man_with_eagle/
|
||||
```
|
||||
|
||||
You can then type ++z++, hit ++tab++ again, and it will autofill to `zebra.jpg`.
|
||||
|
||||
More text completion features (such as autocompleting seeds) are on their way.
|
@ -1,11 +1,8 @@
|
||||
---
|
||||
title: Textual Inversion Embeddings and LoRAs
|
||||
title: Concepts Library
|
||||
---
|
||||
|
||||
# :material-library-shelves: Textual Inversions and LoRAs
|
||||
|
||||
With the advances in research, many new capabilities are available to customize the knowledge and understanding of novel concepts not originally contained in the base model.
|
||||
|
||||
# :material-library-shelves: The Hugging Face Concepts Library and Importing Textual Inversion files
|
||||
|
||||
## Using Textual Inversion Files
|
||||
|
||||
@ -15,16 +12,18 @@ and artistic styles. They are also known as "embeds" in the machine learning
|
||||
world.
|
||||
|
||||
Each TI file introduces one or more vocabulary terms to the SD model. These are
|
||||
known in InvokeAI as "triggers." Triggers are denoted using angle brackets
|
||||
as in "<trigger-phrase>". The two most common type of
|
||||
known in InvokeAI as "triggers." Triggers are often, but not always, denoted
|
||||
using angle brackets as in "<trigger-phrase>". The two most common type of
|
||||
TI files that you'll encounter are `.pt` and `.bin` files, which are produced by
|
||||
different TI training packages. InvokeAI supports both formats, but its
|
||||
[built-in TI training system](TRAINING.md) produces `.pt`.
|
||||
[built-in TI training system](TEXTUAL_INVERSION.md) produces `.pt`.
|
||||
|
||||
The [Hugging Face company](https://huggingface.co/sd-concepts-library) has
|
||||
amassed a large ligrary of >800 community-contributed TI files covering a
|
||||
broad range of subjects and styles. You can also install your own or others' TI files
|
||||
by placing them in the designated directory for the compatible model type
|
||||
broad range of subjects and styles. InvokeAI has built-in support for this
|
||||
library which downloads and merges TI files automatically upon request. You can
|
||||
also install your own or others' TI files by placing them in a designated
|
||||
directory.
|
||||
|
||||
### An Example
|
||||
|
||||
@ -42,47 +41,91 @@ You can also combine styles and concepts:
|
||||
| :--------------------------------------------------------: |
|
||||
|  |
|
||||
</figure>
|
||||
## Using a Hugging Face Concept
|
||||
|
||||
!!! warning "Authenticating to HuggingFace"
|
||||
|
||||
Some concepts require valid authentication to HuggingFace. Without it, they will not be downloaded
|
||||
and will be silently ignored.
|
||||
|
||||
If you used an installer to install InvokeAI, you may have already set a HuggingFace token.
|
||||
If you skipped this step, you can:
|
||||
|
||||
- run the InvokeAI configuration script again (if you used a manual installer): `invokeai-configure`
|
||||
- set one of the `HUGGINGFACE_TOKEN` or `HUGGING_FACE_HUB_TOKEN` environment variables to contain your token
|
||||
|
||||
Finally, if you already used any HuggingFace library on your computer, you might already have a token
|
||||
in your local cache. Check for a hidden `.huggingface` directory in your home folder. If it
|
||||
contains a `token` file, then you are all set.
|
||||
|
||||
|
||||
Hugging Face TI concepts are downloaded and installed automatically as you
|
||||
require them. This requires your machine to be connected to the Internet. To
|
||||
find out what each concept is for, you can browse the
|
||||
[Hugging Face concepts library](https://huggingface.co/sd-concepts-library) and
|
||||
look at examples of what each concept produces.
|
||||
|
||||
When you have an idea of a concept you wish to try, go to the command-line
|
||||
client (CLI) and type a `<` character and the beginning of the Hugging Face
|
||||
concept name you wish to load. Press ++tab++, and the CLI will show you all
|
||||
matching concepts. You can also type `<` and hit ++tab++ to get a listing of all
|
||||
~800 concepts, but be prepared to scroll up to see them all! If there is more
|
||||
than one match you can continue to type and ++tab++ until the concept is
|
||||
completed.
|
||||
|
||||
!!! example
|
||||
|
||||
if you type in `<x` and hit ++tab++, you'll be prompted with the completions:
|
||||
|
||||
```py
|
||||
<xatu2> <xatu> <xbh> <xi> <xidiversity> <xioboma> <xuna> <xyz>
|
||||
```
|
||||
|
||||
Now type `id` and press ++tab++. It will be autocompleted to `<xidiversity>`
|
||||
because this is a unique match.
|
||||
|
||||
Finish your prompt and generate as usual. You may include multiple concept terms
|
||||
in the prompt.
|
||||
|
||||
If you have never used this concept before, you will see a message that the TI
|
||||
model is being downloaded and installed. After this, the concept will be saved
|
||||
locally (in the `models/sd-concepts-library` directory) for future use.
|
||||
|
||||
Several steps happen during downloading and installation, including a scan of
|
||||
the file for malicious code. Should any errors occur, you will be warned and the
|
||||
concept will fail to load. Generation will then continue treating the trigger
|
||||
term as a normal string of characters (e.g. as literal `<ghibli-face>`).
|
||||
|
||||
You can also use `<concept-names>` in the WebGUI's prompt textbox. There is no
|
||||
autocompletion at this time.
|
||||
|
||||
## Installing your Own TI Files
|
||||
|
||||
You may install any number of `.pt` and `.bin` files simply by copying them into
|
||||
the `embedding` directory of the corresponding InvokeAI models directory (usually `invokeai`
|
||||
in your home directory). For example, you can simply move a Stable Diffusion 1.5 embedding file to
|
||||
the `sd-1/embedding` folder. Be careful not to overwrite one file with another.
|
||||
the `embeddings` directory of the InvokeAI runtime directory (usually `invokeai`
|
||||
in your home directory). You may create subdirectories in order to organize the
|
||||
files in any way you wish. Be careful not to overwrite one file with another.
|
||||
For example, TI files generated by the Hugging Face toolkit share the named
|
||||
`learned_embedding.bin`. You can rename these, or use subdirectories to keep them distinct.
|
||||
`learned_embedding.bin`. You can use subdirectories to keep them distinct.
|
||||
|
||||
At startup time, InvokeAI will scan the various `embedding` directories and load any TI
|
||||
files it finds there for compatible models. At startup you will see a message similar to this one:
|
||||
At startup time, InvokeAI will scan the `embeddings` directory and load any TI
|
||||
files it finds there. At startup you will see a message similar to this one:
|
||||
|
||||
```bash
|
||||
>> Current embedding manager terms: <HOI4-Leader>, <princess-knight>
|
||||
>> Current embedding manager terms: *, <HOI4-Leader>, <princess-knight>
|
||||
```
|
||||
To use these when generating, simply type the `<` key in your prompt to open the Textual Inversion WebUI and
|
||||
select the embedding you'd like to use. This UI has type-ahead support, so you can easily find supported embeddings.
|
||||
|
||||
## Using LoRAs
|
||||
Note the `*` trigger term. This is a placeholder term that many early TI
|
||||
tutorials taught people to use rather than a more descriptive term.
|
||||
Unfortunately, if you have multiple TI files that all use this term, only the
|
||||
first one loaded will be triggered by use of the term.
|
||||
|
||||
LoRA files are models that customize the output of Stable Diffusion
|
||||
image generation. Larger than embeddings, but much smaller than full
|
||||
models, they augment SD with improved understanding of subjects and
|
||||
artistic styles.
|
||||
To avoid this problem, you can use the `merge_embeddings.py` script to merge two
|
||||
or more TI files together. If it encounters a collision of terms, the script
|
||||
will prompt you to select new terms that do not collide. See
|
||||
[Textual Inversion](TEXTUAL_INVERSION.md) for details.
|
||||
|
||||
Unlike TI files, LoRAs do not introduce novel vocabulary into the
|
||||
model's known tokens. Instead, LoRAs augment the model's weights that
|
||||
are applied to generate imagery. LoRAs may be supplied with a
|
||||
"trigger" word that they have been explicitly trained on, or may
|
||||
simply apply their effect without being triggered.
|
||||
|
||||
LoRAs are typically stored in .safetensors files, which are the most
|
||||
secure way to store and transmit these types of weights. You may
|
||||
install any number of `.safetensors` LoRA files simply by copying them
|
||||
into the `autoimport/lora` directory of the corresponding InvokeAI models
|
||||
directory (usually `invokeai` in your home directory).
|
||||
|
||||
To use these when generating, open the LoRA menu item in the options
|
||||
panel, select the LoRAs you want to apply and ensure that they have
|
||||
the appropriate weight recommended by the model provider. Typically,
|
||||
most LoRAs perform best at a weight of .75-1.
|
||||
## Further Reading
|
||||
|
||||
Please see [the repository](https://github.com/rinongal/textual_inversion) and
|
||||
associated paper for details and limitations.
|
||||
|
@ -1,282 +0,0 @@
|
||||
---
|
||||
title: Configuration
|
||||
---
|
||||
|
||||
# :material-tune-variant: InvokeAI Configuration
|
||||
|
||||
## Intro
|
||||
|
||||
InvokeAI has numerous runtime settings which can be used to adjust
|
||||
many aspects of its operations, including the location of files and
|
||||
directories, memory usage, and performance. These settings can be
|
||||
viewed and customized in several ways:
|
||||
|
||||
1. By editing settings in the `invokeai.yaml` file.
|
||||
2. By setting environment variables.
|
||||
3. On the command-line, when InvokeAI is launched.
|
||||
|
||||
In addition, the most commonly changed settings are accessible
|
||||
graphically via the `invokeai-configure` script.
|
||||
|
||||
### How the Configuration System Works
|
||||
|
||||
When InvokeAI is launched, the very first thing it needs to do is to
|
||||
find its "root" directory, which contains its configuration files,
|
||||
installed models, its database of images, and the folder(s) of
|
||||
generated images themselves. In this document, the root directory will
|
||||
be referred to as ROOT.
|
||||
|
||||
#### Finding the Root Directory
|
||||
|
||||
To find its root directory, InvokeAI uses the following recipe:
|
||||
|
||||
1. It first looks for the argument `--root <path>` on the command line
|
||||
it was launched from, and uses the indicated path if present.
|
||||
|
||||
2. Next it looks for the environment variable INVOKEAI_ROOT, and uses
|
||||
the directory path found there if present.
|
||||
|
||||
3. If neither of these are present, then InvokeAI looks for the
|
||||
folder containing the `.venv` Python virtual environment directory for
|
||||
the currently active environment. This directory is checked for files
|
||||
expected inside the InvokeAI root before it is used.
|
||||
|
||||
4. Finally, InvokeAI looks for a directory in the current user's home
|
||||
directory named `invokeai`.
|
||||
|
||||
#### Reading the InvokeAI Configuration File
|
||||
|
||||
Once the root directory has been located, InvokeAI looks for a file
|
||||
named `ROOT/invokeai.yaml`, and if present reads configuration values
|
||||
from it. The top of this file looks like this:
|
||||
|
||||
```
|
||||
InvokeAI:
|
||||
Web Server:
|
||||
host: localhost
|
||||
port: 9090
|
||||
allow_origins: []
|
||||
allow_credentials: true
|
||||
allow_methods:
|
||||
- '*'
|
||||
allow_headers:
|
||||
- '*'
|
||||
Features:
|
||||
esrgan: true
|
||||
internet_available: true
|
||||
log_tokenization: false
|
||||
patchmatch: true
|
||||
restore: true
|
||||
...
|
||||
```
|
||||
|
||||
This lines in this file are used to establish default values for
|
||||
Invoke's settings. In the above fragment, the Web Server's listening
|
||||
port is set to 9090 by the `port` setting.
|
||||
|
||||
You can edit this file with a text editor such as "Notepad" (do not
|
||||
use Word or any other word processor). When editing, be careful to
|
||||
maintain the indentation, and do not add extraneous text, as syntax
|
||||
errors will prevent InvokeAI from launching. A basic guide to the
|
||||
format of YAML files can be found
|
||||
[here](https://circleci.com/blog/what-is-yaml-a-beginner-s-guide/).
|
||||
|
||||
You can fix a broken `invokeai.yaml` by deleting it and running the
|
||||
configuration script again -- option [7] in the launcher, "Re-run the
|
||||
configure script".
|
||||
|
||||
#### Reading Environment Variables
|
||||
|
||||
Next InvokeAI looks for defined environment variables in the format
|
||||
`INVOKEAI_<setting_name>`, for example `INVOKEAI_port`. Environment
|
||||
variable values take precedence over configuration file variables. On
|
||||
a Macintosh system, for example, you could change the port that the
|
||||
web server listens on by setting the environment variable this way:
|
||||
|
||||
```
|
||||
export INVOKEAI_port=8000
|
||||
invokeai-web
|
||||
```
|
||||
|
||||
Please check out these
|
||||
[Macintosh](https://phoenixnap.com/kb/set-environment-variable-mac)
|
||||
and
|
||||
[Windows](https://phoenixnap.com/kb/windows-set-environment-variable)
|
||||
guides for setting temporary and permanent environment variables.
|
||||
|
||||
#### Reading the Command Line
|
||||
|
||||
Lastly, InvokeAI takes settings from the command line, which override
|
||||
everything else. The command-line settings have the same name as the
|
||||
corresponding configuration file settings, preceded by a `--`, for
|
||||
example `--port 8000`.
|
||||
|
||||
If you are using the launcher (`invoke.sh` or `invoke.bat`) to launch
|
||||
InvokeAI, then just pass the command-line arguments to the launcher:
|
||||
|
||||
```
|
||||
invoke.bat --port 8000 --host 0.0.0.0
|
||||
```
|
||||
|
||||
The arguments will be applied when you select the web server option
|
||||
(and the other options as well).
|
||||
|
||||
If, on the other hand, you prefer to launch InvokeAI directly from the
|
||||
command line, you would first activate the virtual environment (known
|
||||
as the "developer's console" in the launcher), and run `invokeai-web`:
|
||||
|
||||
```
|
||||
> C:\Users\Fred\invokeai\.venv\scripts\activate
|
||||
(.venv) > invokeai-web --port 8000 --host 0.0.0.0
|
||||
```
|
||||
|
||||
You can get a listing and brief instructions for each of the
|
||||
command-line options by giving the `--help` argument:
|
||||
|
||||
```
|
||||
(.venv) > invokeai-web --help
|
||||
usage: InvokeAI [-h] [--host HOST] [--port PORT] [--allow_origins [ALLOW_ORIGINS ...]] [--allow_credentials | --no-allow_credentials] [--allow_methods [ALLOW_METHODS ...]]
|
||||
[--allow_headers [ALLOW_HEADERS ...]] [--esrgan | --no-esrgan] [--internet_available | --no-internet_available] [--log_tokenization | --no-log_tokenization]
|
||||
[--patchmatch | --no-patchmatch] [--restore | --no-restore]
|
||||
[--always_use_cpu | --no-always_use_cpu] [--free_gpu_mem | --no-free_gpu_mem] [--max_loaded_models MAX_LOADED_MODELS] [--max_cache_size MAX_CACHE_SIZE]
|
||||
[--max_vram_cache_size MAX_VRAM_CACHE_SIZE] [--gpu_mem_reserved GPU_MEM_RESERVED] [--precision {auto,float16,float32,autocast}]
|
||||
[--sequential_guidance | --no-sequential_guidance] [--xformers_enabled | --no-xformers_enabled] [--tiled_decode | --no-tiled_decode] [--root ROOT]
|
||||
[--autoimport_dir AUTOIMPORT_DIR] [--lora_dir LORA_DIR] [--embedding_dir EMBEDDING_DIR] [--controlnet_dir CONTROLNET_DIR] [--conf_path CONF_PATH]
|
||||
[--models_dir MODELS_DIR] [--legacy_conf_dir LEGACY_CONF_DIR] [--db_dir DB_DIR] [--outdir OUTDIR] [--from_file FROM_FILE]
|
||||
[--use_memory_db | --no-use_memory_db] [--model MODEL] [--log_handlers [LOG_HANDLERS ...]] [--log_format {plain,color,syslog,legacy}]
|
||||
[--log_level {debug,info,warning,error,critical}] [--version | --no-version]
|
||||
```
|
||||
|
||||
## The Configuration Settings
|
||||
|
||||
The configuration settings are divided into several distinct
|
||||
groups in `invokeia.yaml`:
|
||||
|
||||
### Web Server
|
||||
|
||||
| Setting | Default Value | Description |
|
||||
|----------|----------------|--------------|
|
||||
| `host` | `localhost` | Name or IP address of the network interface that the web server will listen on |
|
||||
| `port` | `9090` | Network port number that the web server will listen on |
|
||||
| `allow_origins` | `[]` | A list of host names or IP addresses that are allowed to connect to the InvokeAI API in the format `['host1','host2',...]` |
|
||||
| `allow_credentials | `true` | Require credentials for a foreign host to access the InvokeAI API (don't change this) |
|
||||
| `allow_methods` | `*` | List of HTTP methods ("GET", "POST") that the web server is allowed to use when accessing the API |
|
||||
| `allow_headers` | `*` | List of HTTP headers that the web server will accept when accessing the API |
|
||||
|
||||
The documentation for InvokeAI's API can be accessed by browsing to the following URL: [http://localhost:9090/docs].
|
||||
|
||||
### Features
|
||||
|
||||
These configuration settings allow you to enable and disable various InvokeAI features:
|
||||
|
||||
| Setting | Default Value | Description |
|
||||
|----------|----------------|--------------|
|
||||
| `esrgan` | `true` | Activate the ESRGAN upscaling options|
|
||||
| `internet_available` | `true` | When a resource is not available locally, try to fetch it via the internet |
|
||||
| `log_tokenization` | `false` | Before each text2image generation, print a color-coded representation of the prompt to the console; this can help understand why a prompt is not working as expected |
|
||||
| `patchmatch` | `true` | Activate the "patchmatch" algorithm for improved inpainting |
|
||||
| `restore` | `true` | Activate the facial restoration features (DEPRECATED; restoration features will be removed in 3.0.0) |
|
||||
|
||||
### Memory/Performance
|
||||
|
||||
These options tune InvokeAI's memory and performance characteristics.
|
||||
|
||||
| Setting | Default Value | Description |
|
||||
|----------|----------------|--------------|
|
||||
| `always_use_cpu` | `false` | Use the CPU to generate images, even if a GPU is available |
|
||||
| `free_gpu_mem` | `false` | Aggressively free up GPU memory after each operation; this will allow you to run in low-VRAM environments with some performance penalties |
|
||||
| `max_cache_size` | `6` | Amount of CPU RAM (in GB) to reserve for caching models in memory; more cache allows you to keep models in memory and switch among them quickly |
|
||||
| `max_vram_cache_size` | `2.75` | Amount of GPU VRAM (in GB) to reserve for caching models in VRAM; more cache speeds up generation but reduces the size of the images that can be generated. This can be set to zero to maximize the amount of memory available for generation. |
|
||||
| `precision` | `auto` | Floating point precision. One of `auto`, `float16` or `float32`. `float16` will consume half the memory of `float32` but produce slightly lower-quality images. The `auto` setting will guess the proper precision based on your video card and operating system |
|
||||
| `sequential_guidance` | `false` | Calculate guidance in serial rather than in parallel, lowering memory requirements at the cost of some performance loss |
|
||||
| `xformers_enabled` | `true` | If the x-formers memory-efficient attention module is installed, activate it for better memory usage and generation speed|
|
||||
| `tiled_decode` | `false` | If true, then during the VAE decoding phase the image will be decoded a section at a time, reducing memory consumption at the cost of a performance hit |
|
||||
|
||||
### Paths
|
||||
|
||||
These options set the paths of various directories and files used by
|
||||
InvokeAI. Relative paths are interpreted relative to INVOKEAI_ROOT, so
|
||||
if INVOKEAI_ROOT is `/home/fred/invokeai` and the path is
|
||||
`autoimport/main`, then the corresponding directory will be located at
|
||||
`/home/fred/invokeai/autoimport/main`.
|
||||
|
||||
| Setting | Default Value | Description |
|
||||
|----------|----------------|--------------|
|
||||
| `autoimport_dir` | `autoimport/main` | At startup time, read and import any main model files found in this directory |
|
||||
| `lora_dir` | `autoimport/lora` | At startup time, read and import any LoRA/LyCORIS models found in this directory |
|
||||
| `embedding_dir` | `autoimport/embedding` | At startup time, read and import any textual inversion (embedding) models found in this directory |
|
||||
| `controlnet_dir` | `autoimport/controlnet` | At startup time, read and import any ControlNet models found in this directory |
|
||||
| `conf_path` | `configs/models.yaml` | Location of the `models.yaml` model configuration file |
|
||||
| `models_dir` | `models` | Location of the directory containing models installed by InvokeAI's model manager |
|
||||
| `legacy_conf_dir` | `configs/stable-diffusion` | Location of the directory containing the .yaml configuration files for legacy checkpoint models |
|
||||
| `db_dir` | `databases` | Location of the directory containing InvokeAI's image, schema and session database |
|
||||
| `outdir` | `outputs` | Location of the directory in which the gallery of generated and uploaded images will be stored |
|
||||
| `use_memory_db` | `false` | Keep database information in memory rather than on disk; this will not preserve image gallery information across restarts |
|
||||
|
||||
Note that the autoimport directories will be searched recursively,
|
||||
allowing you to organize the models into folders and subfolders in any
|
||||
way you wish. In addition, while we have split up autoimport
|
||||
directories by the type of model they contain, this isn't
|
||||
necessary. You can combine different model types in the same folder
|
||||
and InvokeAI will figure out what they are. So you can easily use just
|
||||
one autoimport directory by commenting out the unneeded paths:
|
||||
|
||||
```
|
||||
Paths:
|
||||
autoimport_dir: autoimport
|
||||
# lora_dir: null
|
||||
# embedding_dir: null
|
||||
# controlnet_dir: null
|
||||
```
|
||||
|
||||
### Logging
|
||||
|
||||
These settings control the information, warning, and debugging
|
||||
messages printed to the console log while InvokeAI is running:
|
||||
|
||||
| Setting | Default Value | Description |
|
||||
|----------|----------------|--------------|
|
||||
| `log_handlers` | `console` | This controls where log messages are sent, and can be a list of one or more destinations. Values include `console`, `file`, `syslog` and `http`. These are described in more detail below |
|
||||
| `log_format` | `color` | This controls the formatting of the log messages. Values are `plain`, `color`, `legacy` and `syslog` |
|
||||
| `log_level` | `debug` | This filters messages according to the level of severity and can be one of `debug`, `info`, `warning`, `error` and `critical`. For example, setting to `warning` will display all messages at the warning level or higher, but won't display "debug" or "info" messages |
|
||||
|
||||
Several different log handler destinations are available, and multiple destinations are supported by providing a list:
|
||||
|
||||
```
|
||||
log_handlers:
|
||||
- console
|
||||
- syslog=localhost
|
||||
- file=/var/log/invokeai.log
|
||||
```
|
||||
|
||||
* `console` is the default. It prints log messages to the command-line window from which InvokeAI was launched.
|
||||
|
||||
* `syslog` is only available on Linux and Macintosh systems. It uses
|
||||
the operating system's "syslog" facility to write log file entries
|
||||
locally or to a remote logging machine. `syslog` offers a variety
|
||||
of configuration options:
|
||||
|
||||
```
|
||||
syslog=/dev/log` - log to the /dev/log device
|
||||
syslog=localhost` - log to the network logger running on the local machine
|
||||
syslog=localhost:512` - same as above, but using a non-standard port
|
||||
syslog=fredserver,facility=LOG_USER,socktype=SOCK_DRAM`
|
||||
- Log to LAN-connected server "fredserver" using the facility LOG_USER and datagram packets.
|
||||
```
|
||||
|
||||
* `http` can be used to log to a remote web server. The server must be
|
||||
properly configured to receive and act on log messages. The option
|
||||
accepts the URL to the web server, and a `method` argument
|
||||
indicating whether the message should be submitted using the GET or
|
||||
POST method.
|
||||
|
||||
```
|
||||
http=http://my.server/path/to/logger,method=POST
|
||||
```
|
||||
|
||||
The `log_format` option provides several alternative formats:
|
||||
|
||||
* `color` - default format providing time, date and a message, using text colors to distinguish different log severities
|
||||
* `plain` - same as above, but monochrome text only
|
||||
* `syslog` - the log level and error message only, allowing the syslog system to attach the time and date
|
||||
* `legacy` - a format similar to the one used by the legacy 2.3 InvokeAI releases.
|
@ -1,136 +0,0 @@
|
||||
---
|
||||
title: ControlNet
|
||||
---
|
||||
|
||||
# :material-loupe: ControlNet
|
||||
|
||||
## ControlNet
|
||||
|
||||
ControlNet
|
||||
|
||||
ControlNet is a powerful set of features developed by the open-source
|
||||
community (notably, Stanford researcher
|
||||
[**@ilyasviel**](https://github.com/lllyasviel)) that allows you to
|
||||
apply a secondary neural network model to your image generation
|
||||
process in Invoke.
|
||||
|
||||
With ControlNet, you can get more control over the output of your
|
||||
image generation, providing you with a way to direct the network
|
||||
towards generating images that better fit your desired style or
|
||||
outcome.
|
||||
|
||||
|
||||
### How it works
|
||||
|
||||
ControlNet works by analyzing an input image, pre-processing that
|
||||
image to identify relevant information that can be interpreted by each
|
||||
specific ControlNet model, and then inserting that control information
|
||||
into the generation process. This can be used to adjust the style,
|
||||
composition, or other aspects of the image to better achieve a
|
||||
specific result.
|
||||
|
||||
|
||||
### Models
|
||||
|
||||
InvokeAI provides access to a series of ControlNet models that provide
|
||||
different effects or styles in your generated images. Currently
|
||||
InvokeAI only supports "diffuser" style ControlNet models. These are
|
||||
folders that contain the files `config.json` and/or
|
||||
`diffusion_pytorch_model.safetensors` and
|
||||
`diffusion_pytorch_model.fp16.safetensors`. The name of the folder is
|
||||
the name of the model.
|
||||
|
||||
***InvokeAI does not currently support checkpoint-format
|
||||
ControlNets. These come in the form of a single file with the
|
||||
extension `.safetensors`.***
|
||||
|
||||
Diffuser-style ControlNet models are available at HuggingFace
|
||||
(http://huggingface.co) and accessed via their repo IDs (identifiers
|
||||
in the format "author/modelname"). The easiest way to install them is
|
||||
to use the InvokeAI model installer application. Use the
|
||||
`invoke.sh`/`invoke.bat` launcher to select item [5] and then navigate
|
||||
to the CONTROLNETS section. Select the models you wish to install and
|
||||
press "APPLY CHANGES". You may also enter additional HuggingFace
|
||||
repo_ids in the "Additional models" textbox:
|
||||
|
||||
{:width="640px"}
|
||||
|
||||
Command-line users can launch the model installer using the command
|
||||
`invokeai-model-install`.
|
||||
|
||||
_Be aware that some ControlNet models require additional code
|
||||
functionality in order to work properly, so just installing a
|
||||
third-party ControlNet model may not have the desired effect._ Please
|
||||
read and follow the documentation for installing a third party model
|
||||
not currently included among InvokeAI's default list.
|
||||
|
||||
The models currently supported include:
|
||||
|
||||
**Canny**:
|
||||
|
||||
When the Canny model is used in ControlNet, Invoke will attempt to generate images that match the edges detected.
|
||||
|
||||
Canny edge detection works by detecting the edges in an image by looking for abrupt changes in intensity. It is known for its ability to detect edges accurately while reducing noise and false edges, and the preprocessor can identify more information by decreasing the thresholds.
|
||||
|
||||
**M-LSD**:
|
||||
|
||||
M-LSD is another edge detection algorithm used in ControlNet. It stands for Multi-Scale Line Segment Detector.
|
||||
|
||||
It detects straight line segments in an image by analyzing the local structure of the image at multiple scales. It can be useful for architectural imagery, or anything where straight-line structural information is needed for the resulting output.
|
||||
|
||||
**Lineart**:
|
||||
|
||||
The Lineart model in ControlNet generates line drawings from an input image. The resulting pre-processed image is a simplified version of the original, with only the outlines of objects visible.The Lineart model in ControlNet is known for its ability to accurately capture the contours of the objects in an input sketch.
|
||||
|
||||
**Lineart Anime**:
|
||||
|
||||
A variant of the Lineart model that generates line drawings with a distinct style inspired by anime and manga art styles.
|
||||
|
||||
**Depth**:
|
||||
A model that generates depth maps of images, allowing you to create more realistic 3D models or to simulate depth effects in post-processing.
|
||||
|
||||
**Normal Map (BAE):**
|
||||
A model that generates normal maps from input images, allowing for more realistic lighting effects in 3D rendering.
|
||||
|
||||
**Image Segmentation**:
|
||||
A model that divides input images into segments or regions, each of which corresponds to a different object or part of the image. (More details coming soon)
|
||||
|
||||
|
||||
**Openpose**:
|
||||
The OpenPose control model allows for the identification of the general pose of a character by pre-processing an existing image with a clear human structure. With advanced options, Openpose can also detect the face or hands in the image.
|
||||
|
||||
**Mediapipe Face**:
|
||||
|
||||
The MediaPipe Face identification processor is able to clearly identify facial features in order to capture vivid expressions of human faces.
|
||||
|
||||
**Tile (experimental)**:
|
||||
|
||||
The Tile model fills out details in the image to match the image, rather than the prompt. The Tile Model is a versatile tool that offers a range of functionalities. Its primary capabilities can be boiled down to two main behaviors:
|
||||
|
||||
- It can reinterpret specific details within an image and create fresh, new elements.
|
||||
- It has the ability to disregard global instructions if there's a discrepancy between them and the local context or specific parts of the image. In such cases, it uses the local context to guide the process.
|
||||
|
||||
The Tile Model can be a powerful tool in your arsenal for enhancing image quality and details. If there are undesirable elements in your images, such as blurriness caused by resizing, this model can effectively eliminate these issues, resulting in cleaner, crisper images. Moreover, it can generate and add refined details to your images, improving their overall quality and appeal.
|
||||
|
||||
**Pix2Pix (experimental)**
|
||||
|
||||
With Pix2Pix, you can input an image into the controlnet, and then "instruct" the model to change it using your prompt. For example, you can say "Make it winter" to add more wintry elements to a scene.
|
||||
|
||||
**Inpaint**: Coming Soon - Currently this model is available but not functional on the Canvas. An upcoming release will provide additional capabilities for using this model when inpainting.
|
||||
|
||||
Each of these models can be adjusted and combined with other ControlNet models to achieve different results, giving you even more control over your image generation process.
|
||||
|
||||
|
||||
## Using ControlNet
|
||||
|
||||
To use ControlNet, you can simply select the desired model and adjust both the ControlNet and Pre-processor settings to achieve the desired result. You can also use multiple ControlNet models at the same time, allowing you to achieve even more complex effects or styles in your generated images.
|
||||
|
||||
|
||||
Each ControlNet has two settings that are applied to the ControlNet.
|
||||
|
||||
Weight - Strength of the Controlnet model applied to the generation for the section, defined by start/end.
|
||||
|
||||
Start/End - 0 represents the start of the generation, 1 represents the end. The Start/end setting controls what steps during the generation process have the ControlNet applied.
|
||||
|
||||
Additionally, each ControlNet section can be expanded in order to manipulate settings for the image pre-processor that adjusts your uploaded image before using it in when you Invoke.
|
@ -4,13 +4,86 @@ title: Image-to-Image
|
||||
|
||||
# :material-image-multiple: Image-to-Image
|
||||
|
||||
InvokeAI provides an "img2img" feature that lets you seed your
|
||||
creations with an initial drawing or photo. This is a really cool
|
||||
feature that tells stable diffusion to build the prompt on top of the
|
||||
image you provide, preserving the original's basic shape and layout.
|
||||
Both the Web and command-line interfaces provide an "img2img" feature
|
||||
that lets you seed your creations with an initial drawing or
|
||||
photo. This is a really cool feature that tells stable diffusion to
|
||||
build the prompt on top of the image you provide, preserving the
|
||||
original's basic shape and layout.
|
||||
|
||||
For a walkthrough of using Image-to-Image in the Web UI, see [InvokeAI
|
||||
Web Server](./WEB.md#image-to-image).
|
||||
See the [WebUI Guide](WEB.md) for a walkthrough of the img2img feature
|
||||
in the InvokeAI web server. This document describes how to use img2img
|
||||
in the command-line tool.
|
||||
|
||||
## Basic Usage
|
||||
|
||||
Launch the command-line client by launching `invoke.sh`/`invoke.bat`
|
||||
and choosing option (1). Alternative, activate the InvokeAI
|
||||
environment and issue the command `invokeai`.
|
||||
|
||||
Once the `invoke> ` prompt appears, you can start an img2img render by
|
||||
pointing to a seed file with the `-I` option as shown here:
|
||||
|
||||
!!! example ""
|
||||
|
||||
```commandline
|
||||
tree on a hill with a river, nature photograph, national geographic -I./test-pictures/tree-and-river-sketch.png -f 0.85
|
||||
```
|
||||
|
||||
<figure markdown>
|
||||
|
||||
| original image | generated image |
|
||||
| :------------: | :-------------: |
|
||||
| { width=320 } | { width=320 } |
|
||||
|
||||
</figure>
|
||||
|
||||
The `--init_img` (`-I`) option gives the path to the seed picture. `--strength`
|
||||
(`-f`) controls how much the original will be modified, ranging from `0.0` (keep
|
||||
the original intact), to `1.0` (ignore the original completely). The default is
|
||||
`0.75`, and ranges from `0.25-0.90` give interesting results. Other relevant
|
||||
options include `-C` (classification free guidance scale), and `-s` (steps).
|
||||
Unlike `txt2img`, adding steps will continuously change the resulting image and
|
||||
it will not converge.
|
||||
|
||||
You may also pass a `-v<variation_amount>` option to generate `-n<iterations>`
|
||||
count variants on the original image. This is done by passing the first
|
||||
generated image back into img2img the requested number of times. It generates
|
||||
interesting variants.
|
||||
|
||||
Note that the prompt makes a big difference. For example, this slight variation
|
||||
on the prompt produces a very different image:
|
||||
|
||||
<figure markdown>
|
||||
{ width=320 }
|
||||
<caption markdown>photograph of a tree on a hill with a river</caption>
|
||||
</figure>
|
||||
|
||||
!!! tip
|
||||
|
||||
When designing prompts, think about how the images scraped from the internet were
|
||||
captioned. Very few photographs will be labeled "photograph" or "photorealistic."
|
||||
They will, however, be captioned with the publication, photographer, camera model,
|
||||
or film settings.
|
||||
|
||||
If the initial image contains transparent regions, then Stable Diffusion will
|
||||
only draw within the transparent regions, a process called
|
||||
[`inpainting`](./INPAINTING.md#creating-transparent-regions-for-inpainting).
|
||||
However, for this to work correctly, the color information underneath the
|
||||
transparent needs to be preserved, not erased.
|
||||
|
||||
!!! warning "**IMPORTANT ISSUE** "
|
||||
|
||||
`img2img` does not work properly on initial images smaller
|
||||
than 512x512. Please scale your image to at least 512x512 before using it.
|
||||
Larger images are not a problem, but may run out of VRAM on your GPU card. To
|
||||
fix this, use the --fit option, which downscales the initial image to fit within
|
||||
the box specified by width x height:
|
||||
|
||||
```
|
||||
tree on a hill with a river, national geographic -I./test-pictures/big-sketch.png -H512 -W512 --fit
|
||||
```
|
||||
|
||||
## How does it actually work, though?
|
||||
|
||||
The main difference between `img2img` and `prompt2img` is the starting point.
|
||||
While `prompt2img` always starts with pure gaussian noise and progressively
|
||||
@ -26,6 +99,10 @@ seed `1592514025` develops something like this:
|
||||
|
||||
!!! example ""
|
||||
|
||||
```bash
|
||||
invoke> "fire" -s10 -W384 -H384 -S1592514025
|
||||
```
|
||||
|
||||
<figure markdown>
|
||||
{ width=720 }
|
||||
</figure>
|
||||
@ -80,8 +157,17 @@ Diffusion has less chance to refine itself, so the result ends up inheriting all
|
||||
the problems of my bad drawing.
|
||||
|
||||
If you want to try this out yourself, all of these are using a seed of
|
||||
`1592514025` with a width/height of `384`, step count `10`, the
|
||||
`k_lms` sampler, and the single-word prompt `"fire"`.
|
||||
`1592514025` with a width/height of `384`, step count `10`, the default sampler
|
||||
(`k_lms`), and the single-word prompt `"fire"`:
|
||||
|
||||
```bash
|
||||
invoke> "fire" -s10 -W384 -H384 -S1592514025 -I /tmp/fire-drawing.png --strength 0.7
|
||||
```
|
||||
|
||||
The code for rendering intermediates is on my (damian0815's) branch
|
||||
[document-img2img](https://github.com/damian0815/InvokeAI/tree/document-img2img) -
|
||||
run `invoke.py` and check your `outputs/img-samples/intermediates` folder while
|
||||
generating an image.
|
||||
|
||||
### Compensating for the reduced step count
|
||||
|
||||
@ -94,6 +180,10 @@ give each generation 20 steps.
|
||||
Here's strength `0.4` (note step count `50`, which is `20 ÷ 0.4` to make sure SD
|
||||
does `20` steps from my image):
|
||||
|
||||
```bash
|
||||
invoke> "fire" -s50 -W384 -H384 -S1592514025 -I /tmp/fire-drawing.png -f 0.4
|
||||
```
|
||||
|
||||
<figure markdown>
|
||||

|
||||
</figure>
|
||||
@ -101,6 +191,10 @@ does `20` steps from my image):
|
||||
and here is strength `0.7` (note step count `30`, which is roughly `20 ÷ 0.7` to
|
||||
make sure SD does `20` steps from my image):
|
||||
|
||||
```commandline
|
||||
invoke> "fire" -s30 -W384 -H384 -S1592514025 -I /tmp/fire-drawing.png -f 0.7
|
||||
```
|
||||
|
||||
<figure markdown>
|
||||

|
||||
</figure>
|
||||
|
@ -1,171 +0,0 @@
|
||||
---
|
||||
title: Controlling Logging
|
||||
---
|
||||
|
||||
# :material-image-off: Controlling Logging
|
||||
|
||||
## Controlling How InvokeAI Logs Status Messages
|
||||
|
||||
InvokeAI logs status messages using a configurable logging system. You
|
||||
can log to the terminal window, to a designated file on the local
|
||||
machine, to the syslog facility on a Linux or Mac, or to a properly
|
||||
configured web server. You can configure several logs at the same
|
||||
time, and control the level of message logged and the logging format
|
||||
(to a limited extent).
|
||||
|
||||
Three command-line options control logging:
|
||||
|
||||
### `--log_handlers <handler1> <handler2> ...`
|
||||
|
||||
This option activates one or more log handlers. Options are "console",
|
||||
"file", "syslog" and "http". To specify more than one, separate them
|
||||
by spaces:
|
||||
|
||||
```bash
|
||||
invokeai-web --log_handlers console syslog=/dev/log file=C:\Users\fred\invokeai.log
|
||||
```
|
||||
|
||||
The format of these options is described below.
|
||||
|
||||
### `--log_format {plain|color|legacy|syslog}`
|
||||
|
||||
This controls the format of log messages written to the console. Only
|
||||
the "console" log handler is currently affected by this setting.
|
||||
|
||||
* "plain" provides formatted messages like this:
|
||||
|
||||
```bash
|
||||
|
||||
[2023-05-24 23:18:2[2023-05-24 23:18:50,352]::[InvokeAI]::DEBUG --> this is a debug message
|
||||
[2023-05-24 23:18:50,352]::[InvokeAI]::INFO --> this is an informational messages
|
||||
[2023-05-24 23:18:50,352]::[InvokeAI]::WARNING --> this is a warning
|
||||
[2023-05-24 23:18:50,352]::[InvokeAI]::ERROR --> this is an error
|
||||
[2023-05-24 23:18:50,352]::[InvokeAI]::CRITICAL --> this is a critical error
|
||||
```
|
||||
|
||||
* "color" produces similar output, but the text will be color coded to
|
||||
indicate the severity of the message.
|
||||
|
||||
* "legacy" produces output similar to InvokeAI versions 2.3 and earlier:
|
||||
|
||||
```bash
|
||||
### this is a critical error
|
||||
*** this is an error
|
||||
** this is a warning
|
||||
>> this is an informational messages
|
||||
| this is a debug message
|
||||
```
|
||||
|
||||
* "syslog" produces messages suitable for syslog entries:
|
||||
|
||||
```bash
|
||||
InvokeAI [2691178] <CRITICAL> this is a critical error
|
||||
InvokeAI [2691178] <ERROR> this is an error
|
||||
InvokeAI [2691178] <WARNING> this is a warning
|
||||
InvokeAI [2691178] <INFO> this is an informational messages
|
||||
InvokeAI [2691178] <DEBUG> this is a debug message
|
||||
```
|
||||
|
||||
(note that the date, time and hostname will be added by the syslog
|
||||
system)
|
||||
|
||||
### `--log_level {debug|info|warning|error|critical}`
|
||||
|
||||
Providing this command-line option will cause only messages at the
|
||||
specified level or above to be emitted.
|
||||
|
||||
## Console logging
|
||||
|
||||
When "console" is provided to `--log_handlers`, messages will be
|
||||
written to the command line window in which InvokeAI was launched. By
|
||||
default, the color formatter will be used unless overridden by
|
||||
`--log_format`.
|
||||
|
||||
## File logging
|
||||
|
||||
When "file" is provided to `--log_handlers`, entries will be written
|
||||
to the file indicated in the path argument. By default, the "plain"
|
||||
format will be used:
|
||||
|
||||
```bash
|
||||
invokeai-web --log_handlers file=/var/log/invokeai.log
|
||||
```
|
||||
|
||||
## Syslog logging
|
||||
|
||||
When "syslog" is requested, entries will be sent to the syslog
|
||||
system. There are a variety of ways to control where the log message
|
||||
is sent:
|
||||
|
||||
* Send to the local machine using the `/dev/log` socket:
|
||||
|
||||
```
|
||||
invokeai-web --log_handlers syslog=/dev/log
|
||||
```
|
||||
|
||||
* Send to the local machine using a UDP message:
|
||||
|
||||
```
|
||||
invokeai-web --log_handlers syslog=localhost
|
||||
```
|
||||
|
||||
* Send to the local machine using a UDP message on a nonstandard
|
||||
port:
|
||||
|
||||
```
|
||||
invokeai-web --log_handlers syslog=localhost:512
|
||||
```
|
||||
|
||||
* Send to a remote machine named "loghost" on the local LAN using
|
||||
facility LOG_USER and UDP packets:
|
||||
|
||||
```
|
||||
invokeai-web --log_handlers syslog=loghost,facility=LOG_USER,socktype=SOCK_DGRAM
|
||||
```
|
||||
|
||||
This can be abbreviated `syslog=loghost`, as LOG_USER and SOCK_DGRAM
|
||||
are defaults.
|
||||
|
||||
* Send to a remote machine named "loghost" using the facility LOCAL0
|
||||
and using a TCP socket:
|
||||
|
||||
```
|
||||
invokeai-web --log_handlers syslog=loghost,facility=LOG_LOCAL0,socktype=SOCK_STREAM
|
||||
```
|
||||
|
||||
If no arguments are specified (just a bare "syslog"), then the logging
|
||||
system will look for a UNIX socket named `/dev/log`, and if not found
|
||||
try to send a UDP message to `localhost`. The Macintosh OS used to
|
||||
support logging to a socket named `/var/run/syslog`, but this feature
|
||||
has since been disabled.
|
||||
|
||||
## Web logging
|
||||
|
||||
If you have access to a web server that is configured to log messages
|
||||
when a particular URL is requested, you can log using the "http"
|
||||
method:
|
||||
|
||||
```
|
||||
invokeai-web --log_handlers http=http://my.server/path/to/logger,method=POST
|
||||
```
|
||||
|
||||
The optional [,method=] part can be used to specify whether the URL
|
||||
accepts GET (default) or POST messages.
|
||||
|
||||
Currently password authentication and SSL are not supported.
|
||||
|
||||
## Using the configuration file
|
||||
|
||||
You can set and forget logging options by adding a "Logging" section
|
||||
to `invokeai.yaml`:
|
||||
|
||||
```
|
||||
InvokeAI:
|
||||
[... other settings...]
|
||||
Logging:
|
||||
log_handlers:
|
||||
- console
|
||||
- syslog=/dev/log
|
||||
log_level: info
|
||||
log_format: color
|
||||
```
|
@ -71,3 +71,6 @@ under the selected name and register it with InvokeAI.
|
||||
use InvokeAI conventions - only alphanumeric letters and the
|
||||
characters ".+-".
|
||||
|
||||
## Caveats
|
||||
|
||||
This is a new script and may contain bugs.
|
||||
|
@ -1,208 +0,0 @@
|
||||
# Nodes Editor (Experimental)
|
||||
|
||||
🚨
|
||||
*The node editor is experimental. We've made it accessible because we use it to develop the application, but we have not addressed the many known rough edges. It's very easy to shoot yourself in the foot, and we cannot offer support for it until it sees full release (ETA v3.1). Everything is subject to change without warning.*
|
||||
🚨
|
||||
|
||||
The nodes editor is a blank canvas allowing for the use of individual functions and image transformations to control the image generation workflow. The node processing flow is usually done from left (inputs) to right (outputs), though linearity can become abstracted the more complex the node graph becomes. Nodes inputs and outputs are connected by dragging connectors from node to node.
|
||||
|
||||
To better understand how nodes are used, think of how an electric power bar works. It takes in one input (electricity from a wall outlet) and passes it to multiple devices through multiple outputs. Similarly, a node could have multiple inputs and outputs functioning at the same (or different) time, but all node outputs pass information onward like a power bar passes electricity. Not all outputs are compatible with all inputs, however - Each node has different constraints on how it is expecting to input/output information. In general, node outputs are colour-coded to match compatible inputs of other nodes.
|
||||
|
||||
## Anatomy of a Node
|
||||
|
||||
Individual nodes are made up of the following:
|
||||
|
||||
- Inputs: Edge points on the left side of the node window where you connect outputs from other nodes.
|
||||
- Outputs: Edge points on the right side of the node window where you connect to inputs on other nodes.
|
||||
- Options: Various options which are either manually configured, or overridden by connecting an output from another node to the input.
|
||||
|
||||
## Diffusion Overview
|
||||
|
||||
Taking the time to understand the diffusion process will help you to understand how to set up your nodes in the nodes editor.
|
||||
|
||||
There are two main spaces Stable Diffusion works in: image space and latent space.
|
||||
|
||||
Image space represents images in pixel form that you look at. Latent space represents compressed inputs. It’s in latent space that Stable Diffusion processes images. A VAE (Variational Auto Encoder) is responsible for compressing and encoding inputs into latent space, as well as decoding outputs back into image space.
|
||||
|
||||
When you generate an image using text-to-image, multiple steps occur in latent space:
|
||||
1. Random noise is generated at the chosen height and width. The noise’s characteristics are dictated by the chosen (or not chosen) seed. This noise tensor is passed into latent space. We’ll call this noise A.
|
||||
1. Using a model’s U-Net, a noise predictor examines noise A, and the words tokenized by CLIP from your prompt (conditioning). It generates its own noise tensor to predict what the final image might look like in latent space. We’ll call this noise B.
|
||||
1. Noise B is subtracted from noise A in an attempt to create a final latent image indicative of the inputs. This step is repeated for the number of sampler steps chosen.
|
||||
1. The VAE decodes the final latent image from latent space into image space.
|
||||
|
||||
image-to-image is a similar process, with only step 1 being different:
|
||||
1. The input image is decoded from image space into latent space by the VAE. Noise is then added to the input latent image. Denoising Strength dictates how much noise is added, 0 being none, and 1 being all-encompassing. We’ll call this noise A. The process is then the same as steps 2-4 in the text-to-image explanation above.
|
||||
|
||||
Furthermore, a model provides the CLIP prompt tokenizer, the VAE, and a U-Net (where noise prediction occurs given a prompt and initial noise tensor).
|
||||
|
||||
A noise scheduler (eg. DPM++ 2M Karras) schedules the subtraction of noise from the latent image across the sampler steps chosen (step 3 above). Less noise is usually subtracted at higher sampler steps.
|
||||
|
||||
## Node Types (Base Nodes)
|
||||
|
||||
| Node <img width=160 align="right"> | Function |
|
||||
| ---------------------------------- | --------------------------------------------------------------------------------------|
|
||||
| Add | Adds two numbers |
|
||||
| CannyImageProcessor | Canny edge detection for ControlNet |
|
||||
| ClipSkip | Skip layers in clip text_encoder model |
|
||||
| Collect | Collects values into a collection |
|
||||
| Prompt (Compel) | Parse prompt using compel package to conditioning |
|
||||
| ContentShuffleImageProcessor | Applies content shuffle processing to image |
|
||||
| ControlNet | Collects ControlNet info to pass to other nodes |
|
||||
| CvInpaint | Simple inpaint using opencv |
|
||||
| Divide | Divides two numbers |
|
||||
| DynamicPrompt | Parses a prompt using adieyal/dynamic prompt's random or combinatorial generator |
|
||||
| FloatLinearRange | Creates a range |
|
||||
| HedImageProcessor | Applies HED edge detection to image |
|
||||
| ImageBlur | Blurs an image |
|
||||
| ImageChannel | Gets a channel from an image |
|
||||
| ImageCollection | Load a collection of images and provide it as output |
|
||||
| ImageConvert | Converts an image to a different mode |
|
||||
| ImageCrop | Crops an image to a specified box. The box can be outside of the image. |
|
||||
| ImageInverseLerp | Inverse linear interpolation of all pixels of an image |
|
||||
| ImageLerp | Linear interpolation of all pixels of an image |
|
||||
| ImageMultiply | Multiplies two images together using `PIL.ImageChops.Multiply()` |
|
||||
| ImageNSFWBlurInvocation | Detects and blurs images that may contain sexually explicit content |
|
||||
| ImagePaste | Pastes an image into another image |
|
||||
| ImageProcessor | Base class for invocations that reprocess images for ControlNet |
|
||||
| ImageResize | Resizes an image to specific dimensions |
|
||||
| ImageScale | Scales an image by a factor |
|
||||
| ImageToLatents | Scales latents by a given factor |
|
||||
| ImageWatermarkInvocation | Adds an invisible watermark to images |
|
||||
| InfillColor | Infills transparent areas of an image with a solid color |
|
||||
| InfillPatchMatch | Infills transparent areas of an image using the PatchMatch algorithm |
|
||||
| InfillTile | Infills transparent areas of an image with tiles of the image |
|
||||
| Inpaint | Generates an image using inpaint |
|
||||
| Iterate | Iterates over a list of items |
|
||||
| LatentsToImage | Generates an image from latents |
|
||||
| LatentsToLatents | Generates latents using latents as base image |
|
||||
| LeresImageProcessor | Applies leres processing to image |
|
||||
| LineartAnimeImageProcessor | Applies line art anime processing to image |
|
||||
| LineartImageProcessor | Applies line art processing to image |
|
||||
| LoadImage | Load an image and provide it as output |
|
||||
| Lora Loader | Apply selected lora to unet and text_encoder |
|
||||
| Model Loader | Loads a main model, outputting its submodels |
|
||||
| MaskFromAlpha | Extracts the alpha channel of an image as a mask |
|
||||
| MediapipeFaceProcessor | Applies mediapipe face processing to image |
|
||||
| MidasDepthImageProcessor | Applies Midas depth processing to image |
|
||||
| MlsdImageProcessor | Applied MLSD processing to image |
|
||||
| Multiply | Multiplies two numbers |
|
||||
| Noise | Generates latent noise |
|
||||
| NormalbaeImageProcessor | Applies NormalBAE processing to image |
|
||||
| OpenposeImageProcessor | Applies Openpose processing to image |
|
||||
| ParamFloat | A float parameter |
|
||||
| ParamInt | An integer parameter |
|
||||
| PidiImageProcessor | Applies PIDI processing to an image |
|
||||
| Progress Image | Displays the progress image in the Node Editor |
|
||||
| RandomInit | Outputs a single random integer |
|
||||
| RandomRange | Creates a collection of random numbers |
|
||||
| Range | Creates a range of numbers from start to stop with step |
|
||||
| RangeOfSize | Creates a range from start to start + size with step |
|
||||
| ResizeLatents | Resizes latents to explicit width/height (in pixels). Provided dimensions are floor-divided by 8. |
|
||||
| RestoreFace | Restores faces in the image |
|
||||
| ScaleLatents | Scales latents by a given factor |
|
||||
| SegmentAnythingProcessor | Applies segment anything processing to image |
|
||||
| ShowImage | Displays a provided image, and passes it forward in the pipeline |
|
||||
| StepParamEasing | Experimental per-step parameter for easing for denoising steps |
|
||||
| Subtract | Subtracts two numbers |
|
||||
| TextToLatents | Generates latents from conditionings |
|
||||
| TileResampleProcessor | Bass class for invocations that preprocess images for ControlNet |
|
||||
| Upscale | Upscales an image |
|
||||
| VAE Loader | Loads a VAE model, outputting a VaeLoaderOutput |
|
||||
| ZoeDepthImageProcessor | Applies Zoe depth processing to image |
|
||||
|
||||
## Node Grouping Concepts
|
||||
|
||||
There are several node grouping concepts that can be examined with a narrow focus. These (and other) groupings can be pieced together to make up functional graph setups, and are important to understanding how groups of nodes work together as part of a whole. Note that the screenshots below aren't examples of complete functioning node graphs (see Examples).
|
||||
|
||||
### Noise
|
||||
|
||||
As described, an initial noise tensor is necessary for the latent diffusion process. As a result, all non-image *ToLatents nodes require a noise node input.
|
||||
|
||||

|
||||
|
||||
### Conditioning
|
||||
|
||||
As described, conditioning is necessary for the latent diffusion process, whether empty or not. As a result, all non-image *ToLatents nodes require positive and negative conditioning inputs. Conditioning is reliant on a CLIP tokenizer provided by the Model Loader node.
|
||||
|
||||

|
||||
|
||||
### Image Space & VAE
|
||||
|
||||
The ImageToLatents node doesn't require a noise node input, but requires a VAE input to convert the image from image space into latent space. In reverse, the LatentsToImage node requires a VAE input to convert from latent space back into image space.
|
||||
|
||||

|
||||
|
||||
### Defined & Random Seeds
|
||||
|
||||
It is common to want to use both the same seed (for continuity) and random seeds (for variance). To define a seed, simply enter it into the 'Seed' field on a noise node. Conversely, the RandomInt node generates a random integer between 'Low' and 'High', and can be used as input to the 'Seed' edge point on a noise node to randomize your seed.
|
||||
|
||||

|
||||
|
||||
### Control
|
||||
|
||||
Control means to guide the diffusion process to adhere to a defined input or structure. Control can be provided as input to non-image *ToLatents nodes from ControlNet nodes. ControlNet nodes usually require an image processor which converts an input image for use with ControlNet.
|
||||
|
||||

|
||||
|
||||
### LoRA
|
||||
|
||||
The Lora Loader node lets you load a LoRA (say that ten times fast) and pass it as output to both the Prompt (Compel) and non-image *ToLatents nodes. A model's CLIP tokenizer is passed through the LoRA into Prompt (Compel), where it affects conditioning. A model's U-Net is also passed through the LoRA into a non-image *ToLatents node, where it affects noise prediction.
|
||||
|
||||

|
||||
|
||||
### Scaling
|
||||
|
||||
Use the ImageScale, ScaleLatents, and Upscale nodes to upscale images and/or latent images. The chosen method differs across contexts. However, be aware that latents are already noisy and compressed at their original resolution; scaling an image could produce more detailed results.
|
||||
|
||||

|
||||
|
||||
### Iteration + Multiple Images as Input
|
||||
|
||||
Iteration is a common concept in any processing, and means to repeat a process with given input. In nodes, you're able to use the Iterate node to iterate through collections usually gathered by the Collect node. The Iterate node has many potential uses, from processing a collection of images one after another, to varying seeds across multiple image generations and more. This screenshot demonstrates how to collect several images and pass them out one at a time.
|
||||
|
||||

|
||||
|
||||
### Multiple Image Generation + Random Seeds
|
||||
|
||||
Multiple image generation in the node editor is done using the RandomRange node. In this case, the 'Size' field represents the number of images to generate. As RandomRange produces a collection of integers, we need to add the Iterate node to iterate through the collection.
|
||||
|
||||
To control seeds across generations takes some care. The first row in the screenshot will generate multiple images with different seeds, but using the same RandomRange parameters across invocations will result in the same group of random seeds being used across the images, producing repeatable results. In the second row, adding the RandomInt node as input to RandomRange's 'Seed' edge point will ensure that seeds are varied across all images across invocations, producing varied results.
|
||||
|
||||

|
||||
|
||||
## Examples
|
||||
|
||||
With our knowledge of node grouping and the diffusion process, let’s break down some basic graphs in the nodes editor. Note that a node's options can be overridden by inputs from other nodes. These examples aren't strict rules to follow and only demonstrate some basic configurations.
|
||||
|
||||
### Basic text-to-image Node Graph
|
||||
|
||||

|
||||
|
||||
- Model Loader: A necessity to generating images (as we’ve read above). We choose our model from the dropdown. It outputs a U-Net, CLIP tokenizer, and VAE.
|
||||
- Prompt (Compel): Another necessity. Two prompt nodes are created. One will output positive conditioning (what you want, ‘dog’), one will output negative (what you don’t want, ‘cat’). They both input the CLIP tokenizer that the Model Loader node outputs.
|
||||
- Noise: Consider this noise A from step one of the text-to-image explanation above. Choose a seed number, width, and height.
|
||||
- TextToLatents: This node takes many inputs for converting and processing text & noise from image space into latent space, hence the name TextTo**Latents**. In this setup, it inputs positive and negative conditioning from the prompt nodes for processing (step 2 above). It inputs noise from the noise node for processing (steps 2 & 3 above). Lastly, it inputs a U-Net from the Model Loader node for processing (step 2 above). It outputs latents for use in the next LatentsToImage node. Choose number of sampler steps, CFG scale, and scheduler.
|
||||
- LatentsToImage: This node takes in processed latents from the TextToLatents node, and the model’s VAE from the Model Loader node which is responsible for decoding latents back into the image space, hence the name LatentsTo**Image**. This node is the last stop, and once the image is decoded, it is saved to the gallery.
|
||||
|
||||
### Basic image-to-image Node Graph
|
||||
|
||||

|
||||
|
||||
- Model Loader: Choose a model from the dropdown.
|
||||
- Prompt (Compel): Two prompt nodes. One positive (dog), one negative (dog). Same CLIP inputs from the Model Loader node as before.
|
||||
- ImageToLatents: Upload a source image directly in the node window, via drag'n'drop from the gallery, or passed in as input. The ImageToLatents node inputs the VAE from the Model Loader node to decode the chosen image from image space into latent space, hence the name ImageTo**Latents**. It outputs latents for use in the next LatentsToLatents node. It also outputs the source image's width and height for use in the next Noise node if the final image is to be the same dimensions as the source image.
|
||||
- Noise: A noise tensor is created with the width and height of the source image, and connected to the next LatentsToLatents node. Notice the width and height fields are overridden by the input from the ImageToLatents width and height outputs.
|
||||
- LatentsToLatents: The inputs and options are nearly identical to TextToLatents, except that LatentsToLatents also takes latents as an input. Considering our source image is already converted to latents in the last ImageToLatents node, and text + noise are no longer the only inputs to process, we use the LatentsToLatents node.
|
||||
- LatentsToImage: Like previously, the LatentsToImage node will use the VAE from the Model Loader as input to decode the latents from LatentsToLatents into image space, and save it to the gallery.
|
||||
|
||||
### Basic ControlNet Node Graph
|
||||
|
||||

|
||||
|
||||
- Model Loader
|
||||
- Prompt (Compel)
|
||||
- Noise: Width and height of the CannyImageProcessor ControlNet image is passed in to set the dimensions of the noise passed to TextToLatents.
|
||||
- CannyImageProcessor: The CannyImageProcessor node is used to process the source image being used as a ControlNet. Each ControlNet processor node applies control in different ways, and has some different options to configure. Width and height are passed to noise, as mentioned. The processed ControlNet image is output to the ControlNet node.
|
||||
- ControlNet: Select the type of control model. In this case, canny is chosen as the CannyImageProcessor was used to generate the ControlNet image. Configure the control node options, and pass the control output to TextToLatents.
|
||||
- TextToLatents: Similar to the basic text-to-image example, except ControlNet is passed to the control input edge point.
|
||||
- LatentsToImage
|
89
docs/features/NSFW.md
Normal file
@ -0,0 +1,89 @@
|
||||
---
|
||||
title: The NSFW Checker
|
||||
---
|
||||
|
||||
# :material-image-off: NSFW Checker
|
||||
|
||||
## The NSFW ("Safety") Checker
|
||||
|
||||
The Stable Diffusion image generation models will produce sexual
|
||||
imagery if deliberately prompted, and will occasionally produce such
|
||||
images when this is not intended. Such images are colloquially known
|
||||
as "Not Safe for Work" (NSFW). This behavior is due to the nature of
|
||||
the training set that Stable Diffusion was trained on, which culled
|
||||
millions of "aesthetic" images from the Internet.
|
||||
|
||||
You may not wish to be exposed to these images, and in some
|
||||
jurisdictions it may be illegal to publicly distribute such imagery,
|
||||
including mounting a publicly-available server that provides
|
||||
unfiltered images to the public. Furthermore, the [Stable Diffusion
|
||||
weights
|
||||
License](https://github.com/invoke-ai/InvokeAI/blob/main/LICENSE-ModelWeights.txt)
|
||||
forbids the model from being used to "exploit any of the
|
||||
vulnerabilities of a specific group of persons."
|
||||
|
||||
For these reasons Stable Diffusion offers a "safety checker," a
|
||||
machine learning model trained to recognize potentially disturbing
|
||||
imagery. When a potentially NSFW image is detected, the checker will
|
||||
blur the image and paste a warning icon on top. The checker can be
|
||||
turned on and off on the command line using `--nsfw_checker` and
|
||||
`--no-nsfw_checker`.
|
||||
|
||||
At installation time, InvokeAI will ask whether the checker should be
|
||||
activated by default (neither argument given on the command line). The
|
||||
response is stored in the InvokeAI initialization file (usually
|
||||
`.invokeai` in your home directory). You can change the default at any
|
||||
time by opening this file in a text editor and commenting or
|
||||
uncommenting the line `--nsfw_checker`.
|
||||
|
||||
## Caveats
|
||||
|
||||
There are a number of caveats that you need to be aware of.
|
||||
|
||||
### Accuracy
|
||||
|
||||
The checker is [not perfect](https://arxiv.org/abs/2210.04610).It will
|
||||
occasionally flag innocuous images (false positives), and will
|
||||
frequently miss violent and gory imagery (false negatives). It rarely
|
||||
fails to flag sexual imagery, but this has been known to happen. For
|
||||
these reasons, the InvokeAI team prefers to refer to the software as a
|
||||
"NSFW Checker" rather than "safety checker."
|
||||
|
||||
### Memory Usage and Performance
|
||||
|
||||
The NSFW checker consumes an additional 1.2G of GPU VRAM on top of the
|
||||
3.4G of VRAM used by Stable Diffusion v1.5 (this is with
|
||||
half-precision arithmetic). This means that the checker will not run
|
||||
successfully on GPU cards with less than 6GB VRAM, and will reduce the
|
||||
size of the images that you can produce.
|
||||
|
||||
The checker also introduces a slight performance penalty. Images will
|
||||
take ~1 second longer to generate when the checker is
|
||||
activated. Generally this is not noticeable.
|
||||
|
||||
### Intermediate Images in the Web UI
|
||||
|
||||
The checker only operates on the final image produced by the Stable
|
||||
Diffusion algorithm. If you are using the Web UI and have enabled the
|
||||
display of intermediate images, you will briefly be exposed to a
|
||||
low-resolution (mosaicized) version of the final image before it is
|
||||
flagged by the checker and replaced by a fully blurred version. You
|
||||
are encouraged to turn **off** intermediate image rendering when you
|
||||
are using the checker. Future versions of InvokeAI will apply
|
||||
additional blurring to intermediate images when the checker is active.
|
||||
|
||||
### Watermarking
|
||||
|
||||
InvokeAI does not apply any sort of watermark to images it
|
||||
generates. However, it does write metadata into the PNG data area,
|
||||
including the prompt used to generate the image and relevant parameter
|
||||
settings. These fields can be examined using the `sd-metadata.py`
|
||||
script that comes with the InvokeAI package.
|
||||
|
||||
Note that several other Stable Diffusion distributions offer
|
||||
wavelet-based "invisible" watermarking. We have experimented with the
|
||||
library used to generate these watermarks and have reached the
|
||||
conclusion that while the watermarking library may be adding
|
||||
watermarks to PNG images, the currently available version is unable to
|
||||
retrieve them successfully. If and when a functioning version of the
|
||||
library becomes available, we will offer this feature as well.
|
@ -16,24 +16,48 @@ Output Example:
|
||||
|
||||
---
|
||||
|
||||
## **Invisible Watermark**
|
||||
## **Seamless Tiling**
|
||||
|
||||
In keeping with the principles for responsible AI generation, and to
|
||||
help AI researchers avoid synthetic images contaminating their
|
||||
training sets, InvokeAI adds an invisible watermark to each of the
|
||||
final images it generates. The watermark consists of the text
|
||||
"InvokeAI" and can be viewed using the
|
||||
[invisible-watermarks](https://github.com/ShieldMnt/invisible-watermark)
|
||||
tool.
|
||||
The seamless tiling mode causes generated images to seamlessly tile with itself. To use it, add the
|
||||
`--seamless` option when starting the script which will result in all generated images to tile, or
|
||||
for each `invoke>` prompt as shown here:
|
||||
|
||||
Watermarking is controlled using the `invisible-watermark` setting in
|
||||
`invokeai.yaml`. To turn it off, add the following line under the `Features`
|
||||
category.
|
||||
|
||||
```
|
||||
invisible_watermark: false
|
||||
```python
|
||||
invoke> "pond garden with lotus by claude monet" --seamless -s100 -n4
|
||||
```
|
||||
|
||||
By default this will tile on both the X and Y axes. However, you can also specify specific axes to tile on with `--seamless_axes`.
|
||||
Possible values are `x`, `y`, and `x,y`:
|
||||
```python
|
||||
invoke> "pond garden with lotus by claude monet" --seamless --seamless_axes=x -s100 -n4
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## **Shortcuts: Reusing Seeds**
|
||||
|
||||
Since it is so common to reuse seeds while refining a prompt, there is now a shortcut as of version
|
||||
1.11. Provide a `-S` (or `--seed`) switch of `-1` to use the seed of the most recent image
|
||||
generated. If you produced multiple images with the `-n` switch, then you can go back further
|
||||
using `-2`, `-3`, etc. up to the first image generated by the previous command. Sorry, but you can't go
|
||||
back further than one command.
|
||||
|
||||
Here's an example of using this to do a quick refinement. It also illustrates using the new `-G`
|
||||
switch to turn on upscaling and face enhancement (see previous section):
|
||||
|
||||
```bash
|
||||
invoke> a cute child playing hopscotch -G0.5
|
||||
[...]
|
||||
outputs/img-samples/000039.3498014304.png: "a cute child playing hopscotch" -s50 -W512 -H512 -C7.5 -mk_lms -S3498014304
|
||||
|
||||
# I wonder what it will look like if I bump up the steps and set facial enhancement to full strength?
|
||||
invoke> a cute child playing hopscotch -G1.0 -s100 -S -1
|
||||
reusing previous seed 3498014304
|
||||
[...]
|
||||
outputs/img-samples/000040.3498014304.png: "a cute child playing hopscotch" -G1.0 -s100 -W512 -H512 -C7.5 -mk_lms -S3498014304
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## **Weighted Prompts**
|
||||
|
||||
@ -42,10 +66,73 @@ priority to them, by adding `:<percent>` to the end of the section you wish to u
|
||||
example consider this prompt:
|
||||
|
||||
```bash
|
||||
(tabby cat):0.25 (white duck):0.75 hybrid
|
||||
tabby cat:0.25 white duck:0.75 hybrid
|
||||
```
|
||||
|
||||
This will tell the sampler to invest 25% of its effort on the tabby cat aspect of the image and 75%
|
||||
on the white duck aspect (surprisingly, this example actually works). The prompt weights can use any
|
||||
combination of integers and floating point numbers, and they do not need to add up to 1.
|
||||
|
||||
---
|
||||
|
||||
## **Filename Format**
|
||||
|
||||
The argument `--fnformat` allows to specify the filename of the
|
||||
image. Supported wildcards are all arguments what can be set such as
|
||||
`perlin`, `seed`, `threshold`, `height`, `width`, `gfpgan_strength`,
|
||||
`sampler_name`, `steps`, `model`, `upscale`, `prompt`, `cfg_scale`,
|
||||
`prefix`.
|
||||
|
||||
The following prompt
|
||||
```bash
|
||||
dream> a red car --steps 25 -C 9.8 --perlin 0.1 --fnformat {prompt}_steps.{steps}_cfg.{cfg_scale}_perlin.{perlin}.png
|
||||
```
|
||||
|
||||
generates a file with the name: `outputs/img-samples/a red car_steps.25_cfg.9.8_perlin.0.1.png`
|
||||
|
||||
---
|
||||
|
||||
## **Thresholding and Perlin Noise Initialization Options**
|
||||
|
||||
Two new options are the thresholding (`--threshold`) and the perlin noise initialization (`--perlin`) options. Thresholding limits the range of the latent values during optimization, which helps combat oversaturation with higher CFG scale values. Perlin noise initialization starts with a percentage (a value ranging from 0 to 1) of perlin noise mixed into the initial noise. Both features allow for more variations and options in the course of generating images.
|
||||
|
||||
For better intuition into what these options do in practice:
|
||||
|
||||

|
||||
|
||||
In generating this graphic, perlin noise at initialization was programmatically varied going across on the diagram by values 0.0, 0.1, 0.2, 0.4, 0.5, 0.6, 0.8, 0.9, 1.0; and the threshold was varied going down from
|
||||
0, 1, 2, 3, 4, 5, 10, 20, 100. The other options are fixed, so the initial prompt is as follows (no thresholding or perlin noise):
|
||||
|
||||
```bash
|
||||
invoke> "a portrait of a beautiful young lady" -S 1950357039 -s 100 -C 20 -A k_euler_a --threshold 0 --perlin 0
|
||||
```
|
||||
|
||||
Here's an example of another prompt used when setting the threshold to 5 and perlin noise to 0.2:
|
||||
|
||||
```bash
|
||||
invoke> "a portrait of a beautiful young lady" -S 1950357039 -s 100 -C 20 -A k_euler_a --threshold 5 --perlin 0.2
|
||||
```
|
||||
|
||||
!!! note
|
||||
|
||||
currently the thresholding feature is only implemented for the k-diffusion style samplers, and empirically appears to work best with `k_euler_a` and `k_dpm_2_a`. Using 0 disables thresholding. Using 0 for perlin noise disables using perlin noise for initialization. Finally, using 1 for perlin noise uses only perlin noise for initialization.
|
||||
|
||||
---
|
||||
|
||||
## **Simplified API**
|
||||
|
||||
For programmers who wish to incorporate stable-diffusion into other products, this repository
|
||||
includes a simplified API for text to image generation, which lets you create images from a prompt
|
||||
in just three lines of code:
|
||||
|
||||
```bash
|
||||
from ldm.generate import Generate
|
||||
g = Generate()
|
||||
outputs = g.txt2img("a unicorn in manhattan")
|
||||
```
|
||||
|
||||
Outputs is a list of lists in the format [filename1,seed1],[filename2,seed2]...].
|
||||
|
||||
Please see the documentation in ldm/generate.py for more information.
|
||||
|
||||
---
|
||||
|
@ -8,6 +8,12 @@ title: Postprocessing
|
||||
|
||||
This extension provides the ability to restore faces and upscale images.
|
||||
|
||||
Face restoration and upscaling can be applied at the time you generate the
|
||||
images, or at any later time against a previously-generated PNG file, using the
|
||||
[!fix](#fixing-previously-generated-images) command.
|
||||
[Outpainting and outcropping](OUTPAINTING.md) can only be applied after the
|
||||
fact.
|
||||
|
||||
## Face Fixing
|
||||
|
||||
The default face restoration module is GFPGAN. The default upscale is
|
||||
@ -17,7 +23,8 @@ Real-ESRGAN. For an alternative face restoration module, see
|
||||
As of version 1.14, environment.yaml will install the Real-ESRGAN package into
|
||||
the standard install location for python packages, and will put GFPGAN into a
|
||||
subdirectory of "src" in the InvokeAI directory. Upscaling with Real-ESRGAN
|
||||
should "just work" without further intervention. Simply indicate the desired scale on
|
||||
should "just work" without further intervention. Simply pass the `--upscale`
|
||||
(`-U`) option on the `invoke>` command line, or indicate the desired scale on
|
||||
the popup in the Web GUI.
|
||||
|
||||
**GFPGAN** requires a series of downloadable model files to work. These are
|
||||
@ -34,75 +41,48 @@ reconstruction.
|
||||
|
||||
### Upscaling
|
||||
|
||||
Open the upscaling dialog by clicking on the "expand" icon located
|
||||
above the image display area in the Web UI:
|
||||
`-U : <upscaling_factor> <upscaling_strength>`
|
||||
|
||||
<figure markdown>
|
||||

|
||||
</figure>
|
||||
The upscaling prompt argument takes two values. The first value is a scaling
|
||||
factor and should be set to either `2` or `4` only. This will either scale the
|
||||
image 2x or 4x respectively using different models.
|
||||
|
||||
There are three different upscaling parameters that you can
|
||||
adjust. The first is the scale itself, either 2x or 4x.
|
||||
You can set the scaling stength between `0` and `1.0` to control intensity of
|
||||
the of the scaling. This is handy because AI upscalers generally tend to smooth
|
||||
out texture details. If you wish to retain some of those for natural looking
|
||||
results, we recommend using values between `0.5 to 0.8`.
|
||||
|
||||
The second is the "Denoising Strength." Higher values will smooth out
|
||||
the image and remove digital chatter, but may lose fine detail at
|
||||
higher values.
|
||||
|
||||
Third, "Upscale Strength" allows you to adjust how the You can set the
|
||||
scaling stength between `0` and `1.0` to control the intensity of the
|
||||
scaling. AI upscalers generally tend to smooth out texture details. If
|
||||
you wish to retain some of those for natural looking results, we
|
||||
recommend using values between `0.5 to 0.8`.
|
||||
|
||||
[This figure](../assets/features/upscaling-montage.png) illustrates
|
||||
the effects of denoising and strength. The original image was 512x512,
|
||||
4x scaled to 2048x2048. The "original" version on the upper left was
|
||||
scaled using simple pixel averaging. The remainder use the ESRGAN
|
||||
upscaling algorithm at different levels of denoising and strength.
|
||||
|
||||
<figure markdown>
|
||||
{ width=720 }
|
||||
</figure>
|
||||
|
||||
Both denoising and strength default to 0.75.
|
||||
If you do not explicitly specify an upscaling_strength, it will default to 0.75.
|
||||
|
||||
### Face Restoration
|
||||
|
||||
InvokeAI offers alternative two face restoration algorithms,
|
||||
[GFPGAN](https://github.com/TencentARC/GFPGAN) and
|
||||
[CodeFormer](https://huggingface.co/spaces/sczhou/CodeFormer). These
|
||||
algorithms improve the appearance of faces, particularly eyes and
|
||||
mouths. Issues with faces are less common with the latest set of
|
||||
Stable Diffusion models than with the original 1.4 release, but the
|
||||
restoration algorithms can still make a noticeable improvement in
|
||||
certain cases. You can also apply restoration to old photographs you
|
||||
upload.
|
||||
`-G : <facetool_strength>`
|
||||
|
||||
To access face restoration, click the "smiley face" icon in the
|
||||
toolbar above the InvokeAI image panel. You will be presented with a
|
||||
dialog that offers a choice between the two algorithm and sliders that
|
||||
allow you to adjust their parameters. Alternatively, you may open the
|
||||
left-hand accordion panel labeled "Face Restoration" and have the
|
||||
restoration algorithm of your choice applied to generated images
|
||||
automatically.
|
||||
This prompt argument controls the strength of the face restoration that is being
|
||||
applied. Similar to upscaling, values between `0.5 to 0.8` are recommended.
|
||||
|
||||
You can use either one or both without any conflicts. In cases where you use
|
||||
both, the image will be first upscaled and then the face restoration process
|
||||
will be executed to ensure you get the highest quality facial features.
|
||||
|
||||
Like upscaling, there are a number of parameters that adjust the face
|
||||
restoration output. GFPGAN has a single parameter, `strength`, which
|
||||
controls how much the algorithm is allowed to adjust the
|
||||
image. CodeFormer has two parameters, `strength`, and `fidelity`,
|
||||
which together control the quality of the output image as described in
|
||||
the [CodeFormer project
|
||||
page](https://shangchenzhou.com/projects/CodeFormer/). Default values
|
||||
are 0.75 for both parameters, which achieves a reasonable balance
|
||||
between changing the image too much and not enough.
|
||||
`--save_orig`
|
||||
|
||||
[This figure](../assets/features/restoration-montage.png) illustrates
|
||||
the effects of adjusting GFPGAN and CodeFormer parameters.
|
||||
When you use either `-U` or `-G`, the final result you get is upscaled or face
|
||||
modified. If you want to save the original Stable Diffusion generation, you can
|
||||
use the `-save_orig` prompt argument to save the original unaffected version
|
||||
too.
|
||||
|
||||
<figure markdown>
|
||||
{ width=720 }
|
||||
</figure>
|
||||
### Example Usage
|
||||
|
||||
```bash
|
||||
invoke> "superman dancing with a panda bear" -U 2 0.6 -G 0.4
|
||||
```
|
||||
|
||||
This also works with img2img:
|
||||
|
||||
```bash
|
||||
invoke> "a man wearing a pineapple hat" -I path/to/your/file.png -U 2 0.5 -G 0.6
|
||||
```
|
||||
|
||||
!!! note
|
||||
|
||||
@ -115,8 +95,69 @@ the effects of adjusting GFPGAN and CodeFormer parameters.
|
||||
process is complete. While the image generation is taking place, you will still be able to preview
|
||||
the base images.
|
||||
|
||||
If you wish to stop during the image generation but want to upscale or face
|
||||
restore a particular generated image, pass it again with the same prompt and
|
||||
generated seed along with the `-U` and `-G` prompt arguments to perform those
|
||||
actions.
|
||||
|
||||
## CodeFormer Support
|
||||
|
||||
This repo also allows you to perform face restoration using
|
||||
[CodeFormer](https://github.com/sczhou/CodeFormer).
|
||||
|
||||
In order to setup CodeFormer to work, you need to download the models like with
|
||||
GFPGAN. You can do this either by running `invokeai-configure` or by manually
|
||||
downloading the
|
||||
[model file](https://github.com/sczhou/CodeFormer/releases/download/v0.1.0/codeformer.pth)
|
||||
and saving it to `ldm/invoke/restoration/codeformer/weights` folder.
|
||||
|
||||
You can use `-ft` prompt argument to swap between CodeFormer and the default
|
||||
GFPGAN. The above mentioned `-G` prompt argument will allow you to control the
|
||||
strength of the restoration effect.
|
||||
|
||||
### CodeFormer Usage
|
||||
|
||||
The following command will perform face restoration with CodeFormer instead of
|
||||
the default gfpgan.
|
||||
|
||||
`<prompt> -G 0.8 -ft codeformer`
|
||||
|
||||
### Other Options
|
||||
|
||||
- `-cf` - cf or CodeFormer Fidelity takes values between `0` and `1`. 0 produces
|
||||
high quality results but low accuracy and 1 produces lower quality results but
|
||||
higher accuacy to your original face.
|
||||
|
||||
The following command will perform face restoration with CodeFormer. CodeFormer
|
||||
will output a result that is closely matching to the input face.
|
||||
|
||||
`<prompt> -G 1.0 -ft codeformer -cf 0.9`
|
||||
|
||||
The following command will perform face restoration with CodeFormer. CodeFormer
|
||||
will output a result that is the best restoration possible. This may deviate
|
||||
slightly from the original face. This is an excellent option to use in
|
||||
situations when there is very little facial data to work with.
|
||||
|
||||
`<prompt> -G 1.0 -ft codeformer -cf 0.1`
|
||||
|
||||
## Fixing Previously-Generated Images
|
||||
|
||||
It is easy to apply face restoration and/or upscaling to any
|
||||
previously-generated file. Just use the syntax
|
||||
`!fix path/to/file.png <options>`. For example, to apply GFPGAN at strength 0.8
|
||||
and upscale 2X for a file named `./outputs/img-samples/000044.2945021133.png`,
|
||||
just run:
|
||||
|
||||
```bash
|
||||
invoke> !fix ./outputs/img-samples/000044.2945021133.png -G 0.8 -U 2
|
||||
```
|
||||
|
||||
A new file named `000044.2945021133.fixed.png` will be created in the output
|
||||
directory. Note that the `!fix` command does not replace the original file,
|
||||
unlike the behavior at generate time.
|
||||
|
||||
## How to disable
|
||||
|
||||
If, for some reason, you do not wish to load the GFPGAN and/or ESRGAN libraries,
|
||||
you can disable them on the invoke.py command line with the `--no_restore` and
|
||||
`--no_esrgan` options, respectively.
|
||||
`--no_upscale` options, respectively.
|
||||
|
@ -4,12 +4,77 @@ title: Prompting-Features
|
||||
|
||||
# :octicons-command-palette-24: Prompting-Features
|
||||
|
||||
## **Reading Prompts from a File**
|
||||
|
||||
You can automate `invoke.py` by providing a text file with the prompts you want
|
||||
to run, one line per prompt. The text file must be composed with a text editor
|
||||
(e.g. Notepad) and not a word processor. Each line should look like what you
|
||||
would type at the invoke> prompt:
|
||||
|
||||
```bash
|
||||
"a beautiful sunny day in the park, children playing" -n4 -C10
|
||||
"stormy weather on a mountain top, goats grazing" -s100
|
||||
"innovative packaging for a squid's dinner" -S137038382
|
||||
```
|
||||
|
||||
Then pass this file's name to `invoke.py` when you invoke it:
|
||||
|
||||
```bash
|
||||
python scripts/invoke.py --from_file "/path/to/prompts.txt"
|
||||
```
|
||||
|
||||
You may also read a series of prompts from standard input by providing
|
||||
a filename of `-`. For example, here is a python script that creates a
|
||||
matrix of prompts, each one varying slightly:
|
||||
|
||||
```bash
|
||||
#!/usr/bin/env python
|
||||
|
||||
adjectives = ['sunny','rainy','overcast']
|
||||
samplers = ['k_lms','k_euler_a','k_heun']
|
||||
cfg = [7.5, 9, 11]
|
||||
|
||||
for adj in adjectives:
|
||||
for samp in samplers:
|
||||
for cg in cfg:
|
||||
print(f'a {adj} day -A{samp} -C{cg}')
|
||||
```
|
||||
|
||||
Its output looks like this (abbreviated):
|
||||
|
||||
```bash
|
||||
a sunny day -Aklms -C7.5
|
||||
a sunny day -Aklms -C9
|
||||
a sunny day -Aklms -C11
|
||||
a sunny day -Ak_euler_a -C7.5
|
||||
a sunny day -Ak_euler_a -C9
|
||||
...
|
||||
a overcast day -Ak_heun -C9
|
||||
a overcast day -Ak_heun -C11
|
||||
```
|
||||
|
||||
To feed it to invoke.py, pass the filename of "-"
|
||||
|
||||
```bash
|
||||
python matrix.py | python scripts/invoke.py --from_file -
|
||||
```
|
||||
|
||||
When the script is finished, each of the 27 combinations
|
||||
of adjective, sampler and CFG will be executed.
|
||||
|
||||
The command-line interface provides `!fetch` and `!replay` commands
|
||||
which allow you to read the prompts from a single previously-generated
|
||||
image or a whole directory of them, write the prompts to a file, and
|
||||
then replay them. Or you can create your own file of prompts and feed
|
||||
them to the command-line client from within an interactive session.
|
||||
See [Command-Line Interface](CLI.md) for details.
|
||||
|
||||
---
|
||||
|
||||
## **Negative and Unconditioned Prompts**
|
||||
|
||||
Any words between a pair of square brackets will instruct Stable
|
||||
Diffusion to attempt to ban the concept from the generated image. The
|
||||
same effect is achieved by placing words in the "Negative Prompts"
|
||||
textbox in the Web UI.
|
||||
Any words between a pair of square brackets will instruct Stable Diffusion to
|
||||
attempt to ban the concept from the generated image.
|
||||
|
||||
```text
|
||||
this is a test prompt [not really] to make you understand [cool] how this works.
|
||||
@ -22,9 +87,7 @@ Here's a prompt that depicts what it does.
|
||||
|
||||
original prompt:
|
||||
|
||||
`#!bash "A fantastical translucent pony made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve"`
|
||||
|
||||
`#!bash parameters: steps=20, dimensions=512x768, CFG=7.5, Scheduler=k_euler_a, seed=1654590180`
|
||||
`#!bash "A fantastical translucent pony made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve" -s 20 -W 512 -H 768 -C 7.5 -A k_euler_a -S 1654590180`
|
||||
|
||||
<figure markdown>
|
||||
|
||||
@ -36,8 +99,7 @@ That image has a woman, so if we want the horse without a rider, we can
|
||||
influence the image not to have a woman by putting [woman] in the prompt, like
|
||||
this:
|
||||
|
||||
`#!bash "A fantastical translucent poney made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve [woman]"`
|
||||
(same parameters as above)
|
||||
`#!bash "A fantastical translucent poney made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve [woman]" -s 20 -W 512 -H 768 -C 7.5 -A k_euler_a -S 1654590180`
|
||||
|
||||
<figure markdown>
|
||||
|
||||
@ -48,8 +110,7 @@ this:
|
||||
That's nice - but say we also don't want the image to be quite so blue. We can
|
||||
add "blue" to the list of negative prompts, so it's now [woman blue]:
|
||||
|
||||
`#!bash "A fantastical translucent poney made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve [woman blue]"`
|
||||
(same parameters as above)
|
||||
`#!bash "A fantastical translucent poney made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve [woman blue]" -s 20 -W 512 -H 768 -C 7.5 -A k_euler_a -S 1654590180`
|
||||
|
||||
<figure markdown>
|
||||
|
||||
@ -60,8 +121,7 @@ add "blue" to the list of negative prompts, so it's now [woman blue]:
|
||||
Getting close - but there's no sense in having a saddle when our horse doesn't
|
||||
have a rider, so we'll add one more negative prompt: [woman blue saddle].
|
||||
|
||||
`#!bash "A fantastical translucent poney made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve [woman blue saddle]"`
|
||||
(same parameters as above)
|
||||
`#!bash "A fantastical translucent poney made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve [woman blue saddle]" -s 20 -W 512 -H 768 -C 7.5 -A k_euler_a -S 1654590180`
|
||||
|
||||
<figure markdown>
|
||||
|
||||
@ -201,6 +261,19 @@ Prompt2prompt `.swap()` is not compatible with xformers, which will be temporari
|
||||
The `prompt2prompt` code is based off
|
||||
[bloc97's colab](https://github.com/bloc97/CrossAttentionControl).
|
||||
|
||||
Note that `prompt2prompt` is not currently working with the runwayML inpainting
|
||||
model, and may never work due to the way this model is set up. If you attempt to
|
||||
use `prompt2prompt` you will get the original image back. However, since this
|
||||
model is so good at inpainting, a good substitute is to use the `clipseg` text
|
||||
masking option:
|
||||
|
||||
```bash
|
||||
invoke> a fluffy cat eating a hotdot
|
||||
Outputs:
|
||||
[1010] outputs/000025.2182095108.png: a fluffy cat eating a hotdog
|
||||
invoke> a smiling dog eating a hotdog -I 000025.2182095108.png -tm cat
|
||||
```
|
||||
|
||||
### Escaping parantheses () and speech marks ""
|
||||
|
||||
If the model you are using has parentheses () or speech marks "" as part of its
|
||||
@ -301,48 +374,6 @@ summoning up the concept of some sort of scifi creature? Let's find out.
|
||||
Indeed, removing the word "hybrid" produces an image that is more like what we'd
|
||||
expect.
|
||||
|
||||
## Dynamic Prompts
|
||||
|
||||
Dynamic Prompts are a powerful feature designed to produce a variety of prompts based on user-defined options. Using a special syntax, you can construct a prompt with multiple possibilities, and the system will automatically generate a series of permutations based on your settings. This is extremely beneficial for ideation, exploring various scenarios, or testing different concepts swiftly and efficiently.
|
||||
|
||||
### Structure of a Dynamic Prompt
|
||||
|
||||
A Dynamic Prompt comprises of regular text, supplemented with alternatives enclosed within curly braces {} and separated by a vertical bar |. For example: {option1|option2|option3}. The system will then select one of the options to include in the final prompt. This flexible system allows for options to be placed throughout the text as needed.
|
||||
|
||||
Furthermore, Dynamic Prompts can designate multiple selections from a single group of options. This feature is triggered by prefixing the options with a numerical value followed by $$. For example, in {2$$option1|option2|option3}, the system will select two distinct options from the set.
|
||||
### Creating Dynamic Prompts
|
||||
|
||||
To create a Dynamic Prompt, follow these steps:
|
||||
|
||||
Draft your sentence or phrase, identifying words or phrases with multiple possible options.
|
||||
Encapsulate the different options within curly braces {}.
|
||||
Within the braces, separate each option using a vertical bar |.
|
||||
If you want to include multiple options from a single group, prefix with the desired number and $$.
|
||||
|
||||
For instance: A {house|apartment|lodge|cottage} in {summer|winter|autumn|spring} designed in {2$$style1|style2|style3}.
|
||||
### How Dynamic Prompts Work
|
||||
|
||||
Once a Dynamic Prompt is configured, the system generates an array of combinations using the options provided. Each group of options in curly braces is treated independently, with the system selecting one option from each group. For a prefixed set (e.g., 2$$), the system will select two distinct options.
|
||||
|
||||
For example, the following prompts could be generated from the above Dynamic Prompt:
|
||||
|
||||
A house in summer designed in style1, style2
|
||||
A lodge in autumn designed in style3, style1
|
||||
A cottage in winter designed in style2, style3
|
||||
And many more!
|
||||
|
||||
When the `Combinatorial` setting is on, Invoke will disable the "Images" selection, and generate every combination up until the setting for Max Prompts is reached.
|
||||
When the `Combinatorial` setting is off, Invoke will randomly generate combinations up until the setting for Images has been reached.
|
||||
|
||||
|
||||
|
||||
### Tips and Tricks for Using Dynamic Prompts
|
||||
|
||||
Below are some useful strategies for creating Dynamic Prompts:
|
||||
|
||||
Utilize Dynamic Prompts to generate a wide spectrum of prompts, perfect for brainstorming and exploring diverse ideas.
|
||||
Ensure that the options within a group are contextually relevant to the part of the sentence where they are used. For instance, group building types together, and seasons together.
|
||||
Apply the 2$$ prefix when you want to incorporate more than one option from a single group. This becomes quite handy when mixing and matching different elements.
|
||||
Experiment with different quantities for the prefix. For example, 3$$ will select three distinct options.
|
||||
Be aware of coherence in your prompts. Although the system can generate all possible combinations, not all may semantically make sense. Therefore, carefully choose the options for each group.
|
||||
Always review and fine-tune the generated prompts as needed. While Dynamic Prompts can help you generate a multitude of combinations, the final polishing and refining remain in your hands.
|
||||
In conclusion, prompt blending is great for exploring creative space, but can be
|
||||
difficult to direct. A forthcoming release of InvokeAI will feature more
|
||||
deterministic prompt weighting.
|
||||
|
287
docs/features/TEXTUAL_INVERSION.md
Normal file
@ -0,0 +1,287 @@
|
||||
---
|
||||
title: Textual-Inversion
|
||||
---
|
||||
|
||||
# :material-file-document: Textual Inversion
|
||||
|
||||
## **Personalizing Text-to-Image Generation**
|
||||
|
||||
You may personalize the generated images to provide your own styles or objects
|
||||
by training a new LDM checkpoint and introducing a new vocabulary to the fixed
|
||||
model as a (.pt) embeddings file. Alternatively, you may use or train
|
||||
HuggingFace Concepts embeddings files (.bin) from
|
||||
<https://huggingface.co/sd-concepts-library> and its associated
|
||||
notebooks.
|
||||
|
||||
## **Hardware and Software Requirements**
|
||||
|
||||
You will need a GPU to perform training in a reasonable length of
|
||||
time, and at least 12 GB of VRAM. We recommend using the [`xformers`
|
||||
library](../installation/070_INSTALL_XFORMERS.md) to accelerate the
|
||||
training process further. During training, about ~8 GB is temporarily
|
||||
needed in order to store intermediate models, checkpoints and logs.
|
||||
|
||||
## **Preparing for Training**
|
||||
|
||||
To train, prepare a folder that contains 3-5 images that illustrate
|
||||
the object or concept. It is good to provide a variety of examples or
|
||||
poses to avoid overtraining the system. Format these images as PNG
|
||||
(preferred) or JPG. You do not need to resize or crop the images in
|
||||
advance, but for more control you may wish to do so.
|
||||
|
||||
Place the training images in a directory on the machine InvokeAI runs
|
||||
on. We recommend placing them in a subdirectory of the
|
||||
`text-inversion-training-data` folder located in the InvokeAI root
|
||||
directory, ordinarily `~/invokeai` (Linux/Mac), or
|
||||
`C:\Users\your_name\invokeai` (Windows). For example, to create an
|
||||
embedding for the "psychedelic" style, you'd place the training images
|
||||
into the directory
|
||||
`~invokeai/text-inversion-training-data/psychedelic`.
|
||||
|
||||
## **Launching Training Using the Console Front End**
|
||||
|
||||
InvokeAI 2.3 and higher comes with a text console-based training front
|
||||
end. From within the `invoke.sh`/`invoke.bat` Invoke launcher script,
|
||||
start the front end by selecting choice (3):
|
||||
|
||||
```sh
|
||||
Do you want to generate images using the
|
||||
1. command-line
|
||||
2. browser-based UI
|
||||
3. textual inversion training
|
||||
4. open the developer console
|
||||
Please enter 1, 2, 3, or 4: [1] 3
|
||||
```
|
||||
|
||||
From the command line, with the InvokeAI virtual environment active,
|
||||
you can launch the front end with the command `invokeai-ti --gui`.
|
||||
|
||||
This will launch a text-based front end that will look like this:
|
||||
|
||||
<figure markdown>
|
||||

|
||||
</figure>
|
||||
|
||||
The interface is keyboard-based. Move from field to field using
|
||||
control-N (^N) to move to the next field and control-P (^P) to the
|
||||
previous one. <Tab> and <shift-TAB> work as well. Once a field is
|
||||
active, use the cursor keys. In a checkbox group, use the up and down
|
||||
cursor keys to move from choice to choice, and <space> to select a
|
||||
choice. In a scrollbar, use the left and right cursor keys to increase
|
||||
and decrease the value of the scroll. In textfields, type the desired
|
||||
values.
|
||||
|
||||
The number of parameters may look intimidating, but in most cases the
|
||||
predefined defaults work fine. The red circled fields in the above
|
||||
illustration are the ones you will adjust most frequently.
|
||||
|
||||
### Model Name
|
||||
|
||||
This will list all the diffusers models that are currently
|
||||
installed. Select the one you wish to use as the basis for your
|
||||
embedding. Be aware that if you use a SD-1.X-based model for your
|
||||
training, you will only be able to use this embedding with other
|
||||
SD-1.X-based models. Similarly, if you train on SD-2.X, you will only
|
||||
be able to use the embeddings with models based on SD-2.X.
|
||||
|
||||
### Trigger Term
|
||||
|
||||
This is the prompt term you will use to trigger the embedding. Type a
|
||||
single word or phrase you wish to use as the trigger, example
|
||||
"psychedelic" (without angle brackets). Within InvokeAI, you will then
|
||||
be able to activate the trigger using the syntax `<psychedelic>`.
|
||||
|
||||
### Initializer
|
||||
|
||||
This is a single character that is used internally during the training
|
||||
process as a placeholder for the trigger term. It defaults to "*" and
|
||||
can usually be left alone.
|
||||
|
||||
### Resume from last saved checkpoint
|
||||
|
||||
As training proceeds, textual inversion will write a series of
|
||||
intermediate files that can be used to resume training from where it
|
||||
was left off in the case of an interruption. This checkbox will be
|
||||
automatically selected if you provide a previously used trigger term
|
||||
and at least one checkpoint file is found on disk.
|
||||
|
||||
Note that as of 20 January 2023, resume does not seem to be working
|
||||
properly due to an issue with the upstream code.
|
||||
|
||||
### Data Training Directory
|
||||
|
||||
This is the location of the images to be used for training. When you
|
||||
select a trigger term like "my-trigger", the frontend will prepopulate
|
||||
this field with `~/invokeai/text-inversion-training-data/my-trigger`,
|
||||
but you can change the path to wherever you want.
|
||||
|
||||
### Output Destination Directory
|
||||
|
||||
This is the location of the logs, checkpoint files, and embedding
|
||||
files created during training. When you select a trigger term like
|
||||
"my-trigger", the frontend will prepopulate this field with
|
||||
`~/invokeai/text-inversion-output/my-trigger`, but you can change the
|
||||
path to wherever you want.
|
||||
|
||||
### Image resolution
|
||||
|
||||
The images in the training directory will be automatically scaled to
|
||||
the value you use here. For best results, you will want to use the
|
||||
same default resolution of the underlying model (512 pixels for
|
||||
SD-1.5, 768 for the larger version of SD-2.1).
|
||||
|
||||
### Center crop images
|
||||
|
||||
If this is selected, your images will be center cropped to make them
|
||||
square before resizing them to the desired resolution. Center cropping
|
||||
can indiscriminately cut off the top of subjects' heads for portrait
|
||||
aspect images, so if you have images like this, you may wish to use a
|
||||
photoeditor to manually crop them to a square aspect ratio.
|
||||
|
||||
### Mixed precision
|
||||
|
||||
Select the floating point precision for the embedding. "no" will
|
||||
result in a full 32-bit precision, "fp16" will provide 16-bit
|
||||
precision, and "bf16" will provide mixed precision (only available
|
||||
when XFormers is used).
|
||||
|
||||
### Max training steps
|
||||
|
||||
How many steps the training will take before the model converges. Most
|
||||
training sets will converge with 2000-3000 steps.
|
||||
|
||||
### Batch size
|
||||
|
||||
This adjusts how many training images are processed simultaneously in
|
||||
each step. Higher values will cause the training process to run more
|
||||
quickly, but use more memory. The default size will run with GPUs with
|
||||
as little as 12 GB.
|
||||
|
||||
### Learning rate
|
||||
|
||||
The rate at which the system adjusts its internal weights during
|
||||
training. Higher values risk overtraining (getting the same image each
|
||||
time), and lower values will take more steps to train a good
|
||||
model. The default of 0.0005 is conservative; you may wish to increase
|
||||
it to 0.005 to speed up training.
|
||||
|
||||
### Scale learning rate by number of GPUs, steps and batch size
|
||||
|
||||
If this is selected (the default) the system will adjust the provided
|
||||
learning rate to improve performance.
|
||||
|
||||
### Use xformers acceleration
|
||||
|
||||
This will activate XFormers memory-efficient attention. You need to
|
||||
have XFormers installed for this to have an effect.
|
||||
|
||||
### Learning rate scheduler
|
||||
|
||||
This adjusts how the learning rate changes over the course of
|
||||
training. The default "constant" means to use a constant learning rate
|
||||
for the entire training session. The other values scale the learning
|
||||
rate according to various formulas.
|
||||
|
||||
Only "constant" is supported by the XFormers library.
|
||||
|
||||
### Gradient accumulation steps
|
||||
|
||||
This is a parameter that allows you to use bigger batch sizes than
|
||||
your GPU's VRAM would ordinarily accommodate, at the cost of some
|
||||
performance.
|
||||
|
||||
### Warmup steps
|
||||
|
||||
If "constant_with_warmup" is selected in the learning rate scheduler,
|
||||
then this provides the number of warmup steps. Warmup steps have a
|
||||
very low learning rate, and are one way of preventing early
|
||||
overtraining.
|
||||
|
||||
## The training run
|
||||
|
||||
Start the training run by advancing to the OK button (bottom right)
|
||||
and pressing <enter>. A series of progress messages will be displayed
|
||||
as the training process proceeds. This may take an hour or two,
|
||||
depending on settings and the speed of your system. Various log and
|
||||
checkpoint files will be written into the output directory (ordinarily
|
||||
`~/invokeai/text-inversion-output/my-model/`)
|
||||
|
||||
At the end of successful training, the system will copy the file
|
||||
`learned_embeds.bin` into the InvokeAI root directory's `embeddings`
|
||||
directory, using a subdirectory named after the trigger token. For
|
||||
example, if the trigger token was `psychedelic`, then look for the
|
||||
embeddings file in
|
||||
`~/invokeai/embeddings/psychedelic/learned_embeds.bin`
|
||||
|
||||
You may now launch InvokeAI and try out a prompt that uses the trigger
|
||||
term. For example `a plate of banana sushi in <psychedelic> style`.
|
||||
|
||||
## **Training with the Command-Line Script**
|
||||
|
||||
Training can also be done using a traditional command-line script. It
|
||||
can be launched from within the "developer's console", or from the
|
||||
command line after activating InvokeAI's virtual environment.
|
||||
|
||||
It accepts a large number of arguments, which can be summarized by
|
||||
passing the `--help` argument:
|
||||
|
||||
```sh
|
||||
invokeai-ti --help
|
||||
```
|
||||
|
||||
Typical usage is shown here:
|
||||
```sh
|
||||
invokeai-ti \
|
||||
--model=stable-diffusion-1.5 \
|
||||
--resolution=512 \
|
||||
--learnable_property=style \
|
||||
--initializer_token='*' \
|
||||
--placeholder_token='<psychedelic>' \
|
||||
--train_data_dir=/home/lstein/invokeai/training-data/psychedelic \
|
||||
--output_dir=/home/lstein/invokeai/text-inversion-training/psychedelic \
|
||||
--scale_lr \
|
||||
--train_batch_size=8 \
|
||||
--gradient_accumulation_steps=4 \
|
||||
--max_train_steps=3000 \
|
||||
--learning_rate=0.0005 \
|
||||
--resume_from_checkpoint=latest \
|
||||
--lr_scheduler=constant \
|
||||
--mixed_precision=fp16 \
|
||||
--only_save_embeds
|
||||
```
|
||||
|
||||
## Using Embeddings
|
||||
|
||||
After training completes, the resultant embeddings will be saved into your `$INVOKEAI_ROOT/embeddings/<trigger word>/learned_embeds.bin`.
|
||||
|
||||
These will be automatically loaded when you start InvokeAI.
|
||||
|
||||
Add the trigger word, surrounded by angle brackets, to use that embedding. For example, if your trigger word was `terence`, use `<terence>` in prompts. This is the same syntax used by the HuggingFace concepts library.
|
||||
|
||||
**Note:** `.pt` embeddings do not require the angle brackets.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### `Cannot load embedding for <trigger>. It was trained on a model with token dimension 1024, but the current model has token dimension 768`
|
||||
|
||||
Messages like this indicate you trained the embedding on a different base model than the currently selected one.
|
||||
|
||||
For example, in the error above, the training was done on SD2.1 (768x768) but it was used on SD1.5 (512x512).
|
||||
|
||||
## Reading
|
||||
|
||||
For more information on textual inversion, please see the following
|
||||
resources:
|
||||
|
||||
* The [textual inversion repository](https://github.com/rinongal/textual_inversion) and
|
||||
associated paper for details and limitations.
|
||||
* [HuggingFace's textual inversion training
|
||||
page](https://huggingface.co/docs/diffusers/training/text_inversion)
|
||||
* [HuggingFace example script
|
||||
documentation](https://github.com/huggingface/diffusers/tree/main/examples/textual_inversion)
|
||||
(Note that this script is similar to, but not identical, to
|
||||
`textual_inversion`, but produces embed files that are completely compatible.
|
||||
|
||||
---
|
||||
|
||||
copyright (c) 2023, Lincoln Stein and the InvokeAI Development Team
|
@ -1,286 +0,0 @@
|
||||
---
|
||||
title: Training
|
||||
---
|
||||
|
||||
# :material-file-document: Training
|
||||
|
||||
# Textual Inversion Training
|
||||
## **Personalizing Text-to-Image Generation**
|
||||
|
||||
You may personalize the generated images to provide your own styles or objects
|
||||
by training a new LDM checkpoint and introducing a new vocabulary to the fixed
|
||||
model as a (.pt) embeddings file. Alternatively, you may use or train
|
||||
HuggingFace Concepts embeddings files (.bin) from
|
||||
<https://huggingface.co/sd-concepts-library> and its associated
|
||||
notebooks.
|
||||
|
||||
## **Hardware and Software Requirements**
|
||||
|
||||
You will need a GPU to perform training in a reasonable length of
|
||||
time, and at least 12 GB of VRAM. We recommend using the [`xformers`
|
||||
library](../installation/070_INSTALL_XFORMERS.md) to accelerate the
|
||||
training process further. During training, about ~8 GB is temporarily
|
||||
needed in order to store intermediate models, checkpoints and logs.
|
||||
|
||||
## **Preparing for Training**
|
||||
|
||||
To train, prepare a folder that contains 3-5 images that illustrate
|
||||
the object or concept. It is good to provide a variety of examples or
|
||||
poses to avoid overtraining the system. Format these images as PNG
|
||||
(preferred) or JPG. You do not need to resize or crop the images in
|
||||
advance, but for more control you may wish to do so.
|
||||
|
||||
Place the training images in a directory on the machine InvokeAI runs
|
||||
on. We recommend placing them in a subdirectory of the
|
||||
`text-inversion-training-data` folder located in the InvokeAI root
|
||||
directory, ordinarily `~/invokeai` (Linux/Mac), or
|
||||
`C:\Users\your_name\invokeai` (Windows). For example, to create an
|
||||
embedding for the "psychedelic" style, you'd place the training images
|
||||
into the directory
|
||||
`~invokeai/text-inversion-training-data/psychedelic`.
|
||||
|
||||
## **Launching Training Using the Console Front End**
|
||||
|
||||
InvokeAI 2.3 and higher comes with a text console-based training front
|
||||
end. From within the `invoke.sh`/`invoke.bat` Invoke launcher script,
|
||||
start the front end by selecting choice (3):
|
||||
|
||||
```sh
|
||||
Do you want to generate images using the
|
||||
1: Browser-based UI
|
||||
2: Command-line interface
|
||||
3: Run textual inversion training
|
||||
4: Merge models (diffusers type only)
|
||||
5: Download and install models
|
||||
6: Change InvokeAI startup options
|
||||
7: Re-run the configure script to fix a broken install
|
||||
8: Open the developer console
|
||||
9: Update InvokeAI
|
||||
10: Command-line help
|
||||
Q: Quit
|
||||
|
||||
Please enter 1-10, Q: [1]
|
||||
```
|
||||
|
||||
From the command line, with the InvokeAI virtual environment active,
|
||||
you can launch the front end with the command `invokeai-ti --gui`.
|
||||
|
||||
This will launch a text-based front end that will look like this:
|
||||
|
||||
<figure markdown>
|
||||

|
||||
</figure>
|
||||
|
||||
The interface is keyboard-based. Move from field to field using
|
||||
control-N (^N) to move to the next field and control-P (^P) to the
|
||||
previous one. <Tab> and <shift-TAB> work as well. Once a field is
|
||||
active, use the cursor keys. In a checkbox group, use the up and down
|
||||
cursor keys to move from choice to choice, and <space> to select a
|
||||
choice. In a scrollbar, use the left and right cursor keys to increase
|
||||
and decrease the value of the scroll. In textfields, type the desired
|
||||
values.
|
||||
|
||||
The number of parameters may look intimidating, but in most cases the
|
||||
predefined defaults work fine. The red circled fields in the above
|
||||
illustration are the ones you will adjust most frequently.
|
||||
|
||||
### Model Name
|
||||
|
||||
This will list all the diffusers models that are currently
|
||||
installed. Select the one you wish to use as the basis for your
|
||||
embedding. Be aware that if you use a SD-1.X-based model for your
|
||||
training, you will only be able to use this embedding with other
|
||||
SD-1.X-based models. Similarly, if you train on SD-2.X, you will only
|
||||
be able to use the embeddings with models based on SD-2.X.
|
||||
|
||||
### Trigger Term
|
||||
|
||||
This is the prompt term you will use to trigger the embedding. Type a
|
||||
single word or phrase you wish to use as the trigger, example
|
||||
"psychedelic" (without angle brackets). Within InvokeAI, you will then
|
||||
be able to activate the trigger using the syntax `<psychedelic>`.
|
||||
|
||||
### Initializer
|
||||
|
||||
This is a single character that is used internally during the training
|
||||
process as a placeholder for the trigger term. It defaults to "*" and
|
||||
can usually be left alone.
|
||||
|
||||
### Resume from last saved checkpoint
|
||||
|
||||
As training proceeds, textual inversion will write a series of
|
||||
intermediate files that can be used to resume training from where it
|
||||
was left off in the case of an interruption. This checkbox will be
|
||||
automatically selected if you provide a previously used trigger term
|
||||
and at least one checkpoint file is found on disk.
|
||||
|
||||
Note that as of 20 January 2023, resume does not seem to be working
|
||||
properly due to an issue with the upstream code.
|
||||
|
||||
### Data Training Directory
|
||||
|
||||
This is the location of the images to be used for training. When you
|
||||
select a trigger term like "my-trigger", the frontend will prepopulate
|
||||
this field with `~/invokeai/text-inversion-training-data/my-trigger`,
|
||||
but you can change the path to wherever you want.
|
||||
|
||||
### Output Destination Directory
|
||||
|
||||
This is the location of the logs, checkpoint files, and embedding
|
||||
files created during training. When you select a trigger term like
|
||||
"my-trigger", the frontend will prepopulate this field with
|
||||
`~/invokeai/text-inversion-output/my-trigger`, but you can change the
|
||||
path to wherever you want.
|
||||
|
||||
### Image resolution
|
||||
|
||||
The images in the training directory will be automatically scaled to
|
||||
the value you use here. For best results, you will want to use the
|
||||
same default resolution of the underlying model (512 pixels for
|
||||
SD-1.5, 768 for the larger version of SD-2.1).
|
||||
|
||||
### Center crop images
|
||||
|
||||
If this is selected, your images will be center cropped to make them
|
||||
square before resizing them to the desired resolution. Center cropping
|
||||
can indiscriminately cut off the top of subjects' heads for portrait
|
||||
aspect images, so if you have images like this, you may wish to use a
|
||||
photoeditor to manually crop them to a square aspect ratio.
|
||||
|
||||
### Mixed precision
|
||||
|
||||
Select the floating point precision for the embedding. "no" will
|
||||
result in a full 32-bit precision, "fp16" will provide 16-bit
|
||||
precision, and "bf16" will provide mixed precision (only available
|
||||
when XFormers is used).
|
||||
|
||||
### Max training steps
|
||||
|
||||
How many steps the training will take before the model converges. Most
|
||||
training sets will converge with 2000-3000 steps.
|
||||
|
||||
### Batch size
|
||||
|
||||
This adjusts how many training images are processed simultaneously in
|
||||
each step. Higher values will cause the training process to run more
|
||||
quickly, but use more memory. The default size will run with GPUs with
|
||||
as little as 12 GB.
|
||||
|
||||
### Learning rate
|
||||
|
||||
The rate at which the system adjusts its internal weights during
|
||||
training. Higher values risk overtraining (getting the same image each
|
||||
time), and lower values will take more steps to train a good
|
||||
model. The default of 0.0005 is conservative; you may wish to increase
|
||||
it to 0.005 to speed up training.
|
||||
|
||||
### Scale learning rate by number of GPUs, steps and batch size
|
||||
|
||||
If this is selected (the default) the system will adjust the provided
|
||||
learning rate to improve performance.
|
||||
|
||||
### Use xformers acceleration
|
||||
|
||||
This will activate XFormers memory-efficient attention. You need to
|
||||
have XFormers installed for this to have an effect.
|
||||
|
||||
### Learning rate scheduler
|
||||
|
||||
This adjusts how the learning rate changes over the course of
|
||||
training. The default "constant" means to use a constant learning rate
|
||||
for the entire training session. The other values scale the learning
|
||||
rate according to various formulas.
|
||||
|
||||
Only "constant" is supported by the XFormers library.
|
||||
|
||||
### Gradient accumulation steps
|
||||
|
||||
This is a parameter that allows you to use bigger batch sizes than
|
||||
your GPU's VRAM would ordinarily accommodate, at the cost of some
|
||||
performance.
|
||||
|
||||
### Warmup steps
|
||||
|
||||
If "constant_with_warmup" is selected in the learning rate scheduler,
|
||||
then this provides the number of warmup steps. Warmup steps have a
|
||||
very low learning rate, and are one way of preventing early
|
||||
overtraining.
|
||||
|
||||
## The training run
|
||||
|
||||
Start the training run by advancing to the OK button (bottom right)
|
||||
and pressing <enter>. A series of progress messages will be displayed
|
||||
as the training process proceeds. This may take an hour or two,
|
||||
depending on settings and the speed of your system. Various log and
|
||||
checkpoint files will be written into the output directory (ordinarily
|
||||
`~/invokeai/text-inversion-output/my-model/`)
|
||||
|
||||
At the end of successful training, the system will copy the file
|
||||
`learned_embeds.bin` into the InvokeAI root directory's `embeddings`
|
||||
directory, using a subdirectory named after the trigger token. For
|
||||
example, if the trigger token was `psychedelic`, then look for the
|
||||
embeddings file in
|
||||
`~/invokeai/embeddings/psychedelic/learned_embeds.bin`
|
||||
|
||||
You may now launch InvokeAI and try out a prompt that uses the trigger
|
||||
term. For example `a plate of banana sushi in <psychedelic> style`.
|
||||
|
||||
## **Training with the Command-Line Script**
|
||||
|
||||
Training can also be done using a traditional command-line script. It
|
||||
can be launched from within the "developer's console", or from the
|
||||
command line after activating InvokeAI's virtual environment.
|
||||
|
||||
It accepts a large number of arguments, which can be summarized by
|
||||
passing the `--help` argument:
|
||||
|
||||
```sh
|
||||
invokeai-ti --help
|
||||
```
|
||||
|
||||
Typical usage is shown here:
|
||||
```sh
|
||||
invokeai-ti \
|
||||
--model=stable-diffusion-1.5 \
|
||||
--resolution=512 \
|
||||
--learnable_property=style \
|
||||
--initializer_token='*' \
|
||||
--placeholder_token='<psychedelic>' \
|
||||
--train_data_dir=/home/lstein/invokeai/training-data/psychedelic \
|
||||
--output_dir=/home/lstein/invokeai/text-inversion-training/psychedelic \
|
||||
--scale_lr \
|
||||
--train_batch_size=8 \
|
||||
--gradient_accumulation_steps=4 \
|
||||
--max_train_steps=3000 \
|
||||
--learning_rate=0.0005 \
|
||||
--resume_from_checkpoint=latest \
|
||||
--lr_scheduler=constant \
|
||||
--mixed_precision=fp16 \
|
||||
--only_save_embeds
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### `Cannot load embedding for <trigger>. It was trained on a model with token dimension 1024, but the current model has token dimension 768`
|
||||
|
||||
Messages like this indicate you trained the embedding on a different base model than the currently selected one.
|
||||
|
||||
For example, in the error above, the training was done on SD2.1 (768x768) but it was used on SD1.5 (512x512).
|
||||
|
||||
## Reading
|
||||
|
||||
For more information on textual inversion, please see the following
|
||||
resources:
|
||||
|
||||
* The [textual inversion repository](https://github.com/rinongal/textual_inversion) and
|
||||
associated paper for details and limitations.
|
||||
* [HuggingFace's textual inversion training
|
||||
page](https://huggingface.co/docs/diffusers/training/text_inversion)
|
||||
* [HuggingFace example script
|
||||
documentation](https://github.com/huggingface/diffusers/tree/main/examples/textual_inversion)
|
||||
(Note that this script is similar to, but not identical, to
|
||||
`textual_inversion`, but produces embed files that are completely compatible.
|
||||
|
||||
---
|
||||
|
||||
copyright (c) 2023, Lincoln Stein and the InvokeAI Development Team
|