Compare commits
61 Commits
Author | SHA1 | Date | |
---|---|---|---|
90d37eac03 | |||
230de023ff | |||
febf86dedf | |||
76ae17abac | |||
339ff4b464 | |||
00c0e487dd | |||
5c8dfa38be | |||
acf85c66a5 | |||
3619918954 | |||
65b14683a8 | |||
f4fc02a3da | |||
c334170a93 | |||
deab6c64fc | |||
e1c9503951 | |||
9a21812bf5 | |||
347b5ce452 | |||
b39029521b | |||
97b26f3de2 | |||
e19a7a990d | |||
3e424e1046 | |||
db20b4af9c | |||
44ff8f8531 | |||
a8b794d7e0 | |||
f868362ca8 | |||
8858f7e97c | |||
2db4969e18 | |||
2ecc1abf21 | |||
703bc9494a | |||
e5ab07091d | |||
891678b656 | |||
39ea2a257c | |||
2d68eae16b | |||
d65948c423 | |||
9910a0b004 | |||
ff96358cb3 | |||
edf471f655 | |||
5b02c8ca4a | |||
e7688c53b8 | |||
87cada42db | |||
6fe67ee426 | |||
5fbc81885a | |||
25ba5451f2 | |||
138c9cf7a8 | |||
87981306a3 | |||
f7893b3ea9 | |||
87395fe6fe | |||
15f876c66c | |||
522c35ac5b | |||
bb2d6d640f | |||
2412d8dec1 | |||
2ab5a43663 | |||
0ec3d6c10a | |||
d208e1b0f5 | |||
8a6ba6a212 | |||
b793d69ff3 | |||
54f55471df | |||
cec7fb7dc6 | |||
b0b82efffe | |||
e599604294 | |||
57a3ea9d7b | |||
a3a50bb886 |
@ -1,3 +0,0 @@
|
||||
*
|
||||
!environment*.yml
|
||||
!docker-build
|
102
.github/ISSUE_TEMPLATE/BUG_REPORT.yml
vendored
@ -1,102 +0,0 @@
|
||||
name: 🐞 Bug Report
|
||||
|
||||
description: File a bug report
|
||||
|
||||
title: '[bug]: '
|
||||
|
||||
labels: ['bug']
|
||||
|
||||
# assignees:
|
||||
# - moderator_bot
|
||||
# - lstein
|
||||
|
||||
body:
|
||||
- type: markdown
|
||||
attributes:
|
||||
value: |
|
||||
Thanks for taking the time to fill out this Bug Report!
|
||||
|
||||
- type: checkboxes
|
||||
attributes:
|
||||
label: Is there an existing issue for this?
|
||||
description: |
|
||||
Please use the [search function](https://github.com/invoke-ai/InvokeAI/issues?q=is%3Aissue+is%3Aopen+label%3Abug)
|
||||
irst to see if an issue already exists for the bug you encountered.
|
||||
options:
|
||||
- label: I have searched the existing issues
|
||||
required: true
|
||||
|
||||
- type: markdown
|
||||
attributes:
|
||||
value: __Describe your environment__
|
||||
|
||||
- type: dropdown
|
||||
id: os_dropdown
|
||||
attributes:
|
||||
label: OS
|
||||
description: Which operating System did you use when the bug occured
|
||||
multiple: false
|
||||
options:
|
||||
- 'Linux'
|
||||
- 'Windows'
|
||||
- 'macOS'
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: dropdown
|
||||
id: gpu_dropdown
|
||||
attributes:
|
||||
label: GPU
|
||||
description: Which kind of Graphic-Adapter is your System using
|
||||
multiple: false
|
||||
options:
|
||||
- 'cuda'
|
||||
- 'amd'
|
||||
- 'mps'
|
||||
- 'cpu'
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: input
|
||||
id: vram
|
||||
attributes:
|
||||
label: VRAM
|
||||
description: Size of the VRAM if known
|
||||
placeholder: 8GB
|
||||
validations:
|
||||
required: false
|
||||
|
||||
- type: textarea
|
||||
id: what-happened
|
||||
attributes:
|
||||
label: What happened?
|
||||
description: |
|
||||
Briefly describe what happened, what you expected to happen and how to reproduce this bug.
|
||||
placeholder: When using the webinterface and right-clicking on button X instead of the popup-menu there error Y appears
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: textarea
|
||||
attributes:
|
||||
label: Screenshots
|
||||
description: If applicable, add screenshots to help explain your problem
|
||||
placeholder: this is what the result looked like <screenshot>
|
||||
validations:
|
||||
required: false
|
||||
|
||||
- type: textarea
|
||||
attributes:
|
||||
label: Additional context
|
||||
description: Add any other context about the problem here
|
||||
placeholder: Only happens when there is full moon and Friday the 13th on Christmas Eve 🎅🏻
|
||||
validations:
|
||||
required: false
|
||||
|
||||
- type: input
|
||||
id: contact
|
||||
attributes:
|
||||
label: Contact Details
|
||||
description: __OPTIONAL__ How can we get in touch with you if we need more info (besides this issue)?
|
||||
placeholder: ex. email@example.com, discordname, twitter, ...
|
||||
validations:
|
||||
required: false
|
56
.github/ISSUE_TEMPLATE/FEATURE_REQUEST.yml
vendored
@ -1,56 +0,0 @@
|
||||
name: Feature Request
|
||||
description: Commit a idea or Request a new feature
|
||||
title: '[enhancement]: '
|
||||
labels: ['enhancement']
|
||||
# assignees:
|
||||
# - lstein
|
||||
# - tildebyte
|
||||
body:
|
||||
- type: markdown
|
||||
attributes:
|
||||
value: |
|
||||
Thanks for taking the time to fill out this Feature request!
|
||||
|
||||
- type: checkboxes
|
||||
attributes:
|
||||
label: Is there an existing issue for this?
|
||||
description: |
|
||||
Please make use of the [search function](https://github.com/invoke-ai/InvokeAI/labels/enhancement)
|
||||
to see if a simmilar issue already exists for the feature you want to request
|
||||
options:
|
||||
- label: I have searched the existing issues
|
||||
required: true
|
||||
|
||||
- type: input
|
||||
id: contact
|
||||
attributes:
|
||||
label: Contact Details
|
||||
description: __OPTIONAL__ How could we get in touch with you if we need more info (besides this issue)?
|
||||
placeholder: ex. email@example.com, discordname, twitter, ...
|
||||
validations:
|
||||
required: false
|
||||
|
||||
- type: textarea
|
||||
id: whatisexpected
|
||||
attributes:
|
||||
label: What should this feature add?
|
||||
description: Please try to explain the functionality this feature should add
|
||||
placeholder: |
|
||||
Instead of one huge textfield, it would be nice to have forms for bug-reports, feature-requests, ...
|
||||
Great benefits with automatic labeling, assigning and other functionalitys not available in that form
|
||||
via old-fashioned markdown-templates. I would also love to see the use of a moderator bot 🤖 like
|
||||
https://github.com/marketplace/actions/issue-moderator-with-commands to auto close old issues and other things
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: textarea
|
||||
attributes:
|
||||
label: Alternatives
|
||||
description: Describe alternatives you've considered
|
||||
placeholder: A clear and concise description of any alternative solutions or features you've considered.
|
||||
|
||||
- type: textarea
|
||||
attributes:
|
||||
label: Aditional Content
|
||||
description: Add any other context or screenshots about the feature request here.
|
||||
placeholder: This is a Mockup of the design how I imagine it <screenshot>
|
36
.github/ISSUE_TEMPLATE/bug_report.md
vendored
Normal file
@ -0,0 +1,36 @@
|
||||
---
|
||||
name: Bug report
|
||||
about: Create a report to help us improve
|
||||
title: ''
|
||||
labels: ''
|
||||
assignees: ''
|
||||
|
||||
---
|
||||
|
||||
**Describe your environment**
|
||||
- GPU: [cuda/amd/mps/cpu]
|
||||
- VRAM: [if known]
|
||||
- CPU arch: [x86/arm]
|
||||
- OS: [Linux/Windows/macOS]
|
||||
- Python: [Anaconda/miniconda/miniforge/pyenv/other (explain)]
|
||||
- Branch: [if `git status` says anything other than "On branch main" paste it here]
|
||||
- Commit: [run `git show` and paste the line that starts with "Merge" here]
|
||||
|
||||
**Describe the bug**
|
||||
A clear and concise description of what the bug is.
|
||||
|
||||
**To Reproduce**
|
||||
Steps to reproduce the behavior:
|
||||
1. Go to '...'
|
||||
2. Click on '....'
|
||||
3. Scroll down to '....'
|
||||
4. See error
|
||||
|
||||
**Expected behavior**
|
||||
A clear and concise description of what you expected to happen.
|
||||
|
||||
**Screenshots**
|
||||
If applicable, add screenshots to help explain your problem.
|
||||
|
||||
**Additional context**
|
||||
Add any other context about the problem here.
|
14
.github/ISSUE_TEMPLATE/config.yml
vendored
@ -1,14 +0,0 @@
|
||||
blank_issues_enabled: false
|
||||
contact_links:
|
||||
- name: Project-Documentation
|
||||
url: https://invoke-ai.github.io/InvokeAI/
|
||||
about: Should be your first place to go when looking for manuals/FAQs regarding our InvokeAI Toolkit
|
||||
- name: Discord
|
||||
url: https://discord.gg/ZmtBAhwWhy
|
||||
about: Our Discord Community could maybe help you out via live-chat
|
||||
- name: GitHub Community Support
|
||||
url: https://github.com/orgs/community/discussions
|
||||
about: Please ask and answer questions regarding the GitHub Platform here.
|
||||
- name: GitHub Security Bug Bounty
|
||||
url: https://bounty.github.com/
|
||||
about: Please report security vulnerabilities of the GitHub Platform here.
|
20
.github/ISSUE_TEMPLATE/feature_request.md
vendored
Normal file
@ -0,0 +1,20 @@
|
||||
---
|
||||
name: Feature request
|
||||
about: Suggest an idea for this project
|
||||
title: ''
|
||||
labels: ''
|
||||
assignees: ''
|
||||
|
||||
---
|
||||
|
||||
**Is your feature request related to a problem? Please describe.**
|
||||
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
|
||||
|
||||
**Describe the solution you'd like**
|
||||
A clear and concise description of what you want to happen.
|
||||
|
||||
**Describe alternatives you've considered**
|
||||
A clear and concise description of any alternative solutions or features you've considered.
|
||||
|
||||
**Additional context**
|
||||
Add any other context or screenshots about the feature request here.
|
42
.github/workflows/build-container.yml
vendored
@ -1,42 +0,0 @@
|
||||
# Building the Image without pushing to confirm it is still buildable
|
||||
# confirum functionality would unfortunately need way more resources
|
||||
name: build container image
|
||||
on:
|
||||
push:
|
||||
branches:
|
||||
- 'main'
|
||||
- 'development'
|
||||
pull_request:
|
||||
branches:
|
||||
- 'main'
|
||||
- 'development'
|
||||
|
||||
jobs:
|
||||
docker:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: prepare docker-tag
|
||||
env:
|
||||
repository: ${{ github.repository }}
|
||||
run: echo "dockertag=${repository,,}" >> $GITHUB_ENV
|
||||
- name: Checkout
|
||||
uses: actions/checkout@v3
|
||||
- name: Set up QEMU
|
||||
uses: docker/setup-qemu-action@v2
|
||||
- name: Set up Docker Buildx
|
||||
uses: docker/setup-buildx-action@v2
|
||||
- name: Cache Docker layers
|
||||
uses: actions/cache@v2
|
||||
with:
|
||||
path: /tmp/.buildx-cache
|
||||
key: buildx-${{ hashFiles('docker-build/Dockerfile') }}
|
||||
- name: Build container
|
||||
uses: docker/build-push-action@v3
|
||||
with:
|
||||
context: .
|
||||
file: docker-build/Dockerfile
|
||||
platforms: linux/amd64
|
||||
push: false
|
||||
tags: ${{ env.dockertag }}:latest
|
||||
cache-from: type=local,src=/tmp/.buildx-cache
|
||||
cache-to: type=local,dest=/tmp/.buildx-cache
|
25
.github/workflows/create-caches.yml
vendored
@ -54,10 +54,27 @@ jobs:
|
||||
[[ -d models/ldm/stable-diffusion-v1 ]] \
|
||||
|| mkdir -p models/ldm/stable-diffusion-v1
|
||||
[[ -r models/ldm/stable-diffusion-v1/model.ckpt ]] \
|
||||
|| curl \
|
||||
-H "Authorization: Bearer ${{ secrets.HUGGINGFACE_TOKEN }}" \
|
||||
-o models/ldm/stable-diffusion-v1/model.ckpt \
|
||||
-L https://huggingface.co/CompVis/stable-diffusion-v-1-4-original/resolve/main/sd-v1-4.ckpt
|
||||
|| curl -o models/ldm/stable-diffusion-v1/model.ckpt ${{ secrets.SD_V1_4_URL }}
|
||||
|
||||
- name: Use cached Conda Environment
|
||||
uses: actions/cache@v3
|
||||
env:
|
||||
cache-name: cache-conda-env-${{ env.CONDA_ENV_NAME }}
|
||||
conda-env-file: ${{ matrix.environment-file }}
|
||||
with:
|
||||
path: ${{ env.CONDA_ROOT }}/envs/${{ env.CONDA_ENV_NAME }}
|
||||
key: ${{ env.cache-name }}
|
||||
restore-keys: ${{ env.cache-name }}-${{ runner.os }}-${{ hashFiles(env.conda-env-file) }}
|
||||
|
||||
- name: Use cached Conda Packages
|
||||
uses: actions/cache@v3
|
||||
env:
|
||||
cache-name: cache-conda-env-${{ env.CONDA_ENV_NAME }}
|
||||
conda-env-file: ${{ matrix.environment-file }}
|
||||
with:
|
||||
path: ${{ env.CONDA_PKGS_DIR }}
|
||||
key: ${{ env.cache-name }}
|
||||
restore-keys: ${{ env.cache-name }}-${{ runner.os }}-${{ hashFiles(env.conda-env-file) }}
|
||||
|
||||
- name: Activate Conda Env
|
||||
uses: conda-incubator/setup-miniconda@v2
|
||||
|
115
.github/workflows/test-invoke-conda.yml
vendored
@ -1,4 +1,4 @@
|
||||
name: Test invoke.py
|
||||
name: Test Invoke with Conda
|
||||
on:
|
||||
push:
|
||||
branches:
|
||||
@ -11,57 +11,31 @@ on:
|
||||
- 'development'
|
||||
|
||||
jobs:
|
||||
matrix:
|
||||
os_matrix:
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
stable-diffusion-model:
|
||||
# - 'https://huggingface.co/CompVis/stable-diffusion-v-1-4-original/resolve/main/sd-v1-4.ckpt'
|
||||
- 'https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned-emaonly.ckpt'
|
||||
os:
|
||||
- ubuntu-latest
|
||||
- macOS-12
|
||||
os: [ubuntu-latest, macos-latest]
|
||||
include:
|
||||
- os: ubuntu-latest
|
||||
environment-file: environment.yml
|
||||
default-shell: bash -l {0}
|
||||
- os: macOS-12
|
||||
- os: macos-latest
|
||||
environment-file: environment-mac.yml
|
||||
default-shell: bash -l {0}
|
||||
# - stable-diffusion-model: https://huggingface.co/CompVis/stable-diffusion-v-1-4-original/resolve/main/sd-v1-4.ckpt
|
||||
# stable-diffusion-model-dl-path: models/ldm/stable-diffusion-v1/sd-v1-4.ckpt
|
||||
# stable-diffusion-model-switch: stable-diffusion-1.4
|
||||
- stable-diffusion-model: https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned-emaonly.ckpt
|
||||
stable-diffusion-model-dl-path: models/ldm/stable-diffusion-v1/v1-5-pruned-emaonly.ckpt
|
||||
stable-diffusion-model-switch: stable-diffusion-1.5
|
||||
name: ${{ matrix.os }} with ${{ matrix.stable-diffusion-model-switch }}
|
||||
name: Test invoke.py on ${{ matrix.os }} with conda
|
||||
runs-on: ${{ matrix.os }}
|
||||
env:
|
||||
CONDA_ENV_NAME: invokeai
|
||||
defaults:
|
||||
run:
|
||||
shell: ${{ matrix.default-shell }}
|
||||
steps:
|
||||
- name: Checkout sources
|
||||
id: checkout-sources
|
||||
uses: actions/checkout@v3
|
||||
|
||||
- name: create models.yaml from example
|
||||
run: cp configs/models.yaml.example configs/models.yaml
|
||||
|
||||
- name: Use cached conda packages
|
||||
id: use-cached-conda-packages
|
||||
uses: actions/cache@v3
|
||||
with:
|
||||
path: ~/conda_pkgs_dir
|
||||
key: conda-pkgs-${{ runner.os }}-${{ runner.arch }}-${{ hashFiles(matrix.environment-file) }}
|
||||
|
||||
- name: Activate Conda Env
|
||||
id: activate-conda-env
|
||||
- name: setup miniconda
|
||||
uses: conda-incubator/setup-miniconda@v2
|
||||
with:
|
||||
activate-environment: ${{ env.CONDA_ENV_NAME }}
|
||||
environment-file: ${{ matrix.environment-file }}
|
||||
auto-activate-base: false
|
||||
auto-update-conda: false
|
||||
miniconda-version: latest
|
||||
|
||||
- name: set test prompt to main branch validation
|
||||
@ -74,40 +48,79 @@ jobs:
|
||||
|
||||
- name: set test prompt to Pull Request validation
|
||||
if: ${{ github.ref != 'refs/heads/main' && github.ref != 'refs/heads/development' }}
|
||||
run: echo "TEST_PROMPTS=tests/validate_pr_prompt.txt" >> $GITHUB_ENV
|
||||
run: echo "TEST_PROMPTS=tests/pr_prompt.txt" >> $GITHUB_ENV
|
||||
|
||||
- name: Download ${{ matrix.stable-diffusion-model-switch }}
|
||||
id: download-stable-diffusion-model
|
||||
- name: set conda environment name
|
||||
run: echo "CONDA_ENV_NAME=invokeai" >> $GITHUB_ENV
|
||||
|
||||
- name: Use Cached Stable Diffusion v1.4 Model
|
||||
id: cache-sd-v1-4
|
||||
uses: actions/cache@v3
|
||||
env:
|
||||
cache-name: cache-sd-v1-4
|
||||
with:
|
||||
path: models/ldm/stable-diffusion-v1/model.ckpt
|
||||
key: ${{ env.cache-name }}
|
||||
restore-keys: ${{ env.cache-name }}
|
||||
|
||||
- name: Download Stable Diffusion v1.4 Model
|
||||
if: ${{ steps.cache-sd-v1-4.outputs.cache-hit != 'true' }}
|
||||
run: |
|
||||
[[ -d models/ldm/stable-diffusion-v1 ]] \
|
||||
|| mkdir -p models/ldm/stable-diffusion-v1
|
||||
curl \
|
||||
-H "Authorization: Bearer ${{ secrets.HUGGINGFACE_TOKEN }}" \
|
||||
-o ${{ matrix.stable-diffusion-model-dl-path }} \
|
||||
-L ${{ matrix.stable-diffusion-model }}
|
||||
[[ -r models/ldm/stable-diffusion-v1/model.ckpt ]] \
|
||||
|| curl -o models/ldm/stable-diffusion-v1/model.ckpt ${{ secrets.SD_V1_4_URL }}
|
||||
|
||||
- name: Use cached Conda Environment
|
||||
uses: actions/cache@v3
|
||||
env:
|
||||
cache-name: cache-conda-env-${{ env.CONDA_ENV_NAME }}
|
||||
conda-env-file: ${{ matrix.environment-file }}
|
||||
with:
|
||||
path: ${{ env.CONDA }}/envs/${{ env.CONDA_ENV_NAME }}
|
||||
key: env-${{ env.cache-name }}-${{ runner.os }}-${{ hashFiles(env.conda-env-file) }}
|
||||
|
||||
- name: Use cached Conda Packages
|
||||
uses: actions/cache@v3
|
||||
env:
|
||||
cache-name: cache-conda-pkgs-${{ env.CONDA_ENV_NAME }}
|
||||
conda-env-file: ${{ matrix.environment-file }}
|
||||
with:
|
||||
path: ${{ env.CONDA_PKGS_DIR }}
|
||||
key: pkgs-${{ env.cache-name }}-${{ runner.os }}-${{ hashFiles(env.conda-env-file) }}
|
||||
|
||||
- name: Activate Conda Env
|
||||
uses: conda-incubator/setup-miniconda@v2
|
||||
with:
|
||||
activate-environment: ${{ env.CONDA_ENV_NAME }}
|
||||
environment-file: ${{ matrix.environment-file }}
|
||||
|
||||
- name: Use Cached Huggingface and Torch models
|
||||
id: cache-hugginface-torch
|
||||
uses: actions/cache@v3
|
||||
env:
|
||||
cache-name: cache-hugginface-torch
|
||||
with:
|
||||
path: ~/.cache
|
||||
key: ${{ env.cache-name }}
|
||||
restore-keys: |
|
||||
${{ env.cache-name }}-${{ hashFiles('scripts/preload_models.py') }}
|
||||
|
||||
- name: run preload_models.py
|
||||
id: run-preload-models
|
||||
run: |
|
||||
python scripts/preload_models.py \
|
||||
--no-interactive
|
||||
run: python scripts/preload_models.py
|
||||
|
||||
- name: Run the tests
|
||||
id: run-tests
|
||||
run: |
|
||||
time python scripts/invoke.py \
|
||||
--model ${{ matrix.stable-diffusion-model-switch }} \
|
||||
--from_file ${{ env.TEST_PROMPTS }}
|
||||
|
||||
- name: export conda env
|
||||
id: export-conda-env
|
||||
run: |
|
||||
mkdir -p outputs/img-samples
|
||||
conda env export --name ${{ env.CONDA_ENV_NAME }} > outputs/img-samples/environment-${{ runner.os }}-${{ runner.arch }}.yml
|
||||
conda env export --name ${{ env.CONDA_ENV_NAME }} > outputs/img-samples/environment-${{ runner.os }}.yml
|
||||
|
||||
- name: Archive results
|
||||
id: archive-results
|
||||
uses: actions/upload-artifact@v3
|
||||
with:
|
||||
name: results_${{ matrix.os }}_${{ matrix.stable-diffusion-model-switch }}
|
||||
name: results_${{ matrix.os }}
|
||||
path: outputs/img-samples
|
||||
|
13
.gitignore
vendored
@ -1,10 +1,7 @@
|
||||
# ignore default image save location and model symbolic link
|
||||
outputs/
|
||||
models/ldm/stable-diffusion-v1/model.ckpt
|
||||
ldm/invoke/restoration/codeformer/weights
|
||||
# ignore user models config
|
||||
configs/models.user.yaml
|
||||
config/models.user.yml
|
||||
**/restoration/codeformer/weights
|
||||
|
||||
# ignore the Anaconda/Miniconda installer used while building Docker image
|
||||
anaconda.sh
|
||||
@ -198,13 +195,7 @@ checkpoints
|
||||
.scratch/
|
||||
.vscode/
|
||||
gfpgan/
|
||||
models/ldm/stable-diffusion-v1/*.sha256
|
||||
models/ldm/stable-diffusion-v1/model.sha256
|
||||
|
||||
# GFPGAN model files
|
||||
gfpgan/
|
||||
|
||||
# config file (will be created by installer)
|
||||
configs/models.yaml
|
||||
|
||||
# weights (will be created by installer)
|
||||
models/ldm/stable-diffusion-v1/*.ckpt
|
17
README.md
@ -2,7 +2,7 @@
|
||||
|
||||
# InvokeAI: A Stable Diffusion Toolkit
|
||||
|
||||
_Formerly known as lstein/stable-diffusion_
|
||||
_Formally known as lstein/stable-diffusion_
|
||||
|
||||

|
||||
|
||||
@ -42,7 +42,7 @@ generation process. It runs on Windows, Mac and Linux machines, with
|
||||
GPU cards with as little as 4 GB of RAM. It provides both a polished
|
||||
Web interface (see below), and an easy-to-use command-line interface.
|
||||
|
||||
**Quick links**: [<a href="https://discord.gg/ZmtBAhwWhy">Discord Server</a>] [<a href="https://invoke-ai.github.io/InvokeAI/">Documentation and Tutorials</a>] [<a href="https://github.com/invoke-ai/InvokeAI/">Code and Downloads</a>] [<a href="https://github.com/invoke-ai/InvokeAI/issues">Bug Reports</a>] [<a href="https://github.com/invoke-ai/InvokeAI/discussions">Discussion, Ideas & Q&A</a>]
|
||||
**Quick links**: [<a href="https://discord.gg/NwVCmKwY">Discord Server</a>] [<a href="https://invoke-ai.github.io/InvokeAI/">Documentation and Tutorials</a>] [<a href="https://github.com/invoke-ai/InvokeAI/">Code and Downloads</a>] [<a href="https://github.com/invoke-ai/InvokeAI/issues">Bug Reports</a>] [<a href="https://github.com/invoke-ai/InvokeAI/discussions">Discussion, Ideas & Q&A</a>]
|
||||
|
||||
<div align="center"><img src="docs/assets/invoke-web-server-1.png" width=640></div>
|
||||
|
||||
@ -133,19 +133,6 @@ you can try starting `invoke.py` with the `--precision=float32` flag:
|
||||
|
||||
### Latest Changes
|
||||
|
||||
### v2.1.0 major changes <small>(2 November 2022)</small>
|
||||
|
||||
- [Inpainting](https://invoke-ai.github.io/InvokeAI/features/INPAINTING/) support in the WebGUI
|
||||
- Greatly improved navigation and user experience in the [WebGUI](https://invoke-ai.github.io/InvokeAI/features/WEB/)
|
||||
- The prompt syntax has been enhanced with [prompt weighting, cross-attention and prompt merging](https://invoke-ai.github.io/InvokeAI/features/PROMPTS/).
|
||||
- You can now load [multiple models and switch among them quickly](https://docs.google.com/presentation/d/1WywGA1rny7bpFh7CLSdTr4nNpVKdlUeT0Bj0jCsILyU/edit?usp=sharing) without leaving the CLI.
|
||||
- The installation process (via `scripts/preload_models.py`) now lets you select among several popular [Stable Diffusion models](https://invoke-ai.github.io/InvokeAI/installation/INSTALLING_MODELS/) and downloads and installs them on your behalf. Among other models, this script will install the current Stable Diffusion 1.5 model as well as a StabilityAI variable autoencoder (VAE) which improves face generation.
|
||||
- Tired of struggling with photoeditors to get the masked region of for inpainting just right? Let the AI make the mask for you using [text masking](https://docs.google.com/presentation/d/1pWoY510hCVjz0M6X9CBbTznZgW2W5BYNKrmZm7B45q8/edit#slide=id.p). This feature allows you to specify the part of the image to paint over using just English-language phrases.
|
||||
- Tired of seeing the head of your subjects cropped off? Uncrop them in the CLI with the [outcrop feature](https://invoke-ai.github.io/InvokeAI/features/OUTPAINTING/#outcrop).
|
||||
- Tired of seeing your subject's bodies duplicated or mangled when generating larger-dimension images? Check out the `--hires` option in the CLI, or select the corresponding toggle in the WebGUI.
|
||||
- We now support textual inversion and fine-tune .bin styles and subjects from the Hugging Face archive of [SD Concepts](https://huggingface.co/sd-concepts-library). Load the .bin file using the `--embedding_path` option. (The next version will support merging and loading of multiple simultaneous models).
|
||||
<a href="https://invoke-ai.github.io/InvokeAI/CHANGELOG/>Complete Changelog</a>
|
||||
|
||||
- v2.0.1 (13 October 2022)
|
||||
- fix noisy images at high step count when using k* samplers
|
||||
- dream.py script now calls invoke.py module directly rather than
|
||||
|
Before Width: | Height: | Size: 33 KiB |
822
backend/server.py
Normal file
@ -0,0 +1,822 @@
|
||||
import mimetypes
|
||||
import transformers
|
||||
import json
|
||||
import os
|
||||
import traceback
|
||||
import eventlet
|
||||
import glob
|
||||
import shlex
|
||||
import math
|
||||
import shutil
|
||||
import sys
|
||||
|
||||
sys.path.append(".")
|
||||
|
||||
from argparse import ArgumentTypeError
|
||||
from modules.create_cmd_parser import create_cmd_parser
|
||||
|
||||
parser = create_cmd_parser()
|
||||
opt = parser.parse_args()
|
||||
|
||||
|
||||
from flask_socketio import SocketIO
|
||||
from flask import Flask, send_from_directory, url_for, jsonify
|
||||
from pathlib import Path
|
||||
from PIL import Image
|
||||
from pytorch_lightning import logging
|
||||
from threading import Event
|
||||
from uuid import uuid4
|
||||
from send2trash import send2trash
|
||||
|
||||
|
||||
from ldm.generate import Generate
|
||||
from ldm.invoke.restoration import Restoration
|
||||
from ldm.invoke.pngwriter import PngWriter, retrieve_metadata
|
||||
from ldm.invoke.args import APP_ID, APP_VERSION, calculate_init_img_hash
|
||||
from ldm.invoke.conditioning import split_weighted_subprompts
|
||||
|
||||
from modules.parameters import parameters_to_command
|
||||
|
||||
|
||||
"""
|
||||
USER CONFIG
|
||||
"""
|
||||
if opt.cors and "*" in opt.cors:
|
||||
raise ArgumentTypeError('"*" is not an allowed CORS origin')
|
||||
|
||||
|
||||
output_dir = "outputs/" # Base output directory for images
|
||||
host = opt.host # Web & socket.io host
|
||||
port = opt.port # Web & socket.io port
|
||||
verbose = opt.verbose # enables copious socket.io logging
|
||||
precision = opt.precision
|
||||
free_gpu_mem = opt.free_gpu_mem
|
||||
embedding_path = opt.embedding_path
|
||||
additional_allowed_origins = (
|
||||
opt.cors if opt.cors else []
|
||||
) # additional CORS allowed origins
|
||||
model = "stable-diffusion-1.4"
|
||||
|
||||
"""
|
||||
END USER CONFIG
|
||||
"""
|
||||
|
||||
|
||||
print("* Initializing, be patient...\n")
|
||||
|
||||
|
||||
"""
|
||||
SERVER SETUP
|
||||
"""
|
||||
|
||||
|
||||
# fix missing mimetypes on windows due to registry wonkiness
|
||||
mimetypes.add_type("application/javascript", ".js")
|
||||
mimetypes.add_type("text/css", ".css")
|
||||
|
||||
app = Flask(__name__, static_url_path="", static_folder="../frontend/dist/")
|
||||
|
||||
|
||||
app.config["OUTPUTS_FOLDER"] = "../outputs"
|
||||
|
||||
|
||||
@app.route("/outputs/<path:filename>")
|
||||
def outputs(filename):
|
||||
return send_from_directory(app.config["OUTPUTS_FOLDER"], filename)
|
||||
|
||||
|
||||
@app.route("/", defaults={"path": ""})
|
||||
def serve(path):
|
||||
return send_from_directory(app.static_folder, "index.html")
|
||||
|
||||
|
||||
logger = True if verbose else False
|
||||
engineio_logger = True if verbose else False
|
||||
|
||||
# default 1,000,000, needs to be higher for socketio to accept larger images
|
||||
max_http_buffer_size = 10000000
|
||||
|
||||
cors_allowed_origins = [f"http://{host}:{port}"] + additional_allowed_origins
|
||||
|
||||
socketio = SocketIO(
|
||||
app,
|
||||
logger=logger,
|
||||
engineio_logger=engineio_logger,
|
||||
max_http_buffer_size=max_http_buffer_size,
|
||||
cors_allowed_origins=cors_allowed_origins,
|
||||
ping_interval=(50, 50),
|
||||
ping_timeout=60,
|
||||
)
|
||||
|
||||
|
||||
"""
|
||||
END SERVER SETUP
|
||||
"""
|
||||
|
||||
|
||||
"""
|
||||
APP SETUP
|
||||
"""
|
||||
|
||||
|
||||
class CanceledException(Exception):
|
||||
pass
|
||||
|
||||
|
||||
try:
|
||||
gfpgan, codeformer, esrgan = None, None, None
|
||||
from ldm.invoke.restoration.base import Restoration
|
||||
|
||||
restoration = Restoration()
|
||||
gfpgan, codeformer = restoration.load_face_restore_models()
|
||||
esrgan = restoration.load_esrgan()
|
||||
|
||||
# coreformer.process(self, image, strength, device, seed=None, fidelity=0.75)
|
||||
|
||||
except (ModuleNotFoundError, ImportError):
|
||||
print(traceback.format_exc(), file=sys.stderr)
|
||||
print(">> You may need to install the ESRGAN and/or GFPGAN modules")
|
||||
|
||||
canceled = Event()
|
||||
|
||||
# reduce logging outputs to error
|
||||
transformers.logging.set_verbosity_error()
|
||||
logging.getLogger("pytorch_lightning").setLevel(logging.ERROR)
|
||||
|
||||
# Initialize and load model
|
||||
generate = Generate(
|
||||
model,
|
||||
precision=precision,
|
||||
embedding_path=embedding_path,
|
||||
)
|
||||
generate.free_gpu_mem = free_gpu_mem
|
||||
generate.load_model()
|
||||
|
||||
|
||||
# location for "finished" images
|
||||
result_path = os.path.join(output_dir, "img-samples/")
|
||||
|
||||
# temporary path for intermediates
|
||||
intermediate_path = os.path.join(result_path, "intermediates/")
|
||||
|
||||
# path for user-uploaded init images and masks
|
||||
init_image_path = os.path.join(result_path, "init-images/")
|
||||
mask_image_path = os.path.join(result_path, "mask-images/")
|
||||
|
||||
# txt log
|
||||
log_path = os.path.join(result_path, "invoke_log.txt")
|
||||
|
||||
# make all output paths
|
||||
[
|
||||
os.makedirs(path, exist_ok=True)
|
||||
for path in [result_path, intermediate_path, init_image_path, mask_image_path]
|
||||
]
|
||||
|
||||
|
||||
"""
|
||||
END APP SETUP
|
||||
"""
|
||||
|
||||
|
||||
"""
|
||||
SOCKET.IO LISTENERS
|
||||
"""
|
||||
|
||||
|
||||
@socketio.on("requestSystemConfig")
|
||||
def handle_request_capabilities():
|
||||
print(f">> System config requested")
|
||||
config = get_system_config()
|
||||
socketio.emit("systemConfig", config)
|
||||
|
||||
|
||||
@socketio.on("requestImages")
|
||||
def handle_request_images(page=1, offset=0, last_mtime=None):
|
||||
chunk_size = 50
|
||||
|
||||
if last_mtime:
|
||||
print(f">> Latest images requested")
|
||||
else:
|
||||
print(
|
||||
f">> Page {page} of images requested (page size {chunk_size} offset {offset})"
|
||||
)
|
||||
|
||||
paths = glob.glob(os.path.join(result_path, "*.png"))
|
||||
sorted_paths = sorted(paths, key=lambda x: os.path.getmtime(x), reverse=True)
|
||||
|
||||
if last_mtime:
|
||||
image_paths = filter(lambda x: os.path.getmtime(x) > last_mtime, sorted_paths)
|
||||
else:
|
||||
|
||||
image_paths = sorted_paths[
|
||||
slice(chunk_size * (page - 1) + offset, chunk_size * page + offset)
|
||||
]
|
||||
page = page + 1
|
||||
|
||||
image_array = []
|
||||
|
||||
for path in image_paths:
|
||||
metadata = retrieve_metadata(path)
|
||||
image_array.append(
|
||||
{
|
||||
"url": path,
|
||||
"mtime": os.path.getmtime(path),
|
||||
"metadata": metadata["sd-metadata"],
|
||||
}
|
||||
)
|
||||
|
||||
socketio.emit(
|
||||
"galleryImages",
|
||||
{
|
||||
"images": image_array,
|
||||
"nextPage": page,
|
||||
"offset": offset,
|
||||
"onlyNewImages": True if last_mtime else False,
|
||||
},
|
||||
)
|
||||
|
||||
|
||||
@socketio.on("generateImage")
|
||||
def handle_generate_image_event(
|
||||
generation_parameters, esrgan_parameters, gfpgan_parameters
|
||||
):
|
||||
print(
|
||||
f">> Image generation requested: {generation_parameters}\nESRGAN parameters: {esrgan_parameters}\nGFPGAN parameters: {gfpgan_parameters}"
|
||||
)
|
||||
generate_images(generation_parameters, esrgan_parameters, gfpgan_parameters)
|
||||
|
||||
|
||||
@socketio.on("runESRGAN")
|
||||
def handle_run_esrgan_event(original_image, esrgan_parameters):
|
||||
print(
|
||||
f'>> ESRGAN upscale requested for "{original_image["url"]}": {esrgan_parameters}'
|
||||
)
|
||||
progress = {
|
||||
"currentStep": 1,
|
||||
"totalSteps": 1,
|
||||
"currentIteration": 1,
|
||||
"totalIterations": 1,
|
||||
"currentStatus": "Preparing",
|
||||
"isProcessing": True,
|
||||
"currentStatusHasSteps": False,
|
||||
}
|
||||
|
||||
socketio.emit("progressUpdate", progress)
|
||||
eventlet.sleep(0)
|
||||
|
||||
image = Image.open(original_image["url"])
|
||||
|
||||
seed = (
|
||||
original_image["metadata"]["seed"]
|
||||
if "seed" in original_image["metadata"]
|
||||
else "unknown_seed"
|
||||
)
|
||||
|
||||
progress["currentStatus"] = "Upscaling"
|
||||
socketio.emit("progressUpdate", progress)
|
||||
eventlet.sleep(0)
|
||||
|
||||
image = esrgan.process(
|
||||
image=image,
|
||||
upsampler_scale=esrgan_parameters["upscale"][0],
|
||||
strength=esrgan_parameters["upscale"][1],
|
||||
seed=seed,
|
||||
)
|
||||
|
||||
progress["currentStatus"] = "Saving image"
|
||||
socketio.emit("progressUpdate", progress)
|
||||
eventlet.sleep(0)
|
||||
|
||||
esrgan_parameters["seed"] = seed
|
||||
metadata = parameters_to_post_processed_image_metadata(
|
||||
parameters=esrgan_parameters,
|
||||
original_image_path=original_image["url"],
|
||||
type="esrgan",
|
||||
)
|
||||
command = parameters_to_command(esrgan_parameters)
|
||||
|
||||
path = save_image(image, command, metadata, result_path, postprocessing="esrgan")
|
||||
|
||||
write_log_message(f'[Upscaled] "{original_image["url"]}" > "{path}": {command}')
|
||||
|
||||
progress["currentStatus"] = "Finished"
|
||||
progress["currentStep"] = 0
|
||||
progress["totalSteps"] = 0
|
||||
progress["currentIteration"] = 0
|
||||
progress["totalIterations"] = 0
|
||||
progress["isProcessing"] = False
|
||||
socketio.emit("progressUpdate", progress)
|
||||
eventlet.sleep(0)
|
||||
|
||||
socketio.emit(
|
||||
"esrganResult",
|
||||
{
|
||||
"url": os.path.relpath(path),
|
||||
"mtime": os.path.getmtime(path),
|
||||
"metadata": metadata,
|
||||
},
|
||||
)
|
||||
|
||||
|
||||
@socketio.on("runGFPGAN")
|
||||
def handle_run_gfpgan_event(original_image, gfpgan_parameters):
|
||||
print(
|
||||
f'>> GFPGAN face fix requested for "{original_image["url"]}": {gfpgan_parameters}'
|
||||
)
|
||||
progress = {
|
||||
"currentStep": 1,
|
||||
"totalSteps": 1,
|
||||
"currentIteration": 1,
|
||||
"totalIterations": 1,
|
||||
"currentStatus": "Preparing",
|
||||
"isProcessing": True,
|
||||
"currentStatusHasSteps": False,
|
||||
}
|
||||
|
||||
socketio.emit("progressUpdate", progress)
|
||||
eventlet.sleep(0)
|
||||
|
||||
image = Image.open(original_image["url"])
|
||||
|
||||
seed = (
|
||||
original_image["metadata"]["seed"]
|
||||
if "seed" in original_image["metadata"]
|
||||
else "unknown_seed"
|
||||
)
|
||||
|
||||
progress["currentStatus"] = "Fixing faces"
|
||||
socketio.emit("progressUpdate", progress)
|
||||
eventlet.sleep(0)
|
||||
|
||||
image = gfpgan.process(
|
||||
image=image, strength=gfpgan_parameters["facetool_strength"], seed=seed
|
||||
)
|
||||
|
||||
progress["currentStatus"] = "Saving image"
|
||||
socketio.emit("progressUpdate", progress)
|
||||
eventlet.sleep(0)
|
||||
|
||||
gfpgan_parameters["seed"] = seed
|
||||
metadata = parameters_to_post_processed_image_metadata(
|
||||
parameters=gfpgan_parameters,
|
||||
original_image_path=original_image["url"],
|
||||
type="gfpgan",
|
||||
)
|
||||
command = parameters_to_command(gfpgan_parameters)
|
||||
|
||||
path = save_image(image, command, metadata, result_path, postprocessing="gfpgan")
|
||||
|
||||
write_log_message(f'[Fixed faces] "{original_image["url"]}" > "{path}": {command}')
|
||||
|
||||
progress["currentStatus"] = "Finished"
|
||||
progress["currentStep"] = 0
|
||||
progress["totalSteps"] = 0
|
||||
progress["currentIteration"] = 0
|
||||
progress["totalIterations"] = 0
|
||||
progress["isProcessing"] = False
|
||||
socketio.emit("progressUpdate", progress)
|
||||
eventlet.sleep(0)
|
||||
|
||||
socketio.emit(
|
||||
"gfpganResult",
|
||||
{
|
||||
"url": os.path.relpath(path),
|
||||
"mtime": os.path.mtime(path),
|
||||
"metadata": metadata,
|
||||
},
|
||||
)
|
||||
|
||||
|
||||
@socketio.on("cancel")
|
||||
def handle_cancel():
|
||||
print(f">> Cancel processing requested")
|
||||
canceled.set()
|
||||
socketio.emit("processingCanceled")
|
||||
|
||||
|
||||
# TODO: I think this needs a safety mechanism.
|
||||
@socketio.on("deleteImage")
|
||||
def handle_delete_image(path, uuid):
|
||||
print(f'>> Delete requested "{path}"')
|
||||
send2trash(path)
|
||||
socketio.emit("imageDeleted", {"url": path, "uuid": uuid})
|
||||
|
||||
|
||||
# TODO: I think this needs a safety mechanism.
|
||||
@socketio.on("uploadInitialImage")
|
||||
def handle_upload_initial_image(bytes, name):
|
||||
print(f'>> Init image upload requested "{name}"')
|
||||
uuid = uuid4().hex
|
||||
split = os.path.splitext(name)
|
||||
name = f"{split[0]}.{uuid}{split[1]}"
|
||||
file_path = os.path.join(init_image_path, name)
|
||||
os.makedirs(os.path.dirname(file_path), exist_ok=True)
|
||||
newFile = open(file_path, "wb")
|
||||
newFile.write(bytes)
|
||||
socketio.emit("initialImageUploaded", {"url": file_path, "uuid": ""})
|
||||
|
||||
|
||||
# TODO: I think this needs a safety mechanism.
|
||||
@socketio.on("uploadMaskImage")
|
||||
def handle_upload_mask_image(bytes, name):
|
||||
print(f'>> Mask image upload requested "{name}"')
|
||||
uuid = uuid4().hex
|
||||
split = os.path.splitext(name)
|
||||
name = f"{split[0]}.{uuid}{split[1]}"
|
||||
file_path = os.path.join(mask_image_path, name)
|
||||
os.makedirs(os.path.dirname(file_path), exist_ok=True)
|
||||
newFile = open(file_path, "wb")
|
||||
newFile.write(bytes)
|
||||
socketio.emit("maskImageUploaded", {"url": file_path, "uuid": ""})
|
||||
|
||||
|
||||
"""
|
||||
END SOCKET.IO LISTENERS
|
||||
"""
|
||||
|
||||
|
||||
"""
|
||||
ADDITIONAL FUNCTIONS
|
||||
"""
|
||||
|
||||
|
||||
def get_system_config():
|
||||
return {
|
||||
"model": "stable diffusion",
|
||||
"model_id": model,
|
||||
"model_hash": generate.model_hash,
|
||||
"app_id": APP_ID,
|
||||
"app_version": APP_VERSION,
|
||||
}
|
||||
|
||||
|
||||
def parameters_to_post_processed_image_metadata(parameters, original_image_path, type):
|
||||
# top-level metadata minus `image` or `images`
|
||||
metadata = get_system_config()
|
||||
|
||||
orig_hash = calculate_init_img_hash(original_image_path)
|
||||
|
||||
image = {"orig_path": original_image_path, "orig_hash": orig_hash}
|
||||
|
||||
if type == "esrgan":
|
||||
image["type"] = "esrgan"
|
||||
image["scale"] = parameters["upscale"][0]
|
||||
image["strength"] = parameters["upscale"][1]
|
||||
elif type == "gfpgan":
|
||||
image["type"] = "gfpgan"
|
||||
image["strength"] = parameters["facetool_strength"]
|
||||
else:
|
||||
raise TypeError(f"Invalid type: {type}")
|
||||
|
||||
metadata["image"] = image
|
||||
return metadata
|
||||
|
||||
|
||||
def parameters_to_generated_image_metadata(parameters):
|
||||
# top-level metadata minus `image` or `images`
|
||||
|
||||
metadata = get_system_config()
|
||||
# remove any image keys not mentioned in RFC #266
|
||||
rfc266_img_fields = [
|
||||
"type",
|
||||
"postprocessing",
|
||||
"sampler",
|
||||
"prompt",
|
||||
"seed",
|
||||
"variations",
|
||||
"steps",
|
||||
"cfg_scale",
|
||||
"threshold",
|
||||
"perlin",
|
||||
"step_number",
|
||||
"width",
|
||||
"height",
|
||||
"extra",
|
||||
"seamless",
|
||||
"hires_fix",
|
||||
]
|
||||
|
||||
rfc_dict = {}
|
||||
|
||||
for item in parameters.items():
|
||||
key, value = item
|
||||
if key in rfc266_img_fields:
|
||||
rfc_dict[key] = value
|
||||
|
||||
postprocessing = []
|
||||
|
||||
# 'postprocessing' is either null or an
|
||||
if "facetool_strength" in parameters:
|
||||
|
||||
postprocessing.append(
|
||||
{"type": "gfpgan", "strength": float(parameters["facetool_strength"])}
|
||||
)
|
||||
|
||||
if "upscale" in parameters:
|
||||
postprocessing.append(
|
||||
{
|
||||
"type": "esrgan",
|
||||
"scale": int(parameters["upscale"][0]),
|
||||
"strength": float(parameters["upscale"][1]),
|
||||
}
|
||||
)
|
||||
|
||||
rfc_dict["postprocessing"] = postprocessing if len(postprocessing) > 0 else None
|
||||
|
||||
# semantic drift
|
||||
rfc_dict["sampler"] = parameters["sampler_name"]
|
||||
|
||||
# display weighted subprompts (liable to change)
|
||||
subprompts = split_weighted_subprompts(parameters["prompt"])
|
||||
subprompts = [{"prompt": x[0], "weight": x[1]} for x in subprompts]
|
||||
rfc_dict["prompt"] = subprompts
|
||||
|
||||
# 'variations' should always exist and be an array, empty or consisting of {'seed': seed, 'weight': weight} pairs
|
||||
variations = []
|
||||
|
||||
if "with_variations" in parameters:
|
||||
variations = [
|
||||
{"seed": x[0], "weight": x[1]} for x in parameters["with_variations"]
|
||||
]
|
||||
|
||||
rfc_dict["variations"] = variations
|
||||
|
||||
if "init_img" in parameters:
|
||||
rfc_dict["type"] = "img2img"
|
||||
rfc_dict["strength"] = parameters["strength"]
|
||||
rfc_dict["fit"] = parameters["fit"] # TODO: Noncompliant
|
||||
rfc_dict["orig_hash"] = calculate_init_img_hash(parameters["init_img"])
|
||||
rfc_dict["init_image_path"] = parameters["init_img"] # TODO: Noncompliant
|
||||
rfc_dict["sampler"] = "ddim" # TODO: FIX ME WHEN IMG2IMG SUPPORTS ALL SAMPLERS
|
||||
if "init_mask" in parameters:
|
||||
rfc_dict["mask_hash"] = calculate_init_img_hash(
|
||||
parameters["init_mask"]
|
||||
) # TODO: Noncompliant
|
||||
rfc_dict["mask_image_path"] = parameters["init_mask"] # TODO: Noncompliant
|
||||
else:
|
||||
rfc_dict["type"] = "txt2img"
|
||||
|
||||
metadata["image"] = rfc_dict
|
||||
|
||||
return metadata
|
||||
|
||||
|
||||
def make_unique_init_image_filename(name):
|
||||
uuid = uuid4().hex
|
||||
split = os.path.splitext(name)
|
||||
name = f"{split[0]}.{uuid}{split[1]}"
|
||||
return name
|
||||
|
||||
|
||||
def write_log_message(message, log_path=log_path):
|
||||
"""Logs the filename and parameters used to generate or process that image to log file"""
|
||||
message = f"{message}\n"
|
||||
with open(log_path, "a", encoding="utf-8") as file:
|
||||
file.writelines(message)
|
||||
|
||||
|
||||
def save_image(
|
||||
image, command, metadata, output_dir, step_index=None, postprocessing=False
|
||||
):
|
||||
pngwriter = PngWriter(output_dir)
|
||||
prefix = pngwriter.unique_prefix()
|
||||
|
||||
seed = "unknown_seed"
|
||||
|
||||
if "image" in metadata:
|
||||
if "seed" in metadata["image"]:
|
||||
seed = metadata["image"]["seed"]
|
||||
|
||||
filename = f"{prefix}.{seed}"
|
||||
|
||||
if step_index:
|
||||
filename += f".{step_index}"
|
||||
if postprocessing:
|
||||
filename += f".postprocessed"
|
||||
|
||||
filename += ".png"
|
||||
|
||||
path = pngwriter.save_image_and_prompt_to_png(
|
||||
image=image, dream_prompt=command, metadata=metadata, name=filename
|
||||
)
|
||||
|
||||
return path
|
||||
|
||||
|
||||
def calculate_real_steps(steps, strength, has_init_image):
|
||||
return math.floor(strength * steps) if has_init_image else steps
|
||||
|
||||
|
||||
def generate_images(generation_parameters, esrgan_parameters, gfpgan_parameters):
|
||||
canceled.clear()
|
||||
|
||||
step_index = 1
|
||||
prior_variations = (
|
||||
generation_parameters["with_variations"]
|
||||
if "with_variations" in generation_parameters
|
||||
else []
|
||||
)
|
||||
"""
|
||||
If a result image is used as an init image, and then deleted, we will want to be
|
||||
able to use it as an init image in the future. Need to copy it.
|
||||
|
||||
If the init/mask image doesn't exist in the init_image_path/mask_image_path,
|
||||
make a unique filename for it and copy it there.
|
||||
"""
|
||||
if "init_img" in generation_parameters:
|
||||
filename = os.path.basename(generation_parameters["init_img"])
|
||||
if not os.path.exists(os.path.join(init_image_path, filename)):
|
||||
unique_filename = make_unique_init_image_filename(filename)
|
||||
new_path = os.path.join(init_image_path, unique_filename)
|
||||
shutil.copy(generation_parameters["init_img"], new_path)
|
||||
generation_parameters["init_img"] = new_path
|
||||
if "init_mask" in generation_parameters:
|
||||
filename = os.path.basename(generation_parameters["init_mask"])
|
||||
if not os.path.exists(os.path.join(mask_image_path, filename)):
|
||||
unique_filename = make_unique_init_image_filename(filename)
|
||||
new_path = os.path.join(init_image_path, unique_filename)
|
||||
shutil.copy(generation_parameters["init_img"], new_path)
|
||||
generation_parameters["init_mask"] = new_path
|
||||
|
||||
totalSteps = calculate_real_steps(
|
||||
steps=generation_parameters["steps"],
|
||||
strength=generation_parameters["strength"]
|
||||
if "strength" in generation_parameters
|
||||
else None,
|
||||
has_init_image="init_img" in generation_parameters,
|
||||
)
|
||||
|
||||
progress = {
|
||||
"currentStep": 1,
|
||||
"totalSteps": totalSteps,
|
||||
"currentIteration": 1,
|
||||
"totalIterations": generation_parameters["iterations"],
|
||||
"currentStatus": "Preparing",
|
||||
"isProcessing": True,
|
||||
"currentStatusHasSteps": False,
|
||||
}
|
||||
|
||||
socketio.emit("progressUpdate", progress)
|
||||
eventlet.sleep(0)
|
||||
|
||||
def image_progress(sample, step):
|
||||
if canceled.is_set():
|
||||
raise CanceledException
|
||||
|
||||
nonlocal step_index
|
||||
nonlocal generation_parameters
|
||||
nonlocal progress
|
||||
|
||||
progress["currentStep"] = step + 1
|
||||
progress["currentStatus"] = "Generating"
|
||||
progress["currentStatusHasSteps"] = True
|
||||
|
||||
if (
|
||||
generation_parameters["progress_images"]
|
||||
and step % 5 == 0
|
||||
and step < generation_parameters["steps"] - 1
|
||||
):
|
||||
image = generate.sample_to_image(sample)
|
||||
|
||||
metadata = parameters_to_generated_image_metadata(generation_parameters)
|
||||
command = parameters_to_command(generation_parameters)
|
||||
path = save_image(image, command, metadata, intermediate_path, step_index=step_index, postprocessing=False)
|
||||
|
||||
step_index += 1
|
||||
socketio.emit(
|
||||
"intermediateResult",
|
||||
{
|
||||
"url": os.path.relpath(path),
|
||||
"mtime": os.path.getmtime(path),
|
||||
"metadata": metadata,
|
||||
},
|
||||
)
|
||||
socketio.emit("progressUpdate", progress)
|
||||
eventlet.sleep(0)
|
||||
|
||||
def image_done(image, seed, first_seed):
|
||||
nonlocal generation_parameters
|
||||
nonlocal esrgan_parameters
|
||||
nonlocal gfpgan_parameters
|
||||
nonlocal progress
|
||||
|
||||
step_index = 1
|
||||
nonlocal prior_variations
|
||||
|
||||
progress["currentStatus"] = "Generation complete"
|
||||
socketio.emit("progressUpdate", progress)
|
||||
eventlet.sleep(0)
|
||||
|
||||
all_parameters = generation_parameters
|
||||
postprocessing = False
|
||||
|
||||
if (
|
||||
"variation_amount" in all_parameters
|
||||
and all_parameters["variation_amount"] > 0
|
||||
):
|
||||
first_seed = first_seed or seed
|
||||
this_variation = [[seed, all_parameters["variation_amount"]]]
|
||||
all_parameters["with_variations"] = prior_variations + this_variation
|
||||
all_parameters["seed"] = first_seed
|
||||
elif ("with_variations" in all_parameters):
|
||||
all_parameters["seed"] = first_seed
|
||||
else:
|
||||
all_parameters["seed"] = seed
|
||||
|
||||
if esrgan_parameters:
|
||||
progress["currentStatus"] = "Upscaling"
|
||||
progress["currentStatusHasSteps"] = False
|
||||
socketio.emit("progressUpdate", progress)
|
||||
eventlet.sleep(0)
|
||||
|
||||
image = esrgan.process(
|
||||
image=image,
|
||||
upsampler_scale=esrgan_parameters["level"],
|
||||
strength=esrgan_parameters["strength"],
|
||||
seed=seed,
|
||||
)
|
||||
|
||||
postprocessing = True
|
||||
all_parameters["upscale"] = [
|
||||
esrgan_parameters["level"],
|
||||
esrgan_parameters["strength"],
|
||||
]
|
||||
|
||||
if gfpgan_parameters:
|
||||
progress["currentStatus"] = "Fixing faces"
|
||||
progress["currentStatusHasSteps"] = False
|
||||
socketio.emit("progressUpdate", progress)
|
||||
eventlet.sleep(0)
|
||||
|
||||
image = gfpgan.process(
|
||||
image=image, strength=gfpgan_parameters["strength"], seed=seed
|
||||
)
|
||||
postprocessing = True
|
||||
all_parameters["facetool_strength"] = gfpgan_parameters["strength"]
|
||||
|
||||
progress["currentStatus"] = "Saving image"
|
||||
socketio.emit("progressUpdate", progress)
|
||||
eventlet.sleep(0)
|
||||
|
||||
metadata = parameters_to_generated_image_metadata(all_parameters)
|
||||
command = parameters_to_command(all_parameters)
|
||||
|
||||
path = save_image(
|
||||
image, command, metadata, result_path, postprocessing=postprocessing
|
||||
)
|
||||
|
||||
print(f'>> Image generated: "{path}"')
|
||||
write_log_message(f'[Generated] "{path}": {command}')
|
||||
|
||||
if progress["totalIterations"] > progress["currentIteration"]:
|
||||
progress["currentStep"] = 1
|
||||
progress["currentIteration"] += 1
|
||||
progress["currentStatus"] = "Iteration finished"
|
||||
progress["currentStatusHasSteps"] = False
|
||||
else:
|
||||
progress["currentStep"] = 0
|
||||
progress["totalSteps"] = 0
|
||||
progress["currentIteration"] = 0
|
||||
progress["totalIterations"] = 0
|
||||
progress["currentStatus"] = "Finished"
|
||||
progress["isProcessing"] = False
|
||||
|
||||
socketio.emit("progressUpdate", progress)
|
||||
eventlet.sleep(0)
|
||||
|
||||
socketio.emit(
|
||||
"generationResult",
|
||||
{
|
||||
"url": os.path.relpath(path),
|
||||
"mtime": os.path.getmtime(path),
|
||||
"metadata": metadata,
|
||||
},
|
||||
)
|
||||
eventlet.sleep(0)
|
||||
|
||||
try:
|
||||
generate.prompt2image(
|
||||
**generation_parameters,
|
||||
step_callback=image_progress,
|
||||
image_callback=image_done,
|
||||
)
|
||||
|
||||
except KeyboardInterrupt:
|
||||
raise
|
||||
except CanceledException:
|
||||
pass
|
||||
except Exception as e:
|
||||
socketio.emit("error", {"message": (str(e))})
|
||||
print("\n")
|
||||
traceback.print_exc()
|
||||
print("\n")
|
||||
|
||||
|
||||
"""
|
||||
END ADDITIONAL FUNCTIONS
|
||||
"""
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
print(f">> Starting server at http://{host}:{port}")
|
||||
socketio.run(app, host=host, port=port)
|
54
configs/autoencoder/autoencoder_kl_16x16x16.yaml
Normal file
@ -0,0 +1,54 @@
|
||||
model:
|
||||
base_learning_rate: 4.5e-6
|
||||
target: ldm.models.autoencoder.AutoencoderKL
|
||||
params:
|
||||
monitor: "val/rec_loss"
|
||||
embed_dim: 16
|
||||
lossconfig:
|
||||
target: ldm.modules.losses.LPIPSWithDiscriminator
|
||||
params:
|
||||
disc_start: 50001
|
||||
kl_weight: 0.000001
|
||||
disc_weight: 0.5
|
||||
|
||||
ddconfig:
|
||||
double_z: True
|
||||
z_channels: 16
|
||||
resolution: 256
|
||||
in_channels: 3
|
||||
out_ch: 3
|
||||
ch: 128
|
||||
ch_mult: [ 1,1,2,2,4] # num_down = len(ch_mult)-1
|
||||
num_res_blocks: 2
|
||||
attn_resolutions: [16]
|
||||
dropout: 0.0
|
||||
|
||||
|
||||
data:
|
||||
target: main.DataModuleFromConfig
|
||||
params:
|
||||
batch_size: 12
|
||||
wrap: True
|
||||
train:
|
||||
target: ldm.data.imagenet.ImageNetSRTrain
|
||||
params:
|
||||
size: 256
|
||||
degradation: pil_nearest
|
||||
validation:
|
||||
target: ldm.data.imagenet.ImageNetSRValidation
|
||||
params:
|
||||
size: 256
|
||||
degradation: pil_nearest
|
||||
|
||||
lightning:
|
||||
callbacks:
|
||||
image_logger:
|
||||
target: main.ImageLogger
|
||||
params:
|
||||
batch_frequency: 1000
|
||||
max_images: 8
|
||||
increase_log_steps: True
|
||||
|
||||
trainer:
|
||||
benchmark: True
|
||||
accumulate_grad_batches: 2
|
53
configs/autoencoder/autoencoder_kl_32x32x4.yaml
Normal file
@ -0,0 +1,53 @@
|
||||
model:
|
||||
base_learning_rate: 4.5e-6
|
||||
target: ldm.models.autoencoder.AutoencoderKL
|
||||
params:
|
||||
monitor: "val/rec_loss"
|
||||
embed_dim: 4
|
||||
lossconfig:
|
||||
target: ldm.modules.losses.LPIPSWithDiscriminator
|
||||
params:
|
||||
disc_start: 50001
|
||||
kl_weight: 0.000001
|
||||
disc_weight: 0.5
|
||||
|
||||
ddconfig:
|
||||
double_z: True
|
||||
z_channels: 4
|
||||
resolution: 256
|
||||
in_channels: 3
|
||||
out_ch: 3
|
||||
ch: 128
|
||||
ch_mult: [ 1,2,4,4 ] # num_down = len(ch_mult)-1
|
||||
num_res_blocks: 2
|
||||
attn_resolutions: [ ]
|
||||
dropout: 0.0
|
||||
|
||||
data:
|
||||
target: main.DataModuleFromConfig
|
||||
params:
|
||||
batch_size: 12
|
||||
wrap: True
|
||||
train:
|
||||
target: ldm.data.imagenet.ImageNetSRTrain
|
||||
params:
|
||||
size: 256
|
||||
degradation: pil_nearest
|
||||
validation:
|
||||
target: ldm.data.imagenet.ImageNetSRValidation
|
||||
params:
|
||||
size: 256
|
||||
degradation: pil_nearest
|
||||
|
||||
lightning:
|
||||
callbacks:
|
||||
image_logger:
|
||||
target: main.ImageLogger
|
||||
params:
|
||||
batch_frequency: 1000
|
||||
max_images: 8
|
||||
increase_log_steps: True
|
||||
|
||||
trainer:
|
||||
benchmark: True
|
||||
accumulate_grad_batches: 2
|
54
configs/autoencoder/autoencoder_kl_64x64x3.yaml
Normal file
@ -0,0 +1,54 @@
|
||||
model:
|
||||
base_learning_rate: 4.5e-6
|
||||
target: ldm.models.autoencoder.AutoencoderKL
|
||||
params:
|
||||
monitor: "val/rec_loss"
|
||||
embed_dim: 3
|
||||
lossconfig:
|
||||
target: ldm.modules.losses.LPIPSWithDiscriminator
|
||||
params:
|
||||
disc_start: 50001
|
||||
kl_weight: 0.000001
|
||||
disc_weight: 0.5
|
||||
|
||||
ddconfig:
|
||||
double_z: True
|
||||
z_channels: 3
|
||||
resolution: 256
|
||||
in_channels: 3
|
||||
out_ch: 3
|
||||
ch: 128
|
||||
ch_mult: [ 1,2,4 ] # num_down = len(ch_mult)-1
|
||||
num_res_blocks: 2
|
||||
attn_resolutions: [ ]
|
||||
dropout: 0.0
|
||||
|
||||
|
||||
data:
|
||||
target: main.DataModuleFromConfig
|
||||
params:
|
||||
batch_size: 12
|
||||
wrap: True
|
||||
train:
|
||||
target: ldm.data.imagenet.ImageNetSRTrain
|
||||
params:
|
||||
size: 256
|
||||
degradation: pil_nearest
|
||||
validation:
|
||||
target: ldm.data.imagenet.ImageNetSRValidation
|
||||
params:
|
||||
size: 256
|
||||
degradation: pil_nearest
|
||||
|
||||
lightning:
|
||||
callbacks:
|
||||
image_logger:
|
||||
target: main.ImageLogger
|
||||
params:
|
||||
batch_frequency: 1000
|
||||
max_images: 8
|
||||
increase_log_steps: True
|
||||
|
||||
trainer:
|
||||
benchmark: True
|
||||
accumulate_grad_batches: 2
|
53
configs/autoencoder/autoencoder_kl_8x8x64.yaml
Normal file
@ -0,0 +1,53 @@
|
||||
model:
|
||||
base_learning_rate: 4.5e-6
|
||||
target: ldm.models.autoencoder.AutoencoderKL
|
||||
params:
|
||||
monitor: "val/rec_loss"
|
||||
embed_dim: 64
|
||||
lossconfig:
|
||||
target: ldm.modules.losses.LPIPSWithDiscriminator
|
||||
params:
|
||||
disc_start: 50001
|
||||
kl_weight: 0.000001
|
||||
disc_weight: 0.5
|
||||
|
||||
ddconfig:
|
||||
double_z: True
|
||||
z_channels: 64
|
||||
resolution: 256
|
||||
in_channels: 3
|
||||
out_ch: 3
|
||||
ch: 128
|
||||
ch_mult: [ 1,1,2,2,4,4] # num_down = len(ch_mult)-1
|
||||
num_res_blocks: 2
|
||||
attn_resolutions: [16,8]
|
||||
dropout: 0.0
|
||||
|
||||
data:
|
||||
target: main.DataModuleFromConfig
|
||||
params:
|
||||
batch_size: 12
|
||||
wrap: True
|
||||
train:
|
||||
target: ldm.data.imagenet.ImageNetSRTrain
|
||||
params:
|
||||
size: 256
|
||||
degradation: pil_nearest
|
||||
validation:
|
||||
target: ldm.data.imagenet.ImageNetSRValidation
|
||||
params:
|
||||
size: 256
|
||||
degradation: pil_nearest
|
||||
|
||||
lightning:
|
||||
callbacks:
|
||||
image_logger:
|
||||
target: main.ImageLogger
|
||||
params:
|
||||
batch_frequency: 1000
|
||||
max_images: 8
|
||||
increase_log_steps: True
|
||||
|
||||
trainer:
|
||||
benchmark: True
|
||||
accumulate_grad_batches: 2
|
86
configs/latent-diffusion/celebahq-ldm-vq-4.yaml
Normal file
@ -0,0 +1,86 @@
|
||||
model:
|
||||
base_learning_rate: 2.0e-06
|
||||
target: ldm.models.diffusion.ddpm.LatentDiffusion
|
||||
params:
|
||||
linear_start: 0.0015
|
||||
linear_end: 0.0195
|
||||
num_timesteps_cond: 1
|
||||
log_every_t: 200
|
||||
timesteps: 1000
|
||||
first_stage_key: image
|
||||
image_size: 64
|
||||
channels: 3
|
||||
monitor: val/loss_simple_ema
|
||||
|
||||
unet_config:
|
||||
target: ldm.modules.diffusionmodules.openaimodel.UNetModel
|
||||
params:
|
||||
image_size: 64
|
||||
in_channels: 3
|
||||
out_channels: 3
|
||||
model_channels: 224
|
||||
attention_resolutions:
|
||||
# note: this isn\t actually the resolution but
|
||||
# the downsampling factor, i.e. this corresnponds to
|
||||
# attention on spatial resolution 8,16,32, as the
|
||||
# spatial reolution of the latents is 64 for f4
|
||||
- 8
|
||||
- 4
|
||||
- 2
|
||||
num_res_blocks: 2
|
||||
channel_mult:
|
||||
- 1
|
||||
- 2
|
||||
- 3
|
||||
- 4
|
||||
num_head_channels: 32
|
||||
first_stage_config:
|
||||
target: ldm.models.autoencoder.VQModelInterface
|
||||
params:
|
||||
embed_dim: 3
|
||||
n_embed: 8192
|
||||
ckpt_path: models/first_stage_models/vq-f4/model.ckpt
|
||||
ddconfig:
|
||||
double_z: false
|
||||
z_channels: 3
|
||||
resolution: 256
|
||||
in_channels: 3
|
||||
out_ch: 3
|
||||
ch: 128
|
||||
ch_mult:
|
||||
- 1
|
||||
- 2
|
||||
- 4
|
||||
num_res_blocks: 2
|
||||
attn_resolutions: []
|
||||
dropout: 0.0
|
||||
lossconfig:
|
||||
target: torch.nn.Identity
|
||||
cond_stage_config: __is_unconditional__
|
||||
data:
|
||||
target: main.DataModuleFromConfig
|
||||
params:
|
||||
batch_size: 48
|
||||
num_workers: 5
|
||||
wrap: false
|
||||
train:
|
||||
target: taming.data.faceshq.CelebAHQTrain
|
||||
params:
|
||||
size: 256
|
||||
validation:
|
||||
target: taming.data.faceshq.CelebAHQValidation
|
||||
params:
|
||||
size: 256
|
||||
|
||||
|
||||
lightning:
|
||||
callbacks:
|
||||
image_logger:
|
||||
target: main.ImageLogger
|
||||
params:
|
||||
batch_frequency: 5000
|
||||
max_images: 8
|
||||
increase_log_steps: False
|
||||
|
||||
trainer:
|
||||
benchmark: True
|
98
configs/latent-diffusion/cin-ldm-vq-f8.yaml
Normal file
@ -0,0 +1,98 @@
|
||||
model:
|
||||
base_learning_rate: 1.0e-06
|
||||
target: ldm.models.diffusion.ddpm.LatentDiffusion
|
||||
params:
|
||||
linear_start: 0.0015
|
||||
linear_end: 0.0195
|
||||
num_timesteps_cond: 1
|
||||
log_every_t: 200
|
||||
timesteps: 1000
|
||||
first_stage_key: image
|
||||
cond_stage_key: class_label
|
||||
image_size: 32
|
||||
channels: 4
|
||||
cond_stage_trainable: true
|
||||
conditioning_key: crossattn
|
||||
monitor: val/loss_simple_ema
|
||||
unet_config:
|
||||
target: ldm.modules.diffusionmodules.openaimodel.UNetModel
|
||||
params:
|
||||
image_size: 32
|
||||
in_channels: 4
|
||||
out_channels: 4
|
||||
model_channels: 256
|
||||
attention_resolutions:
|
||||
#note: this isn\t actually the resolution but
|
||||
# the downsampling factor, i.e. this corresnponds to
|
||||
# attention on spatial resolution 8,16,32, as the
|
||||
# spatial reolution of the latents is 32 for f8
|
||||
- 4
|
||||
- 2
|
||||
- 1
|
||||
num_res_blocks: 2
|
||||
channel_mult:
|
||||
- 1
|
||||
- 2
|
||||
- 4
|
||||
num_head_channels: 32
|
||||
use_spatial_transformer: true
|
||||
transformer_depth: 1
|
||||
context_dim: 512
|
||||
first_stage_config:
|
||||
target: ldm.models.autoencoder.VQModelInterface
|
||||
params:
|
||||
embed_dim: 4
|
||||
n_embed: 16384
|
||||
ckpt_path: configs/first_stage_models/vq-f8/model.yaml
|
||||
ddconfig:
|
||||
double_z: false
|
||||
z_channels: 4
|
||||
resolution: 256
|
||||
in_channels: 3
|
||||
out_ch: 3
|
||||
ch: 128
|
||||
ch_mult:
|
||||
- 1
|
||||
- 2
|
||||
- 2
|
||||
- 4
|
||||
num_res_blocks: 2
|
||||
attn_resolutions:
|
||||
- 32
|
||||
dropout: 0.0
|
||||
lossconfig:
|
||||
target: torch.nn.Identity
|
||||
cond_stage_config:
|
||||
target: ldm.modules.encoders.modules.ClassEmbedder
|
||||
params:
|
||||
embed_dim: 512
|
||||
key: class_label
|
||||
data:
|
||||
target: main.DataModuleFromConfig
|
||||
params:
|
||||
batch_size: 64
|
||||
num_workers: 12
|
||||
wrap: false
|
||||
train:
|
||||
target: ldm.data.imagenet.ImageNetTrain
|
||||
params:
|
||||
config:
|
||||
size: 256
|
||||
validation:
|
||||
target: ldm.data.imagenet.ImageNetValidation
|
||||
params:
|
||||
config:
|
||||
size: 256
|
||||
|
||||
|
||||
lightning:
|
||||
callbacks:
|
||||
image_logger:
|
||||
target: main.ImageLogger
|
||||
params:
|
||||
batch_frequency: 5000
|
||||
max_images: 8
|
||||
increase_log_steps: False
|
||||
|
||||
trainer:
|
||||
benchmark: True
|
68
configs/latent-diffusion/cin256-v2.yaml
Normal file
@ -0,0 +1,68 @@
|
||||
model:
|
||||
base_learning_rate: 0.0001
|
||||
target: ldm.models.diffusion.ddpm.LatentDiffusion
|
||||
params:
|
||||
linear_start: 0.0015
|
||||
linear_end: 0.0195
|
||||
num_timesteps_cond: 1
|
||||
log_every_t: 200
|
||||
timesteps: 1000
|
||||
first_stage_key: image
|
||||
cond_stage_key: class_label
|
||||
image_size: 64
|
||||
channels: 3
|
||||
cond_stage_trainable: true
|
||||
conditioning_key: crossattn
|
||||
monitor: val/loss
|
||||
use_ema: False
|
||||
|
||||
unet_config:
|
||||
target: ldm.modules.diffusionmodules.openaimodel.UNetModel
|
||||
params:
|
||||
image_size: 64
|
||||
in_channels: 3
|
||||
out_channels: 3
|
||||
model_channels: 192
|
||||
attention_resolutions:
|
||||
- 8
|
||||
- 4
|
||||
- 2
|
||||
num_res_blocks: 2
|
||||
channel_mult:
|
||||
- 1
|
||||
- 2
|
||||
- 3
|
||||
- 5
|
||||
num_heads: 1
|
||||
use_spatial_transformer: true
|
||||
transformer_depth: 1
|
||||
context_dim: 512
|
||||
|
||||
first_stage_config:
|
||||
target: ldm.models.autoencoder.VQModelInterface
|
||||
params:
|
||||
embed_dim: 3
|
||||
n_embed: 8192
|
||||
ddconfig:
|
||||
double_z: false
|
||||
z_channels: 3
|
||||
resolution: 256
|
||||
in_channels: 3
|
||||
out_ch: 3
|
||||
ch: 128
|
||||
ch_mult:
|
||||
- 1
|
||||
- 2
|
||||
- 4
|
||||
num_res_blocks: 2
|
||||
attn_resolutions: []
|
||||
dropout: 0.0
|
||||
lossconfig:
|
||||
target: torch.nn.Identity
|
||||
|
||||
cond_stage_config:
|
||||
target: ldm.modules.encoders.modules.ClassEmbedder
|
||||
params:
|
||||
n_classes: 1001
|
||||
embed_dim: 512
|
||||
key: class_label
|
85
configs/latent-diffusion/ffhq-ldm-vq-4.yaml
Normal file
@ -0,0 +1,85 @@
|
||||
model:
|
||||
base_learning_rate: 2.0e-06
|
||||
target: ldm.models.diffusion.ddpm.LatentDiffusion
|
||||
params:
|
||||
linear_start: 0.0015
|
||||
linear_end: 0.0195
|
||||
num_timesteps_cond: 1
|
||||
log_every_t: 200
|
||||
timesteps: 1000
|
||||
first_stage_key: image
|
||||
image_size: 64
|
||||
channels: 3
|
||||
monitor: val/loss_simple_ema
|
||||
unet_config:
|
||||
target: ldm.modules.diffusionmodules.openaimodel.UNetModel
|
||||
params:
|
||||
image_size: 64
|
||||
in_channels: 3
|
||||
out_channels: 3
|
||||
model_channels: 224
|
||||
attention_resolutions:
|
||||
# note: this isn\t actually the resolution but
|
||||
# the downsampling factor, i.e. this corresnponds to
|
||||
# attention on spatial resolution 8,16,32, as the
|
||||
# spatial reolution of the latents is 64 for f4
|
||||
- 8
|
||||
- 4
|
||||
- 2
|
||||
num_res_blocks: 2
|
||||
channel_mult:
|
||||
- 1
|
||||
- 2
|
||||
- 3
|
||||
- 4
|
||||
num_head_channels: 32
|
||||
first_stage_config:
|
||||
target: ldm.models.autoencoder.VQModelInterface
|
||||
params:
|
||||
embed_dim: 3
|
||||
n_embed: 8192
|
||||
ckpt_path: configs/first_stage_models/vq-f4/model.yaml
|
||||
ddconfig:
|
||||
double_z: false
|
||||
z_channels: 3
|
||||
resolution: 256
|
||||
in_channels: 3
|
||||
out_ch: 3
|
||||
ch: 128
|
||||
ch_mult:
|
||||
- 1
|
||||
- 2
|
||||
- 4
|
||||
num_res_blocks: 2
|
||||
attn_resolutions: []
|
||||
dropout: 0.0
|
||||
lossconfig:
|
||||
target: torch.nn.Identity
|
||||
cond_stage_config: __is_unconditional__
|
||||
data:
|
||||
target: main.DataModuleFromConfig
|
||||
params:
|
||||
batch_size: 42
|
||||
num_workers: 5
|
||||
wrap: false
|
||||
train:
|
||||
target: taming.data.faceshq.FFHQTrain
|
||||
params:
|
||||
size: 256
|
||||
validation:
|
||||
target: taming.data.faceshq.FFHQValidation
|
||||
params:
|
||||
size: 256
|
||||
|
||||
|
||||
lightning:
|
||||
callbacks:
|
||||
image_logger:
|
||||
target: main.ImageLogger
|
||||
params:
|
||||
batch_frequency: 5000
|
||||
max_images: 8
|
||||
increase_log_steps: False
|
||||
|
||||
trainer:
|
||||
benchmark: True
|
85
configs/latent-diffusion/lsun_bedrooms-ldm-vq-4.yaml
Normal file
@ -0,0 +1,85 @@
|
||||
model:
|
||||
base_learning_rate: 2.0e-06
|
||||
target: ldm.models.diffusion.ddpm.LatentDiffusion
|
||||
params:
|
||||
linear_start: 0.0015
|
||||
linear_end: 0.0195
|
||||
num_timesteps_cond: 1
|
||||
log_every_t: 200
|
||||
timesteps: 1000
|
||||
first_stage_key: image
|
||||
image_size: 64
|
||||
channels: 3
|
||||
monitor: val/loss_simple_ema
|
||||
unet_config:
|
||||
target: ldm.modules.diffusionmodules.openaimodel.UNetModel
|
||||
params:
|
||||
image_size: 64
|
||||
in_channels: 3
|
||||
out_channels: 3
|
||||
model_channels: 224
|
||||
attention_resolutions:
|
||||
# note: this isn\t actually the resolution but
|
||||
# the downsampling factor, i.e. this corresnponds to
|
||||
# attention on spatial resolution 8,16,32, as the
|
||||
# spatial reolution of the latents is 64 for f4
|
||||
- 8
|
||||
- 4
|
||||
- 2
|
||||
num_res_blocks: 2
|
||||
channel_mult:
|
||||
- 1
|
||||
- 2
|
||||
- 3
|
||||
- 4
|
||||
num_head_channels: 32
|
||||
first_stage_config:
|
||||
target: ldm.models.autoencoder.VQModelInterface
|
||||
params:
|
||||
ckpt_path: configs/first_stage_models/vq-f4/model.yaml
|
||||
embed_dim: 3
|
||||
n_embed: 8192
|
||||
ddconfig:
|
||||
double_z: false
|
||||
z_channels: 3
|
||||
resolution: 256
|
||||
in_channels: 3
|
||||
out_ch: 3
|
||||
ch: 128
|
||||
ch_mult:
|
||||
- 1
|
||||
- 2
|
||||
- 4
|
||||
num_res_blocks: 2
|
||||
attn_resolutions: []
|
||||
dropout: 0.0
|
||||
lossconfig:
|
||||
target: torch.nn.Identity
|
||||
cond_stage_config: __is_unconditional__
|
||||
data:
|
||||
target: main.DataModuleFromConfig
|
||||
params:
|
||||
batch_size: 48
|
||||
num_workers: 5
|
||||
wrap: false
|
||||
train:
|
||||
target: ldm.data.lsun.LSUNBedroomsTrain
|
||||
params:
|
||||
size: 256
|
||||
validation:
|
||||
target: ldm.data.lsun.LSUNBedroomsValidation
|
||||
params:
|
||||
size: 256
|
||||
|
||||
|
||||
lightning:
|
||||
callbacks:
|
||||
image_logger:
|
||||
target: main.ImageLogger
|
||||
params:
|
||||
batch_frequency: 5000
|
||||
max_images: 8
|
||||
increase_log_steps: False
|
||||
|
||||
trainer:
|
||||
benchmark: True
|
91
configs/latent-diffusion/lsun_churches-ldm-kl-8.yaml
Normal file
@ -0,0 +1,91 @@
|
||||
model:
|
||||
base_learning_rate: 5.0e-5 # set to target_lr by starting main.py with '--scale_lr False'
|
||||
target: ldm.models.diffusion.ddpm.LatentDiffusion
|
||||
params:
|
||||
linear_start: 0.0015
|
||||
linear_end: 0.0155
|
||||
num_timesteps_cond: 1
|
||||
log_every_t: 200
|
||||
timesteps: 1000
|
||||
loss_type: l1
|
||||
first_stage_key: "image"
|
||||
cond_stage_key: "image"
|
||||
image_size: 32
|
||||
channels: 4
|
||||
cond_stage_trainable: False
|
||||
concat_mode: False
|
||||
scale_by_std: True
|
||||
monitor: 'val/loss_simple_ema'
|
||||
|
||||
scheduler_config: # 10000 warmup steps
|
||||
target: ldm.lr_scheduler.LambdaLinearScheduler
|
||||
params:
|
||||
warm_up_steps: [10000]
|
||||
cycle_lengths: [10000000000000]
|
||||
f_start: [1.e-6]
|
||||
f_max: [1.]
|
||||
f_min: [ 1.]
|
||||
|
||||
unet_config:
|
||||
target: ldm.modules.diffusionmodules.openaimodel.UNetModel
|
||||
params:
|
||||
image_size: 32
|
||||
in_channels: 4
|
||||
out_channels: 4
|
||||
model_channels: 192
|
||||
attention_resolutions: [ 1, 2, 4, 8 ] # 32, 16, 8, 4
|
||||
num_res_blocks: 2
|
||||
channel_mult: [ 1,2,2,4,4 ] # 32, 16, 8, 4, 2
|
||||
num_heads: 8
|
||||
use_scale_shift_norm: True
|
||||
resblock_updown: True
|
||||
|
||||
first_stage_config:
|
||||
target: ldm.models.autoencoder.AutoencoderKL
|
||||
params:
|
||||
embed_dim: 4
|
||||
monitor: "val/rec_loss"
|
||||
ckpt_path: "models/first_stage_models/kl-f8/model.ckpt"
|
||||
ddconfig:
|
||||
double_z: True
|
||||
z_channels: 4
|
||||
resolution: 256
|
||||
in_channels: 3
|
||||
out_ch: 3
|
||||
ch: 128
|
||||
ch_mult: [ 1,2,4,4 ] # num_down = len(ch_mult)-1
|
||||
num_res_blocks: 2
|
||||
attn_resolutions: [ ]
|
||||
dropout: 0.0
|
||||
lossconfig:
|
||||
target: torch.nn.Identity
|
||||
|
||||
cond_stage_config: "__is_unconditional__"
|
||||
|
||||
data:
|
||||
target: main.DataModuleFromConfig
|
||||
params:
|
||||
batch_size: 96
|
||||
num_workers: 5
|
||||
wrap: False
|
||||
train:
|
||||
target: ldm.data.lsun.LSUNChurchesTrain
|
||||
params:
|
||||
size: 256
|
||||
validation:
|
||||
target: ldm.data.lsun.LSUNChurchesValidation
|
||||
params:
|
||||
size: 256
|
||||
|
||||
lightning:
|
||||
callbacks:
|
||||
image_logger:
|
||||
target: main.ImageLogger
|
||||
params:
|
||||
batch_frequency: 5000
|
||||
max_images: 8
|
||||
increase_log_steps: False
|
||||
|
||||
|
||||
trainer:
|
||||
benchmark: True
|
71
configs/latent-diffusion/txt2img-1p4B-eval.yaml
Normal file
@ -0,0 +1,71 @@
|
||||
model:
|
||||
base_learning_rate: 5.0e-05
|
||||
target: ldm.models.diffusion.ddpm.LatentDiffusion
|
||||
params:
|
||||
linear_start: 0.00085
|
||||
linear_end: 0.012
|
||||
num_timesteps_cond: 1
|
||||
log_every_t: 200
|
||||
timesteps: 1000
|
||||
first_stage_key: image
|
||||
cond_stage_key: caption
|
||||
image_size: 32
|
||||
channels: 4
|
||||
cond_stage_trainable: true
|
||||
conditioning_key: crossattn
|
||||
monitor: val/loss_simple_ema
|
||||
scale_factor: 0.18215
|
||||
use_ema: False
|
||||
|
||||
unet_config:
|
||||
target: ldm.modules.diffusionmodules.openaimodel.UNetModel
|
||||
params:
|
||||
image_size: 32
|
||||
in_channels: 4
|
||||
out_channels: 4
|
||||
model_channels: 320
|
||||
attention_resolutions:
|
||||
- 4
|
||||
- 2
|
||||
- 1
|
||||
num_res_blocks: 2
|
||||
channel_mult:
|
||||
- 1
|
||||
- 2
|
||||
- 4
|
||||
- 4
|
||||
num_heads: 8
|
||||
use_spatial_transformer: true
|
||||
transformer_depth: 1
|
||||
context_dim: 1280
|
||||
use_checkpoint: true
|
||||
legacy: False
|
||||
|
||||
first_stage_config:
|
||||
target: ldm.models.autoencoder.AutoencoderKL
|
||||
params:
|
||||
embed_dim: 4
|
||||
monitor: val/rec_loss
|
||||
ddconfig:
|
||||
double_z: true
|
||||
z_channels: 4
|
||||
resolution: 256
|
||||
in_channels: 3
|
||||
out_ch: 3
|
||||
ch: 128
|
||||
ch_mult:
|
||||
- 1
|
||||
- 2
|
||||
- 4
|
||||
- 4
|
||||
num_res_blocks: 2
|
||||
attn_resolutions: []
|
||||
dropout: 0.0
|
||||
lossconfig:
|
||||
target: torch.nn.Identity
|
||||
|
||||
cond_stage_config:
|
||||
target: ldm.modules.encoders.modules.BERTEmbedder
|
||||
params:
|
||||
n_embed: 1280
|
||||
n_layer: 32
|
20
configs/models.yaml
Normal file
@ -0,0 +1,20 @@
|
||||
# This file describes the alternative machine learning models
|
||||
# available to the dream script.
|
||||
#
|
||||
# To add a new model, follow the examples below. Each
|
||||
# model requires a model config file, a weights file,
|
||||
# and the width and height of the images it
|
||||
# was trained on.
|
||||
|
||||
laion400m:
|
||||
config: configs/latent-diffusion/txt2img-1p4B-eval.yaml
|
||||
weights: models/ldm/text2img-large/model.ckpt
|
||||
description: Latent Diffusion LAION400M model
|
||||
width: 256
|
||||
height: 256
|
||||
stable-diffusion-1.4:
|
||||
config: configs/stable-diffusion/v1-inference.yaml
|
||||
weights: models/ldm/stable-diffusion-v1/model.ckpt
|
||||
description: Stable Diffusion inference model version 1.4
|
||||
width: 512
|
||||
height: 512
|
@ -1,27 +0,0 @@
|
||||
# This file describes the alternative machine learning models
|
||||
# available to InvokeAI script.
|
||||
#
|
||||
# To add a new model, follow the examples below. Each
|
||||
# model requires a model config file, a weights file,
|
||||
# and the width and height of the images it
|
||||
# was trained on.
|
||||
stable-diffusion-1.5:
|
||||
description: The newest Stable Diffusion version 1.5 weight file (4.27 GB)
|
||||
weights: ./models/ldm/stable-diffusion-v1/v1-5-pruned-emaonly.ckpt
|
||||
config: ./configs/stable-diffusion/v1-inference.yaml
|
||||
width: 512
|
||||
height: 512
|
||||
vae: ./models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
|
||||
default: true
|
||||
stable-diffusion-1.4:
|
||||
description: Stable Diffusion inference model version 1.4
|
||||
config: configs/stable-diffusion/v1-inference.yaml
|
||||
weights: models/ldm/stable-diffusion-v1/sd-v1-4.ckpt
|
||||
vae: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
|
||||
width: 512
|
||||
height: 512
|
||||
inpainting-1.5:
|
||||
weights: models/ldm/stable-diffusion-v1/sd-v1-5-inpainting.ckpt
|
||||
config: configs/stable-diffusion/v1-inpainting-inference.yaml
|
||||
vae: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
|
||||
description: RunwayML SD 1.5 model optimized for inpainting
|
68
configs/retrieval-augmented-diffusion/768x768.yaml
Normal file
@ -0,0 +1,68 @@
|
||||
model:
|
||||
base_learning_rate: 0.0001
|
||||
target: ldm.models.diffusion.ddpm.LatentDiffusion
|
||||
params:
|
||||
linear_start: 0.0015
|
||||
linear_end: 0.015
|
||||
num_timesteps_cond: 1
|
||||
log_every_t: 200
|
||||
timesteps: 1000
|
||||
first_stage_key: jpg
|
||||
cond_stage_key: nix
|
||||
image_size: 48
|
||||
channels: 16
|
||||
cond_stage_trainable: false
|
||||
conditioning_key: crossattn
|
||||
monitor: val/loss_simple_ema
|
||||
scale_by_std: false
|
||||
scale_factor: 0.22765929
|
||||
unet_config:
|
||||
target: ldm.modules.diffusionmodules.openaimodel.UNetModel
|
||||
params:
|
||||
image_size: 48
|
||||
in_channels: 16
|
||||
out_channels: 16
|
||||
model_channels: 448
|
||||
attention_resolutions:
|
||||
- 4
|
||||
- 2
|
||||
- 1
|
||||
num_res_blocks: 2
|
||||
channel_mult:
|
||||
- 1
|
||||
- 2
|
||||
- 3
|
||||
- 4
|
||||
use_scale_shift_norm: false
|
||||
resblock_updown: false
|
||||
num_head_channels: 32
|
||||
use_spatial_transformer: true
|
||||
transformer_depth: 1
|
||||
context_dim: 768
|
||||
use_checkpoint: true
|
||||
first_stage_config:
|
||||
target: ldm.models.autoencoder.AutoencoderKL
|
||||
params:
|
||||
monitor: val/rec_loss
|
||||
embed_dim: 16
|
||||
ddconfig:
|
||||
double_z: true
|
||||
z_channels: 16
|
||||
resolution: 256
|
||||
in_channels: 3
|
||||
out_ch: 3
|
||||
ch: 128
|
||||
ch_mult:
|
||||
- 1
|
||||
- 1
|
||||
- 2
|
||||
- 2
|
||||
- 4
|
||||
num_res_blocks: 2
|
||||
attn_resolutions:
|
||||
- 16
|
||||
dropout: 0.0
|
||||
lossconfig:
|
||||
target: torch.nn.Identity
|
||||
cond_stage_config:
|
||||
target: torch.nn.Identity
|
@ -76,4 +76,4 @@ model:
|
||||
target: torch.nn.Identity
|
||||
|
||||
cond_stage_config:
|
||||
target: ldm.modules.encoders.modules.WeightedFrozenCLIPEmbedder
|
||||
target: ldm.modules.encoders.modules.FrozenCLIPEmbedder
|
||||
|
@ -1,79 +0,0 @@
|
||||
model:
|
||||
base_learning_rate: 7.5e-05
|
||||
target: ldm.models.diffusion.ddpm.LatentInpaintDiffusion
|
||||
params:
|
||||
linear_start: 0.00085
|
||||
linear_end: 0.0120
|
||||
num_timesteps_cond: 1
|
||||
log_every_t: 200
|
||||
timesteps: 1000
|
||||
first_stage_key: "jpg"
|
||||
cond_stage_key: "txt"
|
||||
image_size: 64
|
||||
channels: 4
|
||||
cond_stage_trainable: false # Note: different from the one we trained before
|
||||
conditioning_key: hybrid # important
|
||||
monitor: val/loss_simple_ema
|
||||
scale_factor: 0.18215
|
||||
finetune_keys: null
|
||||
|
||||
scheduler_config: # 10000 warmup steps
|
||||
target: ldm.lr_scheduler.LambdaLinearScheduler
|
||||
params:
|
||||
warm_up_steps: [ 2500 ] # NOTE for resuming. use 10000 if starting from scratch
|
||||
cycle_lengths: [ 10000000000000 ] # incredibly large number to prevent corner cases
|
||||
f_start: [ 1.e-6 ]
|
||||
f_max: [ 1. ]
|
||||
f_min: [ 1. ]
|
||||
|
||||
personalization_config:
|
||||
target: ldm.modules.embedding_manager.EmbeddingManager
|
||||
params:
|
||||
placeholder_strings: ["*"]
|
||||
initializer_words: ['face', 'man', 'photo', 'africanmale']
|
||||
per_image_tokens: false
|
||||
num_vectors_per_token: 1
|
||||
progressive_words: False
|
||||
|
||||
unet_config:
|
||||
target: ldm.modules.diffusionmodules.openaimodel.UNetModel
|
||||
params:
|
||||
image_size: 32 # unused
|
||||
in_channels: 9 # 4 data + 4 downscaled image + 1 mask
|
||||
out_channels: 4
|
||||
model_channels: 320
|
||||
attention_resolutions: [ 4, 2, 1 ]
|
||||
num_res_blocks: 2
|
||||
channel_mult: [ 1, 2, 4, 4 ]
|
||||
num_heads: 8
|
||||
use_spatial_transformer: True
|
||||
transformer_depth: 1
|
||||
context_dim: 768
|
||||
use_checkpoint: True
|
||||
legacy: False
|
||||
|
||||
first_stage_config:
|
||||
target: ldm.models.autoencoder.AutoencoderKL
|
||||
params:
|
||||
embed_dim: 4
|
||||
monitor: val/rec_loss
|
||||
ddconfig:
|
||||
double_z: true
|
||||
z_channels: 4
|
||||
resolution: 256
|
||||
in_channels: 3
|
||||
out_ch: 3
|
||||
ch: 128
|
||||
ch_mult:
|
||||
- 1
|
||||
- 2
|
||||
- 4
|
||||
- 4
|
||||
num_res_blocks: 2
|
||||
attn_resolutions: []
|
||||
dropout: 0.0
|
||||
lossconfig:
|
||||
target: torch.nn.Identity
|
||||
|
||||
cond_stage_config:
|
||||
target: ldm.modules.encoders.modules.WeightedFrozenCLIPEmbedder
|
@ -1,74 +1,57 @@
|
||||
FROM ubuntu AS get_miniconda
|
||||
FROM debian
|
||||
|
||||
SHELL ["/bin/bash", "-c"]
|
||||
ARG gsd
|
||||
ENV GITHUB_STABLE_DIFFUSION $gsd
|
||||
|
||||
# install wget
|
||||
RUN apt-get update \
|
||||
&& apt-get install -y \
|
||||
wget \
|
||||
&& apt-get clean \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
ARG rsd
|
||||
ENV REQS $rsd
|
||||
|
||||
# download and install miniconda
|
||||
ARG conda_version=py39_4.12.0-Linux-x86_64
|
||||
ARG conda_prefix=/opt/conda
|
||||
RUN wget --progress=dot:giga -O /miniconda.sh \
|
||||
https://repo.anaconda.com/miniconda/Miniconda3-${conda_version}.sh \
|
||||
&& bash /miniconda.sh -b -p ${conda_prefix} \
|
||||
&& rm -f /miniconda.sh
|
||||
ARG cs
|
||||
ENV CONDA_SUBDIR $cs
|
||||
|
||||
FROM ubuntu AS invokeai
|
||||
ENV PIP_EXISTS_ACTION="w"
|
||||
|
||||
# use bash
|
||||
SHELL [ "/bin/bash", "-c" ]
|
||||
# TODO: Optimize image size
|
||||
|
||||
# clean bashrc
|
||||
RUN echo "" > ~/.bashrc
|
||||
SHELL ["/bin/bash", "-c"]
|
||||
|
||||
# Install necesarry packages
|
||||
RUN apt-get update \
|
||||
&& apt-get install -y \
|
||||
--no-install-recommends \
|
||||
gcc \
|
||||
git \
|
||||
libgl1-mesa-glx \
|
||||
libglib2.0-0 \
|
||||
pip \
|
||||
python3 \
|
||||
python3-dev \
|
||||
&& apt-get clean \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
WORKDIR /
|
||||
RUN apt update && apt upgrade -y \
|
||||
&& apt install -y \
|
||||
git \
|
||||
libgl1-mesa-glx \
|
||||
libglib2.0-0 \
|
||||
pip \
|
||||
python3 \
|
||||
&& git clone $GITHUB_STABLE_DIFFUSION
|
||||
|
||||
# clone repository and create symlinks
|
||||
ARG invokeai_git=https://github.com/invoke-ai/InvokeAI.git
|
||||
ARG project_name=invokeai
|
||||
RUN git clone ${invokeai_git} /${project_name} \
|
||||
&& mkdir /${project_name}/models/ldm/stable-diffusion-v1 \
|
||||
&& ln -s /data/models/sd-v1-4.ckpt /${project_name}/models/ldm/stable-diffusion-v1/model.ckpt \
|
||||
&& ln -s /data/outputs/ /${project_name}/outputs
|
||||
# Install Anaconda or Miniconda
|
||||
COPY anaconda.sh .
|
||||
RUN bash anaconda.sh -b -u -p /anaconda && /anaconda/bin/conda init bash
|
||||
|
||||
# set workdir
|
||||
WORKDIR /${project_name}
|
||||
# SD
|
||||
WORKDIR /stable-diffusion
|
||||
RUN source ~/.bashrc \
|
||||
&& conda create -y --name ldm && conda activate ldm \
|
||||
&& conda config --env --set subdir $CONDA_SUBDIR \
|
||||
&& pip3 install -r $REQS \
|
||||
&& pip3 install basicsr facexlib realesrgan \
|
||||
&& mkdir models/ldm/stable-diffusion-v1 \
|
||||
&& ln -s "/data/sd-v1-4.ckpt" models/ldm/stable-diffusion-v1/model.ckpt
|
||||
|
||||
# install conda env and preload models
|
||||
ARG conda_prefix=/opt/conda
|
||||
ARG conda_env_file=environment.yml
|
||||
COPY --from=get_miniconda ${conda_prefix} ${conda_prefix}
|
||||
RUN source ${conda_prefix}/etc/profile.d/conda.sh \
|
||||
&& conda init bash \
|
||||
&& source ~/.bashrc \
|
||||
&& conda env create \
|
||||
--name ${project_name} \
|
||||
--file ${conda_env_file} \
|
||||
&& rm -Rf ~/.cache \
|
||||
&& conda clean -afy \
|
||||
&& echo "conda activate ${project_name}" >> ~/.bashrc \
|
||||
&& ln -s /data/models/GFPGANv1.4.pth ./src/gfpgan/experiments/pretrained_models/GFPGANv1.4.pth \
|
||||
&& conda activate ${project_name} \
|
||||
&& python scripts/preload_models.py
|
||||
# Face restoreation
|
||||
# by default expected in a sibling directory to stable-diffusion
|
||||
WORKDIR /
|
||||
RUN git clone https://github.com/TencentARC/GFPGAN.git
|
||||
|
||||
# Copy entrypoint and set env
|
||||
ENV CONDA_PREFIX=${conda_prefix}
|
||||
ENV PROJECT_NAME=${project_name}
|
||||
COPY docker-build/entrypoint.sh /
|
||||
ENTRYPOINT [ "/entrypoint.sh" ]
|
||||
WORKDIR /GFPGAN
|
||||
RUN pip3 install -r requirements.txt \
|
||||
&& python3 setup.py develop \
|
||||
&& ln -s "/data/GFPGANv1.4.pth" experiments/pretrained_models/GFPGANv1.4.pth
|
||||
|
||||
WORKDIR /stable-diffusion
|
||||
RUN python3 scripts/preload_models.py
|
||||
|
||||
WORKDIR /
|
||||
COPY entrypoint.sh .
|
||||
ENTRYPOINT ["/entrypoint.sh"]
|
@ -1,81 +0,0 @@
|
||||
#!/usr/bin/env bash
|
||||
set -e
|
||||
# IMPORTANT: You need to have a token on huggingface.co to be able to download the checkpoint!!!
|
||||
# configure values by using env when executing build.sh
|
||||
# f.e. env ARCH=aarch64 GITHUB_INVOKE_AI=https://github.com/yourname/yourfork.git ./build.sh
|
||||
|
||||
source ./docker-build/env.sh || echo "please run from repository root" || exit 1
|
||||
|
||||
invokeai_conda_version=${INVOKEAI_CONDA_VERSION:-py39_4.12.0-${platform/\//-}}
|
||||
invokeai_conda_prefix=${INVOKEAI_CONDA_PREFIX:-\/opt\/conda}
|
||||
invokeai_conda_env_file=${INVOKEAI_CONDA_ENV_FILE:-environment.yml}
|
||||
invokeai_git=${INVOKEAI_GIT:-https://github.com/invoke-ai/InvokeAI.git}
|
||||
huggingface_token=${HUGGINGFACE_TOKEN?}
|
||||
|
||||
# print the settings
|
||||
echo "You are using these values:"
|
||||
echo -e "project_name:\t\t ${project_name}"
|
||||
echo -e "volumename:\t\t ${volumename}"
|
||||
echo -e "arch:\t\t\t ${arch}"
|
||||
echo -e "platform:\t\t ${platform}"
|
||||
echo -e "invokeai_conda_version:\t ${invokeai_conda_version}"
|
||||
echo -e "invokeai_conda_prefix:\t ${invokeai_conda_prefix}"
|
||||
echo -e "invokeai_conda_env_file: ${invokeai_conda_env_file}"
|
||||
echo -e "invokeai_git:\t\t ${invokeai_git}"
|
||||
echo -e "invokeai_tag:\t\t ${invokeai_tag}\n"
|
||||
|
||||
_runAlpine() {
|
||||
docker run \
|
||||
--rm \
|
||||
--interactive \
|
||||
--tty \
|
||||
--mount source="$volumename",target=/data \
|
||||
--workdir /data \
|
||||
alpine "$@"
|
||||
}
|
||||
|
||||
_copyCheckpoints() {
|
||||
echo "creating subfolders for models and outputs"
|
||||
_runAlpine mkdir models
|
||||
_runAlpine mkdir outputs
|
||||
echo -n "downloading sd-v1-4.ckpt"
|
||||
_runAlpine wget --header="Authorization: Bearer ${huggingface_token}" -O models/sd-v1-4.ckpt https://huggingface.co/CompVis/stable-diffusion-v-1-4-original/resolve/main/sd-v1-4.ckpt
|
||||
echo "done"
|
||||
echo "downloading GFPGANv1.4.pth"
|
||||
_runAlpine wget -O models/GFPGANv1.4.pth https://github.com/TencentARC/GFPGAN/releases/download/v1.3.0/GFPGANv1.4.pth
|
||||
}
|
||||
|
||||
_checkVolumeContent() {
|
||||
_runAlpine ls -lhA /data/models
|
||||
}
|
||||
|
||||
_getModelMd5s() {
|
||||
_runAlpine \
|
||||
alpine sh -c "md5sum /data/models/*"
|
||||
}
|
||||
|
||||
if [[ -n "$(docker volume ls -f name="${volumename}" -q)" ]]; then
|
||||
echo "Volume already exists"
|
||||
if [[ -z "$(_checkVolumeContent)" ]]; then
|
||||
echo "looks empty, copying checkpoint"
|
||||
_copyCheckpoints
|
||||
fi
|
||||
echo "Models in ${volumename}:"
|
||||
_checkVolumeContent
|
||||
else
|
||||
echo -n "createing docker volume "
|
||||
docker volume create "${volumename}"
|
||||
_copyCheckpoints
|
||||
fi
|
||||
|
||||
# Build Container
|
||||
docker build \
|
||||
--platform="${platform}" \
|
||||
--tag "${invokeai_tag}" \
|
||||
--build-arg project_name="${project_name}" \
|
||||
--build-arg conda_version="${invokeai_conda_version}" \
|
||||
--build-arg conda_prefix="${invokeai_conda_prefix}" \
|
||||
--build-arg conda_env_file="${invokeai_conda_env_file}" \
|
||||
--build-arg invokeai_git="${invokeai_git}" \
|
||||
--file ./docker-build/Dockerfile \
|
||||
.
|
@ -1,8 +1,10 @@
|
||||
#!/bin/bash
|
||||
set -e
|
||||
|
||||
source "${CONDA_PREFIX}/etc/profile.d/conda.sh"
|
||||
conda activate "${PROJECT_NAME}"
|
||||
cd /stable-diffusion
|
||||
|
||||
python scripts/invoke.py \
|
||||
${@:---web --host=0.0.0.0}
|
||||
if [ $# -eq 0 ]; then
|
||||
python3 scripts/dream.py --full_precision -o /data
|
||||
# bash
|
||||
else
|
||||
python3 scripts/dream.py --full_precision -o /data "$@"
|
||||
fi
|
@ -1,13 +0,0 @@
|
||||
#!/usr/bin/env bash
|
||||
|
||||
project_name=${PROJECT_NAME:-invokeai}
|
||||
volumename=${VOLUMENAME:-${project_name}_data}
|
||||
arch=${ARCH:-x86_64}
|
||||
platform=${PLATFORM:-Linux/${arch}}
|
||||
invokeai_tag=${INVOKEAI_TAG:-${project_name}-${arch}}
|
||||
|
||||
export project_name
|
||||
export volumename
|
||||
export arch
|
||||
export platform
|
||||
export invokeai_tag
|
@ -1,15 +0,0 @@
|
||||
#!/usr/bin/env bash
|
||||
set -e
|
||||
|
||||
source ./docker-build/env.sh || echo "please run from repository root" || exit 1
|
||||
|
||||
docker run \
|
||||
--interactive \
|
||||
--tty \
|
||||
--rm \
|
||||
--platform "$platform" \
|
||||
--name "$project_name" \
|
||||
--hostname "$project_name" \
|
||||
--mount source="$volumename",target=/data \
|
||||
--publish 9090:9090 \
|
||||
"$invokeai_tag" ${1:+$@}
|
@ -4,94 +4,6 @@ title: Changelog
|
||||
|
||||
# :octicons-log-16: **Changelog**
|
||||
|
||||
## v2.1.0 (2 November 2022)
|
||||
- update mac instructions to use invokeai for env name by @willwillems in https://github.com/invoke-ai/InvokeAI/pull/1030
|
||||
- Update .gitignore by @blessedcoolant in https://github.com/invoke-ai/InvokeAI/pull/1040
|
||||
- reintroduce fix for m1 from https://github.com/invoke-ai/InvokeAI/pull/579 missing after merge by @skurovec in https://github.com/invoke-ai/InvokeAI/pull/1056
|
||||
- Update Stable_Diffusion_AI_Notebook.ipynb (Take 2) by @ChloeL19 in https://github.com/invoke-ai/InvokeAI/pull/1060
|
||||
- Print out the device type which is used by @manzke in https://github.com/invoke-ai/InvokeAI/pull/1073
|
||||
- Hires Addition by @hipsterusername in https://github.com/invoke-ai/InvokeAI/pull/1063
|
||||
- fix for "1 leaked semaphore objects to clean up at shutdown" on M1 by @skurovec in https://github.com/invoke-ai/InvokeAI/pull/1081
|
||||
- Forward dream.py to invoke.py using the same interpreter, add deprecation warning by @db3000 in https://github.com/invoke-ai/InvokeAI/pull/1077
|
||||
- fix noisy images at high step counts by @lstein in https://github.com/invoke-ai/InvokeAI/pull/1086
|
||||
- Generalize facetool strength argument by @db3000 in https://github.com/invoke-ai/InvokeAI/pull/1078
|
||||
- Enable fast switching among models at the invoke> command line by @lstein in https://github.com/invoke-ai/InvokeAI/pull/1066
|
||||
- Fix Typo, committed changing ldm environment to invokeai by @jdries3 in https://github.com/invoke-ai/InvokeAI/pull/1095
|
||||
- Update generate.py by @unreleased in https://github.com/invoke-ai/InvokeAI/pull/1109
|
||||
- Update 'ldm' env to 'invokeai' in troubleshooting steps by @19wolf in https://github.com/invoke-ai/InvokeAI/pull/1125
|
||||
- Fixed documentation typos and resolved merge conflicts by @rupeshs in https://github.com/invoke-ai/InvokeAI/pull/1123
|
||||
- Fix broken doc links, fix malaprop in the project subtitle by @majick in https://github.com/invoke-ai/InvokeAI/pull/1131
|
||||
- Only output facetool parameters if enhancing faces by @db3000 in https://github.com/invoke-ai/InvokeAI/pull/1119
|
||||
- Update gitignore to ignore codeformer weights at new location by @spezialspezial in https://github.com/invoke-ai/InvokeAI/pull/1136
|
||||
- fix links to point to invoke-ai.github.io #1117 by @mauwii in https://github.com/invoke-ai/InvokeAI/pull/1143
|
||||
- Rework-mkdocs by @mauwii in https://github.com/invoke-ai/InvokeAI/pull/1144
|
||||
- add option to CLI and pngwriter that allows user to set PNG compression level by @lstein in https://github.com/invoke-ai/InvokeAI/pull/1127
|
||||
- Fix img2img DDIM index out of bound by @wfng92 in https://github.com/invoke-ai/InvokeAI/pull/1137
|
||||
- Fix gh actions by @mauwii in https://github.com/invoke-ai/InvokeAI/pull/1128
|
||||
- update mac instructions to use invokeai for env name by @willwillems in https://github.com/invoke-ai/InvokeAI/pull/1030
|
||||
- Update .gitignore by @blessedcoolant in https://github.com/invoke-ai/InvokeAI/pull/1040
|
||||
- reintroduce fix for m1 from https://github.com/invoke-ai/InvokeAI/pull/579 missing after merge by @skurovec in https://github.com/invoke-ai/InvokeAI/pull/1056
|
||||
- Update Stable_Diffusion_AI_Notebook.ipynb (Take 2) by @ChloeL19 in https://github.com/invoke-ai/InvokeAI/pull/1060
|
||||
- Print out the device type which is used by @manzke in https://github.com/invoke-ai/InvokeAI/pull/1073
|
||||
- Hires Addition by @hipsterusername in https://github.com/invoke-ai/InvokeAI/pull/1063
|
||||
- fix for "1 leaked semaphore objects to clean up at shutdown" on M1 by @skurovec in https://github.com/invoke-ai/InvokeAI/pull/1081
|
||||
- Forward dream.py to invoke.py using the same interpreter, add deprecation warning by @db3000 in https://github.com/invoke-ai/InvokeAI/pull/1077
|
||||
- fix noisy images at high step counts by @lstein in https://github.com/invoke-ai/InvokeAI/pull/1086
|
||||
- Generalize facetool strength argument by @db3000 in https://github.com/invoke-ai/InvokeAI/pull/1078
|
||||
- Enable fast switching among models at the invoke> command line by @lstein in https://github.com/invoke-ai/InvokeAI/pull/1066
|
||||
- Fix Typo, committed changing ldm environment to invokeai by @jdries3 in https://github.com/invoke-ai/InvokeAI/pull/1095
|
||||
- Fixed documentation typos and resolved merge conflicts by @rupeshs in https://github.com/invoke-ai/InvokeAI/pull/1123
|
||||
- Only output facetool parameters if enhancing faces by @db3000 in https://github.com/invoke-ai/InvokeAI/pull/1119
|
||||
- add option to CLI and pngwriter that allows user to set PNG compression level by @lstein in https://github.com/invoke-ai/InvokeAI/pull/1127
|
||||
- Fix img2img DDIM index out of bound by @wfng92 in https://github.com/invoke-ai/InvokeAI/pull/1137
|
||||
- Add text prompt to inpaint mask support by @lstein in https://github.com/invoke-ai/InvokeAI/pull/1133
|
||||
- Respect http[s] protocol when making socket.io middleware by @damian0815 in https://github.com/invoke-ai/InvokeAI/pull/976
|
||||
- WebUI: Adds Codeformer support by @psychedelicious in https://github.com/invoke-ai/InvokeAI/pull/1151
|
||||
- Skips normalizing prompts for web UI metadata by @psychedelicious in https://github.com/invoke-ai/InvokeAI/pull/1165
|
||||
- Add Asymmetric Tiling by @carson-katri in https://github.com/invoke-ai/InvokeAI/pull/1132
|
||||
- Web UI: Increases max CFG Scale to 200 by @psychedelicious in https://github.com/invoke-ai/InvokeAI/pull/1172
|
||||
- Corrects color channels in face restoration; Fixes #1167 by @psychedelicious in https://github.com/invoke-ai/InvokeAI/pull/1175
|
||||
- Flips channels using array slicing instead of using OpenCV by @psychedelicious in https://github.com/invoke-ai/InvokeAI/pull/1178
|
||||
- Fix typo in docs: s/Formally/Formerly by @noodlebox in https://github.com/invoke-ai/InvokeAI/pull/1176
|
||||
- fix clipseg loading problems by @lstein in https://github.com/invoke-ai/InvokeAI/pull/1177
|
||||
- Correct color channels in upscale using array slicing by @wfng92 in https://github.com/invoke-ai/InvokeAI/pull/1181
|
||||
- Web UI: Filters existing images when adding new images; Fixes #1085 by @psychedelicious in https://github.com/invoke-ai/InvokeAI/pull/1171
|
||||
- fix a number of bugs in textual inversion by @lstein in https://github.com/invoke-ai/InvokeAI/pull/1190
|
||||
- Improve !fetch, add !replay command by @ArDiouscuros in https://github.com/invoke-ai/InvokeAI/pull/882
|
||||
- Fix generation of image with s>1000 by @holstvoogd in https://github.com/invoke-ai/InvokeAI/pull/951
|
||||
- Web UI: Gallery improvements by @psychedelicious in https://github.com/invoke-ai/InvokeAI/pull/1198
|
||||
- Update CLI.md by @krummrey in https://github.com/invoke-ai/InvokeAI/pull/1211
|
||||
- outcropping improvements by @lstein in https://github.com/invoke-ai/InvokeAI/pull/1207
|
||||
- add support for loading VAE autoencoders by @lstein in https://github.com/invoke-ai/InvokeAI/pull/1216
|
||||
- remove duplicate fix_func for MPS by @wfng92 in https://github.com/invoke-ai/InvokeAI/pull/1210
|
||||
- Metadata storage and retrieval fixes by @lstein in https://github.com/invoke-ai/InvokeAI/pull/1204
|
||||
- nix: add shell.nix file by @Cloudef in https://github.com/invoke-ai/InvokeAI/pull/1170
|
||||
- Web UI: Changes vite dist asset paths to relative by @psychedelicious in https://github.com/invoke-ai/InvokeAI/pull/1185
|
||||
- Web UI: Removes isDisabled from PromptInput by @psychedelicious in https://github.com/invoke-ai/InvokeAI/pull/1187
|
||||
- Allow user to generate images with initial noise as on M1 / mps system by @ArDiouscuros in https://github.com/invoke-ai/InvokeAI/pull/981
|
||||
- feat: adding filename format template by @plucked in https://github.com/invoke-ai/InvokeAI/pull/968
|
||||
- Web UI: Fixes broken bundle by @psychedelicious in https://github.com/invoke-ai/InvokeAI/pull/1242
|
||||
- Support runwayML custom inpainting model by @lstein in https://github.com/invoke-ai/InvokeAI/pull/1243
|
||||
- Update IMG2IMG.md by @talitore in https://github.com/invoke-ai/InvokeAI/pull/1262
|
||||
- New dockerfile - including a build- and a run- script as well as a GH-Action by @mauwii in https://github.com/invoke-ai/InvokeAI/pull/1233
|
||||
- cut over from karras to model noise schedule for higher steps by @lstein in https://github.com/invoke-ai/InvokeAI/pull/1222
|
||||
- Prompt tweaks by @lstein in https://github.com/invoke-ai/InvokeAI/pull/1268
|
||||
- Outpainting implementation by @Kyle0654 in https://github.com/invoke-ai/InvokeAI/pull/1251
|
||||
- fixing aspect ratio on hires by @tjennings in https://github.com/invoke-ai/InvokeAI/pull/1249
|
||||
- Fix-build-container-action by @mauwii in https://github.com/invoke-ai/InvokeAI/pull/1274
|
||||
- handle all unicode characters by @damian0815 in https://github.com/invoke-ai/InvokeAI/pull/1276
|
||||
- adds models.user.yml to .gitignore by @JakeHL in https://github.com/invoke-ai/InvokeAI/pull/1281
|
||||
- remove debug branch, set fail-fast to false by @mauwii in https://github.com/invoke-ai/InvokeAI/pull/1284
|
||||
- Protect-secrets-on-pr by @mauwii in https://github.com/invoke-ai/InvokeAI/pull/1285
|
||||
- Web UI: Adds initial inpainting implementation by @psychedelicious in https://github.com/invoke-ai/InvokeAI/pull/1225
|
||||
- fix environment-mac.yml - tested on x64 and arm64 by @mauwii in https://github.com/invoke-ai/InvokeAI/pull/1289
|
||||
- Use proper authentication to download model by @mauwii in https://github.com/invoke-ai/InvokeAI/pull/1287
|
||||
- Prevent indexing error for mode RGB by @spezialspezial in https://github.com/invoke-ai/InvokeAI/pull/1294
|
||||
- Integrate sd-v1-5 model into test matrix (easily expandable), remove unecesarry caches by @mauwii in https://github.com/invoke-ai/InvokeAI/pull/1293
|
||||
- add --no-interactive to preload_models step by @mauwii in https://github.com/invoke-ai/InvokeAI/pull/1302
|
||||
- 1-click installer and updater. Uses micromamba to install git and conda into a contained environment (if necessary) before running the normal installation script by @cmdr2 in https://github.com/invoke-ai/InvokeAI/pull/1253
|
||||
- preload_models.py script downloads the weight files by @lstein in https://github.com/invoke-ai/InvokeAI/pull/1290
|
||||
|
||||
## v2.0.1 (13 October 2022)
|
||||
|
||||
- fix noisy images at high step count when using k* samplers
|
||||
|
Before Width: | Height: | Size: 519 KiB |
Before Width: | Height: | Size: 11 KiB |
Before Width: | Height: | Size: 519 KiB |
Before Width: | Height: | Size: 439 KiB |
Before Width: | Height: | Size: 635 KiB |
Before Width: | Height: | Size: 284 KiB |
Before Width: | Height: | Size: 252 KiB |
Before Width: | Height: | Size: 428 KiB |
Before Width: | Height: | Size: 331 KiB |
Before Width: | Height: | Size: 369 KiB |
Before Width: | Height: | Size: 362 KiB |
Before Width: | Height: | Size: 329 KiB |
Before Width: | Height: | Size: 329 KiB |
Before Width: | Height: | Size: 377 KiB |
Before Width: | Height: | Size: 328 KiB |
Before Width: | Height: | Size: 380 KiB |
Before Width: | Height: | Size: 372 KiB |
Before Width: | Height: | Size: 401 KiB |
Before Width: | Height: | Size: 441 KiB |
Before Width: | Height: | Size: 451 KiB |
Before Width: | Height: | Size: 1.3 MiB |
Before Width: | Height: | Size: 338 KiB |
Before Width: | Height: | Size: 271 KiB |
Before Width: | Height: | Size: 353 KiB |
Before Width: | Height: | Size: 330 KiB |
Before Width: | Height: | Size: 439 KiB |
Before Width: | Height: | Size: 463 KiB |
Before Width: | Height: | Size: 444 KiB |
Before Width: | Height: | Size: 468 KiB |
Before Width: | Height: | Size: 466 KiB |
Before Width: | Height: | Size: 475 KiB |
Before Width: | Height: | Size: 429 KiB |
Before Width: | Height: | Size: 429 KiB |
Before Width: | Height: | Size: 1.3 MiB |
Before Width: | Height: | Size: 477 KiB |
Before Width: | Height: | Size: 476 KiB |
Before Width: | Height: | Size: 434 KiB |
@ -1,116 +0,0 @@
|
||||
## 000001.1863159593.png
|
||||

|
||||
|
||||
banana sushi -s 50 -S 1863159593 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
## 000002.1151955949.png
|
||||

|
||||
|
||||
banana sushi -s 50 -S 1151955949 -W 512 -H 512 -C 7.5 -A plms
|
||||
## 000003.2736230502.png
|
||||

|
||||
|
||||
banana sushi -s 50 -S 2736230502 -W 512 -H 512 -C 7.5 -A ddim
|
||||
## 000004.42.png
|
||||

|
||||
|
||||
banana sushi -s 50 -S 42 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
## 000005.42.png
|
||||

|
||||
|
||||
banana sushi -s 50 -S 42 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
## 000006.478163327.png
|
||||

|
||||
|
||||
banana sushi -s 50 -S 478163327 -W 640 -H 448 -C 7.5 -A k_lms
|
||||
## 000007.2407640369.png
|
||||

|
||||
|
||||
banana sushi -s 50 -S 42 -W 512 -H 512 -C 7.5 -A k_lms -V 2407640369:0.1
|
||||
## 000008.2772421987.png
|
||||

|
||||
|
||||
banana sushi -s 50 -S 42 -W 512 -H 512 -C 7.5 -A k_lms -V 2772421987:0.1
|
||||
## 000009.3532317557.png
|
||||

|
||||
|
||||
banana sushi -s 50 -S 42 -W 512 -H 512 -C 7.5 -A k_lms -V 3532317557:0.1
|
||||
## 000010.2028635318.png
|
||||

|
||||
|
||||
banana sushi -s 50 -S 2028635318 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
## 000011.1111168647.png
|
||||

|
||||
|
||||
pond with waterlillies -s 50 -S 1111168647 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
## 000012.1476370516.png
|
||||

|
||||
|
||||
pond with waterlillies -s 50 -S 1476370516 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
## 000013.4281108706.png
|
||||

|
||||
|
||||
banana sushi -s 50 -S 4281108706 -W 960 -H 960 -C 7.5 -A k_lms
|
||||
## 000014.2396987386.png
|
||||

|
||||
|
||||
old sea captain with crow on shoulder -s 50 -S 2396987386 -W 512 -H 512 -C 7.5 -I docs/assets/preflight-checks/inputs/Lincoln-and-Parrot-512.png -A k_lms -f 0.75
|
||||
## 000015.1252923272.png
|
||||

|
||||
|
||||
old sea captain with crow on shoulder -s 50 -S 1252923272 -W 512 -H 512 -C 7.5 -I docs/assets/preflight-checks/inputs/Lincoln-and-Parrot-512-transparent.png -A k_lms -f 0.75
|
||||
## 000016.2633891320.png
|
||||

|
||||
|
||||
old sea captain with crow on shoulder -s 50 -S 2633891320 -W 512 -H 512 -C 7.5 -I docs/assets/preflight-checks/inputs/Lincoln-and-Parrot-512.png -A plms -f 0.75
|
||||
## 000017.1134411920.png
|
||||

|
||||
|
||||
old sea captain with crow on shoulder -s 50 -S 1134411920 -W 512 -H 512 -C 7.5 -I docs/assets/preflight-checks/inputs/Lincoln-and-Parrot-512.png -A k_euler_a -f 0.75
|
||||
## 000018.47.png
|
||||

|
||||
|
||||
big red dog playing with cat -s 50 -S 47 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
## 000019.47.png
|
||||

|
||||
|
||||
big red++++ dog playing with cat -s 50 -S 47 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
## 000020.47.png
|
||||

|
||||
|
||||
big red dog playing with cat+++ -s 50 -S 47 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
## 000021.47.png
|
||||

|
||||
|
||||
big (red dog).swap(tiger) playing with cat -s 50 -S 47 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
## 000022.47.png
|
||||

|
||||
|
||||
dog:1,cat:2 -s 50 -S 47 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
## 000023.47.png
|
||||

|
||||
|
||||
dog:2,cat:1 -s 50 -S 47 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
## 000024.1029061431.png
|
||||

|
||||
|
||||
medusa with cobras -s 50 -S 1029061431 -W 512 -H 512 -C 7.5 -I docs/assets/preflight-checks/inputs/curly.png -A k_lms -f 0.75 -tm hair
|
||||
## 000025.1284519352.png
|
||||

|
||||
|
||||
bearded man -s 50 -S 1284519352 -W 512 -H 512 -C 7.5 -I docs/assets/preflight-checks/inputs/curly.png -A k_lms -f 0.75 -tm face
|
||||
## curly.942491079.gfpgan.png
|
||||

|
||||
|
||||
!fix ./docs/assets/preflight-checks/inputs/curly.png -s 50 -S 942491079 -W 512 -H 512 -C 7.5 -A k_lms -G 0.8 -ft gfpgan -U 2.0 0.75
|
||||
## curly.942491079.outcrop.png
|
||||

|
||||
|
||||
!fix ./docs/assets/preflight-checks/inputs/curly.png -s 50 -S 942491079 -W 512 -H 512 -C 7.5 -A k_lms -c top 64
|
||||
## curly.942491079.outpaint.png
|
||||

|
||||
|
||||
!fix ./docs/assets/preflight-checks/inputs/curly.png -s 50 -S 942491079 -W 512 -H 512 -C 7.5 -A k_lms -D top 64
|
||||
## curly.942491079.outcrop-01.png
|
||||

|
||||
|
||||
!fix ./docs/assets/preflight-checks/inputs/curly.png -s 50 -S 942491079 -W 512 -H 512 -C 7.5 -A k_lms -c top 64
|
@ -1,29 +0,0 @@
|
||||
outputs/preflight/000001.1863159593.png: banana sushi -s 50 -S 1863159593 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
outputs/preflight/000002.1151955949.png: banana sushi -s 50 -S 1151955949 -W 512 -H 512 -C 7.5 -A plms
|
||||
outputs/preflight/000003.2736230502.png: banana sushi -s 50 -S 2736230502 -W 512 -H 512 -C 7.5 -A ddim
|
||||
outputs/preflight/000004.42.png: banana sushi -s 50 -S 42 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
outputs/preflight/000005.42.png: banana sushi -s 50 -S 42 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
outputs/preflight/000006.478163327.png: banana sushi -s 50 -S 478163327 -W 640 -H 448 -C 7.5 -A k_lms
|
||||
outputs/preflight/000007.2407640369.png: banana sushi -s 50 -S 42 -W 512 -H 512 -C 7.5 -A k_lms -V 2407640369:0.1
|
||||
outputs/preflight/000008.2772421987.png: banana sushi -s 50 -S 42 -W 512 -H 512 -C 7.5 -A k_lms -V 2772421987:0.1
|
||||
outputs/preflight/000009.3532317557.png: banana sushi -s 50 -S 42 -W 512 -H 512 -C 7.5 -A k_lms -V 3532317557:0.1
|
||||
outputs/preflight/000010.2028635318.png: banana sushi -s 50 -S 2028635318 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
outputs/preflight/000011.1111168647.png: pond with waterlillies -s 50 -S 1111168647 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
outputs/preflight/000012.1476370516.png: pond with waterlillies -s 50 -S 1476370516 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
outputs/preflight/000013.4281108706.png: banana sushi -s 50 -S 4281108706 -W 960 -H 960 -C 7.5 -A k_lms
|
||||
outputs/preflight/000014.2396987386.png: old sea captain with crow on shoulder -s 50 -S 2396987386 -W 512 -H 512 -C 7.5 -I docs/assets/preflight-checks/inputs/Lincoln-and-Parrot-512.png -A k_lms -f 0.75
|
||||
outputs/preflight/000015.1252923272.png: old sea captain with crow on shoulder -s 50 -S 1252923272 -W 512 -H 512 -C 7.5 -I docs/assets/preflight-checks/inputs/Lincoln-and-Parrot-512-transparent.png -A k_lms -f 0.75
|
||||
outputs/preflight/000016.2633891320.png: old sea captain with crow on shoulder -s 50 -S 2633891320 -W 512 -H 512 -C 7.5 -I docs/assets/preflight-checks/inputs/Lincoln-and-Parrot-512.png -A plms -f 0.75
|
||||
outputs/preflight/000017.1134411920.png: old sea captain with crow on shoulder -s 50 -S 1134411920 -W 512 -H 512 -C 7.5 -I docs/assets/preflight-checks/inputs/Lincoln-and-Parrot-512.png -A k_euler_a -f 0.75
|
||||
outputs/preflight/000018.47.png: big red dog playing with cat -s 50 -S 47 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
outputs/preflight/000019.47.png: big red++++ dog playing with cat -s 50 -S 47 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
outputs/preflight/000020.47.png: big red dog playing with cat+++ -s 50 -S 47 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
outputs/preflight/000021.47.png: big (red dog).swap(tiger) playing with cat -s 50 -S 47 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
outputs/preflight/000022.47.png: dog:1,cat:2 -s 50 -S 47 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
outputs/preflight/000023.47.png: dog:2,cat:1 -s 50 -S 47 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
outputs/preflight/000024.1029061431.png: medusa with cobras -s 50 -S 1029061431 -W 512 -H 512 -C 7.5 -I docs/assets/preflight-checks/inputs/curly.png -A k_lms -f 0.75 -tm hair
|
||||
outputs/preflight/000025.1284519352.png: bearded man -s 50 -S 1284519352 -W 512 -H 512 -C 7.5 -I docs/assets/preflight-checks/inputs/curly.png -A k_lms -f 0.75 -tm face
|
||||
outputs/preflight/curly.942491079.gfpgan.png: !fix ./docs/assets/preflight-checks/inputs/curly.png -s 50 -S 942491079 -W 512 -H 512 -C 7.5 -A k_lms -G 0.8 -ft gfpgan -U 2.0 0.75
|
||||
outputs/preflight/curly.942491079.outcrop.png: !fix ./docs/assets/preflight-checks/inputs/curly.png -s 50 -S 942491079 -W 512 -H 512 -C 7.5 -A k_lms -c top 64
|
||||
outputs/preflight/curly.942491079.outpaint.png: !fix ./docs/assets/preflight-checks/inputs/curly.png -s 50 -S 942491079 -W 512 -H 512 -C 7.5 -A k_lms -D top 64
|
||||
outputs/preflight/curly.942491079.outcrop-01.png: !fix ./docs/assets/preflight-checks/inputs/curly.png -s 50 -S 942491079 -W 512 -H 512 -C 7.5 -A k_lms -c top 64
|
@ -1,61 +0,0 @@
|
||||
# outputs/preflight/000001.1863159593.png
|
||||
banana sushi -s 50 -S 1863159593 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
# outputs/preflight/000002.1151955949.png
|
||||
banana sushi -s 50 -S 1151955949 -W 512 -H 512 -C 7.5 -A plms
|
||||
# outputs/preflight/000003.2736230502.png
|
||||
banana sushi -s 50 -S 2736230502 -W 512 -H 512 -C 7.5 -A ddim
|
||||
# outputs/preflight/000004.42.png
|
||||
banana sushi -s 50 -S 42 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
# outputs/preflight/000005.42.png
|
||||
banana sushi -s 50 -S 42 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
# outputs/preflight/000006.478163327.png
|
||||
banana sushi -s 50 -S 478163327 -W 640 -H 448 -C 7.5 -A k_lms
|
||||
# outputs/preflight/000007.2407640369.png
|
||||
banana sushi -s 50 -S 42 -W 512 -H 512 -C 7.5 -A k_lms -V 2407640369:0.1
|
||||
# outputs/preflight/000007.2772421987.png
|
||||
banana sushi -s 50 -S 42 -W 512 -H 512 -C 7.5 -A k_lms -V 2772421987:0.1
|
||||
# outputs/preflight/000007.3532317557.png
|
||||
banana sushi -s 50 -S 42 -W 512 -H 512 -C 7.5 -A k_lms -V 3532317557:0.1
|
||||
# outputs/preflight/000008.2028635318.png
|
||||
banana sushi -s 50 -S 2028635318 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
# outputs/preflight/000009.1111168647.png
|
||||
pond with waterlillies -s 50 -S 1111168647 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
# outputs/preflight/000010.1476370516.png
|
||||
pond with waterlillies -s 50 -S 1476370516 -W 512 -H 512 -C 7.5 -A k_lms --seamless
|
||||
# outputs/preflight/000011.4281108706.png
|
||||
banana sushi -s 50 -S 4281108706 -W 960 -H 960 -C 7.5 -A k_lms
|
||||
# outputs/preflight/000012.2396987386.png
|
||||
old sea captain with crow on shoulder -s 50 -S 2396987386 -W 512 -H 512 -C 7.5 -I docs/assets/preflight-checks/inputs/Lincoln-and-Parrot-512.png -A k_lms -f 0.75
|
||||
# outputs/preflight/000013.1252923272.png
|
||||
old sea captain with crow on shoulder -s 50 -S 1252923272 -W 512 -H 512 -C 7.5 -I docs/assets/preflight-checks/inputs/Lincoln-and-Parrot-512-transparent.png -A k_lms -f 0.75
|
||||
# outputs/preflight/000014.2633891320.png
|
||||
old sea captain with crow on shoulder -s 50 -S 2633891320 -W 512 -H 512 -C 7.5 -I docs/assets/preflight-checks/inputs/Lincoln-and-Parrot-512.png -A plms -f 0.75
|
||||
# outputs/preflight/000015.1134411920.png
|
||||
old sea captain with crow on shoulder -s 50 -S 1134411920 -W 512 -H 512 -C 7.5 -I docs/assets/preflight-checks/inputs/Lincoln-and-Parrot-512.png -A k_euler_a -f 0.75
|
||||
# outputs/preflight/000016.42.png
|
||||
big red dog playing with cat -s 50 -S 47 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
# outputs/preflight/000017.42.png
|
||||
big red++++ dog playing with cat -s 50 -S 47 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
# outputs/preflight/000018.42.png
|
||||
big red dog playing with cat+++ -s 50 -S 47 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
# outputs/preflight/000019.42.png
|
||||
big (red dog).swap(tiger) playing with cat -s 50 -S 47 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
# outputs/preflight/000020.42.png
|
||||
dog:1,cat:2 -s 50 -S 47 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
# outputs/preflight/000021.42.png
|
||||
dog:2,cat:1 -s 50 -S 47 -W 512 -H 512 -C 7.5 -A k_lms
|
||||
# outputs/preflight/000022.1029061431.png
|
||||
medusa with cobras -s 50 -S 1029061431 -W 512 -H 512 -C 7.5 -I docs/assets/preflight-checks/inputs/curly.png -A k_lms -f 0.75 -tm hair
|
||||
# outputs/preflight/000023.1284519352.png
|
||||
bearded man -s 50 -S 1284519352 -W 512 -H 512 -C 7.5 -I docs/assets/preflight-checks/inputs/curly.png -A k_lms -f 0.75 -tm face
|
||||
# outputs/preflight/000024.curly.hair.deselected.png
|
||||
!mask -I docs/assets/preflight-checks/inputs/curly.png -tm hair
|
||||
# outputs/preflight/curly.942491079.gfpgan.png
|
||||
!fix ./docs/assets/preflight-checks/inputs/curly.png -U2 -G0.8
|
||||
# outputs/preflight/curly.942491079.outcrop.png
|
||||
!fix ./docs/assets/preflight-checks/inputs/curly.png -c top 64
|
||||
# outputs/preflight/curly.942491079.outpaint.png
|
||||
!fix ./docs/assets/preflight-checks/inputs/curly.png -D top 64
|
||||
# outputs/preflight/curly.942491079.outcrop-01.png
|
||||
!switch inpainting-1.5
|
||||
!fix ./docs/assets/preflight-checks/inputs/curly.png -c top 64
|
Before Width: | Height: | Size: 587 KiB |
Before Width: | Height: | Size: 572 KiB |
Before Width: | Height: | Size: 557 KiB |
Before Width: | Height: | Size: 571 KiB |
Before Width: | Height: | Size: 570 KiB |
Before Width: | Height: | Size: 568 KiB |
Before Width: | Height: | Size: 527 KiB |
Before Width: | Height: | Size: 489 KiB |
Before Width: | Height: | Size: 503 KiB |
Before Width: | Height: | Size: 488 KiB |
Before Width: | Height: | Size: 499 KiB |
Before Width: | Height: | Size: 524 KiB |
Before Width: | Height: | Size: 593 KiB |
Before Width: | Height: | Size: 598 KiB |
Before Width: | Height: | Size: 488 KiB |
Before Width: | Height: | Size: 487 KiB |
Before Width: | Height: | Size: 489 KiB |
Before Width: | Height: | Size: 338 KiB |
Before Width: | Height: | Size: 59 KiB |
@ -8,7 +8,7 @@ hide:
|
||||
|
||||
## **Interactive Command Line Interface**
|
||||
|
||||
The `invoke.py` script, located in `scripts/`, provides an interactive
|
||||
The `invoke.py` script, located in `scripts/dream.py`, provides an interactive
|
||||
interface to image generation similar to the "invoke mothership" bot that Stable
|
||||
AI provided on its Discord server.
|
||||
|
||||
@ -86,7 +86,6 @@ overridden on a per-prompt basis (see [List of prompt arguments](#list-of-prompt
|
||||
| `--model <modelname>` | | `stable-diffusion-1.4` | Loads model specified in configs/models.yaml. Currently one of "stable-diffusion-1.4" or "laion400m" |
|
||||
| `--full_precision` | `-F` | `False` | Run in slower full-precision mode. Needed for Macintosh M1/M2 hardware and some older video cards. |
|
||||
| `--png_compression <0-9>` | `-z<0-9>` | 6 | Select level of compression for output files, from 0 (no compression) to 9 (max compression) |
|
||||
| `--safety-checker` | | False | Activate safety checker for NSFW and other potentially disturbing imagery |
|
||||
| `--web` | | `False` | Start in web server mode |
|
||||
| `--host <ip addr>` | | `localhost` | Which network interface web server should listen on. Set to 0.0.0.0 to listen on any. |
|
||||
| `--port <port>` | | `9090` | Which port web server should listen for requests on. |
|
||||
@ -98,6 +97,7 @@ overridden on a per-prompt basis (see [List of prompt arguments](#list-of-prompt
|
||||
| `--embedding_path <path>` | | `None` | Path to pre-trained embedding manager checkpoints, for custom models |
|
||||
| `--gfpgan_dir` | | `src/gfpgan` | Path to where GFPGAN is installed. |
|
||||
| `--gfpgan_model_path` | | `experiments/pretrained_models/GFPGANv1.4.pth` | Path to GFPGAN model file, relative to `--gfpgan_dir`. |
|
||||
| `--device <device>` | `-d<device>` | `torch.cuda.current_device()` | Device to run SD on, e.g. "cuda:0" |
|
||||
| `--free_gpu_mem` | | `False` | Free GPU memory after sampling, to allow image decoding and saving in low VRAM conditions |
|
||||
| `--precision` | | `auto` | Set model precision, default is selected by device. Options: auto, float32, float16, autocast |
|
||||
|
||||
@ -151,14 +151,12 @@ Here are the invoke> command that apply to txt2img:
|
||||
| --cfg_scale <float>| -C<float> | 7.5 | How hard to try to match the prompt to the generated image; any number greater than 1.0 works, but the useful range is roughly 5.0 to 20.0 |
|
||||
| --seed <int> | -S<int> | None | Set the random seed for the next series of images. This can be used to recreate an image generated previously.|
|
||||
| --sampler <sampler>| -A<sampler>| k_lms | Sampler to use. Use -h to get list of available samplers. |
|
||||
| --karras_max <int> | | 29 | When using k_* samplers, set the maximum number of steps before shifting from using the Karras noise schedule (good for low step counts) to the LatentDiffusion noise schedule (good for high step counts) This value is sticky. [29] |
|
||||
| --hires_fix | | | Larger images often have duplication artefacts. This option suppresses duplicates by generating the image at low res, and then using img2img to increase the resolution |
|
||||
| --png_compression <0-9> | -z<0-9> | 6 | Select level of compression for output files, from 0 (no compression) to 9 (max compression) |
|
||||
| `--png_compression <0-9>` | `-z<0-9>` | 6 | Select level of compression for output files, from 0 (no compression) to 9 (max compression) |
|
||||
| --grid | -g | False | Turn on grid mode to return a single image combining all the images generated by this prompt |
|
||||
| --individual | -i | True | Turn off grid mode (deprecated; leave off --grid instead) |
|
||||
| --outdir <path> | -o<path> | outputs/img_samples | Temporarily change the location of these images |
|
||||
| --seamless | | False | Activate seamless tiling for interesting effects |
|
||||
| --seamless_axes | | x,y | Specify which axes to use circular convolution on. |
|
||||
| --log_tokenization | -t | False | Display a color-coded list of the parsed tokens derived from the prompt |
|
||||
| --skip_normalization| -x | False | Weighted subprompts will not be normalized. See [Weighted Prompts](./OTHER.md#weighted-prompts) |
|
||||
| --upscale <int> <float> | -U <int> <float> | -U 1 0.75| Upscale image by magnification factor (2, 4), and set strength of upscaling (0.0-1.0). If strength not set, will default to 0.75. |
|
||||
@ -212,40 +210,11 @@ accepts additional options:
|
||||
[Inpainting](./INPAINTING.md) for details.
|
||||
|
||||
inpainting accepts all the arguments used for txt2img and img2img, as
|
||||
well as the --mask (-M) and --text_mask (-tm) arguments:
|
||||
well as the --mask (-M) argument:
|
||||
|
||||
| Argument <img width="100" align="right"/> | Shortcut | Default | Description |
|
||||
|--------------------|------------|---------------------|--------------|
|
||||
| `--init_mask <path>` | `-M<path>` | `None` |Path to an image the same size as the initial_image, with areas for inpainting made transparent.|
|
||||
| `--invert_mask ` | | False |If true, invert the mask so that transparent areas are opaque and vice versa.|
|
||||
| `--text_mask <prompt> [<float>]` | `-tm <prompt> [<float>]` | <none> | Create a mask from a text prompt describing part of the image|
|
||||
|
||||
The mask may either be an image with transparent areas, in which case
|
||||
the inpainting will occur in the transparent areas only, or a black
|
||||
and white image, in which case all black areas will be painted into.
|
||||
|
||||
`--text_mask` (short form `-tm`) is a way to generate a mask using a
|
||||
text description of the part of the image to replace. For example, if
|
||||
you have an image of a breakfast plate with a bagel, toast and
|
||||
scrambled eggs, you can selectively mask the bagel and replace it with
|
||||
a piece of cake this way:
|
||||
|
||||
~~~
|
||||
invoke> a piece of cake -I /path/to/breakfast.png -tm bagel
|
||||
~~~
|
||||
|
||||
The algorithm uses <a
|
||||
href="https://github.com/timojl/clipseg">clipseg</a> to classify
|
||||
different regions of the image. The classifier puts out a confidence
|
||||
score for each region it identifies. Generally regions that score
|
||||
above 0.5 are reliable, but if you are getting too much or too little
|
||||
masking you can adjust the threshold down (to get more mask), or up
|
||||
(to get less). In this example, by passing `-tm` a higher value, we
|
||||
are insisting on a more stringent classification.
|
||||
|
||||
~~~
|
||||
invoke> a piece of cake -I /path/to/breakfast.png -tm bagel 0.6
|
||||
~~~
|
||||
|
||||
# Other Commands
|
||||
|
||||
@ -287,20 +256,12 @@ Some examples:
|
||||
Outputs:
|
||||
[1] outputs/img-samples/000017.4829112.gfpgan-00.png: !fix "outputs/img-samples/0000045.4829112.png" -s 50 -S -W 512 -H 512 -C 7.5 -A k_lms -G 0.8
|
||||
|
||||
### !mask
|
||||
|
||||
This command takes an image, a text prompt, and uses the `clipseg`
|
||||
algorithm to automatically generate a mask of the area that matches
|
||||
the text prompt. It is useful for debugging the text masking process
|
||||
prior to inpainting with the `--text_mask` argument. See
|
||||
[INPAINTING.md] for details.
|
||||
|
||||
## Model selection and importation
|
||||
# Model selection and importation
|
||||
|
||||
The CLI allows you to add new models on the fly, as well as to switch
|
||||
among them rapidly without leaving the script.
|
||||
|
||||
### !models
|
||||
## !models
|
||||
|
||||
This prints out a list of the models defined in `config/models.yaml'.
|
||||
The active model is bold-faced
|
||||
@ -312,7 +273,7 @@ laion400m not loaded <no description>
|
||||
waifu-diffusion not loaded Waifu Diffusion v1.3
|
||||
</pre>
|
||||
|
||||
### !switch <model>
|
||||
## !switch <model>
|
||||
|
||||
This quickly switches from one model to another without leaving the
|
||||
CLI script. `invoke.py` uses a memory caching system; once a model
|
||||
@ -358,7 +319,7 @@ laion400m not loaded <no description>
|
||||
waifu-diffusion cached Waifu Diffusion v1.3
|
||||
</pre>
|
||||
|
||||
### !import_model <path/to/model/weights>
|
||||
## !import_model <path/to/model/weights>
|
||||
|
||||
This command imports a new model weights file into InvokeAI, makes it
|
||||
available for image generation within the script, and writes out the
|
||||
@ -383,7 +344,7 @@ automatically.
|
||||
Example:
|
||||
|
||||
<pre>
|
||||
invoke> <b>!import_model models/ldm/stable-diffusion-v1/model-epoch08-float16.ckpt</b>
|
||||
invoke> <b>!import_model models/ldm/stable-diffusion-v1/ model-epoch08-float16.ckpt</b>
|
||||
>> Model import in process. Please enter the values needed to configure this model:
|
||||
|
||||
Name for this model: <b>waifu-diffusion</b>
|
||||
@ -410,7 +371,7 @@ OK to import [n]? <b>y</b>
|
||||
invoke>
|
||||
</pre>
|
||||
|
||||
###!edit_model <name_of_model>
|
||||
##!edit_model <name_of_model>
|
||||
|
||||
The `!edit_model` command can be used to modify a model that is
|
||||
already defined in `config/models.yaml`. Call it with the short
|
||||
@ -446,12 +407,20 @@ OK to import [n]? y
|
||||
Outputs:
|
||||
[2] outputs/img-samples/000018.2273800735.embiggen-00.png: !fix "outputs/img-samples/000017.243781548.gfpgan-00.png" -s 50 -S 2273800735 -W 512 -H 512 -C 7.5 -A k_lms --embiggen 3.0 0.75 0.25
|
||||
```
|
||||
## History processing
|
||||
# History processing
|
||||
|
||||
The CLI provides a series of convenient commands for reviewing previous
|
||||
actions, retrieving them, modifying them, and re-running them.
|
||||
```bash
|
||||
invoke> !fetch 0000015.8929913.png
|
||||
# the script returns the next line, ready for editing and running:
|
||||
invoke> a fantastic alien landscape -W 576 -H 512 -s 60 -A plms -C 7.5
|
||||
```
|
||||
|
||||
### !history
|
||||
Note that this command may behave unexpectedly if given a PNG file that
|
||||
was not generated by InvokeAI.
|
||||
|
||||
### `!history`
|
||||
|
||||
The invoke script keeps track of all the commands you issue during a
|
||||
session, allowing you to re-run them. On Mac and Linux systems, it
|
||||
@ -476,41 +445,20 @@ invoke> !20
|
||||
invoke> watercolor of beautiful woman sitting under tree wearing broad hat and flowing garment -v0.2 -n6 -S2878767194
|
||||
```
|
||||
|
||||
### !fetch
|
||||
## !fetch
|
||||
|
||||
This command retrieves the generation parameters from a previously
|
||||
generated image and either loads them into the command line
|
||||
(Linux|Mac), or prints them out in a comment for copy-and-paste
|
||||
(Windows). You may provide either the name of a file in the current
|
||||
output directory, or a full file path. Specify path to a folder with
|
||||
image png files, and wildcard *.png to retrieve the dream command used
|
||||
to generate the images, and save them to a file commands.txt for
|
||||
further processing.
|
||||
generated image and either loads them into the command line. You may
|
||||
provide either the name of a file in the current output directory, or
|
||||
a full file path.
|
||||
|
||||
This example loads the generation command for a single png file:
|
||||
|
||||
```bash
|
||||
~~~
|
||||
invoke> !fetch 0000015.8929913.png
|
||||
# the script returns the next line, ready for editing and running:
|
||||
invoke> a fantastic alien landscape -W 576 -H 512 -s 60 -A plms -C 7.5
|
||||
```
|
||||
|
||||
This one fetches the generation commands from a batch of files and
|
||||
stores them into `selected.txt`:
|
||||
|
||||
```bash
|
||||
invoke> !fetch outputs\selected-imgs\*.png selected.txt
|
||||
```
|
||||
|
||||
### !replay
|
||||
|
||||
This command replays a text file generated by !fetch or created manually
|
||||
|
||||
~~~
|
||||
invoke> !replay outputs\selected-imgs\selected.txt
|
||||
~~~
|
||||
|
||||
Note that these commands may behave unexpectedly if given a PNG file that
|
||||
Note that this command may behave unexpectedly if given a PNG file that
|
||||
was not generated by InvokeAI.
|
||||
|
||||
### !search <search string>
|
||||
|
@ -120,6 +120,8 @@ Both of the outputs look kind of like what I was thinking of. With the strength
|
||||
|
||||
If you want to try this out yourself, all of these are using a seed of `1592514025` with a width/height of `384`, step count `10`, the default sampler (`k_lms`), and the single-word prompt `"fire"`:
|
||||
|
||||
If you want to try this out yourself, all of these are using a seed of `1592514025` with a width/height of `384`, step count `10`, the default sampler (`k_lms`), and the single-word prompt `fire`:
|
||||
|
||||
```commandline
|
||||
invoke> "fire" -s10 -W384 -H384 -S1592514025 -I /tmp/fire-drawing.png --strength 0.7
|
||||
```
|
||||
|
@ -9,7 +9,7 @@ title: Inpainting
|
||||
Inpainting is really cool. To do it, you start with an initial image
|
||||
and use a photoeditor to make one or more regions transparent
|
||||
(i.e. they have a "hole" in them). You then provide the path to this
|
||||
image at the invoke> command line using the `-I` switch. Stable
|
||||
image at the dream> command line using the `-I` switch. Stable
|
||||
Diffusion will only paint within the transparent region.
|
||||
|
||||
There's a catch. In the current implementation, you have to prepare
|
||||
@ -25,7 +25,7 @@ color information is preserved. There is often an option in the export
|
||||
dialog that lets you specify this.
|
||||
|
||||
If your photoeditor is erasing the underlying color information,
|
||||
`invoke.py` will give you a big fat warning. If you can't find a way to
|
||||
`dream.py` will give you a big fat warning. If you can't find a way to
|
||||
coax your photoeditor to retain color values under transparent areas,
|
||||
then you can combine the `-I` and `-M` switches to provide both the
|
||||
original unedited image and the masked (partially transparent) image:
|
||||
@ -34,188 +34,9 @@ original unedited image and the masked (partially transparent) image:
|
||||
invoke> "man with cat on shoulder" -I./images/man.png -M./images/man-transparent.png
|
||||
```
|
||||
|
||||
## **Masking using Text**
|
||||
We are hoping to get rid of the need for this workaround in an upcoming release.
|
||||
|
||||
You can also create a mask using a text prompt to select the part of
|
||||
the image you want to alter, using the <a
|
||||
href="https://github.com/timojl/clipseg">clipseg</a> algorithm. This
|
||||
works on any image, not just ones generated by InvokeAI.
|
||||
|
||||
The `--text_mask` (short form `-tm`) option takes two arguments. The
|
||||
first argument is a text description of the part of the image you wish
|
||||
to mask (paint over). If the text description contains a space, you must
|
||||
surround it with quotation marks. The optional second argument is the
|
||||
minimum threshold for the mask classifier's confidence score, described
|
||||
in more detail below.
|
||||
|
||||
To see how this works in practice, here's an image of a still life
|
||||
painting that I got off the web.
|
||||
|
||||
<img src="../assets/still-life-scaled.jpg">
|
||||
|
||||
You can selectively mask out the
|
||||
orange and replace it with a baseball in this way:
|
||||
|
||||
~~~
|
||||
invoke> a baseball -I /path/to/still_life.png -tm orange
|
||||
~~~
|
||||
|
||||
<img src="../assets/still-life-inpainted.png">
|
||||
|
||||
The clipseg classifier produces a confidence score for each region it
|
||||
identifies. Generally regions that score above 0.5 are reliable, but
|
||||
if you are getting too much or too little masking you can adjust the
|
||||
threshold down (to get more mask), or up (to get less). In this
|
||||
example, by passing `-tm` a higher value, we are insisting on a tigher
|
||||
mask. However, if you make it too high, the orange may not be picked
|
||||
up at all!
|
||||
|
||||
~~~
|
||||
invoke> a baseball -I /path/to/breakfast.png -tm orange 0.6
|
||||
~~~
|
||||
|
||||
The `!mask` command may be useful for debugging problems with the
|
||||
text2mask feature. The syntax is `!mask /path/to/image.png -tm <text>
|
||||
<threshold>`
|
||||
|
||||
It will generate three files:
|
||||
|
||||
- The image with the selected area highlighted.
|
||||
- it will be named XXXXX.<imagename>.<prompt>.selected.png
|
||||
- The image with the un-selected area highlighted.
|
||||
- it will be named XXXXX.<imagename>.<prompt>.deselected.png
|
||||
- The image with the selected area converted into a black and white
|
||||
image according to the threshold level
|
||||
- it will be named XXXXX.<imagename>.<prompt>.masked.png
|
||||
|
||||
The `.masked.png` file can then be directly passed to the `invoke>`
|
||||
prompt in the CLI via the `-M` argument. Do not attempt this with
|
||||
the `selected.png` or `deselected.png` files, as they contain some
|
||||
transparency throughout the image and will not produce the desired
|
||||
results.
|
||||
|
||||
Here is an example of how `!mask` works:
|
||||
|
||||
```
|
||||
invoke> !mask ./test-pictures/curly.png -tm hair 0.5
|
||||
>> generating masks from ./test-pictures/curly.png
|
||||
>> Initializing clipseg model for text to mask inference
|
||||
Outputs:
|
||||
[941.1] outputs/img-samples/000019.curly.hair.deselected.png: !mask ./test-pictures/curly.png -tm hair 0.5
|
||||
[941.2] outputs/img-samples/000019.curly.hair.selected.png: !mask ./test-pictures/curly.png -tm hair 0.5
|
||||
[941.3] outputs/img-samples/000019.curly.hair.masked.png: !mask ./test-pictures/curly.png -tm hair 0.5
|
||||
```
|
||||
|
||||
**Original image "curly.png"**
|
||||
<img src="../assets/outpainting/curly.png">
|
||||
|
||||
**000019.curly.hair.selected.png**
|
||||
<img src="../assets/inpainting/000019.curly.hair.selected.png">
|
||||
|
||||
**000019.curly.hair.deselected.png**
|
||||
<img src="../assets/inpainting/000019.curly.hair.deselected.png">
|
||||
|
||||
**000019.curly.hair.masked.png**
|
||||
<img src="../assets/inpainting/000019.curly.hair.masked.png">
|
||||
|
||||
It looks like we selected the hair pretty well at the 0.5 threshold
|
||||
(which is the default, so we didn't actually have to specify it), so
|
||||
let's have some fun:
|
||||
|
||||
```
|
||||
invoke> medusa with cobras -I ./test-pictures/curly.png -M 000019.curly.hair.masked.png -C20
|
||||
>> loaded input image of size 512x512 from ./test-pictures/curly.png
|
||||
...
|
||||
Outputs:
|
||||
[946] outputs/img-samples/000024.801380492.png: "medusa with cobras" -s 50 -S 801380492 -W 512 -H 512 -C 20.0 -I ./test-pictures/curly.png -A k_lms -f 0.75
|
||||
```
|
||||
|
||||
<img src="../assets/inpainting/000024.801380492.png">
|
||||
|
||||
You can also skip the `!mask` creation step and just select the masked
|
||||
|
||||
region directly:
|
||||
```
|
||||
invoke> medusa with cobras -I ./test-pictures/curly.png -tm hair -C20
|
||||
```
|
||||
|
||||
## Using the RunwayML inpainting model
|
||||
|
||||
The [RunwayML Inpainting Model
|
||||
v1.5](https://huggingface.co/runwayml/stable-diffusion-inpainting) is
|
||||
a specialized version of [Stable Diffusion
|
||||
v1.5](https://huggingface.co/spaces/runwayml/stable-diffusion-v1-5)
|
||||
that contains extra channels specifically designed to enhance
|
||||
inpainting and outpainting. While it can do regular `txt2img` and
|
||||
`img2img`, it really shines when filling in missing regions. It has an
|
||||
almost uncanny ability to blend the new regions with existing ones in
|
||||
a semantically coherent way.
|
||||
|
||||
To install the inpainting model, follow the
|
||||
[instructions](INSTALLING-MODELS.md) for installing a new model. You
|
||||
may use either the CLI (`invoke.py` script) or directly edit the
|
||||
`configs/models.yaml` configuration file to do this. The main thing to
|
||||
watch out for is that the the model `config` option must be set up to
|
||||
use `v1-inpainting-inference.yaml` rather than the `v1-inference.yaml`
|
||||
file that is used by Stable Diffusion 1.4 and 1.5.
|
||||
|
||||
After installation, your `models.yaml` should contain an entry that
|
||||
looks like this one:
|
||||
|
||||
inpainting-1.5:
|
||||
weights: models/ldm/stable-diffusion-v1/sd-v1-5-inpainting.ckpt
|
||||
description: SD inpainting v1.5
|
||||
config: configs/stable-diffusion/v1-inpainting-inference.yaml
|
||||
vae: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
|
||||
width: 512
|
||||
height: 512
|
||||
|
||||
As shown in the example, you may include a VAE fine-tuning weights
|
||||
file as well. This is strongly recommended.
|
||||
|
||||
To use the custom inpainting model, launch `invoke.py` with the
|
||||
argument `--model inpainting-1.5` or alternatively from within the
|
||||
script use the `!switch inpainting-1.5` command to load and switch to
|
||||
the inpainting model.
|
||||
|
||||
You can now do inpainting and outpainting exactly as described above,
|
||||
but there will (likely) be a noticeable improvement in
|
||||
coherence. Txt2img and Img2img will work as well.
|
||||
|
||||
There are a few caveats to be aware of:
|
||||
|
||||
1. The inpainting model is larger than the standard model, and will
|
||||
use nearly 4 GB of GPU VRAM. This makes it unlikely to run on
|
||||
a 4 GB graphics card.
|
||||
|
||||
2. When operating in Img2img mode, the inpainting model is much less
|
||||
steerable than the standard model. It is great for making small
|
||||
changes, such as changing the pattern of a fabric, or slightly
|
||||
changing a subject's expression or hair, but the model will
|
||||
resist making the dramatic alterations that the standard
|
||||
model lets you do.
|
||||
|
||||
3. While the `--hires` option works fine with the inpainting model,
|
||||
some special features, such as `--embiggen` are disabled.
|
||||
|
||||
4. Prompt weighting (`banana++ sushi`) and merging work well with
|
||||
the inpainting model, but prompt swapping (a ("fluffy cat").swap("smiling dog") eating a hotdog`)
|
||||
will not have any effect due to the way the model is set up.
|
||||
You may use text masking (with `-tm thing-to-mask`) as an
|
||||
effective replacement.
|
||||
|
||||
5. The model tends to oversharpen image if you use high step or CFG
|
||||
values. If you need to do large steps, use the standard model.
|
||||
|
||||
6. The `--strength` (`-f`) option has no effect on the inpainting
|
||||
model due to its fundamental differences with the standard
|
||||
model. It will always take the full number of steps you specify.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
Here are some troubleshooting tips for inpainting and outpainting.
|
||||
|
||||
## Inpainting is not changing the masked region enough!
|
||||
### Inpainting is not changing the masked region enough!
|
||||
|
||||
One of the things to understand about how inpainting works is that it
|
||||
is equivalent to running img2img on just the masked (transparent)
|
||||
|
@ -26,12 +26,6 @@ for each `invoke>` prompt as shown here:
|
||||
invoke> "pond garden with lotus by claude monet" --seamless -s100 -n4
|
||||
```
|
||||
|
||||
By default this will tile on both the X and Y axes. However, you can also specify specific axes to tile on with `--seamless_axes`.
|
||||
Possible values are `x`, `y`, and `x,y`:
|
||||
```python
|
||||
invoke> "pond garden with lotus by claude monet" --seamless --seamless_axes=x -s100 -n4
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## **Shortcuts: Reusing Seeds**
|
||||
@ -75,23 +69,6 @@ combination of integers and floating point numbers, and they do not need to add
|
||||
|
||||
---
|
||||
|
||||
## **Filename Format**
|
||||
|
||||
The argument `--fnformat` allows to specify the filename of the
|
||||
image. Supported wildcards are all arguments what can be set such as
|
||||
`perlin`, `seed`, `threshold`, `height`, `width`, `gfpgan_strength`,
|
||||
`sampler_name`, `steps`, `model`, `upscale`, `prompt`, `cfg_scale`,
|
||||
`prefix`.
|
||||
|
||||
The following prompt
|
||||
```bash
|
||||
dream> a red car --steps 25 -C 9.8 --perlin 0.1 --fnformat {prompt}_steps.{steps}_cfg.{cfg_scale}_perlin.{perlin}.png
|
||||
```
|
||||
|
||||
generates a file with the name: `outputs/img-samples/a red car_steps.25_cfg.9.8_perlin.0.1.png`
|
||||
|
||||
---
|
||||
|
||||
## **Thresholding and Perlin Noise Initialization Options**
|
||||
|
||||
Two new options are the thresholding (`--threshold`) and the perlin noise initialization (`--perlin`) options. Thresholding limits the range of the latent values during optimization, which helps combat oversaturation with higher CFG scale values. Perlin noise initialization starts with a percentage (a value ranging from 0 to 1) of perlin noise mixed into the initial noise. Both features allow for more variations and options in the course of generating images.
|
||||
|
@ -15,52 +15,13 @@ InvokeAI supports two versions of outpainting, one called "outpaint"
|
||||
and the other "outcrop." They work slightly differently and each has
|
||||
its advantages and drawbacks.
|
||||
|
||||
### Outpainting
|
||||
|
||||
Outpainting is the same as inpainting, except that the painting occurs
|
||||
in the regions outside of the original image. To outpaint using the
|
||||
`invoke.py` command line script, prepare an image in which the borders
|
||||
to be extended are pure black. Add an alpha channel (if there isn't one
|
||||
already), and make the borders completely transparent and the interior
|
||||
completely opaque. If you wish to modify the interior as well, you may
|
||||
create transparent holes in the transparency layer, which `img2img` will
|
||||
paint into as usual.
|
||||
|
||||
Pass the image as the argument to the `-I` switch as you would for
|
||||
regular inpainting:
|
||||
|
||||
invoke> a stream by a river -I /path/to/transparent_img.png
|
||||
|
||||
You'll likely be delighted by the results.
|
||||
|
||||
### Tips
|
||||
|
||||
1. Do not try to expand the image too much at once. Generally it is best
|
||||
to expand the margins in 64-pixel increments. 128 pixels often works,
|
||||
but your mileage may vary depending on the nature of the image you are
|
||||
trying to outpaint into.
|
||||
|
||||
2. There are a series of switches that can be used to adjust how the
|
||||
inpainting algorithm operates. In particular, you can use these to
|
||||
minimize the seam that sometimes appears between the original image
|
||||
and the extended part. These switches are:
|
||||
|
||||
--seam_size SEAM_SIZE Size of the mask around the seam between original and outpainted image (0)
|
||||
--seam_blur SEAM_BLUR The amount to blur the seam inwards (0)
|
||||
--seam_strength STRENGTH The img2img strength to use when filling the seam (0.7)
|
||||
--seam_steps SEAM_STEPS The number of steps to use to fill the seam. (10)
|
||||
--tile_size TILE_SIZE The tile size to use for filling outpaint areas (32)
|
||||
|
||||
### Outcrop
|
||||
|
||||
The `outcrop` extension gives you a convenient `!fix` postprocessing
|
||||
command that allows you to extend a previously-generated image in 64
|
||||
pixel increments in any direction. You can apply the module to any
|
||||
image previously-generated by InvokeAI. Note that it works with
|
||||
arbitrary PNG photographs, but not currently with JPG or other
|
||||
formats. Outcropping is particularly effective when combined with the
|
||||
[runwayML custom inpainting
|
||||
model](INPAINTING.md#using-the-runwayml-inpainting-model).
|
||||
The `outcrop` extension allows you to extend the image in 64 pixel
|
||||
increments in any dimension. You can apply the module to any image
|
||||
previously-generated by InvokeAI. Note that it will **not** work with
|
||||
arbitrary photographs or Stable Diffusion images created by other
|
||||
implementations.
|
||||
|
||||
Consider this image:
|
||||
|
||||
@ -72,24 +33,23 @@ Pretty nice, but it's annoying that the top of her head is cut
|
||||
off. She's also a bit off center. Let's fix that!
|
||||
|
||||
```bash
|
||||
invoke> !fix images/curly.png --outcrop top 128 right 64 bottom 64
|
||||
invoke> !fix images/curly.png --outcrop top 64 right 64
|
||||
```
|
||||
|
||||
This is saying to apply the `outcrop` extension by extending the top
|
||||
of the image by 128 pixels, and the right and bottom of the image by
|
||||
64 pixels. You can use any combination of top|left|right|bottom, and
|
||||
of the image by 64 pixels, and the right of the image by the same
|
||||
amount. You can use any combination of top|left|right|bottom, and
|
||||
specify any number of pixels to extend. You can also abbreviate
|
||||
`--outcrop` to `-c`.
|
||||
|
||||
The result looks like this:
|
||||
|
||||
<figure markdown>
|
||||

|
||||

|
||||
</figure>
|
||||
|
||||
The new image is larger than the original (576x704)
|
||||
because 64 pixels were added to the top and right sides. You will
|
||||
need enough VRAM to process an image of this size.
|
||||
The new image is actually slightly larger than the original (576x576,
|
||||
because 64 pixels were added to the top and right sides.)
|
||||
|
||||
A number of caveats:
|
||||
|
||||
@ -104,17 +64,6 @@ you'll get a slightly different result. You can run it repeatedly
|
||||
until you get an image you like. Unfortunately `!fix` does not
|
||||
currently respect the `-n` (`--iterations`) argument.
|
||||
|
||||
3. Your results will be _much_ better if you use the `inpaint-1.5`
|
||||
model released by runwayML and installed by default by
|
||||
`scripts/preload_models.py`. This model was trained specifically to
|
||||
harmoniously fill in image gaps. The standard model will work as well,
|
||||
but you may notice color discontinuities at the border.
|
||||
|
||||
4. When using the `inpaint-1.5` model, you may notice subtle changes
|
||||
to the area within the original image. This is because the model
|
||||
performs an encoding/decoding on the image as a whole. This does not
|
||||
occur with the standard model.
|
||||
|
||||
## Outpaint
|
||||
|
||||
The `outpaint` extension does the same thing, but with subtle
|
||||
|