Compare commits

...

361 Commits

Author SHA1 Message Date
e805021cd3 Handle the "medium" and "small" SDXL controlnet models 2023-08-17 15:33:00 -04:00
89b82b3dc4 (feat): Add Seam Painting to Canvas (1.x, 2.x & SDXL w/ Refiner) (#4292)
## What type of PR is this? (check all applicable)

- [x] Feature

## Have you discussed this change with the InvokeAI team?
- [x] Yes
      
## Description

PR to add Seam Painting back to the Canvas.

## TODO Later

While the graph works as intended, it has become extremely large and
complex. I don't know if there's a simpler way to do this. Maybe there
is but there's soo many connections and visualizing the graph in my head
is extremely difficult. We might need to create some kind of tooling for
this. Coz it's going going to get crazier.

But well works for now.
2023-08-17 21:24:39 +12:00
8923201fdf Merge branch 'main' into seam-painting 2023-08-17 21:21:44 +12:00
226409107b Fix for Image Deletion issue 2023-08-17 17:18:11 +10:00
ae986bf873 Report RAM usage and RAM cache statistics after each generation (#4287)
## What type of PR is this? (check all applicable)

- [X] Feature

## Have you discussed this change with the InvokeAI team?
- [X] Yes

     
## Have you updated all relevant documentation?
- [X] Yes


## Description

This PR enhances the logging of performance statistics to include RAM
and model cache information. After each generation, the following will
be logged. The new information follows TOTAL GRAPH EXECUTION TIME.

```
[2023-08-15 21:55:39,010]::[InvokeAI]::INFO --> Graph stats: 2408dbec-50d0-44a3-bbc4-427037e3f7d4
[2023-08-15 21:55:39,010]::[InvokeAI]::INFO --> Node                 Calls    Seconds VRAM Used
[2023-08-15 21:55:39,010]::[InvokeAI]::INFO --> main_model_loader        1     0.004s     0.000G
[2023-08-15 21:55:39,010]::[InvokeAI]::INFO --> clip_skip                1     0.002s     0.000G
[2023-08-15 21:55:39,010]::[InvokeAI]::INFO --> compel                   2     2.706s     0.246G
[2023-08-15 21:55:39,010]::[InvokeAI]::INFO --> rand_int                 1     0.002s     0.244G
[2023-08-15 21:55:39,011]::[InvokeAI]::INFO --> range_of_size            1     0.002s     0.244G
[2023-08-15 21:55:39,011]::[InvokeAI]::INFO --> iterate                  1     0.002s     0.244G
[2023-08-15 21:55:39,011]::[InvokeAI]::INFO --> metadata_accumulator     1     0.002s     0.244G
[2023-08-15 21:55:39,011]::[InvokeAI]::INFO --> noise                    1     0.003s     0.244G
[2023-08-15 21:55:39,011]::[InvokeAI]::INFO --> denoise_latents          1     2.429s     2.022G
[2023-08-15 21:55:39,011]::[InvokeAI]::INFO --> l2i                      1     1.020s     1.858G
[2023-08-15 21:55:39,011]::[InvokeAI]::INFO --> TOTAL GRAPH EXECUTION TIME:    6.171s
[2023-08-15 21:55:39,011]::[InvokeAI]::INFO --> RAM used by InvokeAI process: 4.50G (delta=0.10G)
[2023-08-15 21:55:39,011]::[InvokeAI]::INFO --> RAM used to load models: 1.99G
[2023-08-15 21:55:39,011]::[InvokeAI]::INFO --> VRAM in use: 0.303G
[2023-08-15 21:55:39,011]::[InvokeAI]::INFO --> RAM cache statistics:
[2023-08-15 21:55:39,011]::[InvokeAI]::INFO -->    Model cache hits: 2
[2023-08-15 21:55:39,011]::[InvokeAI]::INFO -->    Model cache misses: 5
[2023-08-15 21:55:39,011]::[InvokeAI]::INFO -->    Models cached: 5
[2023-08-15 21:55:39,011]::[InvokeAI]::INFO -->    Models cleared from cache: 0
[2023-08-15 21:55:39,011]::[InvokeAI]::INFO -->    Cache high water mark: 1.99/7.50G    
```

There may be a memory leak in InvokeAI. I'm seeing the process memory
usage increasing by about 100 MB with each generation as shown in the
example above.
2023-08-17 16:10:18 +12:00
daf75a1361 blackify 2023-08-16 21:47:29 -04:00
fe4b2d53ed Merge branch 'feat/collect-more-stats' of github.com:invoke-ai/InvokeAI into feat/collect-more-stats 2023-08-16 21:39:29 -04:00
c39f8b478b fix misplaced ram_used and ram_changed attributes 2023-08-16 21:39:18 -04:00
1f82d8013e Merge branch 'main' into feat/collect-more-stats 2023-08-16 18:51:17 -04:00
e373bfca54 fix several broken links in the installation index 2023-08-16 17:54:39 -04:00
2ca8611723 add +/- sign in front of RAM delta 2023-08-16 15:53:01 -04:00
b12cf315a8 Merge branch 'main' into feat/collect-more-stats 2023-08-16 09:19:33 -04:00
975586bb40 Merge branch 'main' into seam-painting 2023-08-17 01:05:42 +12:00
a7ba142ad9 feat(ui): set min zoom on nodes to 0.1 2023-08-16 23:04:36 +10:00
0d36bab6cc fix(ui): do not rerender top panel buttons 2023-08-16 23:04:36 +10:00
c2e7f62701 fix(ui): do not rerender edges 2023-08-16 23:04:36 +10:00
1f194e3688 chore(ui): lint 2023-08-16 23:04:36 +10:00
f9b8b5cff2 fix(ui): improve node rendering performance
Previously the editor was using prop-drilling node data and templates to get values deep into nodes. This ended up causing very noticeable performance degradation. For example, any text entry fields were super laggy.

Refactor the whole thing to use memoized selectors via hooks. The hooks are mostly very narrow, returning only the data needed.

Data objects are never passed down, only node id and field name - sometimes the field kind ('input' or 'output').

The end result is a *much* smoother node editor with very minimal rerenders.
2023-08-16 23:04:36 +10:00
f7c92e1eff fix(ui): disable awkward resize animation for <Flow /> 2023-08-16 23:04:36 +10:00
70b8c3dfea fix(ui): fix context menu on workflow editor
There is a tricky mouse event interaction between chakra's `useOutsideClick()` hook (used by chakra `<Menu />`) and reactflow. The hook doesn't work when you click the main reactflow area.

To get around this, I've used a dirty hack, copy-pasting the simple context menu component we use, and extending it slightly to respond to a global `contextMenusClosed` redux action.
2023-08-16 23:04:36 +10:00
43b30355e4 feat: make primitive node titles consistent 2023-08-16 23:04:36 +10:00
a93bd01353 fix bad merge 2023-08-16 08:53:07 -04:00
bb1b8ceaa8 Update invokeai/backend/model_management/model_cache.py
Co-authored-by: StAlKeR7779 <stalkek7779@yandex.ru>
2023-08-16 08:48:44 -04:00
be8edaf3fd Merge branch 'main' into feat/collect-more-stats 2023-08-16 08:48:14 -04:00
9cbaefaa81 feat: Add Seam Painting to SDXL 2023-08-16 19:46:48 +12:00
cc7c6e5d41 feat: Add Seam Painting with Scale Before 2023-08-16 19:35:03 +12:00
f2ee8a3da8 wip: Basic Seam Painting (only normal models) (no scale) 2023-08-16 17:26:23 +12:00
e98d7a52d4 feat: Add Seam Painting Options 2023-08-16 17:25:55 +12:00
21e1c0a5f0 tweaked formatting 2023-08-15 22:25:30 -04:00
611e241ca7 chore(ui): regen types 2023-08-16 12:07:34 +10:00
6df4af2c79 chore: lint 2023-08-16 12:07:34 +10:00
0f8606914e feat(ui): remove shouldShowDeleteButton
- remove this state entirely
- use `state.hotkeys.shift` directly to hide and show the icon on gallery
- also formatting
2023-08-16 12:07:34 +10:00
5b1099193d fix(ui): restore reset button in node image component 2023-08-16 12:07:34 +10:00
230131646f feat(ui): use imageDTOs instead of images in starring queries 2023-08-16 12:07:34 +10:00
8b1ec2685f chore: black 2023-08-16 12:07:34 +10:00
60c2c877d7 fix: add response model for star/unstar routes
- also implement pessimistic updates for starring, only changing the images that were successfully updated by backend
- some autoformat changes crept in
2023-08-16 12:07:34 +10:00
315a056686 feat(ui): show Star All if selection is a mix of starred and unstarred 2023-08-16 12:07:34 +10:00
80b0c5eab4 change from pin to star 2023-08-16 12:07:34 +10:00
08dc265e09 add listener to update selection list with change in star status 2023-08-16 12:07:34 +10:00
029a95550e rename pin to star, add multiselect and remove single image update api 2023-08-16 12:07:34 +10:00
ee6a26a97d update list images endpoint to sort by pinnedness and then created_at 2023-08-16 12:07:34 +10:00
a512fdc0f6 update IAIDndImage to use children for icons, add UI for shift+delete to delete images from gallery 2023-08-16 12:07:34 +10:00
767a612746 (ui) WIP trying to get all cache scenarios working smoothly, fix assets 2023-08-16 12:07:34 +10:00
0a71d6baa1 (ui) update cache to render pinned images alongside unpinned correctly, as well as changes in pinnedness 2023-08-16 12:07:34 +10:00
37be827e17 (ui) hook up toggle pin mutation with context menu for single image 2023-08-16 12:07:34 +10:00
04a9894e77 (api) add ability to pin and unpin images 2023-08-16 12:07:34 +10:00
f9958de6be added memory used to load models 2023-08-15 21:56:19 -04:00
ec10aca91e report RAM and RAM cache statistics 2023-08-15 21:00:30 -04:00
2b7dd3e236 feat: add missing primitive collections
- add missing primitive collections
- remove `Seed` and `LoRAField` (they don't exist)
2023-08-16 09:54:38 +10:00
fa884134d9 feat: rename ui_type_hint to ui_type
Just a bit more succinct while not losing any clarity.
2023-08-16 09:54:38 +10:00
18006cab9a chore: Regen frontend types 2023-08-16 09:54:38 +10:00
75ea716c13 feat(ui): hide node footer if there is nothing to display 2023-08-16 09:54:38 +10:00
d5f7027597 feat: Save Mask option for Canvas 2023-08-16 09:54:38 +10:00
b1ad777f5a fix: Outpainting being broken due to field name change 2023-08-16 09:54:38 +10:00
f65c8092cb fix(ui): fix issue with node editor state not restoring correctly on mount
If `reactflow` initializes before the node templates are parsed, edges may not be rendered and the viewport may get reset.

- Add `isReady` state to `NodesState`. This is false when we are loading or parsing node templates and true when that is finished.
- Conditionally render `reactflow` based on `isReady`.
- Add `viewport` to `NodesState` & handlers to keep it synced. This allows `reactflow` to mount and unmount freely and not lose viewport.
2023-08-16 09:54:38 +10:00
94bfef3543 feat(ui): add UI component for unknown node types 2023-08-16 09:54:38 +10:00
c48fd9c083 feat(nodes): refactor parameter/primitive nodes
Refine concept of "parameter" nodes to "primitives":
- integer
- float
- string
- boolean
- image
- latents
- conditioning
- color

Each primitive has:
- A field definition, if it is not already python primitive value. The field is how this primitive value is passed between nodes. Collections are lists of the field in node definitions. ex: `ImageField` & `list[ImageField]`
- A single output class. ex: `ImageOutput`
- A collection output class. ex: `ImageCollectionOutput`
- A node, which functions to load or pass on the primitive value. ex: `ImageInvocation` (in this case, `ImageInvocation` replaces `LoadImage`)

Plus a number of related changes:
- Reorganize these into `primitives.py`
- Update all nodes and logic to use primitives
- Consolidate "prompt" outputs into "string" & "mask" into "image" (there's no reason for these to be different, the function identically)
- Update default graphs & tests
- Regen frontend types & minor frontend tidy related to changes
2023-08-16 09:54:38 +10:00
f49fc7fb55 feat: node editor
squashed rebase on main after backendd refactor
2023-08-16 09:54:38 +10:00
a4b029d03c write RAM usage and change after each generation 2023-08-15 18:21:31 -04:00
d6c9bf5b38 added sdxl controlnet detection 2023-08-15 12:51:15 -04:00
e54355f0f3 Prevent merge from crashing with a WindowsPath serialization error (#4271)
## What type of PR is this? (check all applicable)

- [X] Bug Fix

## Have you discussed this change with the InvokeAI team?
- [X] Yes

## Have you updated all relevant documentation?
- [X] Yes

## Description

On Windows systems, model merging was crashing at the very last step
with an error related to not being able to serialize a WindowsPath
object. I have converted the path that is passed to `save_pretrained`
into a string, which I believe will solve the problem.

Note that I had to rebuild the web frontend and add it to the PR in
order to test on my Windows VM which does not have the full node stack
installed due to space limitations.

## Related Tickets & Documents


https://discord.com/channels/1020123559063990373/1042475531079262378/1140680788954861698
2023-08-15 15:11:01 +12:00
b2934be6ba use as_posix() instead of str() 2023-08-14 22:59:26 -04:00
eab67b6a01 fixed actual bug 2023-08-14 22:59:26 -04:00
02fa116690 rebuild frontend for windows testing 2023-08-14 22:59:26 -04:00
5190a4c282 further removal of Paths 2023-08-14 22:59:26 -04:00
141d438517 prevent windows from crashing with a WindowsPath serialization error on merge 2023-08-14 22:59:26 -04:00
549d2e0485 chore: remove old web server code and python deps 2023-08-15 10:54:57 +10:00
d3d8b71c67 feat: Change refinerStart default to 0.8
This is the recommended value according to the paper.
2023-08-15 10:13:02 +10:00
6eaaa75a5d Use double quotes in docker entrypoint to prevent word splitting (#4260)
## What type of PR is this? (check all applicable)

- [ ] Refactor
- [ ] Feature
- [x] Bug Fix
- [ ] Optimization
- [ ] Documentation Update
- [ ] Community Node Submission


## Have you discussed this change with the InvokeAI team?
- [ ] Yes
- [x] No, because: it's smol

      
## Have you updated all relevant documentation?
- [ ] Yes
- [x] No


## Description
docker_entrypoint.sh does not quote variable expansion to prevent word
splitting, causing paths with spaces to fail as in #3913

## Related Tickets & Documents
#3913

<!--
For pull requests that relate or close an issue, please include them
below. 

For example having the text: "closes #1234" would connect the current
pull
request to issue 1234.  And when we merge the pull request, Github will
automatically close the issue.
-->

- Related Issue #3913
- Closes #3913

## QA Instructions, Screenshots, Recordings

<!-- 
Please provide steps on how to test changes, any hardware or 
software specifications as well as any other pertinent information. 
-->

## Added/updated tests?

- [ ] Yes
- [x] No : _please replace this line with details on why tests
      have not been included_

## [optional] Are there any post deployment tasks we need to perform?
2023-08-15 02:15:22 +12:00
ba57ec5907 Merge branch 'main' into fix/docker_entrypoint 2023-08-14 09:26:32 -04:00
cd0e4bc1d7 Refactor generation backend (#4201)
## What type of PR is this? (check all applicable)

- [x] Refactor
- [x] Feature
- [x] Bug Fix
- [x] Optimization
- [ ] Documentation Update
- [ ] Community Node Submission


## Have you discussed this change with the InvokeAI team?
- [x] Yes
- [ ] No, because:

      
## Have you updated all relevant documentation?
- [ ] Yes
- [x] No


## Description
- Remove SDXL raw prompt nodes
- SDXL and SD1/2 generation merged to same nodes - t2l/l2l
- Fixed - if no xformers installed we trying to enable attention
slicing, ignoring torch-sdp availability
- Fixed - In SDXL negative prompt now creating zeroed tensor(according
to official code)
- Added mask field to l2l node
- Removed inpaint node and all legacy code related to this node
- Pass info about seed in latents, so we can use it to initialize
ancestral/sde schedulers
- t2l and l2l nodes moved from strength to denoising_start/end
- Removed code for noise threshold(@hipsterusername said that there no
plans to restore this feature)
- Fixed - first preview image now not gray
- Fixed - report correct total step count in progress, added scheduler
order in progress event
- Added MaskEdge and ColorCorrect nodes (@hipsterusername)

## Added/updated tests?

- [ ] Yes
- [x] No
2023-08-13 23:08:11 -04:00
9d3cd85bdd chore: black 2023-08-14 13:02:33 +10:00
46a8eed33e Merge branch 'main' into feat/refactor_generation_backend 2023-08-14 13:01:28 +10:00
9fee3f7b66 Revert "Add magic to debug"
This reverts commit 511da59793.
2023-08-14 12:58:08 +10:00
9217a217d4 fix(ui): refiner uses steps directly, no math 2023-08-14 12:56:37 +10:00
b2700ffde4 Update post processing docs 2023-08-13 22:25:49 -04:00
511da59793 Add magic to debug 2023-08-14 05:14:24 +03:00
409e5d01ba Fix cpu_only schedulers(unipc) 2023-08-14 05:14:05 +03:00
58d5c61c79 fix: SDXL Inpaint & Outpaint using regular Img2Img strength 2023-08-14 12:55:18 +12:00
3d8da67be3 Remove callback-generator wrapper 2023-08-14 03:35:15 +03:00
957ee6d370 fix: SDXL Canvas Inpaint & Outpaint not respecting SDXL Refiner start value 2023-08-14 12:13:29 +12:00
fecad2c014 fix: SDXL Denoising Strength not plugged in correctly 2023-08-14 11:59:11 +12:00
550e6ef27a re: Set the image denoise str back to 0
Bug has been fixed. No longer needed.
2023-08-14 10:27:07 +12:00
cc85c98bf3 feat: Upgrade Diffusers to 0.19.3
Needed for some schedulers
2023-08-14 09:26:28 +12:00
75fb3f429f re: Readd Refiner Step Math but cap max steps to 1000 2023-08-14 09:26:01 +12:00
d63bb39475 Make dpmpp_sde(_k) use not random seed 2023-08-14 00:24:38 +03:00
096333ba3f Fix error on zero timesteps 2023-08-14 00:20:01 +03:00
0b2925709c Use double quotes in docker entrypoint to prevent word splitting 2023-08-13 14:36:55 -05:00
7a8f14d595 Clean-up code a bit 2023-08-13 19:50:48 +03:00
59ba9fc0f6 Flip bits in seed for sde/ancestral schedulers to have different noise from initial 2023-08-13 19:50:16 +03:00
6e0beb1ed4 Fixes for second order scheduler timesteps 2023-08-13 19:31:47 +03:00
94636ddb03 Fix empty prompt handling 2023-08-13 19:31:14 +03:00
746e099f0d fix: Do not do step math for refinerSteps
This is probably better done on the backend or in a different way. This can cause steps to go above 1000 which is more than the set number for the model.
2023-08-14 04:04:15 +12:00
499e89d6f6 feat: Add SDXL Negative Aesthetic Score 2023-08-14 04:02:36 +12:00
250d530260 Fixed import issue in invokeai/frontend/install/model_install.py (#4259)
This fixes an import issue introduced in commit 1bfe983. The change made
'invokeai_configure' into a module but this line still tries to call it
as if it's a function. This will result in a `'module' not callable`
error.

## What type of PR is this? (check all applicable)

- [ ] Refactor
- [ ] Feature
- [x] Bug Fix
- [ ] Optimization
- [ ] Documentation Update
- [ ] Community Node Submission


## Have you discussed this change with the InvokeAI team?
- [x] Yes
- [ ] No, because:

      
## Have you updated all relevant documentation?
- [ ] Yes
- [x] No


## Description

imic from discord ask that I submit a PR to fix this bug.

## Related Tickets & Documents

<!--
For pull requests that relate or close an issue, please include them
below. 

For example having the text: "closes #1234" would connect the current
pull
request to issue 1234.  And when we merge the pull request, Github will
automatically close the issue.
-->

- Related Issue #
- Closes #

## QA Instructions, Screenshots, Recordings

<!-- 
Please provide steps on how to test changes, any hardware or 
software specifications as well as any other pertinent information. 
-->

## Added/updated tests?

- [ ] Yes
- [ ] No : _please replace this line with details on why tests
      have not been included_

## [optional] Are there any post deployment tasks we need to perform?
2023-08-14 02:40:08 +12:00
90fa3eebb3 feat: Make SDXL Style Prompt not take spaces 2023-08-14 02:25:39 +12:00
0aba105a8f Missed a spot in configure_invokeai.py 2023-08-13 05:32:35 -07:00
9e2e82a752 Fixed import issue in invokeai/frontend/install/model_install.py
This fixes an import issue introduced in commit 1bfe983.
The change made 'invokeai_configure' into a module but this line still tries to call it as if it's a function. This will result in a `'module' not callable` error.
2023-08-13 05:15:55 -07:00
561951ad98 chore: Black linting 2023-08-13 21:28:39 +12:00
3ff9961bda fix: Circular dependency in Mask Blur Method 2023-08-13 21:26:20 +12:00
33779b6339 chore: Remove shouldFitToWidthHeight from Inpaint Graphs
Was never used for inpainting but was fed to the node anyway.
2023-08-13 21:16:37 +12:00
b35cdc05a5 feat: Scaled Processing to Inpainting & Outpainting / 1.x & SDXL 2023-08-13 20:17:23 +12:00
9afb5d6ace Update version to 3.0.2post1 2023-08-12 19:49:33 -04:00
50177b8ed9 Update frontend JS files 2023-08-12 19:49:33 -04:00
c8864e475b fix: SDXL Lora's not working on Canvas Image To Image 2023-08-13 04:34:15 +12:00
fcf7f4ac77 feat: Add SDXL ControlNet To Linear UI 2023-08-13 04:27:38 +12:00
29f1c6dc82 fix: Image To Image FP32 Fix for Canvas SDXL 2023-08-13 04:23:52 +12:00
28208e6f49 fix: Fix VAE Precision not working for SDXL Canvas Modes 2023-08-13 04:09:51 +12:00
c33acf951e feat: Make Refiner work with Canvas 2023-08-13 03:53:40 +12:00
500cd552bc feat: Make SDXL work across the board + Custom VAE Support
Also a major cleanup pass to the SDXL graphs to ensure there's no ID overlap
2023-08-13 01:45:03 +12:00
55d27f71a3 feat: Give each graph its own unique id 2023-08-13 00:51:10 +12:00
746c7c59ff fix: remove extra node for canvas output catch 2023-08-12 22:39:30 +12:00
ad96c41156 feat: Add Canvas Output node to all Canvas Graphs 2023-08-12 22:04:43 +12:00
27bd127fb0 fix: Do not add anything but final output to staging area 2023-08-12 21:10:30 +12:00
f296e5c41e wip: Remove MaskBlur / Adjust color correction 2023-08-12 20:54:30 +12:00
a67d8376c7 fix missed spot for autoAddBoardId none 2023-08-12 18:07:01 +10:00
9f6221fe8c Merge branch 'main' into feat/refactor_generation_backend 2023-08-12 18:37:47 +12:00
7587b54787 chore: Cleanup, comment and organize Node Graphs
Before it gets too chaotic
2023-08-12 17:17:46 +12:00
7254ffc3e7 chore: Split Inpaint and Outpaint Graphs 2023-08-12 16:30:20 +12:00
6034fa12de feat: Add Mask Blur node 2023-08-12 16:20:58 +12:00
ce3675fc14 Apply denoising_start/end according on timestep value 2023-08-12 03:19:49 +03:00
8acd7eeca5 feat: Disable clip skip for SDXL Canvas 2023-08-12 08:18:30 +12:00
7293a6036a feat(wip): Add SDXL To Canvas 2023-08-12 08:16:05 +12:00
0b11f309ca instead of crashing when a corrupted model is detected, warn and move on 2023-08-11 15:05:14 -04:00
6a8eb392b2 Add support for loading SDXL LoRA weights in diffusers format. 2023-08-11 14:40:22 -04:00
f343ab0302 wip: Port Outpainting to new backend 2023-08-12 06:15:59 +12:00
824ca92760 fix maximum python version instructions 2023-08-11 13:49:39 -04:00
d7d6298ec0 feat: Add Infill Method support 2023-08-12 05:32:11 +12:00
58a48bf197 fix: LoRA list name sorting 2023-08-12 04:47:15 +12:00
5629d8fa37 fix; Key issue in Lora List 2023-08-12 04:43:40 +12:00
1affb7f647 feat: Add Paste / Mask Blur / Color Correction to Inpainting
Seam options are now removed. They are replaced by two options --Mask Blur and Mask Blur Method .. which control the softness of the mask that is being painted.
2023-08-12 03:28:19 +12:00
69a9dc7b36 wip: Add initial Inpaint Graph 2023-08-12 02:42:13 +12:00
f3ae52ff97 Fix error at high denoising_start, fix unipc(cpu_only) 2023-08-11 15:46:16 +03:00
7479f9cc02 feat: Update LinearUI to use new backend (except Inpaint) 2023-08-11 22:22:01 +12:00
87ce4ab27c fix: Update default_graph to use new DenoiseLatents 2023-08-11 22:21:13 +12:00
7c0023ad9e feat: Remove TextToLatents / Rename Latents To Latents -> DenoiseLatents 2023-08-11 22:20:37 +12:00
231e665675 Merge branch 'main' into feat/refactor_generation_backend 2023-08-11 20:53:38 +12:00
80fd4c2176 undo lint changes 2023-08-11 14:26:09 +10:00
3b6e425e17 fix error detail in toast 2023-08-11 14:26:09 +10:00
50415450d8 invalidate board total when images deleted, only run date range logic if board has less than 20 images 2023-08-11 14:26:09 +10:00
06296896a9 Update invokeai version 2023-08-10 22:23:41 -04:00
a7399aca0c Add new JS files for 3.0.2 build 2023-08-10 22:23:41 -04:00
d1ea8b1e98 Two changes to command-line scripts (#4235)
During install testing I discovered two small problems in the
command-line scripts. These are fixed.

## What type of PR is this? (check all applicable)

- [X Bug Fix

## Have you discussed this change with the InvokeAI team?
- [X] Yes
- 
      
## Have you updated all relevant documentation?
- [X] Yes


## Description

- installer - use correct entry point for invokeai-configure
- model merge script - prevent error when `--root` not provided
2023-08-10 21:11:45 -04:00
f851ad7ba0 Two changes to command-line scripts
- installer - use correct entry point for invokeai-configure
- model merge script - prevent error when `--root` not provided
2023-08-10 20:59:22 -04:00
591838a84b Add support for LyCORIS IA3 format (#4234)
## What type of PR is this? (check all applicable)

- [ ] Refactor
- [x] Feature
- [ ] Bug Fix
- [ ] Optimization
- [ ] Documentation Update
- [ ] Community Node Submission


## Have you discussed this change with the InvokeAI team?
- [ ] Yes
- [x] No

      
## Have you updated all relevant documentation?
- [ ] Yes
- [x] No


## Description
Add support for LyCORIS IA3 format

## Related Tickets & Documents
- Closes #4229 

## Added/updated tests?

- [ ] Yes
- [x] No
2023-08-11 03:30:35 +03:00
c0c2ab3dcf Format by black 2023-08-11 03:20:56 +03:00
56023bc725 Add support for LyCORIS IA3 format 2023-08-11 02:08:08 +03:00
2ef6a8995b Temporary force set vae to same precision as unet 2023-08-10 18:01:58 -04:00
d0fee93aac round slider values to nice numbers 2023-08-10 18:00:45 -04:00
1bfe9835cf clip cache settings to permissible values; remove redundant imports in install __init__ file 2023-08-10 18:00:45 -04:00
8e7eae6cc7 Probe LoRAs that do not have the text encoder (#4181)
## What type of PR is this? (check all applicable)

- [X] Bug Fix

## Have you discussed this change with the InvokeAI team?
- [X] No - minor fix

      
## Have you updated all relevant documentation?
- [X] Yes

## Description

It turns out that some LoRAs do not have the text encoder model, and
this was causing the code that distinguishes the model base type during
model import to reject them as having an unknown base model. This PR
enables detection of these cases.
2023-08-10 17:50:20 -04:00
f6522c8971 Merge branch 'main' into fix/detect-more-loras 2023-08-10 17:33:16 -04:00
a969707e45 prevent vae: '' from crashing model 2023-08-10 17:33:04 -04:00
6c8e898f09 Update scripts/verify_checkpoint_template.py
Co-authored-by: Eugene Brodsky <ebr@users.noreply.github.com>
2023-08-10 16:00:33 -04:00
7bad9bcf53 update dependencies and docs to cu118 2023-08-10 15:19:12 -04:00
d42b45116f fix(ui): fix lora sort (#4222)
## What type of PR is this? (check all applicable)

- [ ] Refactor
- [ ] Feature
- [s] Bug Fix
- [ ] Optimization
- [ ] Documentation Update
- [ ] Community Node Submission


## Have you discussed this change with the InvokeAI team?
- [x] Yes
- [ ] No, because:

      

## Description

was sorting with disabled at top of list instead of bottom

fixes #4217

## Related Tickets & Documents

<!--
For pull requests that relate or close an issue, please include them
below. 

For example having the text: "closes #1234" would connect the current
pull
request to issue 1234.  And when we merge the pull request, Github will
automatically close the issue.
-->

- Related Issue #
- Closes #4217

## QA Instructions, Screenshots, Recordings

<!-- 
Please provide steps on how to test changes, any hardware or 
software specifications as well as any other pertinent information. 
-->

![image](https://github.com/invoke-ai/InvokeAI/assets/4822129/dd895b86-05de-4303-8674-9b181037abaa)
2023-08-10 21:04:28 +12:00
d4812bbc8d Merge branch 'main' into fix/ui/fix-lora-sort 2023-08-10 19:00:26 +10:00
3cd05cf6bf fix(ui): fix lora sort
was sorting with disabled at top of list instead of bottom

fixes #4217
2023-08-10 15:31:29 +10:00
2564301aeb fix(ui): fix canvas model switching (#4221)
## What type of PR is this? (check all applicable)

- [ ] Refactor
- [ ] Feature
- [x] Bug Fix
- [ ] Optimization
- [ ] Documentation Update
- [ ] Community Node Submission


## Have you discussed this change with the InvokeAI team?
- [x] Yes
- [ ] No, because:

## Description

There was no check at all to see if the canvas had a valid model already
selected. The first model in the list was selected every time.

Now, we check if its valid. If not, we go through the logic to try and
pick the first valid model.

If there are no valid models, or there was a problem listing models, the
model selection is cleared.

## Related Tickets & Documents

<!--
For pull requests that relate or close an issue, please include them
below. 

For example having the text: "closes #1234" would connect the current
pull
request to issue 1234.  And when we merge the pull request, Github will
automatically close the issue.
-->


- Closes #4125

## QA Instructions, Screenshots, Recordings

<!-- 
Please provide steps on how to test changes, any hardware or 
software specifications as well as any other pertinent information. 
-->

- Go to Canvas tab
- Select a model other than the first one in the list
- Go to a different tab
- Go back to Canvas tab
- The model should be the same as you selected
2023-08-10 17:29:41 +12:00
da0efeaa7f fix(ui): fix canvas model switching
There was no check at all to see if the canvas had a valid model already selected. The first model in the list was selected every time.

Now, we check if its valid. If not, we go through the logic to try and pick the first valid model.

If there are no valid models, or there was a problem listing models, the model selection is cleared.
2023-08-10 15:20:37 +10:00
49cce1eec6 feat: add app_version to image metadata 2023-08-10 14:22:39 +10:00
e9ec5ab85c Apply requested changes
Co-Authored-By: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
2023-08-10 06:19:22 +03:00
17fed1c870 Fix merge conflict errors 2023-08-10 05:03:33 +03:00
ade78b9591 Merge branch 'main' into feat/refactor_generation_backend 2023-08-10 04:32:16 +03:00
c8fbaf54b6 Add self.min, not self.max 2023-08-10 09:59:22 +10:00
f86d388786 refactor(diffusers_pipeline): remove unused pipeline methods 🚮 (#4175) 2023-08-09 15:19:27 -07:00
cd2c688562 Merge branch 'main' into refactor/remove_unused_pipeline_methods 2023-08-09 17:26:09 -04:00
2d29ac6f0d Add techjedi's image import script (#4171)
## What type of PR is this? (check all applicable)

- [X ] Feature

## Have you discussed this change with the InvokeAI team?
- [X] Yes

## Have you updated all relevant documentation?
- [X] Yes

## Description

This PR adds the `invokeai-import-images` script, which imports a
directory of 2.*.* -generated images into the current InvokeAI root
directory, preserving and converting their metadata. The script also
handles 3.* images.

Many thanks to @techjedi for writing this. This version differs from the
original in two minor respects:

1. It is installed as an `invokeai-import-images` command.
2. The prompts for image and database paths use file completion provided
by the `prompt_toolkit` library.
## To Test

1. Activate the virtual environment for the destination root to import
INTO
2. Run `invokeai-import-images`
3. Follow the prompts

## Related Tickets & Documents

This is a frequently-requested feature on Discord, but I couldn't find
an Issue.

## QA Instructions, Screenshots, Recordings

<!-- 
Please provide steps on how to test changes, any hardware or 
software specifications as well as any other pertinent information. 
-->

## Added/updated tests?

- [ ] Yes
- [X] No : but should in the future
2023-08-09 13:17:08 -04:00
2c2b731386 fix typo 2023-08-09 13:08:59 -04:00
2f68a1a76c use Stalker's simplified LoRA vector-length detection code 2023-08-09 09:21:29 -04:00
930e7bc754 Merge branch 'main' into feat/image-import-script 2023-08-09 08:54:56 -04:00
7d4ace962a Merge branch 'main' into fix/detect-more-loras 2023-08-09 08:48:27 -04:00
06842f8e0a Update to 3.0.2rc1 2023-08-09 00:29:43 -04:00
c82da330db Pin safetensors to 0.3.1
Safetensors 0.3.2 does not ship an ARM64 wheel so install on macOS fails
2023-08-09 00:29:43 -04:00
628df4ec98 Add updated frontend html file 2023-08-09 00:29:43 -04:00
16b956616f Update version to 3.0.2 2023-08-09 00:29:43 -04:00
604cc17a3a Yarn build JS files 2023-08-09 00:29:43 -04:00
37c9b85549 Add slider for VRAM cache in configure script (#4133)
## What type of PR is this? (check all applicable)

- [X ] Feature

## Have you discussed this change with the InvokeAI team?
- [X] Yes
- [ ] No, because:

      
## Have you updated all relevant documentation?
- [ ] Yes
- [X] No - will be in release notes

## Description

On CUDA systems, this PR adds a new slider to the install-time configure
script for adjusting the VRAM cache and suggests a good starting value
based on the user's max VRAM (this is subject to verification).

On non-CUDA systems this slider is suppressed.

Please test on both CUDA and non-CUDA systems using:
```
invokeai-configure --root ~/invokeai-main/ --skip-sd --skip-support
```

To see and test the default values, move `invokeai.yaml` out of the way
before running.

**Note added 8 August 2023**

This PR also fixes the configure and model install scripts so that if
the window is too small to fit the user interface, the user will be
prompted to interactively resize the window and/or change font size
(with the option to give up). This will prevent `npyscreen` from
generating its horrible tracebacks.

## Related Tickets & Documents

<!--
For pull requests that relate or close an issue, please include them
below. 

For example having the text: "closes #1234" would connect the current
pull
request to issue 1234.  And when we merge the pull request, Github will
automatically close the issue.
-->

- Related Issue #
- Closes #

## QA Instructions, Screenshots, Recordings

<!-- 
Please provide steps on how to test changes, any hardware or 
software specifications as well as any other pertinent information. 
-->

## Added/updated tests?

- [ ] Yes
- [ ] No : _please replace this line with details on why tests
      have not been included_

## [optional] Are there any post deployment tasks we need to perform?
2023-08-09 12:27:54 +10:00
8b39b67ec7 Merge branch 'main' into feat/select-vram-in-config 2023-08-09 12:17:27 +10:00
a933977861 Pick correct config file for sdxl models (#4191)
## What type of PR is this? (check all applicable)

- [X] Bug Fix

## Have you discussed this change with the InvokeAI team?
- [X] Yes
- [ ] No, because:

      
## Have you updated all relevant documentation?
- [X Yes
- [ ] No


## Description

If `models.yaml` is cleared out for some reason, the model manager will
repopulate it by scanning `models`. However, this would fail with a
pydantic validation error if any SDXL checkpoint models were present
because the lack of logic to pick the correct configuration file. This
has now been added.
2023-08-09 11:16:48 +10:00
dfb41d8461 Merge branch 'main' into bugfix/autodetect-sdxl-ckpt-config 2023-08-09 03:57:44 +03:00
e98f7eda2e Fix total_steps in generation event, order field added 2023-08-09 03:34:25 +03:00
b4a74f6523 Add MaskEdge and ColorCorrect nodes
Co-Authored-By: Kent Keirsey <31807370+hipsterusername@users.noreply.github.com>
2023-08-08 23:57:02 +03:00
f7aec3b934 Move conditioning class to backend 2023-08-08 23:33:52 +03:00
4d5169e16d Merge branch 'main' into feat/select-vram-in-config 2023-08-08 13:50:02 -04:00
a7e44678fb Remove legacy/unused code 2023-08-08 20:49:01 +03:00
da0184a786 Invert mask, fix l2l on no mask conntected, remove zeroing latents on zero start 2023-08-08 20:01:49 +03:00
f56f19710d allow user to interactively resize screen before UI runs 2023-08-08 12:27:25 -04:00
96b7248051 Add mask to l2l 2023-08-08 18:50:36 +03:00
e77400ab62 remove deprecated options from config 2023-08-08 08:33:30 -07:00
13347f6aec blackified 2023-08-08 08:33:30 -07:00
a9bf387e5e turned on Pydantic validate_assignment 2023-08-08 08:33:30 -07:00
8258c87a9f refrain from writing deprecated legacy options to invokeai.yaml 2023-08-08 08:33:30 -07:00
1b1b399fd0 Fix crash when attempting to update a model (#4192)
## What type of PR is this? (check all applicable)

- [X] Bug Fix


## Have you discussed this change with the InvokeAI team?
- [X No, because small fix

      
## Have you updated all relevant documentation?
- [X] Yes

## Description

A logic bug was introduced in PR #4109 that caused Web-based model
updates to fail with a pydantic validation error. This corrects the
problem.

## Related Tickets & Documents

PR #4109
2023-08-08 10:54:27 -04:00
a8d3e078c0 Merge branch 'main' into fix/detect-more-loras 2023-08-08 10:42:45 -04:00
6ed7ba57dd Merge branch 'main' into bugfix/fix-model-updates 2023-08-08 09:05:25 -04:00
2b3b77a276 api(images): allow HEAD request on image/full (#4193) 2023-08-08 00:08:48 -07:00
8b8ec68b30 Merge branch 'main' into feat/image_http_head 2023-08-08 00:02:48 -07:00
e20af5aef0 feat(ui): add LoRA support to SDXL linear UI
new graph modifier `addSDXLLoRasToGraph()` handles adding LoRA to the SDXL t2i and i2i graphs.
2023-08-08 15:02:00 +10:00
57e8ec9488 chore(ui): lint/format 2023-08-08 12:53:47 +10:00
734a9e4271 invalidate board total when images deleted, only run date range logic if board has less than 20 images 2023-08-08 12:53:47 +10:00
fe924daee3 add option to disable multiselect 2023-08-08 12:53:47 +10:00
750f09fbed blackify 2023-08-07 21:01:59 -04:00
4df581811e add template verification script 2023-08-07 21:01:48 -04:00
eb70bc2ae4 add scripts to create model templates and check whether they match 2023-08-07 21:00:47 -04:00
5f29526a8e Add seed to latents field 2023-08-08 04:00:33 +03:00
492bfe002a Remove sdxl t2l/l2l nodes 2023-08-08 03:38:42 +03:00
809705c30d api(images): allow HEAD request on image/full 2023-08-07 15:11:47 -07:00
f0918edf98 improve error reporting on unrecognized lora models 2023-08-07 16:38:58 -04:00
a846d82fa1 Add techedi code to avoid rendering prompt/seed with null
- Added techjedi github and real names
2023-08-07 16:29:46 -04:00
22f7cf0638 add stalker's complicated but effective code for finding token vector length in LoRAs 2023-08-07 16:19:57 -04:00
25c669b1d6 Merge remote-tracking branch 'origin/main' into refactor/remove_unused_pipeline_methods 2023-08-07 13:03:10 -07:00
4367061b19 fix(ModelManager): fix overridden VAE with relative path (#4059) 2023-08-07 12:57:32 -07:00
0fd13d3604 Merge branch 'main' into feat/select-vram-in-config 2023-08-07 15:51:59 -04:00
72a3e776b2 fix logic error introduced in PR 4109 2023-08-07 15:38:22 -04:00
af044007d5 pick correct config file for sdxl models 2023-08-07 15:19:49 -04:00
1db2c93f75 Fix preview, inpaint 2023-08-07 21:27:32 +03:00
f272a44feb Merge branch 'main' into refactor/model_manager_instantiate 2023-08-07 10:59:28 -07:00
2539e26c18 Apply denoising_start/end, add torch-sdp to memory effictiend attention func 2023-08-07 19:57:11 +03:00
b0738b7f70 Fixes, zero tensor for empty negative prompt, remove raw prompt node 2023-08-07 18:37:06 +03:00
8469d3e95a chore: black 2023-08-07 10:05:52 +10:00
ae17d01e1d Fix hue adjustment (#4182)
* Fix hue adjustment

Hue adjustment wasn't working correctly because color channels got swapped. This has now been fixed and we're using PIL rather than cv2 to do the RGBA->HSV->RGBA conversion. The range of hue adjustment is also the more typical 0..360 degrees.
2023-08-06 23:23:51 +00:00
f3d3316558 probe LoRAs that do not have the text encoder 2023-08-06 16:00:53 -04:00
5a6cefb0ea add backslash to end of incomplete windows paths 2023-08-06 12:34:35 -04:00
1a6f5f0860 use backslash on Windows systems for autoadded delimiter 2023-08-06 12:29:31 -04:00
5bfd6cb66f Merge remote-tracking branch 'origin/main' into refactor/model_manager_instantiate
# Conflicts:
#	invokeai/backend/model_management/model_manager.py
2023-08-05 22:02:28 -07:00
59caff7ff0 refactor(diffusers_pipeline): remove unused img2img wrappers 🚮
invokeai.app no longer needs this as a single method, as it builds on latents2latents instead.
2023-08-05 21:50:52 -07:00
6487e7d906 refactor(diffusers_pipeline): remove unused ModelGroup 🚮
orphaned since #3550 removed the LazilyLoadedModelGroup code, probably unused since ModelCache took over responsibility for sequential offload somewhere around #3335.
2023-08-05 21:50:52 -07:00
77033eabd3 refactor(diffusers_pipeline): remove unused precision 🚮 2023-08-05 21:50:52 -07:00
b80abdd101 refactor(diffusers_pipeline): remove unused image_from_embeddings 🚮 2023-08-05 21:50:52 -07:00
006d782cc8 refactor(diffusers_pipeline): tidy imports 🚮 2023-08-05 21:50:52 -07:00
d09dfc3e9b fix(api): use db_location instead of db_path_string
This may just be the SQLite memory sentinel value.
2023-08-06 14:09:04 +10:00
66f524cae7 fix(mm): fix a lot of typing issues
Most fixes are just things being typed as `str` but having default values of `None`, but there are some minor logic changes.
2023-08-06 14:09:04 +10:00
9ba50130a1 fix(api): fix db location types
The services all want strings instead of `Path`s; create variable for the string representation of the path provided by the config services.
2023-08-06 14:09:04 +10:00
d4cf2d2666 fix(api): fix ApiDependencies.invoker types
ApiDependencies.invoker` provides typing for the API's services layer. Marking it `Optional` results in all the routes seeing it as optional, which is not good.

Instead of marking it optional to satisfy the initial assignment to `None`, we can just skip the initial assignment. This preserves the IDE hinting in API layer and is types-legal.
2023-08-06 14:09:04 +10:00
9aaf67c5b4 wip 2023-08-06 05:05:25 +03:00
b8b589c150 fix(nodes): fix hsl nodes rebase conflict 2023-08-06 09:57:49 +10:00
d93900a8de Added HSL Nodes 2023-08-06 09:57:49 +10:00
7f4c387080 test(model_management): factor out name strings 2023-08-05 15:46:46 -07:00
80876bbbd1 Merge remote-tracking branch 'origin/refactor/model_manager_instantiate' into refactor/model_manager_instantiate 2023-08-05 15:25:05 -07:00
7a4ff4c089 Merge branch 'main' into refactor/model_manager_instantiate 2023-08-05 15:23:38 -07:00
44bf308192 test(model_management): add a couple tests for _get_model_path 2023-08-05 15:22:23 -07:00
12e51c84ae blackified 2023-08-05 14:26:16 -07:00
b2eb83deff add docs 2023-08-05 14:26:16 -07:00
0ccc3b509e add techjedi's import script, with some filecompletion tweaks 2023-08-05 14:26:16 -07:00
4043a4c21c blackified 2023-08-05 12:44:58 -04:00
c8ceb96091 add docs 2023-08-05 12:26:52 -04:00
83f75750a9 add techjedi's import script, with some filecompletion tweaks 2023-08-05 12:19:24 -04:00
dc96a3e79d Fix random number generator
Passing in seed=0 is not equivalent to seed=None. The latter will get a new seed from entropy in the OS, and that's what we should be using.
2023-08-06 00:29:08 +10:00
c076f1397e rebuild frontend 2023-08-05 14:40:42 +10:00
2568aafc0b bump version number so that pip updates work 2023-08-05 14:40:42 +10:00
65ed224bfc Merge branch 'main' into refactor/model_manager_instantiate 2023-08-04 21:34:38 -07:00
b6e369c745 chore: black 2023-08-05 12:28:35 +10:00
ecabfc252b devices.py - Update MPS FP16 check to account for upcoming MacOS Sonoma
float16 doesn't seem to work on MacOS Sonoma due to further changes with Metal. This'll default back to float32 for Sonoma users.
2023-08-05 12:28:35 +10:00
da96a41103 Merge branch 'main' into feat/select-vram-in-config 2023-08-05 12:11:50 +10:00
d162b78767 fix broken civitai example link 2023-08-05 12:10:52 +10:00
eb6c317f04 chore: black 2023-08-05 12:05:24 +10:00
6d7223238f fix: fix typo in message 2023-08-05 12:05:24 +10:00
8607d124c5 improve message about the consequences of the --ignore_missing_core_models flag 2023-08-05 12:05:24 +10:00
23497bf759 add --ignore_missing_core_models CLI flag to bypass checking for missing core models 2023-08-05 12:05:24 +10:00
b10cf20eb1 Merge branch 'main' into refactor/model_manager_instantiate
# Conflicts:
#	invokeai/backend/model_management/model_manager.py
2023-08-04 18:28:18 -07:00
3d93851dba Installer should download fp16 models if user has specified 'auto' in config (#4129)
## What type of PR is this? (check all applicable)

- [ ] Refactor
- [ ] Feature
- [X] Bug Fix
- [ ] Optimization
- [ ] Documentation Update
- [ ] Community Node Submission


## Have you discussed this change with the InvokeAI team?
- [X] Yes
- [ ] No, because:

      
## Have you updated all relevant documentation?
- [X] Yes
- [ ] No


## Description

At install time, when the user's config specified "auto" precision, the
installer was downloading the fp32 models even when an fp16 model would
be appropriate for the OS.


## Related Tickets & Documents

<!--
For pull requests that relate or close an issue, please include them
below. 

For example having the text: "closes #1234" would connect the current
pull
request to issue 1234.  And when we merge the pull request, Github will
automatically close the issue.
-->

- Closes #4127
2023-08-05 01:56:25 +03:00
9bacd77a79 Merge branch 'main' into bugfix/fp16-models 2023-08-05 01:42:43 +03:00
1b158f62c4 resolve vae overrides correctly 2023-08-04 18:24:47 -04:00
6ad565d84c folded in changes from 4099 2023-08-04 18:24:47 -04:00
04229082d6 Provide ti name from model manager, not from ti itself 2023-08-04 18:24:47 -04:00
03c27412f7 [WIP] Add sdxl lora support (#4097)
## What type of PR is this? (check all applicable)

- [ ] Refactor
- [x] Feature
- [ ] Bug Fix
- [ ] Optimization
- [ ] Documentation Update
- [ ] Community Node Submission


## Have you discussed this change with the InvokeAI team?
- [x] Yes
- [ ] No, because:

      
## Have you updated all relevant documentation?
- [ ] Yes
- [x] No


## Description
Add lora loading for sdxl.
NOT TESTED - I run only 2 loras, please check more(including lycoris if
they already exists).

## QA Instructions, Screenshots, Recordings
https://civitai.com/models/118536/voxel-xl

![image](https://github.com/invoke-ai/InvokeAI/assets/7768370/76a6abff-cb0a-43b4-b779-a0b0e5b46e56)


## Added/updated tests?

- [ ] Yes
- [x] No
2023-08-04 16:12:22 -04:00
f0613bb0ef Fix merge conflict resolve - restore full/diff layer support 2023-08-04 19:53:27 +03:00
0e9f92b868 Merge branch 'main' into feat/sdxl_lora 2023-08-04 19:22:13 +03:00
7d0cc6ec3f chore: black 2023-08-05 02:04:22 +10:00
2f8b928486 Add support for diff/full lora layers 2023-08-05 02:04:22 +10:00
0d3c27f46c Fix typo
Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>
2023-08-04 11:44:56 -04:00
cff91f06d3 Add lora apply in sdxl l2l node 2023-08-04 11:44:56 -04:00
1d5d187ba1 model probe detects sdxl lora models 2023-08-04 11:44:56 -04:00
1ac14a1e43 add sdxl lora support 2023-08-04 11:44:56 -04:00
cfc3a20565 autoAddBoardId should always be defined 2023-08-04 22:19:11 +10:00
05ae4e283c Stop checking for unet/model.onnx when a model_index.json is detected (#4132)
## What type of PR is this? (check all applicable)

- [ ] Refactor
- [ ] Feature
- [x] Bug Fix
- [ ] Optimization
- [ ] Documentation Update
- [ ] Community Node Submission


## Have you discussed this change with the InvokeAI team?
- [x] Yes
- [ ] No, because:

      
## Have you updated all relevant documentation?
- [ ] Yes
- [ ] No


## Description


## Related Tickets & Documents

<!--
For pull requests that relate or close an issue, please include them
below. 

For example having the text: "closes #1234" would connect the current
pull
request to issue 1234.  And when we merge the pull request, Github will
automatically close the issue.
-->

- Related Issue #
- Closes #

## QA Instructions, Screenshots, Recordings

<!-- 
Please provide steps on how to test changes, any hardware or 
software specifications as well as any other pertinent information. 
-->

## Added/updated tests?

- [ ] Yes
- [ ] No : _please replace this line with details on why tests
      have not been included_

## [optional] Are there any post deployment tasks we need to perform?
2023-08-03 22:10:37 -04:00
f06fee4581 Merge branch 'main' into remove-onnx-model-check-from-pipeline-download 2023-08-03 22:02:05 -04:00
9091e19de8 Add execution stat reporting after each invocation (#4125)
## What type of PR is this? (check all applicable)

- [X] Feature


## Have you discussed this change with the InvokeAI team?
- [X] Yes
- [ ] No, because:

      
## Have you updated all relevant documentation?
- [X] Yes
- [ ] No

## Description

This PR adds execution time and VRAM usage reporting to each graph
invocation. The log output will look like this:

```
[2023-08-02 18:03:04,507]::[InvokeAI]::INFO --> Graph stats: c7764585-9c68-4d9d-a199-55e8186790f3                                                                                              
[2023-08-02 18:03:04,507]::[InvokeAI]::INFO --> Node                 Calls  Seconds  VRAM Used                                                                                                 
[2023-08-02 18:03:04,507]::[InvokeAI]::INFO --> main_model_loader        1   0.005s     0.01G                                                                                                  
[2023-08-02 18:03:04,508]::[InvokeAI]::INFO --> clip_skip                1   0.004s     0.01G                                                                                                  
[2023-08-02 18:03:04,508]::[InvokeAI]::INFO --> compel                   2   0.512s     0.26G                                                                                                  
[2023-08-02 18:03:04,508]::[InvokeAI]::INFO --> rand_int                 1   0.001s     0.01G                                                                                                  
[2023-08-02 18:03:04,508]::[InvokeAI]::INFO --> range_of_size            1   0.001s     0.01G                                                                                                  
[2023-08-02 18:03:04,508]::[InvokeAI]::INFO --> iterate                  1   0.001s     0.01G                                                                                                  
[2023-08-02 18:03:04,508]::[InvokeAI]::INFO --> metadata_accumulator     1   0.002s     0.01G                                                                                                  
[2023-08-02 18:03:04,508]::[InvokeAI]::INFO --> noise                    1   0.002s     0.01G                                                                                                  
[2023-08-02 18:03:04,508]::[InvokeAI]::INFO --> t2l                      1   3.541s     1.93G                                                                                                  
[2023-08-02 18:03:04,508]::[InvokeAI]::INFO --> l2i                      1   0.679s     0.58G                                                                                                  
[2023-08-02 18:03:04,508]::[InvokeAI]::INFO --> TOTAL GRAPH EXECUTION TIME:  4.749s                                                                                                            
[2023-08-02 18:03:04,508]::[InvokeAI]::INFO --> Current VRAM utilization 0.01G                                                                                                                 
```
On systems without CUDA, the VRAM stats are not printed.

The current implementation keeps track of graph ids separately so will
not be confused when several graphs are executing in parallel. It
handles exceptions, and it is integrated into the app framework by
defining an abstract base class and storing an implementation instance
in `InvocationServices`.
2023-08-03 20:05:21 -04:00
0a0b7141af Merge branch 'main' into feat/execution-stats 2023-08-03 19:49:00 -04:00
1deca89fde Merge branch 'main' into feat/select-vram-in-config 2023-08-03 19:27:58 -04:00
446fb4a438 blackify 2023-08-03 19:24:23 -04:00
ab5d938a1d use variant instead of revision 2023-08-03 19:23:52 -04:00
9942af756a Merge branch 'main' into remove-onnx-model-check-from-pipeline-download 2023-08-03 10:10:51 -04:00
06742faca7 Merge branch 'feat/execution-stats' of github.com:invoke-ai/InvokeAI into feat/execution-stats 2023-08-03 08:48:05 -04:00
d2bddf7f91 tweak formatting to accommodate longer runtimes 2023-08-03 08:47:56 -04:00
91ebf9f76e Merge branch 'main' into refactor/model_manager_instantiate 2023-08-02 19:01:21 -07:00
bf94412d14 feat: add multi-select to gallery
multi-select actions include:
- drag to board to move all to that board
- right click to add all to board or delete all

backend changes:
- add routes for changing board for list of image names, deleting list of images
- change image-specific routes to `images/i/{image_name}` to not clobber other routes (like `images/upload`, `images/delete`)
- subclass pydantic `BaseModel` as `BaseModelExcludeNull`, which excludes null values when calling `dict()` on the model. this fixes inconsistent types related to JSON parsing null values into `null` instead of `undefined`
- remove `board_id` from `remove_image_from_board`

frontend changes:
- multi-selection stuff uses `ImageDTO[]` as payloads, for dnd and other mutations. this gives us access to image `board_id`s when hitting routes, and enables efficient cache updates.
- consolidate change board and delete image modals to handle single and multiples
- board totals are now re-fetched on mutation and not kept in sync manually - was way too tedious to do this
- fixed warning about nested `<p>` elements
- closes #4088 , need to handle case when `autoAddBoardId` is `"none"`
- add option to show gallery image delete button on every gallery image

frontend refactors/organisation:
- make typegen script js instead of ts
- enable `noUncheckedIndexedAccess` to help avoid bugs when indexing into arrays, many small changes needed to satisfy TS after this
- move all image-related endpoints into `endpoints/images.ts`, its a big file now, but this fixes a number of circular dependency issues that were otherwise felt impossible to resolve
2023-08-03 11:46:59 +10:00
e080fd1e08 blackify 2023-08-03 11:25:20 +10:00
eeef1e08f8 restore ability to convert merged inpaint .safetensors files 2023-08-03 11:25:20 +10:00
b3b94b5a8d use correct prop 2023-08-03 11:01:21 +10:00
5c9787c145 add project-id header to requests 2023-08-03 11:01:21 +10:00
cf72eba15c Merge branch 'main' into feat/execution-stats 2023-08-03 10:53:25 +10:00
a6f9396a30 fix(db): retrieve metadata even when no session_id
this was unnecessarily skipped if there was no `session_id`.
2023-08-03 10:43:44 +10:00
118d5b387b deploy: refactor github workflows
Currently we use some workflow trigger conditionals to run either a real test workflow (installing the app and running it) or a fake workflow, disguised as the real one, that just auto-passes.

This change refactors the workflow to use a single workflow that can be skipped, using another github action to determine which things to run depending on the paths changed.
2023-08-03 10:32:50 +10:00
02d2cc758d Merge branch 'main' into refactor/model_manager_instantiate 2023-08-02 17:11:23 -07:00
db545f8801 chore: move PR template to .github/ dir (#4060)
## What type of PR is this? (check all applicable)

- [x] Refactor

## Have you discussed this change with the InvokeAI team?
- [x] No, because it's pretty minor

      
## Have you updated all relevant documentation?
- [x] No


## Description

This PR just moves the PR template to within the `.github/` directory
leading to a overall minimal project structure.

## Added/updated tests?

- [x] No : because this change doesn't affect or need a separate test
2023-08-03 10:08:17 +10:00
b0d72b15b3 Merge branch 'main' into patch-1 2023-08-03 10:04:47 +10:00
4e0949fa55 fix .swap() by reverting improperly merged @classmethod change 2023-08-03 10:00:43 +10:00
f028342f5b Merge branch 'main' into patch-1 2023-08-03 10:00:10 +10:00
7021467048 (ci) do not install all dependencies when running static checks (#4036)
Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
2023-08-02 23:46:02 +00:00
26ef5249b1 guard board switching in board context menu 2023-08-03 09:18:46 +10:00
87424be95d block auto add board change during generation. Switch condition to isProcessing 2023-08-03 09:18:46 +10:00
366952f810 fix localization 2023-08-03 09:18:46 +10:00
450e95de59 auto change board waiting for isReady 2023-08-03 09:18:46 +10:00
0ba8a0ea6c Board assignment changing on click 2023-08-03 09:18:46 +10:00
f4981f26d5 Merge branch 'main' into bugfix/fp16-models 2023-08-02 19:17:55 -04:00
6bc21984c6 Merge branch 'main' into feat/select-vram-in-config 2023-08-02 19:12:43 -04:00
43d6312587 Merge branch 'main' into feat/execution-stats 2023-08-02 19:12:08 -04:00
0d125bf3e4 chore: delete nonfunctional shell.nix
This was for v2.3 and is very broken. See `flake.nix`, thanks to @zopieux
2023-08-03 09:09:40 +10:00
921ccad04d added stats service to the cli_app startup 2023-08-02 18:41:43 -04:00
05c9207e7b Merge branch 'feat/execution-stats' of github.com:invoke-ai/InvokeAI into feat/execution-stats 2023-08-02 18:31:33 -04:00
3fc789a7ee fix unit tests 2023-08-02 18:31:10 -04:00
008362918e Merge branch 'main' into feat/execution-stats 2023-08-02 18:15:51 -04:00
8fc75a71ee integrate correctly into app API and add features
- Create abstract base class InvocationStatsServiceBase
- Store InvocationStatsService in the InvocationServices object
- Collect and report stats on simultaneous graph execution
  independently for each graph id
- Track VRAM usage for each node
- Handle cancellations and other exceptions gracefully
2023-08-02 18:10:52 -04:00
82d259f43b Merge branch 'main' into remove-onnx-model-check-from-pipeline-download 2023-08-02 16:35:46 -04:00
ec48779080 blackify 2023-08-02 14:28:19 -04:00
bc20fe4cb5 Merge branch 'main' into feat/select-vram-in-config 2023-08-02 14:27:17 -04:00
5de42be4a6 reduce VRAM cache default; take max RAM from system 2023-08-02 14:27:13 -04:00
818c55cd53 Refactor/cleanup root detection (#4102)
## What type of PR is this? (check all applicable)

- [ X] Refactor
- [ ] Feature
- [ ] Bug Fix
- [ ] Optimization
- [ ] Documentation Update
- [ ] Community Node Submission


## Have you discussed this change with the InvokeAI team?
- [ ] Yes
- [ X] No, because: invisible change

      
## Have you updated all relevant documentation?
- [ X] Yes
- [ ] No


## Description

There was a problem in 3.0.1 with root resolution. If INVOKEAI_ROOT were
set to "." (or any relative path), then the location of root would
change if the code did an os.chdir() after config initialization. I
fixed this in a quick and dirty way for 3.0.1.post3.

This PR cleans up the code with a little refactoring.

## Related Tickets & Documents

<!--
For pull requests that relate or close an issue, please include them
below. 

For example having the text: "closes #1234" would connect the current
pull
request to issue 1234.  And when we merge the pull request, Github will
automatically close the issue.
-->

- Related Issue #
- Closes #

## QA Instructions, Screenshots, Recordings

<!-- 
Please provide steps on how to test changes, any hardware or 
software specifications as well as any other pertinent information. 
-->

## Added/updated tests?

- [ ] Yes
- [ ] No : _please replace this line with details on why tests
      have not been included_

## [optional] Are there any post deployment tasks we need to perform?
2023-08-02 10:36:12 -04:00
0db1e97119 Merge branch 'main' into refactor/cleanup-root-detection 2023-08-02 09:46:46 -04:00
29ac252501 blackify 2023-08-02 09:44:06 -04:00
880727436c fix default vram cache size calculation 2023-08-02 09:43:52 -04:00
77c5c18542 add slider for VRAM cache 2023-08-02 09:11:24 -04:00
ed76250dba Stop checking for unet/model.onnx when a model_index.json is detected 2023-08-02 07:21:21 -04:00
4d22cafdad Installer should download fp16 models if user has specified 'auto' in config
- Closes #4127
2023-08-01 22:06:27 -04:00
1f9e984b0d Merge branch 'main' into refactor/model_manager_instantiate 2023-08-01 16:49:39 -07:00
8a4e5f73aa reset stats on exception 2023-08-01 19:39:42 -04:00
4599575e65 fix(ui): use const for wsProtocol, lint 2023-08-02 09:26:20 +10:00
242d860a47 fix https/wss behind reverse proxy 2023-08-02 09:26:20 +10:00
0c1a7e72d4 Fix manual installation documentation (#4107)
## What type of PR is this? (check all applicable)

- [ ] Refactor
- [ ] Feature
- [ ]X Bug Fix
- [ ] Optimization
- [ X] Documentation Update
- [ ] Community Node Submission


## Have you discussed this change with the InvokeAI team?
- [ ] Yes
- [ X] No, because: obvious problem

      
## Have you updated all relevant documentation?
- [ X] Yes
- [ ] No


## Description

The manual installation documentation in both README.md and
020_MANUAL_INSTALL give an incomplete `invokeai-configure` command which
leaves out the path to the root directory to create. As a result, the
invokeai root directory gets created in the user’s home directory, even
if they intended it to be placed somewhere else.

This is a fairly important issue.
2023-08-01 18:55:53 -04:00
11a44b944d fix installation documentation 2023-08-01 18:52:17 -04:00
fd7b842419 add execution stat reporting after each invocation 2023-08-01 17:44:09 -04:00
5998509888 Merge branch 'main' into refactor/model_manager_instantiate 2023-08-01 11:09:43 -07:00
403a6e88f2 fix: flake: add opencv with CUDA, new patchmatch dependency. 2023-08-01 23:56:41 +10:00
c9d452b9d4 fix: Model Manager Tab Issues (#4087)
## What type of PR is this? (check all applicable)

- [x] Refactor
- [x] Feature
- [x] Bug Fix
- [?] Optimization


## Have you discussed this change with the InvokeAI team?
- [x] No

     
## Description

- Fixed filter type select using `images` instead of `all` -- Probably
some merge conflict.
- Added loading state for model lists. Handy when the model list takes
longer than a second for any reason to fetch. Better to show this than
an empty screen.
- Refactored the render model list function so we modify the display
component in one area rather than have repeated code.

### Other Issues

- The list can get a bit laggy on initial load when you have a hundreds
of models / loras. This needs to be fixed. Will make another PR for
this.
2023-08-02 01:06:53 +12:00
dcc274a2b9 feat: Make ModelListWrapper instead of rendering conditionally 2023-08-01 22:50:10 +10:00
f404669831 fix: Rename loading vars for consistency 2023-08-01 22:42:05 +10:00
ce687b28ef fix: Model Manager Tab Issues 2023-08-01 22:41:32 +10:00
7292d89108 Merge branch 'main' into refactor/cleanup-root-detection 2023-08-01 22:14:56 +10:00
437f45a97f do not depend on existence of /tmp directory 2023-08-01 00:41:35 -04:00
13ef33ed64 Merge branch 'refactor/cleanup-root-detection' of github.com:invoke-ai/InvokeAI into refactor/cleanup-root-detection 2023-08-01 00:19:55 -04:00
86d8b46fca Merge branch 'main' into refactor/cleanup-root-detection 2023-08-01 00:14:26 -04:00
df53b62048 get rid of dangling debug statements 2023-07-31 22:39:11 -04:00
55d3f04476 additional refactoring 2023-07-31 22:36:11 -04:00
72ebe2ce68 refactor root directory detection to be cleaner 2023-07-31 22:30:06 -04:00
7cd8b2f207 Refactor root detection code 2023-07-31 21:15:44 -04:00
bacdf985f1 doc(model_manager): docstrings 2023-07-31 09:16:32 -07:00
e3519052ae Merge remote-tracking branch 'origin/main' into refactor/model_manager_instantiate 2023-07-31 08:46:09 -07:00
adfd1e52f4 refactor(model_manager): avoid copy/paste logic 2023-07-30 11:53:12 -07:00
0e48c98330 Merge remote-tracking branch 'origin/main' into refactor/model_manager_instantiate
# Conflicts:
#	invokeai/backend/model_management/model_manager.py
2023-07-30 11:33:13 -07:00
ff1c40747e lint: formatting 2023-07-29 20:02:31 -07:00
dbfd1bcb5e Merge branch 'main' into refactor/model_manager_instantiate 2023-07-29 19:53:21 -07:00
ccceb32a85 lint: formatting 2023-07-29 11:50:04 -07:00
21617e60e1 Merge remote-tracking branch 'origin/main' into refactor/model_manager_instantiate 2023-07-29 08:21:26 -07:00
35dd58e273 chore: move PR template to .github/ dir 2023-07-29 12:59:56 +05:30
86b8b69e88 internal(ModelManager): add instantiate method 2023-07-28 22:30:25 -07:00
bc9a5038fd refactor(ModelManager): factor out get_model_path 2023-07-28 22:29:36 -07:00
b163ae6a4d refactor(ModelManager): factor out get_model_config 2023-07-28 21:30:20 -07:00
dca685ac25 refactor(ModelManager): refactor rescan-on-miss to exists() method 2023-07-28 21:11:00 -07:00
e70bedba7d refactor(ModelManager): factor out _get_implementation method 2023-07-28 21:03:27 -07:00
454 changed files with 22711 additions and 15128 deletions

View File

@ -1,13 +1,14 @@
name: Black # TODO: add isort and flake8 later
name: style checks
# just formatting for now
# TODO: add isort and flake8 later
on:
pull_request: {}
pull_request:
push:
branches: master
tags: "*"
branches: main
jobs:
test:
black:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
@ -19,8 +20,7 @@ jobs:
- name: Install dependencies with pip
run: |
pip install --upgrade pip wheel
pip install .[test]
pip install black
# - run: isort --check-only .
- run: black --check .

View File

@ -1,50 +0,0 @@
name: Test invoke.py pip
# This is a dummy stand-in for the actual tests
# we don't need to run python tests on non-Python changes
# But PRs require passing tests to be mergeable
on:
pull_request:
paths:
- '**'
- '!pyproject.toml'
- '!invokeai/**'
- '!tests/**'
- 'invokeai/frontend/web/**'
merge_group:
workflow_dispatch:
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
jobs:
matrix:
if: github.event.pull_request.draft == false
strategy:
matrix:
python-version:
- '3.10'
pytorch:
- linux-cuda-11_7
- linux-rocm-5_2
- linux-cpu
- macos-default
- windows-cpu
include:
- pytorch: linux-cuda-11_7
os: ubuntu-22.04
- pytorch: linux-rocm-5_2
os: ubuntu-22.04
- pytorch: linux-cpu
os: ubuntu-22.04
- pytorch: macos-default
os: macOS-12
- pytorch: windows-cpu
os: windows-2022
name: ${{ matrix.pytorch }} on ${{ matrix.python-version }}
runs-on: ${{ matrix.os }}
steps:
- name: skip
run: echo "no build required"

View File

@ -3,16 +3,7 @@ on:
push:
branches:
- 'main'
paths:
- 'pyproject.toml'
- 'invokeai/**'
- '!invokeai/frontend/web/**'
pull_request:
paths:
- 'pyproject.toml'
- 'invokeai/**'
- 'tests/**'
- '!invokeai/frontend/web/**'
types:
- 'ready_for_review'
- 'opened'
@ -65,10 +56,23 @@ jobs:
id: checkout-sources
uses: actions/checkout@v3
- name: Check for changed python files
id: changed-files
uses: tj-actions/changed-files@v37
with:
files_yaml: |
python:
- 'pyproject.toml'
- 'invokeai/**'
- '!invokeai/frontend/web/**'
- 'tests/**'
- name: set test prompt to main branch validation
if: steps.changed-files.outputs.python_any_changed == 'true'
run: echo "TEST_PROMPTS=tests/validate_pr_prompt.txt" >> ${{ matrix.github-env }}
- name: setup python
if: steps.changed-files.outputs.python_any_changed == 'true'
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
@ -76,6 +80,7 @@ jobs:
cache-dependency-path: pyproject.toml
- name: install invokeai
if: steps.changed-files.outputs.python_any_changed == 'true'
env:
PIP_EXTRA_INDEX_URL: ${{ matrix.extra-index-url }}
run: >
@ -83,6 +88,7 @@ jobs:
--editable=".[test]"
- name: run pytest
if: steps.changed-files.outputs.python_any_changed == 'true'
id: run-pytest
run: pytest

View File

@ -161,7 +161,7 @@ the command `npm install -g yarn` if needed)
_For Windows/Linux with an NVIDIA GPU:_
```terminal
pip install "InvokeAI[xformers]" --use-pep517 --extra-index-url https://download.pytorch.org/whl/cu117
pip install "InvokeAI[xformers]" --use-pep517 --extra-index-url https://download.pytorch.org/whl/cu118
```
_For Linux with an AMD GPU:_
@ -184,8 +184,9 @@ the command `npm install -g yarn` if needed)
6. Configure InvokeAI and install a starting set of image generation models (you only need to do this once):
```terminal
invokeai-configure
invokeai-configure --root .
```
Don't miss the dot at the end!
7. Launch the web server (do it every time you run InvokeAI):
@ -193,15 +194,9 @@ the command `npm install -g yarn` if needed)
invokeai-web
```
8. Build Node.js assets
8. Point your browser to http://localhost:9090 to bring up the web interface.
```terminal
cd invokeai/frontend/web/
yarn vite build
```
9. Point your browser to http://localhost:9090 to bring up the web interface.
10. Type `banana sushi` in the box on the top left and click `Invoke`.
9. Type `banana sushi` in the box on the top left and click `Invoke`.
Be sure to activate the virtual environment each time before re-launching InvokeAI,
using `source .venv/bin/activate` or `.venv\Scripts\activate`.
@ -311,13 +306,30 @@ InvokeAI. The second will prepare the 2.3 directory for use with 3.0.
You may now launch the WebUI in the usual way, by selecting option [1]
from the launcher script
#### Migration Caveats
#### Migrating Images
The migration script will migrate your invokeai settings and models,
including textual inversion models, LoRAs and merges that you may have
installed previously. However it does **not** migrate the generated
images stored in your 2.3-format outputs directory. You will need to
manually import selected images into the 3.0 gallery via drag-and-drop.
images stored in your 2.3-format outputs directory. To do this, you
need to run an additional step:
1. From a working InvokeAI 3.0 root directory, start the launcher and
enter menu option [8] to open the "developer's console".
2. At the developer's console command line, type the command:
```bash
invokeai-import-images
```
3. This will lead you through the process of confirming the desired
source and destination for the imported images. The images will
appear in the gallery board of your choice, and contain the
original prompt, model name, and other parameters used to generate
the image.
(Many kudos to **techjedi** for contributing this script.)
## Hardware Requirements

View File

@ -29,8 +29,8 @@ configure() {
echo "To reconfigure InvokeAI, delete the above file."
echo "======================================================================"
else
mkdir -p ${INVOKEAI_ROOT}
chown --recursive ${USER} ${INVOKEAI_ROOT}
mkdir -p "${INVOKEAI_ROOT}"
chown --recursive ${USER} "${INVOKEAI_ROOT}"
gosu ${USER} invokeai-configure --yes --default_only
fi
}
@ -50,16 +50,16 @@ fi
if [[ -v "PUBLIC_KEY" ]] && [[ ! -d "${HOME}/.ssh" ]]; then
apt-get update
apt-get install -y openssh-server
pushd $HOME
pushd "$HOME"
mkdir -p .ssh
echo ${PUBLIC_KEY} > .ssh/authorized_keys
echo "${PUBLIC_KEY}" > .ssh/authorized_keys
chmod -R 700 .ssh
popd
service ssh start
fi
cd ${INVOKEAI_ROOT}
cd "${INVOKEAI_ROOT}"
# Run the CMD as the Container User (not root).
exec gosu ${USER} "$@"

Binary file not shown.

Before

Width:  |  Height:  |  Size: 310 KiB

After

Width:  |  Height:  |  Size: 297 KiB

View File

@ -4,35 +4,13 @@ title: Postprocessing
# :material-image-edit: Postprocessing
## Intro
This extension provides the ability to restore faces and upscale images.
This sections details the ability to improve faces and upscale images.
## Face Fixing
The default face restoration module is GFPGAN. The default upscale is
Real-ESRGAN. For an alternative face restoration module, see
[CodeFormer Support](#codeformer-support) below.
As of InvokeAI 3.0, the easiest way to improve faces created during image generation is through the Inpainting functionality of the Unified Canvas. Simply add the image containing the faces that you would like to improve to the canvas, mask the face to be improved and run the invocation. For best results, make sure to use an inpainting specific model; these are usually identified by the "-inpainting" term in the model name.
As of version 1.14, environment.yaml will install the Real-ESRGAN package into
the standard install location for python packages, and will put GFPGAN into a
subdirectory of "src" in the InvokeAI directory. Upscaling with Real-ESRGAN
should "just work" without further intervention. Simply indicate the desired scale on
the popup in the Web GUI.
**GFPGAN** requires a series of downloadable model files to work. These are
loaded when you run `invokeai-configure`. If GFPAN is failing with an
error, please run the following from the InvokeAI directory:
```bash
invokeai-configure
```
If you do not run this script in advance, the GFPGAN module will attempt to
download the models files the first time you try to perform facial
reconstruction.
### Upscaling
## Upscaling
Open the upscaling dialog by clicking on the "expand" icon located
above the image display area in the Web UI:
@ -41,82 +19,23 @@ above the image display area in the Web UI:
![upscale1](../assets/features/upscale-dialog.png)
</figure>
There are three different upscaling parameters that you can
adjust. The first is the scale itself, either 2x or 4x.
The default upscaling option is Real-ESRGAN x2 Plus, which will scale your image by a factor of two. This means upscaling a 512x512 image will result in a new 1024x1024 image.
The second is the "Denoising Strength." Higher values will smooth out
the image and remove digital chatter, but may lose fine detail at
higher values.
Other options are the x4 upscalers, which will scale your image by a factor of 4.
Third, "Upscale Strength" allows you to adjust how the You can set the
scaling stength between `0` and `1.0` to control the intensity of the
scaling. AI upscalers generally tend to smooth out texture details. If
you wish to retain some of those for natural looking results, we
recommend using values between `0.5 to 0.8`.
[This figure](../assets/features/upscaling-montage.png) illustrates
the effects of denoising and strength. The original image was 512x512,
4x scaled to 2048x2048. The "original" version on the upper left was
scaled using simple pixel averaging. The remainder use the ESRGAN
upscaling algorithm at different levels of denoising and strength.
<figure markdown>
![upscaling](../assets/features/upscaling-montage.png){ width=720 }
</figure>
Both denoising and strength default to 0.75.
### Face Restoration
InvokeAI offers alternative two face restoration algorithms,
[GFPGAN](https://github.com/TencentARC/GFPGAN) and
[CodeFormer](https://huggingface.co/spaces/sczhou/CodeFormer). These
algorithms improve the appearance of faces, particularly eyes and
mouths. Issues with faces are less common with the latest set of
Stable Diffusion models than with the original 1.4 release, but the
restoration algorithms can still make a noticeable improvement in
certain cases. You can also apply restoration to old photographs you
upload.
To access face restoration, click the "smiley face" icon in the
toolbar above the InvokeAI image panel. You will be presented with a
dialog that offers a choice between the two algorithm and sliders that
allow you to adjust their parameters. Alternatively, you may open the
left-hand accordion panel labeled "Face Restoration" and have the
restoration algorithm of your choice applied to generated images
automatically.
Like upscaling, there are a number of parameters that adjust the face
restoration output. GFPGAN has a single parameter, `strength`, which
controls how much the algorithm is allowed to adjust the
image. CodeFormer has two parameters, `strength`, and `fidelity`,
which together control the quality of the output image as described in
the [CodeFormer project
page](https://shangchenzhou.com/projects/CodeFormer/). Default values
are 0.75 for both parameters, which achieves a reasonable balance
between changing the image too much and not enough.
[This figure](../assets/features/restoration-montage.png) illustrates
the effects of adjusting GFPGAN and CodeFormer parameters.
<figure markdown>
![upscaling](../assets/features/restoration-montage.png){ width=720 }
</figure>
!!! note
GFPGAN and Real-ESRGAN are both memory intensive. In order to avoid crashes and memory overloads
Real-ESRGAN is memory intensive. In order to avoid crashes and memory overloads
during the Stable Diffusion process, these effects are applied after Stable Diffusion has completed
its work.
In single image generations, you will see the output right away but when you are using multiple
iterations, the images will first be generated and then upscaled and face restored after that
iterations, the images will first be generated and then upscaled after that
process is complete. While the image generation is taking place, you will still be able to preview
the base images.
## How to disable
If, for some reason, you do not wish to load the GFPGAN and/or ESRGAN libraries,
you can disable them on the invoke.py command line with the `--no_restore` and
`--no_esrgan` options, respectively.
If, for some reason, you do not wish to load the ESRGAN libraries,
you can disable them on the invoke.py command line with the `--no_esrgan` options.

View File

@ -264,7 +264,7 @@ experimental versions later.
you can create several levels of subfolders and drop your models into
whichever ones you want.
- ***Autoimport FolderLICENSE***
- ***LICENSE***
At the bottom of the screen you will see a checkbox for accepting
the CreativeML Responsible AI Licenses. You need to accept the license
@ -471,7 +471,7 @@ Then type the following commands:
=== "NVIDIA System"
```bash
pip install torch torchvision --force-reinstall --extra-index-url https://download.pytorch.org/whl/cu117
pip install torch torchvision --force-reinstall --extra-index-url https://download.pytorch.org/whl/cu118
pip install xformers
```

View File

@ -148,7 +148,7 @@ manager, please follow these steps:
=== "CUDA (NVidia)"
```bash
pip install "InvokeAI[xformers]" --use-pep517 --extra-index-url https://download.pytorch.org/whl/cu117
pip install "InvokeAI[xformers]" --use-pep517 --extra-index-url https://download.pytorch.org/whl/cu118
```
=== "ROCm (AMD)"
@ -192,8 +192,10 @@ manager, please follow these steps:
your outputs.
```terminal
invokeai-configure
invokeai-configure --root .
```
Don't miss the dot at the end of the command!
The script `invokeai-configure` will interactively guide you through the
process of downloading and installing the weights files needed for InvokeAI.
@ -225,12 +227,6 @@ manager, please follow these steps:
!!! warning "Make sure that the virtual environment is activated, which should create `(.venv)` in front of your prompt!"
=== "CLI"
```bash
invokeai
```
=== "local Webserver"
```bash
@ -243,6 +239,12 @@ manager, please follow these steps:
invokeai --web --host 0.0.0.0
```
=== "CLI"
```bash
invokeai
```
If you choose the run the web interface, point your browser at
http://localhost:9090 in order to load the GUI.
@ -310,7 +312,7 @@ installation protocol (important!)
=== "CUDA (NVidia)"
```bash
pip install -e .[xformers] --use-pep517 --extra-index-url https://download.pytorch.org/whl/cu117
pip install -e .[xformers] --use-pep517 --extra-index-url https://download.pytorch.org/whl/cu118
```
=== "ROCm (AMD)"
@ -354,7 +356,7 @@ you can do so using this unsupported recipe:
mkdir ~/invokeai
conda create -n invokeai python=3.10
conda activate invokeai
pip install InvokeAI[xformers] --use-pep517 --extra-index-url https://download.pytorch.org/whl/cu117
pip install InvokeAI[xformers] --use-pep517 --extra-index-url https://download.pytorch.org/whl/cu118
invokeai-configure --root ~/invokeai
invokeai --root ~/invokeai --web
```

View File

@ -34,11 +34,11 @@ directly from NVIDIA. **Do not try to install Ubuntu's
nvidia-cuda-toolkit package. It is out of date and will cause
conflicts among the NVIDIA driver and binaries.**
Go to [CUDA Toolkit 11.7
Downloads](https://developer.nvidia.com/cuda-11-7-0-download-archive),
and use the target selection wizard to choose your operating system,
hardware platform, and preferred installation method (e.g. "local"
versus "network").
Go to [CUDA Toolkit
Downloads](https://developer.nvidia.com/cuda-downloads), and use the
target selection wizard to choose your operating system, hardware
platform, and preferred installation method (e.g. "local" versus
"network").
This will provide you with a downloadable install file or, depending
on your choices, a recipe for downloading and running a install shell
@ -61,7 +61,7 @@ Runtime Site](https://developer.nvidia.com/nvidia-container-runtime)
When installing torch and torchvision manually with `pip`, remember to provide
the argument `--extra-index-url
https://download.pytorch.org/whl/cu117` as described in the [Manual
https://download.pytorch.org/whl/cu118` as described in the [Manual
Installation Guide](020_INSTALL_MANUAL.md).
## :simple-amd: ROCm

View File

@ -124,7 +124,7 @@ installation. Examples:
invokeai-model-install --list controlnet
# (install the model at the indicated URL)
invokeai-model-install --add http://civitai.com/2860
invokeai-model-install --add https://civitai.com/api/download/models/128713
# (delete the named model)
invokeai-model-install --delete sd-1/main/analog-diffusion
@ -170,4 +170,4 @@ elsewhere on disk and they will be autoimported. You can also create
subfolders and organize them as you wish.
The location of the autoimport directories are controlled by settings
in `invokeai.yaml`. See [Configuration](../features/CONFIGURATION.md).
in `invokeai.yaml`. See [Configuration](../features/CONFIGURATION.md).

View File

@ -28,18 +28,21 @@ command line, then just be sure to activate it's virtual environment.
Then run the following three commands:
```sh
pip install xformers==0.0.16rc425
pip install triton
pip install xformers~=0.0.19
pip install triton # WON'T WORK ON WINDOWS
python -m xformers.info output
```
The first command installs `xformers`, the second installs the
`triton` training accelerator, and the third prints out the `xformers`
installation status. If all goes well, you'll see a report like the
installation status. On Windows, please omit the `triton` package,
which is not available on that platform.
If all goes well, you'll see a report like the
following:
```sh
xFormers 0.0.16rc425
xFormers 0.0.20
memory_efficient_attention.cutlassF: available
memory_efficient_attention.cutlassB: available
memory_efficient_attention.flshattF: available
@ -48,22 +51,28 @@ memory_efficient_attention.smallkF: available
memory_efficient_attention.smallkB: available
memory_efficient_attention.tritonflashattF: available
memory_efficient_attention.tritonflashattB: available
indexing.scaled_index_addF: available
indexing.scaled_index_addB: available
indexing.index_select: available
swiglu.dual_gemm_silu: available
swiglu.gemm_fused_operand_sum: available
swiglu.fused.p.cpp: available
is_triton_available: True
is_functorch_available: False
pytorch.version: 1.13.1+cu117
pytorch.version: 2.0.1+cu118
pytorch.cuda: available
gpu.compute_capability: 8.6
gpu.name: NVIDIA RTX A2000 12GB
gpu.compute_capability: 8.9
gpu.name: NVIDIA GeForce RTX 4070
build.info: available
build.cuda_version: 1107
build.python_version: 3.10.9
build.torch_version: 1.13.1+cu117
build.cuda_version: 1108
build.python_version: 3.10.11
build.torch_version: 2.0.1+cu118
build.env.TORCH_CUDA_ARCH_LIST: 5.0+PTX 6.0 6.1 7.0 7.5 8.0 8.6
build.env.XFORMERS_BUILD_TYPE: Release
build.env.XFORMERS_ENABLE_DEBUG_ASSERTIONS: None
build.env.NVCC_FLAGS: None
build.env.XFORMERS_PACKAGE_FROM: wheel-v0.0.16rc425
build.env.XFORMERS_PACKAGE_FROM: wheel-v0.0.20
build.nvcc_version: 11.8.89
source.privacy: open source
```
@ -83,14 +92,14 @@ installed from source. These instructions were written for a system
running Ubuntu 22.04, but other Linux distributions should be able to
adapt this recipe.
#### 1. Install CUDA Toolkit 11.7
#### 1. Install CUDA Toolkit 11.8
You will need the CUDA developer's toolkit in order to compile and
install xFormers. **Do not try to install Ubuntu's nvidia-cuda-toolkit
package.** It is out of date and will cause conflicts among the NVIDIA
driver and binaries. Instead install the CUDA Toolkit package provided
by NVIDIA itself. Go to [CUDA Toolkit 11.7
Downloads](https://developer.nvidia.com/cuda-11-7-0-download-archive)
by NVIDIA itself. Go to [CUDA Toolkit 11.8
Downloads](https://developer.nvidia.com/cuda-11-8-0-download-archive)
and use the target selection wizard to choose your platform and Linux
distribution. Select an installer type of "runfile (local)" at the
last step.
@ -101,17 +110,17 @@ example, the install script recipe for Ubuntu 22.04 running on a
x86_64 system is:
```
wget https://developer.download.nvidia.com/compute/cuda/11.7.0/local_installers/cuda_11.7.0_515.43.04_linux.run
sudo sh cuda_11.7.0_515.43.04_linux.run
wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda_11.8.0_520.61.05_linux.run
sudo sh cuda_11.8.0_520.61.05_linux.run
```
Rather than cut-and-paste this example, We recommend that you walk
through the toolkit wizard in order to get the most up to date
installer for your system.
#### 2. Confirm/Install pyTorch 1.13 with CUDA 11.7 support
#### 2. Confirm/Install pyTorch 2.01 with CUDA 11.8 support
If you are using InvokeAI 2.3 or higher, these will already be
If you are using InvokeAI 3.0.2 or higher, these will already be
installed. If not, you can check whether you have the needed libraries
using a quick command. Activate the invokeai virtual environment,
either by entering the "developer's console", or manually with a
@ -124,7 +133,7 @@ Then run the command:
python -c 'exec("import torch\nprint(torch.__version__)")'
```
If it prints __1.13.1+cu117__ you're good. If not, you can install the
If it prints __1.13.1+cu118__ you're good. If not, you can install the
most up to date libraries with this command:
```sh

View File

@ -25,10 +25,10 @@ This method is recommended for experienced users and developers
#### [Docker Installation](040_INSTALL_DOCKER.md)
This method is recommended for those familiar with running Docker containers
### Other Installation Guides
- [PyPatchMatch](installation/060_INSTALL_PATCHMATCH.md)
- [XFormers](installation/070_INSTALL_XFORMERS.md)
- [CUDA and ROCm Drivers](installation/030_INSTALL_CUDA_AND_ROCM.md)
- [Installing New Models](installation/050_INSTALLING_MODELS.md)
- [PyPatchMatch](060_INSTALL_PATCHMATCH.md)
- [XFormers](070_INSTALL_XFORMERS.md)
- [CUDA and ROCm Drivers](030_INSTALL_CUDA_AND_ROCM.md)
- [Installing New Models](050_INSTALLING_MODELS.md)
## :fontawesome-solid-computer: Hardware Requirements

View File

@ -34,6 +34,10 @@
cudaPackages.cudnn
cudaPackages.cuda_nvrtc
cudatoolkit
pkgconfig
libconfig
cmake
blas
freeglut
glib
gperf
@ -42,6 +46,12 @@
libGLU
linuxPackages.nvidia_x11
python
(opencv4.override {
enableGtk3 = true;
enableFfmpeg = true;
enableCuda = true;
enableUnfree = true;
})
stdenv.cc
stdenv.cc.cc.lib
xorg.libX11

View File

@ -348,7 +348,7 @@ class InvokeAiInstance:
introduction()
from invokeai.frontend.install import invokeai_configure
from invokeai.frontend.install.invokeai_configure import invokeai_configure
# NOTE: currently the config script does its own arg parsing! this means the command-line switches
# from the installer will also automatically propagate down to the config script.
@ -463,10 +463,10 @@ def get_torch_source() -> (Union[str, None], str):
url = "https://download.pytorch.org/whl/cpu"
if device == "cuda":
url = "https://download.pytorch.org/whl/cu117"
url = "https://download.pytorch.org/whl/cu118"
optional_modules = "[xformers,onnx-cuda]"
if device == "cuda_and_dml":
url = "https://download.pytorch.org/whl/cu117"
url = "https://download.pytorch.org/whl/cu118"
optional_modules = "[xformers,onnx-directml]"
# in all other cases, Torch wheels should be coming from PyPi as of Torch 1.13

View File

@ -8,16 +8,13 @@ Preparations:
to work. Instructions are given here:
https://invoke-ai.github.io/InvokeAI/installation/INSTALL_AUTOMATED/
NOTE: At this time we do not recommend Python 3.11. We recommend
Version 3.10.9, which has been extensively tested with InvokeAI.
Before you start the installer, please open up your system's command
line window (Terminal or Command) and type the commands:
python --version
If all is well, it will print "Python 3.X.X", where the version number
is at least 3.9.1, and less than 3.11.
is at least 3.9.*, and not higher than 3.11.*.
If this works, check the version of the Python package manager, pip:

View File

@ -2,7 +2,6 @@
from typing import Optional
from logging import Logger
import os
from invokeai.app.services.board_image_record_storage import (
SqliteBoardImageRecordStorage,
)
@ -30,6 +29,7 @@ from ..services.invoker import Invoker
from ..services.processor import DefaultInvocationProcessor
from ..services.sqlite import SqliteItemStorage
from ..services.model_manager_service import ModelManagerService
from ..services.invocation_stats import InvocationStatsService
from .events import FastAPIEventService
@ -55,7 +55,7 @@ logger = InvokeAILogger.getLogger()
class ApiDependencies:
"""Contains and initializes all dependencies for the API"""
invoker: Optional[Invoker] = None
invoker: Invoker
@staticmethod
def initialize(config: InvokeAIAppConfig, event_handler_id: int, logger: Logger = logger):
@ -68,8 +68,9 @@ class ApiDependencies:
output_folder = config.output_path
# TODO: build a file/path manager?
db_location = config.db_path
db_location.parent.mkdir(parents=True, exist_ok=True)
db_path = config.db_path
db_path.parent.mkdir(parents=True, exist_ok=True)
db_location = str(db_path)
graph_execution_manager = SqliteItemStorage[GraphExecutionState](
filename=db_location, table_name="graph_executions"
@ -128,6 +129,7 @@ class ApiDependencies:
graph_execution_manager=graph_execution_manager,
processor=DefaultInvocationProcessor(),
configuration=config,
performance_statistics=InvocationStatsService(graph_execution_manager),
logger=logger,
)

View File

@ -1,24 +1,30 @@
from fastapi import Body, HTTPException, Path, Query
from fastapi import Body, HTTPException
from fastapi.routing import APIRouter
from invokeai.app.services.board_record_storage import BoardRecord, BoardChanges
from invokeai.app.services.image_record_storage import OffsetPaginatedResults
from invokeai.app.services.models.board_record import BoardDTO
from invokeai.app.services.models.image_record import ImageDTO
from pydantic import BaseModel, Field
from ..dependencies import ApiDependencies
board_images_router = APIRouter(prefix="/v1/board_images", tags=["boards"])
class AddImagesToBoardResult(BaseModel):
board_id: str = Field(description="The id of the board the images were added to")
added_image_names: list[str] = Field(description="The image names that were added to the board")
class RemoveImagesFromBoardResult(BaseModel):
removed_image_names: list[str] = Field(description="The image names that were removed from their board")
@board_images_router.post(
"/",
operation_id="create_board_image",
operation_id="add_image_to_board",
responses={
201: {"description": "The image was added to a board successfully"},
},
status_code=201,
)
async def create_board_image(
async def add_image_to_board(
board_id: str = Body(description="The id of the board to add to"),
image_name: str = Body(description="The name of the image to add"),
):
@ -29,26 +35,78 @@ async def create_board_image(
)
return result
except Exception as e:
raise HTTPException(status_code=500, detail="Failed to add to board")
raise HTTPException(status_code=500, detail="Failed to add image to board")
@board_images_router.delete(
"/",
operation_id="remove_board_image",
operation_id="remove_image_from_board",
responses={
201: {"description": "The image was removed from the board successfully"},
},
status_code=201,
)
async def remove_board_image(
board_id: str = Body(description="The id of the board"),
image_name: str = Body(description="The name of the image to remove"),
async def remove_image_from_board(
image_name: str = Body(description="The name of the image to remove", embed=True),
):
"""Deletes a board_image"""
"""Removes an image from its board, if it had one"""
try:
result = ApiDependencies.invoker.services.board_images.remove_image_from_board(
board_id=board_id, image_name=image_name
)
result = ApiDependencies.invoker.services.board_images.remove_image_from_board(image_name=image_name)
return result
except Exception as e:
raise HTTPException(status_code=500, detail="Failed to update board")
raise HTTPException(status_code=500, detail="Failed to remove image from board")
@board_images_router.post(
"/batch",
operation_id="add_images_to_board",
responses={
201: {"description": "Images were added to board successfully"},
},
status_code=201,
response_model=AddImagesToBoardResult,
)
async def add_images_to_board(
board_id: str = Body(description="The id of the board to add to"),
image_names: list[str] = Body(description="The names of the images to add", embed=True),
) -> AddImagesToBoardResult:
"""Adds a list of images to a board"""
try:
added_image_names: list[str] = []
for image_name in image_names:
try:
ApiDependencies.invoker.services.board_images.add_image_to_board(
board_id=board_id, image_name=image_name
)
added_image_names.append(image_name)
except:
pass
return AddImagesToBoardResult(board_id=board_id, added_image_names=added_image_names)
except Exception as e:
raise HTTPException(status_code=500, detail="Failed to add images to board")
@board_images_router.post(
"/batch/delete",
operation_id="remove_images_from_board",
responses={
201: {"description": "Images were removed from board successfully"},
},
status_code=201,
response_model=RemoveImagesFromBoardResult,
)
async def remove_images_from_board(
image_names: list[str] = Body(description="The names of the images to remove", embed=True),
) -> RemoveImagesFromBoardResult:
"""Removes a list of images from their board, if they had one"""
try:
removed_image_names: list[str] = []
for image_name in image_names:
try:
ApiDependencies.invoker.services.board_images.remove_image_from_board(image_name=image_name)
removed_image_names.append(image_name)
except:
pass
return RemoveImagesFromBoardResult(removed_image_names=removed_image_names)
except Exception as e:
raise HTTPException(status_code=500, detail="Failed to remove images from board")

View File

@ -1,31 +1,31 @@
import io
from typing import Optional
from PIL import Image
from fastapi import Body, HTTPException, Path, Query, Request, Response, UploadFile
from fastapi.responses import FileResponse
from fastapi.routing import APIRouter
from PIL import Image
from pydantic import BaseModel, Field
from invokeai.app.invocations.metadata import ImageMetadata
from invokeai.app.models.image import ImageCategory, ResourceOrigin
from invokeai.app.services.image_record_storage import OffsetPaginatedResults
from invokeai.app.services.item_storage import PaginatedResults
from invokeai.app.services.models.image_record import (
ImageDTO,
ImageRecordChanges,
ImageUrlsDTO,
)
from ..dependencies import ApiDependencies
images_router = APIRouter(prefix="/v1/images", tags=["images"])
# images are immutable; set a high max-age
IMAGE_MAX_AGE = 31536000
@images_router.post(
"/",
"/upload",
operation_id="upload_image",
responses={
201: {"description": "The image was uploaded successfully"},
@ -77,7 +77,7 @@ async def upload_image(
raise HTTPException(status_code=500, detail="Failed to create image")
@images_router.delete("/{image_name}", operation_id="delete_image")
@images_router.delete("/i/{image_name}", operation_id="delete_image")
async def delete_image(
image_name: str = Path(description="The name of the image to delete"),
) -> None:
@ -103,7 +103,7 @@ async def clear_intermediates() -> int:
@images_router.patch(
"/{image_name}",
"/i/{image_name}",
operation_id="update_image",
response_model=ImageDTO,
)
@ -120,7 +120,7 @@ async def update_image(
@images_router.get(
"/{image_name}",
"/i/{image_name}",
operation_id="get_image_dto",
response_model=ImageDTO,
)
@ -136,7 +136,7 @@ async def get_image_dto(
@images_router.get(
"/{image_name}/metadata",
"/i/{image_name}/metadata",
operation_id="get_image_metadata",
response_model=ImageMetadata,
)
@ -151,8 +151,9 @@ async def get_image_metadata(
raise HTTPException(status_code=404)
@images_router.get(
"/{image_name}/full",
@images_router.api_route(
"/i/{image_name}/full",
methods=["GET", "HEAD"],
operation_id="get_image_full",
response_class=Response,
responses={
@ -187,7 +188,7 @@ async def get_image_full(
@images_router.get(
"/{image_name}/thumbnail",
"/i/{image_name}/thumbnail",
operation_id="get_image_thumbnail",
response_class=Response,
responses={
@ -216,7 +217,7 @@ async def get_image_thumbnail(
@images_router.get(
"/{image_name}/urls",
"/i/{image_name}/urls",
operation_id="get_image_urls",
response_model=ImageUrlsDTO,
)
@ -265,3 +266,62 @@ async def list_image_dtos(
)
return image_dtos
class DeleteImagesFromListResult(BaseModel):
deleted_images: list[str]
@images_router.post("/delete", operation_id="delete_images_from_list", response_model=DeleteImagesFromListResult)
async def delete_images_from_list(
image_names: list[str] = Body(description="The list of names of images to delete", embed=True),
) -> DeleteImagesFromListResult:
try:
deleted_images: list[str] = []
for image_name in image_names:
try:
ApiDependencies.invoker.services.images.delete(image_name)
deleted_images.append(image_name)
except:
pass
return DeleteImagesFromListResult(deleted_images=deleted_images)
except Exception as e:
raise HTTPException(status_code=500, detail="Failed to delete images")
class ImagesUpdatedFromListResult(BaseModel):
updated_image_names: list[str] = Field(description="The image names that were updated")
@images_router.post("/star", operation_id="star_images_in_list", response_model=ImagesUpdatedFromListResult)
async def star_images_in_list(
image_names: list[str] = Body(description="The list of names of images to star", embed=True),
) -> ImagesUpdatedFromListResult:
try:
updated_image_names: list[str] = []
for image_name in image_names:
try:
ApiDependencies.invoker.services.images.update(image_name, changes=ImageRecordChanges(starred=True))
updated_image_names.append(image_name)
except:
pass
return ImagesUpdatedFromListResult(updated_image_names=updated_image_names)
except Exception as e:
raise HTTPException(status_code=500, detail="Failed to star images")
@images_router.post("/unstar", operation_id="unstar_images_in_list", response_model=ImagesUpdatedFromListResult)
async def unstar_images_in_list(
image_names: list[str] = Body(description="The list of names of images to unstar", embed=True),
) -> ImagesUpdatedFromListResult:
try:
updated_image_names: list[str] = []
for image_name in image_names:
try:
ApiDependencies.invoker.services.images.update(image_name, changes=ImageRecordChanges(starred=False))
updated_image_names.append(image_name)
except:
pass
return ImagesUpdatedFromListResult(updated_image_names=updated_image_names)
except Exception as e:
raise HTTPException(status_code=500, detail="Failed to unstar images")

View File

@ -104,8 +104,12 @@ async def update_model(
): # model manager moved model path during rename - don't overwrite it
info.path = new_info.get("path")
# replace empty string values with None/null to avoid phenomenon of vae: ''
info_dict = info.dict()
info_dict = {x: info_dict[x] if info_dict[x] else None for x in info_dict.keys()}
ApiDependencies.invoker.services.model_manager.update_model(
model_name=model_name, base_model=base_model, model_type=model_type, model_attributes=info.dict()
model_name=model_name, base_model=base_model, model_type=model_type, model_attributes=info_dict
)
model_raw = ApiDependencies.invoker.services.model_manager.list_model(

View File

@ -38,7 +38,7 @@ import mimetypes
from .api.dependencies import ApiDependencies
from .api.routers import sessions, models, images, boards, board_images, app_info
from .api.sockets import SocketIO
from .invocations.baseinvocation import BaseInvocation
from .invocations.baseinvocation import BaseInvocation, _InputField, _OutputField, UIConfigBase
import torch
@ -134,6 +134,11 @@ def custom_openapi():
# This could break in some cases, figure out a better way to do it
output_type_titles[schema_key] = output_schema["title"]
# Add Node Editor UI helper schemas
ui_config_schemas = schema([UIConfigBase, _InputField, _OutputField], ref_prefix="#/components/schemas/")
for schema_key, output_schema in ui_config_schemas["definitions"].items():
openapi_schema["components"]["schemas"][schema_key] = output_schema
# Add a reference to the output type to additionalProperties of the invoker schema
for invoker in all_invocations:
invoker_name = invoker.__name__

View File

@ -37,6 +37,7 @@ from invokeai.app.services.image_record_storage import SqliteImageRecordStorage
from invokeai.app.services.images import ImageService, ImageServiceDependencies
from invokeai.app.services.resource_name import SimpleNameService
from invokeai.app.services.urls import LocalUrlService
from invokeai.app.services.invocation_stats import InvocationStatsService
from .services.default_graphs import default_text_to_image_graph_id, create_system_graphs
from .services.latent_storage import DiskLatentsStorage, ForwardCacheLatentsStorage
@ -311,6 +312,7 @@ def invoke_cli():
graph_library=SqliteItemStorage[LibraryGraph](filename=db_location, table_name="graphs"),
graph_execution_manager=graph_execution_manager,
processor=DefaultInvocationProcessor(),
performance_statistics=InvocationStatsService(graph_execution_manager),
logger=logger,
configuration=config,
)

View File

@ -3,15 +3,366 @@
from __future__ import annotations
from abc import ABC, abstractmethod
from enum import Enum
from inspect import signature
from typing import TYPE_CHECKING, Dict, List, Literal, TypedDict, get_args, get_type_hints
from typing import (
TYPE_CHECKING,
AbstractSet,
Any,
Callable,
ClassVar,
Mapping,
Optional,
Type,
TypeVar,
Union,
get_args,
get_type_hints,
)
from pydantic import BaseConfig, BaseModel, Field
from pydantic import BaseModel, Field
from pydantic.fields import Undefined
from pydantic.typing import NoArgAnyCallable
if TYPE_CHECKING:
from ..services.invocation_services import InvocationServices
class FieldDescriptions:
denoising_start = "When to start denoising, expressed a percentage of total steps"
denoising_end = "When to stop denoising, expressed a percentage of total steps"
cfg_scale = "Classifier-Free Guidance scale"
scheduler = "Scheduler to use during inference"
positive_cond = "Positive conditioning tensor"
negative_cond = "Negative conditioning tensor"
noise = "Noise tensor"
clip = "CLIP (tokenizer, text encoder, LoRAs) and skipped layer count"
unet = "UNet (scheduler, LoRAs)"
vae = "VAE"
cond = "Conditioning tensor"
controlnet_model = "ControlNet model to load"
vae_model = "VAE model to load"
lora_model = "LoRA model to load"
main_model = "Main model (UNet, VAE, CLIP) to load"
sdxl_main_model = "SDXL Main model (UNet, VAE, CLIP1, CLIP2) to load"
sdxl_refiner_model = "SDXL Refiner Main Modde (UNet, VAE, CLIP2) to load"
onnx_main_model = "ONNX Main model (UNet, VAE, CLIP) to load"
lora_weight = "The weight at which the LoRA is applied to each model"
compel_prompt = "Prompt to be parsed by Compel to create a conditioning tensor"
raw_prompt = "Raw prompt text (no parsing)"
sdxl_aesthetic = "The aesthetic score to apply to the conditioning tensor"
skipped_layers = "Number of layers to skip in text encoder"
seed = "Seed for random number generation"
steps = "Number of steps to run"
width = "Width of output (px)"
height = "Height of output (px)"
control = "ControlNet(s) to apply"
denoised_latents = "Denoised latents tensor"
latents = "Latents tensor"
strength = "Strength of denoising (proportional to steps)"
core_metadata = "Optional core metadata to be written to image"
interp_mode = "Interpolation mode"
torch_antialias = "Whether or not to apply antialiasing (bilinear or bicubic only)"
fp32 = "Whether or not to use full float32 precision"
precision = "Precision to use"
tiled = "Processing using overlapping tiles (reduce memory consumption)"
detect_res = "Pixel resolution for detection"
image_res = "Pixel resolution for output image"
safe_mode = "Whether or not to use safe mode"
scribble_mode = "Whether or not to use scribble mode"
scale_factor = "The factor by which to scale"
num_1 = "The first number"
num_2 = "The second number"
mask = "The mask to use for the operation"
class Input(str, Enum):
"""
The type of input a field accepts.
- `Input.Direct`: The field must have its value provided directly, when the invocation and field \
are instantiated.
- `Input.Connection`: The field must have its value provided by a connection.
- `Input.Any`: The field may have its value provided either directly or by a connection.
"""
Connection = "connection"
Direct = "direct"
Any = "any"
class UIType(str, Enum):
"""
Type hints for the UI.
If a field should be provided a data type that does not exactly match the python type of the field, \
use this to provide the type that should be used instead. See the node development docs for detail \
on adding a new field type, which involves client-side changes.
"""
# region Primitives
Integer = "integer"
Float = "float"
Boolean = "boolean"
String = "string"
Array = "array"
Image = "ImageField"
Latents = "LatentsField"
Conditioning = "ConditioningField"
Control = "ControlField"
Color = "ColorField"
ImageCollection = "ImageCollection"
ConditioningCollection = "ConditioningCollection"
ColorCollection = "ColorCollection"
LatentsCollection = "LatentsCollection"
IntegerCollection = "IntegerCollection"
FloatCollection = "FloatCollection"
StringCollection = "StringCollection"
BooleanCollection = "BooleanCollection"
# endregion
# region Models
MainModel = "MainModelField"
SDXLMainModel = "SDXLMainModelField"
SDXLRefinerModel = "SDXLRefinerModelField"
ONNXModel = "ONNXModelField"
VaeModel = "VaeModelField"
LoRAModel = "LoRAModelField"
ControlNetModel = "ControlNetModelField"
UNet = "UNetField"
Vae = "VaeField"
CLIP = "ClipField"
# endregion
# region Iterate/Collect
Collection = "Collection"
CollectionItem = "CollectionItem"
# endregion
# region Misc
FilePath = "FilePath"
Enum = "enum"
# endregion
class UIComponent(str, Enum):
"""
The type of UI component to use for a field, used to override the default components, which are \
inferred from the field type.
"""
None_ = "none"
Textarea = "textarea"
Slider = "slider"
class _InputField(BaseModel):
"""
*DO NOT USE*
This helper class is used to tell the client about our custom field attributes via OpenAPI
schema generation, and Typescript type generation from that schema. It serves no functional
purpose in the backend.
"""
input: Input
ui_hidden: bool
ui_type: Optional[UIType]
ui_component: Optional[UIComponent]
class _OutputField(BaseModel):
"""
*DO NOT USE*
This helper class is used to tell the client about our custom field attributes via OpenAPI
schema generation, and Typescript type generation from that schema. It serves no functional
purpose in the backend.
"""
ui_hidden: bool
ui_type: Optional[UIType]
def InputField(
*args: Any,
default: Any = Undefined,
default_factory: Optional[NoArgAnyCallable] = None,
alias: Optional[str] = None,
title: Optional[str] = None,
description: Optional[str] = None,
exclude: Optional[Union[AbstractSet[Union[int, str]], Mapping[Union[int, str], Any], Any]] = None,
include: Optional[Union[AbstractSet[Union[int, str]], Mapping[Union[int, str], Any], Any]] = None,
const: Optional[bool] = None,
gt: Optional[float] = None,
ge: Optional[float] = None,
lt: Optional[float] = None,
le: Optional[float] = None,
multiple_of: Optional[float] = None,
allow_inf_nan: Optional[bool] = None,
max_digits: Optional[int] = None,
decimal_places: Optional[int] = None,
min_items: Optional[int] = None,
max_items: Optional[int] = None,
unique_items: Optional[bool] = None,
min_length: Optional[int] = None,
max_length: Optional[int] = None,
allow_mutation: bool = True,
regex: Optional[str] = None,
discriminator: Optional[str] = None,
repr: bool = True,
input: Input = Input.Any,
ui_type: Optional[UIType] = None,
ui_component: Optional[UIComponent] = None,
ui_hidden: bool = False,
**kwargs: Any,
) -> Any:
"""
Creates an input field for an invocation.
This is a wrapper for Pydantic's [Field](https://docs.pydantic.dev/1.10/usage/schema/#field-customization) \
that adds a few extra parameters to support graph execution and the node editor UI.
:param Input input: [Input.Any] The kind of input this field requires. \
`Input.Direct` means a value must be provided on instantiation. \
`Input.Connection` means the value must be provided by a connection. \
`Input.Any` means either will do.
:param UIType ui_type: [None] Optionally provides an extra type hint for the UI. \
In some situations, the field's type is not enough to infer the correct UI type. \
For example, model selection fields should render a dropdown UI component to select a model. \
Internally, there is no difference between SD-1, SD-2 and SDXL model fields, they all use \
`MainModelField`. So to ensure the base-model-specific UI is rendered, you can use \
`UIType.SDXLMainModelField` to indicate that the field is an SDXL main model field.
:param UIComponent ui_component: [None] Optionally specifies a specific component to use in the UI. \
The UI will always render a suitable component, but sometimes you want something different than the default. \
For example, a `string` field will default to a single-line input, but you may want a multi-line textarea instead. \
For this case, you could provide `UIComponent.Textarea`.
: param bool ui_hidden: [False] Specifies whether or not this field should be hidden in the UI.
"""
return Field(
*args,
default=default,
default_factory=default_factory,
alias=alias,
title=title,
description=description,
exclude=exclude,
include=include,
const=const,
gt=gt,
ge=ge,
lt=lt,
le=le,
multiple_of=multiple_of,
allow_inf_nan=allow_inf_nan,
max_digits=max_digits,
decimal_places=decimal_places,
min_items=min_items,
max_items=max_items,
unique_items=unique_items,
min_length=min_length,
max_length=max_length,
allow_mutation=allow_mutation,
regex=regex,
discriminator=discriminator,
repr=repr,
input=input,
ui_type=ui_type,
ui_component=ui_component,
ui_hidden=ui_hidden,
**kwargs,
)
def OutputField(
*args: Any,
default: Any = Undefined,
default_factory: Optional[NoArgAnyCallable] = None,
alias: Optional[str] = None,
title: Optional[str] = None,
description: Optional[str] = None,
exclude: Optional[Union[AbstractSet[Union[int, str]], Mapping[Union[int, str], Any], Any]] = None,
include: Optional[Union[AbstractSet[Union[int, str]], Mapping[Union[int, str], Any], Any]] = None,
const: Optional[bool] = None,
gt: Optional[float] = None,
ge: Optional[float] = None,
lt: Optional[float] = None,
le: Optional[float] = None,
multiple_of: Optional[float] = None,
allow_inf_nan: Optional[bool] = None,
max_digits: Optional[int] = None,
decimal_places: Optional[int] = None,
min_items: Optional[int] = None,
max_items: Optional[int] = None,
unique_items: Optional[bool] = None,
min_length: Optional[int] = None,
max_length: Optional[int] = None,
allow_mutation: bool = True,
regex: Optional[str] = None,
discriminator: Optional[str] = None,
repr: bool = True,
ui_type: Optional[UIType] = None,
ui_hidden: bool = False,
**kwargs: Any,
) -> Any:
"""
Creates an output field for an invocation output.
This is a wrapper for Pydantic's [Field](https://docs.pydantic.dev/1.10/usage/schema/#field-customization) \
that adds a few extra parameters to support graph execution and the node editor UI.
:param UIType ui_type: [None] Optionally provides an extra type hint for the UI. \
In some situations, the field's type is not enough to infer the correct UI type. \
For example, model selection fields should render a dropdown UI component to select a model. \
Internally, there is no difference between SD-1, SD-2 and SDXL model fields, they all use \
`MainModelField`. So to ensure the base-model-specific UI is rendered, you can use \
`UIType.SDXLMainModelField` to indicate that the field is an SDXL main model field.
: param bool ui_hidden: [False] Specifies whether or not this field should be hidden in the UI. \
"""
return Field(
*args,
default=default,
default_factory=default_factory,
alias=alias,
title=title,
description=description,
exclude=exclude,
include=include,
const=const,
gt=gt,
ge=ge,
lt=lt,
le=le,
multiple_of=multiple_of,
allow_inf_nan=allow_inf_nan,
max_digits=max_digits,
decimal_places=decimal_places,
min_items=min_items,
max_items=max_items,
unique_items=unique_items,
min_length=min_length,
max_length=max_length,
allow_mutation=allow_mutation,
regex=regex,
discriminator=discriminator,
repr=repr,
ui_type=ui_type,
ui_hidden=ui_hidden,
**kwargs,
)
class UIConfigBase(BaseModel):
"""
Provides additional node configuration to the UI.
This is used internally by the @tags and @title decorator logic. You probably want to use those
decorators, though you may add this class to a node definition to specify the title and tags.
"""
tags: Optional[list[str]] = Field(default_factory=None, description="The tags to display in the UI")
title: Optional[str] = Field(default=None, description="The display name of the node")
class InvocationContext:
services: InvocationServices
graph_execution_state_id: str
@ -39,6 +390,20 @@ class BaseInvocationOutput(BaseModel):
return tuple(subclasses)
class RequiredConnectionException(Exception):
"""Raised when an field which requires a connection did not receive a value."""
def __init__(self, node_id: str, field_name: str):
super().__init__(f"Node {node_id} missing connections for field {field_name}")
class MissingInputException(Exception):
"""Raised when an field which requires some input, but did not receive a value."""
def __init__(self, node_id: str, field_name: str):
super().__init__(f"Node {node_id} missing value or connection for field {field_name}")
class BaseInvocation(ABC, BaseModel):
"""A node to process inputs and produce outputs.
May use dependency injection in __init__ to receive providers.
@ -76,70 +441,81 @@ class BaseInvocation(ABC, BaseModel):
def get_output_type(cls):
return signature(cls.invoke).return_annotation
class Config:
@staticmethod
def schema_extra(schema: dict[str, Any], model_class: Type[BaseModel]) -> None:
uiconfig = getattr(model_class, "UIConfig", None)
if uiconfig and hasattr(uiconfig, "title"):
schema["title"] = uiconfig.title
if uiconfig and hasattr(uiconfig, "tags"):
schema["tags"] = uiconfig.tags
@abstractmethod
def invoke(self, context: InvocationContext) -> BaseInvocationOutput:
"""Invoke with provided context and return outputs."""
pass
# fmt: off
id: str = Field(description="The id of this node. Must be unique among all nodes.")
is_intermediate: bool = Field(default=False, description="Whether or not this node is an intermediate node.")
# fmt: on
def __init__(self, **data):
# nodes may have required fields, that can accept input from connections
# on instantiation of the model, we need to exclude these from validation
restore = dict()
try:
field_names = list(self.__fields__.keys())
for field_name in field_names:
# if the field is required and may get its value from a connection, exclude it from validation
field = self.__fields__[field_name]
_input = field.field_info.extra.get("input", None)
if _input in [Input.Connection, Input.Any] and field.required:
if field_name not in data:
restore[field_name] = self.__fields__.pop(field_name)
# instantiate the node, which will validate the data
super().__init__(**data)
finally:
# restore the removed fields
for field_name, field in restore.items():
self.__fields__[field_name] = field
def invoke_internal(self, context: InvocationContext) -> BaseInvocationOutput:
for field_name, field in self.__fields__.items():
_input = field.field_info.extra.get("input", None)
if field.required and not hasattr(self, field_name):
if _input == Input.Connection:
raise RequiredConnectionException(self.__fields__["type"].default, field_name)
elif _input == Input.Any:
raise MissingInputException(self.__fields__["type"].default, field_name)
return self.invoke(context)
id: str = InputField(description="The id of this node. Must be unique among all nodes.")
is_intermediate: bool = InputField(
default=False, description="Whether or not this node is an intermediate node.", input=Input.Direct
)
UIConfig: ClassVar[Type[UIConfigBase]]
# TODO: figure out a better way to provide these hints
# TODO: when we can upgrade to python 3.11, we can use the`NotRequired` type instead of `total=False`
class UIConfig(TypedDict, total=False):
type_hints: Dict[
str,
Literal[
"integer",
"float",
"boolean",
"string",
"enum",
"image",
"latents",
"model",
"control",
"image_collection",
"vae_model",
"lora_model",
],
]
tags: List[str]
title: str
T = TypeVar("T", bound=BaseInvocation)
class CustomisedSchemaExtra(TypedDict):
ui: UIConfig
def title(title: str) -> Callable[[Type[T]], Type[T]]:
"""Adds a title to the invocation. Use this to override the default title generation, which is based on the class name."""
def wrapper(cls: Type[T]) -> Type[T]:
uiconf_name = cls.__qualname__ + ".UIConfig"
if not hasattr(cls, "UIConfig") or cls.UIConfig.__qualname__ != uiconf_name:
cls.UIConfig = type(uiconf_name, (UIConfigBase,), dict())
cls.UIConfig.title = title
return cls
return wrapper
class InvocationConfig(BaseConfig):
"""Customizes pydantic's BaseModel.Config class for use by Invocations.
def tags(*tags: str) -> Callable[[Type[T]], Type[T]]:
"""Adds tags to the invocation. Use this to improve the streamline finding the invocation in the UI."""
Provide `schema_extra` a `ui` dict to add hints for generated UIs.
def wrapper(cls: Type[T]) -> Type[T]:
uiconf_name = cls.__qualname__ + ".UIConfig"
if not hasattr(cls, "UIConfig") or cls.UIConfig.__qualname__ != uiconf_name:
cls.UIConfig = type(uiconf_name, (UIConfigBase,), dict())
cls.UIConfig.tags = list(tags)
return cls
`tags`
- A list of strings, used to categorise invocations.
`type_hints`
- A dict of field types which override the types in the invocation definition.
- Each key should be the name of one of the invocation's fields.
- Each value should be one of the valid types:
- `integer`, `float`, `boolean`, `string`, `enum`, `image`, `latents`, `model`
```python
class Config(InvocationConfig):
schema_extra = {
"ui": {
"tags": ["stable-diffusion", "image"],
"type_hints": {
"initial_image": "image",
},
},
}
```
"""
schema_extra: CustomisedSchemaExtra
return wrapper

View File

@ -3,58 +3,25 @@
from typing import Literal
import numpy as np
from pydantic import Field, validator
from pydantic import validator
from invokeai.app.models.image import ImageField
from invokeai.app.invocations.primitives import ImageCollectionOutput, ImageField, IntegerCollectionOutput
from invokeai.app.util.misc import SEED_MAX, get_random_seed
from .baseinvocation import BaseInvocation, BaseInvocationOutput, InvocationConfig, InvocationContext, UIConfig
class IntCollectionOutput(BaseInvocationOutput):
"""A collection of integers"""
type: Literal["int_collection"] = "int_collection"
# Outputs
collection: list[int] = Field(default=[], description="The int collection")
class FloatCollectionOutput(BaseInvocationOutput):
"""A collection of floats"""
type: Literal["float_collection"] = "float_collection"
# Outputs
collection: list[float] = Field(default=[], description="The float collection")
class ImageCollectionOutput(BaseInvocationOutput):
"""A collection of images"""
type: Literal["image_collection"] = "image_collection"
# Outputs
collection: list[ImageField] = Field(default=[], description="The output images")
class Config:
schema_extra = {"required": ["type", "collection"]}
from .baseinvocation import BaseInvocation, InputField, InvocationContext, UIType, tags, title
@title("Integer Range")
@tags("collection", "integer", "range")
class RangeInvocation(BaseInvocation):
"""Creates a range of numbers from start to stop with step"""
type: Literal["range"] = "range"
# Inputs
start: int = Field(default=0, description="The start of the range")
stop: int = Field(default=10, description="The stop of the range")
step: int = Field(default=1, description="The step of the range")
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Range", "tags": ["range", "integer", "collection"]},
}
start: int = InputField(default=0, description="The start of the range")
stop: int = InputField(default=10, description="The stop of the range")
step: int = InputField(default=1, description="The step of the range")
@validator("stop")
def stop_gt_start(cls, v, values):
@ -62,76 +29,44 @@ class RangeInvocation(BaseInvocation):
raise ValueError("stop must be greater than start")
return v
def invoke(self, context: InvocationContext) -> IntCollectionOutput:
return IntCollectionOutput(collection=list(range(self.start, self.stop, self.step)))
def invoke(self, context: InvocationContext) -> IntegerCollectionOutput:
return IntegerCollectionOutput(collection=list(range(self.start, self.stop, self.step)))
@title("Integer Range of Size")
@tags("range", "integer", "size", "collection")
class RangeOfSizeInvocation(BaseInvocation):
"""Creates a range from start to start + size with step"""
type: Literal["range_of_size"] = "range_of_size"
# Inputs
start: int = Field(default=0, description="The start of the range")
size: int = Field(default=1, description="The number of values")
step: int = Field(default=1, description="The step of the range")
start: int = InputField(default=0, description="The start of the range")
size: int = InputField(default=1, description="The number of values")
step: int = InputField(default=1, description="The step of the range")
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Sized Range", "tags": ["range", "integer", "size", "collection"]},
}
def invoke(self, context: InvocationContext) -> IntCollectionOutput:
return IntCollectionOutput(collection=list(range(self.start, self.start + self.size, self.step)))
def invoke(self, context: InvocationContext) -> IntegerCollectionOutput:
return IntegerCollectionOutput(collection=list(range(self.start, self.start + self.size, self.step)))
@title("Random Range")
@tags("range", "integer", "random", "collection")
class RandomRangeInvocation(BaseInvocation):
"""Creates a collection of random numbers"""
type: Literal["random_range"] = "random_range"
# Inputs
low: int = Field(default=0, description="The inclusive low value")
high: int = Field(default=np.iinfo(np.int32).max, description="The exclusive high value")
size: int = Field(default=1, description="The number of values to generate")
seed: int = Field(
low: int = InputField(default=0, description="The inclusive low value")
high: int = InputField(default=np.iinfo(np.int32).max, description="The exclusive high value")
size: int = InputField(default=1, description="The number of values to generate")
seed: int = InputField(
ge=0,
le=SEED_MAX,
description="The seed for the RNG (omit for random)",
default_factory=get_random_seed,
)
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Random Range", "tags": ["range", "integer", "random", "collection"]},
}
def invoke(self, context: InvocationContext) -> IntCollectionOutput:
def invoke(self, context: InvocationContext) -> IntegerCollectionOutput:
rng = np.random.default_rng(self.seed)
return IntCollectionOutput(collection=list(rng.integers(low=self.low, high=self.high, size=self.size)))
class ImageCollectionInvocation(BaseInvocation):
"""Load a collection of images and provide it as output."""
# fmt: off
type: Literal["image_collection"] = "image_collection"
# Inputs
images: list[ImageField] = Field(
default=[], description="The image collection to load"
)
# fmt: on
def invoke(self, context: InvocationContext) -> ImageCollectionOutput:
return ImageCollectionOutput(collection=self.images)
class Config(InvocationConfig):
schema_extra = {
"ui": {
"type_hints": {
"title": "Image Collection",
"images": "image_collection",
}
},
}
return IntegerCollectionOutput(collection=list(rng.integers(low=self.low, high=self.high, size=self.size)))

View File

@ -1,56 +1,40 @@
from typing import Literal, Optional, Union, List, Annotated
from pydantic import BaseModel, Field
import re
from .baseinvocation import BaseInvocation, BaseInvocationOutput, InvocationContext, InvocationConfig
from .model import ClipField
from ...backend.util.devices import torch_dtype
from ...backend.stable_diffusion.diffusion import InvokeAIDiffuserComponent
from ...backend.model_management import BaseModelType, ModelType, SubModelType, ModelPatcher
from dataclasses import dataclass
from typing import List, Literal, Union
import torch
from compel import Compel, ReturnedEmbeddingsType
from compel.prompt_parser import Blend, Conjunction, CrossAttentionControlSubstitute, FlattenedPrompt, Fragment
from ...backend.util.devices import torch_dtype
from ...backend.model_management import ModelType
from ...backend.model_management.models import ModelNotFoundException
from invokeai.app.invocations.primitives import ConditioningField, ConditioningOutput
from invokeai.backend.stable_diffusion.diffusion.shared_invokeai_diffusion import (
BasicConditioningInfo,
SDXLConditioningInfo,
)
from ...backend.model_management import ModelPatcher, ModelType
from ...backend.model_management.lora import ModelPatcher
from ...backend.model_management.models import ModelNotFoundException
from ...backend.stable_diffusion.diffusion import InvokeAIDiffuserComponent
from .baseinvocation import BaseInvocation, BaseInvocationOutput, InvocationConfig, InvocationContext
from ...backend.util.devices import torch_dtype
from .baseinvocation import (
BaseInvocation,
BaseInvocationOutput,
FieldDescriptions,
Input,
InputField,
InvocationContext,
OutputField,
UIComponent,
tags,
title,
)
from .model import ClipField
from dataclasses import dataclass
class ConditioningField(BaseModel):
conditioning_name: Optional[str] = Field(default=None, description="The name of conditioning data")
class Config:
schema_extra = {"required": ["conditioning_name"]}
@dataclass
class BasicConditioningInfo:
# type: Literal["basic_conditioning"] = "basic_conditioning"
embeds: torch.Tensor
extra_conditioning: Optional[InvokeAIDiffuserComponent.ExtraConditioningInfo]
# weight: float
# mode: ConditioningAlgo
@dataclass
class SDXLConditioningInfo(BasicConditioningInfo):
# type: Literal["sdxl_conditioning"] = "sdxl_conditioning"
pooled_embeds: torch.Tensor
add_time_ids: torch.Tensor
ConditioningInfoType = Annotated[Union[BasicConditioningInfo, SDXLConditioningInfo], Field(discriminator="type")]
@dataclass
class ConditioningFieldData:
conditionings: List[Union[BasicConditioningInfo, SDXLConditioningInfo]]
conditionings: List[BasicConditioningInfo]
# unconditioned: Optional[torch.Tensor]
@ -60,32 +44,26 @@ class ConditioningFieldData:
# PerpNeg = "perp_neg"
class CompelOutput(BaseInvocationOutput):
"""Compel parser output"""
# fmt: off
type: Literal["compel_output"] = "compel_output"
conditioning: ConditioningField = Field(default=None, description="Conditioning")
# fmt: on
@title("Compel Prompt")
@tags("prompt", "compel")
class CompelInvocation(BaseInvocation):
"""Parse prompt using compel package to conditioning."""
type: Literal["compel"] = "compel"
prompt: str = Field(default="", description="Prompt")
clip: ClipField = Field(None, description="Clip to use")
# Schema customisation
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Prompt (Compel)", "tags": ["prompt", "compel"], "type_hints": {"model": "model"}},
}
prompt: str = InputField(
default="",
description=FieldDescriptions.compel_prompt,
ui_component=UIComponent.Textarea,
)
clip: ClipField = InputField(
title="CLIP",
description=FieldDescriptions.clip,
input=Input.Connection,
)
@torch.no_grad()
def invoke(self, context: InvocationContext) -> CompelOutput:
def invoke(self, context: InvocationContext) -> ConditioningOutput:
tokenizer_info = context.services.model_manager.get_model(
**self.clip.tokenizer.dict(),
context=context,
@ -109,12 +87,15 @@ class CompelInvocation(BaseInvocation):
name = trigger[1:-1]
try:
ti_list.append(
context.services.model_manager.get_model(
model_name=name,
base_model=self.clip.text_encoder.base_model,
model_type=ModelType.TextualInversion,
context=context,
).context.model
(
name,
context.services.model_manager.get_model(
model_name=name,
base_model=self.clip.text_encoder.base_model,
model_type=ModelType.TextualInversion,
context=context,
).context.model,
)
)
except ModelNotFoundException:
# print(e)
@ -165,7 +146,7 @@ class CompelInvocation(BaseInvocation):
conditioning_name = f"{context.graph_execution_state_id}_{self.id}_conditioning"
context.services.latents.save(conditioning_name, conditioning_data)
return CompelOutput(
return ConditioningOutput(
conditioning=ConditioningField(
conditioning_name=conditioning_name,
),
@ -173,7 +154,15 @@ class CompelInvocation(BaseInvocation):
class SDXLPromptInvocationBase:
def run_clip_raw(self, context, clip_field, prompt, get_pooled):
def run_clip_compel(
self,
context: InvocationContext,
clip_field: ClipField,
prompt: str,
get_pooled: bool,
lora_prefix: str,
zero_on_empty: bool,
):
tokenizer_info = context.services.model_manager.get_model(
**clip_field.tokenizer.dict(),
context=context,
@ -183,79 +172,21 @@ class SDXLPromptInvocationBase:
context=context,
)
def _lora_loader():
for lora in clip_field.loras:
lora_info = context.services.model_manager.get_model(**lora.dict(exclude={"weight"}), context=context)
yield (lora_info.context.model, lora.weight)
del lora_info
return
# loras = [(context.services.model_manager.get_model(**lora.dict(exclude={"weight"})).context.model, lora.weight) for lora in self.clip.loras]
ti_list = []
for trigger in re.findall(r"<[a-zA-Z0-9., _-]+>", prompt):
name = trigger[1:-1]
try:
ti_list.append(
context.services.model_manager.get_model(
model_name=name,
base_model=clip_field.text_encoder.base_model,
model_type=ModelType.TextualInversion,
context=context,
).context.model
)
except ModelNotFoundException:
# print(e)
# import traceback
# print(traceback.format_exc())
print(f'Warn: trigger: "{trigger}" not found')
with ModelPatcher.apply_lora_text_encoder(
text_encoder_info.context.model, _lora_loader()
), ModelPatcher.apply_ti(tokenizer_info.context.model, text_encoder_info.context.model, ti_list) as (
tokenizer,
ti_manager,
), ModelPatcher.apply_clip_skip(
text_encoder_info.context.model, clip_field.skipped_layers
), text_encoder_info as text_encoder:
text_inputs = tokenizer(
prompt,
padding="max_length",
max_length=tokenizer.model_max_length,
truncation=True,
return_tensors="pt",
)
text_input_ids = text_inputs.input_ids
prompt_embeds = text_encoder(
text_input_ids.to(text_encoder.device),
output_hidden_states=True,
# return zero on empty
if prompt == "" and zero_on_empty:
cpu_text_encoder = text_encoder_info.context.model
c = torch.zeros(
(1, cpu_text_encoder.config.max_position_embeddings, cpu_text_encoder.config.hidden_size),
dtype=text_encoder_info.context.cache.precision,
)
if get_pooled:
c_pooled = prompt_embeds[0]
c_pooled = torch.zeros(
(1, cpu_text_encoder.config.hidden_size),
dtype=c.dtype,
)
else:
c_pooled = None
c = prompt_embeds.hidden_states[-2]
del tokenizer
del text_encoder
del tokenizer_info
del text_encoder_info
c = c.detach().to("cpu")
if c_pooled is not None:
c_pooled = c_pooled.detach().to("cpu")
return c, c_pooled, None
def run_clip_compel(self, context, clip_field, prompt, get_pooled):
tokenizer_info = context.services.model_manager.get_model(
**clip_field.tokenizer.dict(),
context=context,
)
text_encoder_info = context.services.model_manager.get_model(
**clip_field.text_encoder.dict(),
context=context,
)
return c, c_pooled, None
def _lora_loader():
for lora in clip_field.loras:
@ -271,12 +202,15 @@ class SDXLPromptInvocationBase:
name = trigger[1:-1]
try:
ti_list.append(
context.services.model_manager.get_model(
model_name=name,
base_model=clip_field.text_encoder.base_model,
model_type=ModelType.TextualInversion,
context=context,
).context.model
(
name,
context.services.model_manager.get_model(
model_name=name,
base_model=clip_field.text_encoder.base_model,
model_type=ModelType.TextualInversion,
context=context,
).context.model,
)
)
except ModelNotFoundException:
# print(e)
@ -284,8 +218,8 @@ class SDXLPromptInvocationBase:
# print(traceback.format_exc())
print(f'Warn: trigger: "{trigger}" not found')
with ModelPatcher.apply_lora_text_encoder(
text_encoder_info.context.model, _lora_loader()
with ModelPatcher.apply_lora(
text_encoder_info.context.model, _lora_loader(), lora_prefix
), ModelPatcher.apply_ti(tokenizer_info.context.model, text_encoder_info.context.model, ti_list) as (
tokenizer,
ti_manager,
@ -333,35 +267,37 @@ class SDXLPromptInvocationBase:
return c, c_pooled, ec
@title("SDXL Compel Prompt")
@tags("sdxl", "compel", "prompt")
class SDXLCompelPromptInvocation(BaseInvocation, SDXLPromptInvocationBase):
"""Parse prompt using compel package to conditioning."""
type: Literal["sdxl_compel_prompt"] = "sdxl_compel_prompt"
prompt: str = Field(default="", description="Prompt")
style: str = Field(default="", description="Style prompt")
original_width: int = Field(1024, description="")
original_height: int = Field(1024, description="")
crop_top: int = Field(0, description="")
crop_left: int = Field(0, description="")
target_width: int = Field(1024, description="")
target_height: int = Field(1024, description="")
clip: ClipField = Field(None, description="Clip to use")
clip2: ClipField = Field(None, description="Clip2 to use")
# Schema customisation
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "SDXL Prompt (Compel)", "tags": ["prompt", "compel"], "type_hints": {"model": "model"}},
}
prompt: str = InputField(default="", description=FieldDescriptions.compel_prompt, ui_component=UIComponent.Textarea)
style: str = InputField(default="", description=FieldDescriptions.compel_prompt, ui_component=UIComponent.Textarea)
original_width: int = InputField(default=1024, description="")
original_height: int = InputField(default=1024, description="")
crop_top: int = InputField(default=0, description="")
crop_left: int = InputField(default=0, description="")
target_width: int = InputField(default=1024, description="")
target_height: int = InputField(default=1024, description="")
clip: ClipField = InputField(description=FieldDescriptions.clip, input=Input.Connection)
clip2: ClipField = InputField(description=FieldDescriptions.clip, input=Input.Connection)
@torch.no_grad()
def invoke(self, context: InvocationContext) -> CompelOutput:
c1, c1_pooled, ec1 = self.run_clip_compel(context, self.clip, self.prompt, False)
def invoke(self, context: InvocationContext) -> ConditioningOutput:
c1, c1_pooled, ec1 = self.run_clip_compel(
context, self.clip, self.prompt, False, "lora_te1_", zero_on_empty=True
)
if self.style.strip() == "":
c2, c2_pooled, ec2 = self.run_clip_compel(context, self.clip2, self.prompt, True)
c2, c2_pooled, ec2 = self.run_clip_compel(
context, self.clip2, self.prompt, True, "lora_te2_", zero_on_empty=True
)
else:
c2, c2_pooled, ec2 = self.run_clip_compel(context, self.clip2, self.style, True)
c2, c2_pooled, ec2 = self.run_clip_compel(
context, self.clip2, self.style, True, "lora_te2_", zero_on_empty=True
)
original_size = (self.original_height, self.original_width)
crop_coords = (self.crop_top, self.crop_left)
@ -383,39 +319,34 @@ class SDXLCompelPromptInvocation(BaseInvocation, SDXLPromptInvocationBase):
conditioning_name = f"{context.graph_execution_state_id}_{self.id}_conditioning"
context.services.latents.save(conditioning_name, conditioning_data)
return CompelOutput(
return ConditioningOutput(
conditioning=ConditioningField(
conditioning_name=conditioning_name,
),
)
@title("SDXL Refiner Compel Prompt")
@tags("sdxl", "compel", "prompt")
class SDXLRefinerCompelPromptInvocation(BaseInvocation, SDXLPromptInvocationBase):
"""Parse prompt using compel package to conditioning."""
type: Literal["sdxl_refiner_compel_prompt"] = "sdxl_refiner_compel_prompt"
style: str = Field(default="", description="Style prompt") # TODO: ?
original_width: int = Field(1024, description="")
original_height: int = Field(1024, description="")
crop_top: int = Field(0, description="")
crop_left: int = Field(0, description="")
aesthetic_score: float = Field(6.0, description="")
clip2: ClipField = Field(None, description="Clip to use")
# Schema customisation
class Config(InvocationConfig):
schema_extra = {
"ui": {
"title": "SDXL Refiner Prompt (Compel)",
"tags": ["prompt", "compel"],
"type_hints": {"model": "model"},
},
}
style: str = InputField(
default="", description=FieldDescriptions.compel_prompt, ui_component=UIComponent.Textarea
) # TODO: ?
original_width: int = InputField(default=1024, description="")
original_height: int = InputField(default=1024, description="")
crop_top: int = InputField(default=0, description="")
crop_left: int = InputField(default=0, description="")
aesthetic_score: float = InputField(default=6.0, description=FieldDescriptions.sdxl_aesthetic)
clip2: ClipField = InputField(description=FieldDescriptions.clip, input=Input.Connection)
@torch.no_grad()
def invoke(self, context: InvocationContext) -> CompelOutput:
c2, c2_pooled, ec2 = self.run_clip_compel(context, self.clip2, self.style, True)
def invoke(self, context: InvocationContext) -> ConditioningOutput:
# TODO: if there will appear lora for refiner - write proper prefix
c2, c2_pooled, ec2 = self.run_clip_compel(context, self.clip2, self.style, True, "<NONE>", zero_on_empty=False)
original_size = (self.original_height, self.original_width)
crop_coords = (self.crop_top, self.crop_left)
@ -436,117 +367,7 @@ class SDXLRefinerCompelPromptInvocation(BaseInvocation, SDXLPromptInvocationBase
conditioning_name = f"{context.graph_execution_state_id}_{self.id}_conditioning"
context.services.latents.save(conditioning_name, conditioning_data)
return CompelOutput(
conditioning=ConditioningField(
conditioning_name=conditioning_name,
),
)
class SDXLRawPromptInvocation(BaseInvocation, SDXLPromptInvocationBase):
"""Pass unmodified prompt to conditioning without compel processing."""
type: Literal["sdxl_raw_prompt"] = "sdxl_raw_prompt"
prompt: str = Field(default="", description="Prompt")
style: str = Field(default="", description="Style prompt")
original_width: int = Field(1024, description="")
original_height: int = Field(1024, description="")
crop_top: int = Field(0, description="")
crop_left: int = Field(0, description="")
target_width: int = Field(1024, description="")
target_height: int = Field(1024, description="")
clip: ClipField = Field(None, description="Clip to use")
clip2: ClipField = Field(None, description="Clip2 to use")
# Schema customisation
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "SDXL Prompt (Raw)", "tags": ["prompt", "compel"], "type_hints": {"model": "model"}},
}
@torch.no_grad()
def invoke(self, context: InvocationContext) -> CompelOutput:
c1, c1_pooled, ec1 = self.run_clip_raw(context, self.clip, self.prompt, False)
if self.style.strip() == "":
c2, c2_pooled, ec2 = self.run_clip_raw(context, self.clip2, self.prompt, True)
else:
c2, c2_pooled, ec2 = self.run_clip_raw(context, self.clip2, self.style, True)
original_size = (self.original_height, self.original_width)
crop_coords = (self.crop_top, self.crop_left)
target_size = (self.target_height, self.target_width)
add_time_ids = torch.tensor([original_size + crop_coords + target_size])
conditioning_data = ConditioningFieldData(
conditionings=[
SDXLConditioningInfo(
embeds=torch.cat([c1, c2], dim=-1),
pooled_embeds=c2_pooled,
add_time_ids=add_time_ids,
extra_conditioning=ec1,
)
]
)
conditioning_name = f"{context.graph_execution_state_id}_{self.id}_conditioning"
context.services.latents.save(conditioning_name, conditioning_data)
return CompelOutput(
conditioning=ConditioningField(
conditioning_name=conditioning_name,
),
)
class SDXLRefinerRawPromptInvocation(BaseInvocation, SDXLPromptInvocationBase):
"""Parse prompt using compel package to conditioning."""
type: Literal["sdxl_refiner_raw_prompt"] = "sdxl_refiner_raw_prompt"
style: str = Field(default="", description="Style prompt") # TODO: ?
original_width: int = Field(1024, description="")
original_height: int = Field(1024, description="")
crop_top: int = Field(0, description="")
crop_left: int = Field(0, description="")
aesthetic_score: float = Field(6.0, description="")
clip2: ClipField = Field(None, description="Clip to use")
# Schema customisation
class Config(InvocationConfig):
schema_extra = {
"ui": {
"title": "SDXL Refiner Prompt (Raw)",
"tags": ["prompt", "compel"],
"type_hints": {"model": "model"},
},
}
@torch.no_grad()
def invoke(self, context: InvocationContext) -> CompelOutput:
c2, c2_pooled, ec2 = self.run_clip_raw(context, self.clip2, self.style, True)
original_size = (self.original_height, self.original_width)
crop_coords = (self.crop_top, self.crop_left)
add_time_ids = torch.tensor([original_size + crop_coords + (self.aesthetic_score,)])
conditioning_data = ConditioningFieldData(
conditionings=[
SDXLConditioningInfo(
embeds=c2,
pooled_embeds=c2_pooled,
add_time_ids=add_time_ids,
extra_conditioning=ec2, # or None
)
]
)
conditioning_name = f"{context.graph_execution_state_id}_{self.id}_conditioning"
context.services.latents.save(conditioning_name, conditioning_data)
return CompelOutput(
return ConditioningOutput(
conditioning=ConditioningField(
conditioning_name=conditioning_name,
),
@ -557,21 +378,18 @@ class ClipSkipInvocationOutput(BaseInvocationOutput):
"""Clip skip node output"""
type: Literal["clip_skip_output"] = "clip_skip_output"
clip: ClipField = Field(None, description="Clip with skipped layers")
clip: ClipField = OutputField(default=None, description=FieldDescriptions.clip, title="CLIP")
@title("CLIP Skip")
@tags("clipskip", "clip", "skip")
class ClipSkipInvocation(BaseInvocation):
"""Skip layers in clip text_encoder model."""
type: Literal["clip_skip"] = "clip_skip"
clip: ClipField = Field(None, description="Clip to use")
skipped_layers: int = Field(0, description="Number of layers to skip in text_encoder")
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "CLIP Skip", "tags": ["clip", "skip"]},
}
clip: ClipField = InputField(description=FieldDescriptions.clip, input=Input.Connection, title="CLIP")
skipped_layers: int = InputField(default=0, description=FieldDescriptions.skipped_layers)
def invoke(self, context: InvocationContext) -> ClipSkipInvocationOutput:
self.clip.skipped_layers += self.skipped_layers

View File

@ -26,79 +26,31 @@ from controlnet_aux.util import HWC3, ade_palette
from PIL import Image
from pydantic import BaseModel, Field, validator
from invokeai.app.invocations.primitives import ImageField, ImageOutput
from ...backend.model_management import BaseModelType, ModelType
from ..models.image import ImageCategory, ImageField, ResourceOrigin
from .baseinvocation import BaseInvocation, BaseInvocationOutput, InvocationConfig, InvocationContext
from ..models.image import ImageOutput, PILInvocationConfig
from ..models.image import ImageCategory, ResourceOrigin
from .baseinvocation import (
BaseInvocation,
BaseInvocationOutput,
FieldDescriptions,
InputField,
Input,
InvocationContext,
OutputField,
UIType,
tags,
title,
)
CONTROLNET_DEFAULT_MODELS = [
###########################################
# lllyasviel sd v1.5, ControlNet v1.0 models
##############################################
"lllyasviel/sd-controlnet-canny",
"lllyasviel/sd-controlnet-depth",
"lllyasviel/sd-controlnet-hed",
"lllyasviel/sd-controlnet-seg",
"lllyasviel/sd-controlnet-openpose",
"lllyasviel/sd-controlnet-scribble",
"lllyasviel/sd-controlnet-normal",
"lllyasviel/sd-controlnet-mlsd",
#############################################
# lllyasviel sd v1.5, ControlNet v1.1 models
#############################################
"lllyasviel/control_v11p_sd15_canny",
"lllyasviel/control_v11p_sd15_openpose",
"lllyasviel/control_v11p_sd15_seg",
# "lllyasviel/control_v11p_sd15_depth", # broken
"lllyasviel/control_v11f1p_sd15_depth",
"lllyasviel/control_v11p_sd15_normalbae",
"lllyasviel/control_v11p_sd15_scribble",
"lllyasviel/control_v11p_sd15_mlsd",
"lllyasviel/control_v11p_sd15_softedge",
"lllyasviel/control_v11p_sd15s2_lineart_anime",
"lllyasviel/control_v11p_sd15_lineart",
"lllyasviel/control_v11p_sd15_inpaint",
# "lllyasviel/control_v11u_sd15_tile",
# problem (temporary?) with huffingface "lllyasviel/control_v11u_sd15_tile",
# so for now replace "lllyasviel/control_v11f1e_sd15_tile",
"lllyasviel/control_v11e_sd15_shuffle",
"lllyasviel/control_v11e_sd15_ip2p",
"lllyasviel/control_v11f1e_sd15_tile",
#################################################
# thibaud sd v2.1 models (ControlNet v1.0? or v1.1?
##################################################
"thibaud/controlnet-sd21-openpose-diffusers",
"thibaud/controlnet-sd21-canny-diffusers",
"thibaud/controlnet-sd21-depth-diffusers",
"thibaud/controlnet-sd21-scribble-diffusers",
"thibaud/controlnet-sd21-hed-diffusers",
"thibaud/controlnet-sd21-zoedepth-diffusers",
"thibaud/controlnet-sd21-color-diffusers",
"thibaud/controlnet-sd21-openposev2-diffusers",
"thibaud/controlnet-sd21-lineart-diffusers",
"thibaud/controlnet-sd21-normalbae-diffusers",
"thibaud/controlnet-sd21-ade20k-diffusers",
##############################################
# ControlNetMediaPipeface, ControlNet v1.1
##############################################
# ["CrucibleAI/ControlNetMediaPipeFace", "diffusion_sd15"], # SD 1.5
# diffusion_sd15 needs to be passed to from_pretrained() as subfolder arg
# hacked t2l to split to model & subfolder if format is "model,subfolder"
"CrucibleAI/ControlNetMediaPipeFace,diffusion_sd15", # SD 1.5
"CrucibleAI/ControlNetMediaPipeFace", # SD 2.1?
]
CONTROLNET_NAME_VALUES = Literal[tuple(CONTROLNET_DEFAULT_MODELS)]
CONTROLNET_MODE_VALUES = Literal[tuple(["balanced", "more_prompt", "more_control", "unbalanced"])]
CONTROLNET_MODE_VALUES = Literal["balanced", "more_prompt", "more_control", "unbalanced"]
CONTROLNET_RESIZE_VALUES = Literal[
tuple(
[
"just_resize",
"crop_resize",
"fill_resize",
"just_resize_simple",
]
)
"just_resize",
"crop_resize",
"fill_resize",
"just_resize_simple",
]
@ -110,9 +62,8 @@ class ControlNetModelField(BaseModel):
class ControlField(BaseModel):
image: ImageField = Field(default=None, description="The control image")
control_model: Optional[ControlNetModelField] = Field(default=None, description="The ControlNet model to use")
# control_weight: Optional[float] = Field(default=1, description="weight given to controlnet")
image: ImageField = Field(description="The control image")
control_model: ControlNetModelField = Field(description="The ControlNet model to use")
control_weight: Union[float, List[float]] = Field(default=1, description="The weight given to the ControlNet")
begin_step_percent: float = Field(
default=0, ge=0, le=1, description="When the ControlNet is first applied (% of total steps)"
@ -135,60 +86,39 @@ class ControlField(BaseModel):
raise ValueError("Control weights must be within -1 to 2 range")
return v
class Config:
schema_extra = {
"required": ["image", "control_model", "control_weight", "begin_step_percent", "end_step_percent"],
"ui": {
"type_hints": {
"control_weight": "float",
"control_model": "controlnet_model",
# "control_weight": "number",
}
},
}
class ControlOutput(BaseInvocationOutput):
"""node output for ControlNet info"""
# fmt: off
type: Literal["control_output"] = "control_output"
control: ControlField = Field(default=None, description="The control info")
# fmt: on
# Outputs
control: ControlField = OutputField(description=FieldDescriptions.control)
@title("ControlNet")
@tags("controlnet")
class ControlNetInvocation(BaseInvocation):
"""Collects ControlNet info to pass to other nodes"""
# fmt: off
type: Literal["controlnet"] = "controlnet"
# Inputs
image: ImageField = Field(default=None, description="The control image")
control_model: ControlNetModelField = Field(default="lllyasviel/sd-controlnet-canny",
description="control model used")
control_weight: Union[float, List[float]] = Field(default=1.0, description="The weight given to the ControlNet")
begin_step_percent: float = Field(default=0, ge=-1, le=2,
description="When the ControlNet is first applied (% of total steps)")
end_step_percent: float = Field(default=1, ge=0, le=1,
description="When the ControlNet is last applied (% of total steps)")
control_mode: CONTROLNET_MODE_VALUES = Field(default="balanced", description="The control mode used")
resize_mode: CONTROLNET_RESIZE_VALUES = Field(default="just_resize", description="The resize mode used")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {
"title": "ControlNet",
"tags": ["controlnet", "latents"],
"type_hints": {
"model": "model",
"control": "control",
# "cfg_scale": "float",
"cfg_scale": "number",
"control_weight": "float",
},
},
}
# Inputs
image: ImageField = InputField(description="The control image")
control_model: ControlNetModelField = InputField(
default="lllyasviel/sd-controlnet-canny", description=FieldDescriptions.controlnet_model, input=Input.Direct
)
control_weight: Union[float, List[float]] = InputField(
default=1.0, description="The weight given to the ControlNet", ui_type=UIType.Float
)
begin_step_percent: float = InputField(
default=0, ge=-1, le=2, description="When the ControlNet is first applied (% of total steps)"
)
end_step_percent: float = InputField(
default=1, ge=0, le=1, description="When the ControlNet is last applied (% of total steps)"
)
control_mode: CONTROLNET_MODE_VALUES = InputField(default="balanced", description="The control mode used")
resize_mode: CONTROLNET_RESIZE_VALUES = InputField(default="just_resize", description="The resize mode used")
def invoke(self, context: InvocationContext) -> ControlOutput:
return ControlOutput(
@ -204,19 +134,13 @@ class ControlNetInvocation(BaseInvocation):
)
class ImageProcessorInvocation(BaseInvocation, PILInvocationConfig):
class ImageProcessorInvocation(BaseInvocation):
"""Base class for invocations that preprocess images for ControlNet"""
# fmt: off
type: Literal["image_processor"] = "image_processor"
# Inputs
image: ImageField = Field(default=None, description="The image to process")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Image Processor", "tags": ["image", "processor"]},
}
# Inputs
image: ImageField = InputField(description="The image to process")
def run_processor(self, image):
# superclass just passes through image without processing
@ -255,20 +179,20 @@ class ImageProcessorInvocation(BaseInvocation, PILInvocationConfig):
)
class CannyImageProcessorInvocation(ImageProcessorInvocation, PILInvocationConfig):
@title("Canny Processor")
@tags("controlnet", "canny")
class CannyImageProcessorInvocation(ImageProcessorInvocation):
"""Canny edge detection for ControlNet"""
# fmt: off
type: Literal["canny_image_processor"] = "canny_image_processor"
# Input
low_threshold: int = Field(default=100, ge=0, le=255, description="The low threshold of the Canny pixel gradient (0-255)")
high_threshold: int = Field(default=200, ge=0, le=255, description="The high threshold of the Canny pixel gradient (0-255)")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Canny Processor", "tags": ["controlnet", "canny", "image", "processor"]},
}
# Input
low_threshold: int = InputField(
default=100, ge=0, le=255, description="The low threshold of the Canny pixel gradient (0-255)"
)
high_threshold: int = InputField(
default=200, ge=0, le=255, description="The high threshold of the Canny pixel gradient (0-255)"
)
def run_processor(self, image):
canny_processor = CannyDetector()
@ -276,23 +200,19 @@ class CannyImageProcessorInvocation(ImageProcessorInvocation, PILInvocationConfi
return processed_image
class HedImageProcessorInvocation(ImageProcessorInvocation, PILInvocationConfig):
@title("HED (softedge) Processor")
@tags("controlnet", "hed", "softedge")
class HedImageProcessorInvocation(ImageProcessorInvocation):
"""Applies HED edge detection to image"""
# fmt: off
type: Literal["hed_image_processor"] = "hed_image_processor"
# Inputs
detect_resolution: int = Field(default=512, ge=0, description="The pixel resolution for detection")
image_resolution: int = Field(default=512, ge=0, description="The pixel resolution for the output image")
# safe not supported in controlnet_aux v0.0.3
# safe: bool = Field(default=False, description="whether to use safe mode")
scribble: bool = Field(default=False, description="Whether to use scribble mode")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Softedge(HED) Processor", "tags": ["controlnet", "softedge", "hed", "image", "processor"]},
}
# Inputs
detect_resolution: int = InputField(default=512, ge=0, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=0, description=FieldDescriptions.image_res)
# safe not supported in controlnet_aux v0.0.3
# safe: bool = InputField(default=False, description=FieldDescriptions.safe_mode)
scribble: bool = InputField(default=False, description=FieldDescriptions.scribble_mode)
def run_processor(self, image):
hed_processor = HEDdetector.from_pretrained("lllyasviel/Annotators")
@ -307,21 +227,17 @@ class HedImageProcessorInvocation(ImageProcessorInvocation, PILInvocationConfig)
return processed_image
class LineartImageProcessorInvocation(ImageProcessorInvocation, PILInvocationConfig):
@title("Lineart Processor")
@tags("controlnet", "lineart")
class LineartImageProcessorInvocation(ImageProcessorInvocation):
"""Applies line art processing to image"""
# fmt: off
type: Literal["lineart_image_processor"] = "lineart_image_processor"
# Inputs
detect_resolution: int = Field(default=512, ge=0, description="The pixel resolution for detection")
image_resolution: int = Field(default=512, ge=0, description="The pixel resolution for the output image")
coarse: bool = Field(default=False, description="Whether to use coarse mode")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Lineart Processor", "tags": ["controlnet", "lineart", "image", "processor"]},
}
# Inputs
detect_resolution: int = InputField(default=512, ge=0, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=0, description=FieldDescriptions.image_res)
coarse: bool = InputField(default=False, description="Whether to use coarse mode")
def run_processor(self, image):
lineart_processor = LineartDetector.from_pretrained("lllyasviel/Annotators")
@ -331,23 +247,16 @@ class LineartImageProcessorInvocation(ImageProcessorInvocation, PILInvocationCon
return processed_image
class LineartAnimeImageProcessorInvocation(ImageProcessorInvocation, PILInvocationConfig):
@title("Lineart Anime Processor")
@tags("controlnet", "lineart", "anime")
class LineartAnimeImageProcessorInvocation(ImageProcessorInvocation):
"""Applies line art anime processing to image"""
# fmt: off
type: Literal["lineart_anime_image_processor"] = "lineart_anime_image_processor"
# Inputs
detect_resolution: int = Field(default=512, ge=0, description="The pixel resolution for detection")
image_resolution: int = Field(default=512, ge=0, description="The pixel resolution for the output image")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {
"title": "Lineart Anime Processor",
"tags": ["controlnet", "lineart", "anime", "image", "processor"],
},
}
# Inputs
detect_resolution: int = InputField(default=512, ge=0, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=0, description=FieldDescriptions.image_res)
def run_processor(self, image):
processor = LineartAnimeDetector.from_pretrained("lllyasviel/Annotators")
@ -359,21 +268,17 @@ class LineartAnimeImageProcessorInvocation(ImageProcessorInvocation, PILInvocati
return processed_image
class OpenposeImageProcessorInvocation(ImageProcessorInvocation, PILInvocationConfig):
@title("Openpose Processor")
@tags("controlnet", "openpose", "pose")
class OpenposeImageProcessorInvocation(ImageProcessorInvocation):
"""Applies Openpose processing to image"""
# fmt: off
type: Literal["openpose_image_processor"] = "openpose_image_processor"
# Inputs
hand_and_face: bool = Field(default=False, description="Whether to use hands and face mode")
detect_resolution: int = Field(default=512, ge=0, description="The pixel resolution for detection")
image_resolution: int = Field(default=512, ge=0, description="The pixel resolution for the output image")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Openpose Processor", "tags": ["controlnet", "openpose", "image", "processor"]},
}
# Inputs
hand_and_face: bool = InputField(default=False, description="Whether to use hands and face mode")
detect_resolution: int = InputField(default=512, ge=0, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=0, description=FieldDescriptions.image_res)
def run_processor(self, image):
openpose_processor = OpenposeDetector.from_pretrained("lllyasviel/Annotators")
@ -386,22 +291,18 @@ class OpenposeImageProcessorInvocation(ImageProcessorInvocation, PILInvocationCo
return processed_image
class MidasDepthImageProcessorInvocation(ImageProcessorInvocation, PILInvocationConfig):
@title("Midas (Depth) Processor")
@tags("controlnet", "midas", "depth")
class MidasDepthImageProcessorInvocation(ImageProcessorInvocation):
"""Applies Midas depth processing to image"""
# fmt: off
type: Literal["midas_depth_image_processor"] = "midas_depth_image_processor"
# Inputs
a_mult: float = Field(default=2.0, ge=0, description="Midas parameter `a_mult` (a = a_mult * PI)")
bg_th: float = Field(default=0.1, ge=0, description="Midas parameter `bg_th`")
# depth_and_normal not supported in controlnet_aux v0.0.3
# depth_and_normal: bool = Field(default=False, description="whether to use depth and normal mode")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Midas (Depth) Processor", "tags": ["controlnet", "midas", "depth", "image", "processor"]},
}
# Inputs
a_mult: float = InputField(default=2.0, ge=0, description="Midas parameter `a_mult` (a = a_mult * PI)")
bg_th: float = InputField(default=0.1, ge=0, description="Midas parameter `bg_th`")
# depth_and_normal not supported in controlnet_aux v0.0.3
# depth_and_normal: bool = InputField(default=False, description="whether to use depth and normal mode")
def run_processor(self, image):
midas_processor = MidasDetector.from_pretrained("lllyasviel/Annotators")
@ -415,20 +316,16 @@ class MidasDepthImageProcessorInvocation(ImageProcessorInvocation, PILInvocation
return processed_image
class NormalbaeImageProcessorInvocation(ImageProcessorInvocation, PILInvocationConfig):
@title("Normal BAE Processor")
@tags("controlnet", "normal", "bae")
class NormalbaeImageProcessorInvocation(ImageProcessorInvocation):
"""Applies NormalBae processing to image"""
# fmt: off
type: Literal["normalbae_image_processor"] = "normalbae_image_processor"
# Inputs
detect_resolution: int = Field(default=512, ge=0, description="The pixel resolution for detection")
image_resolution: int = Field(default=512, ge=0, description="The pixel resolution for the output image")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Normal BAE Processor", "tags": ["controlnet", "normal", "bae", "image", "processor"]},
}
# Inputs
detect_resolution: int = InputField(default=512, ge=0, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=0, description=FieldDescriptions.image_res)
def run_processor(self, image):
normalbae_processor = NormalBaeDetector.from_pretrained("lllyasviel/Annotators")
@ -438,22 +335,18 @@ class NormalbaeImageProcessorInvocation(ImageProcessorInvocation, PILInvocationC
return processed_image
class MlsdImageProcessorInvocation(ImageProcessorInvocation, PILInvocationConfig):
@title("MLSD Processor")
@tags("controlnet", "mlsd")
class MlsdImageProcessorInvocation(ImageProcessorInvocation):
"""Applies MLSD processing to image"""
# fmt: off
type: Literal["mlsd_image_processor"] = "mlsd_image_processor"
# Inputs
detect_resolution: int = Field(default=512, ge=0, description="The pixel resolution for detection")
image_resolution: int = Field(default=512, ge=0, description="The pixel resolution for the output image")
thr_v: float = Field(default=0.1, ge=0, description="MLSD parameter `thr_v`")
thr_d: float = Field(default=0.1, ge=0, description="MLSD parameter `thr_d`")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "MLSD Processor", "tags": ["controlnet", "mlsd", "image", "processor"]},
}
# Inputs
detect_resolution: int = InputField(default=512, ge=0, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=0, description=FieldDescriptions.image_res)
thr_v: float = InputField(default=0.1, ge=0, description="MLSD parameter `thr_v`")
thr_d: float = InputField(default=0.1, ge=0, description="MLSD parameter `thr_d`")
def run_processor(self, image):
mlsd_processor = MLSDdetector.from_pretrained("lllyasviel/Annotators")
@ -467,22 +360,18 @@ class MlsdImageProcessorInvocation(ImageProcessorInvocation, PILInvocationConfig
return processed_image
class PidiImageProcessorInvocation(ImageProcessorInvocation, PILInvocationConfig):
@title("PIDI Processor")
@tags("controlnet", "pidi")
class PidiImageProcessorInvocation(ImageProcessorInvocation):
"""Applies PIDI processing to image"""
# fmt: off
type: Literal["pidi_image_processor"] = "pidi_image_processor"
# Inputs
detect_resolution: int = Field(default=512, ge=0, description="The pixel resolution for detection")
image_resolution: int = Field(default=512, ge=0, description="The pixel resolution for the output image")
safe: bool = Field(default=False, description="Whether to use safe mode")
scribble: bool = Field(default=False, description="Whether to use scribble mode")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "PIDI Processor", "tags": ["controlnet", "pidi", "image", "processor"]},
}
# Inputs
detect_resolution: int = InputField(default=512, ge=0, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=0, description=FieldDescriptions.image_res)
safe: bool = InputField(default=False, description=FieldDescriptions.safe_mode)
scribble: bool = InputField(default=False, description=FieldDescriptions.scribble_mode)
def run_processor(self, image):
pidi_processor = PidiNetDetector.from_pretrained("lllyasviel/Annotators")
@ -496,26 +385,19 @@ class PidiImageProcessorInvocation(ImageProcessorInvocation, PILInvocationConfig
return processed_image
class ContentShuffleImageProcessorInvocation(ImageProcessorInvocation, PILInvocationConfig):
@title("Content Shuffle Processor")
@tags("controlnet", "contentshuffle")
class ContentShuffleImageProcessorInvocation(ImageProcessorInvocation):
"""Applies content shuffle processing to image"""
# fmt: off
type: Literal["content_shuffle_image_processor"] = "content_shuffle_image_processor"
# Inputs
detect_resolution: int = Field(default=512, ge=0, description="The pixel resolution for detection")
image_resolution: int = Field(default=512, ge=0, description="The pixel resolution for the output image")
h: Optional[int] = Field(default=512, ge=0, description="Content shuffle `h` parameter")
w: Optional[int] = Field(default=512, ge=0, description="Content shuffle `w` parameter")
f: Optional[int] = Field(default=256, ge=0, description="Content shuffle `f` parameter")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {
"title": "Content Shuffle Processor",
"tags": ["controlnet", "contentshuffle", "image", "processor"],
},
}
# Inputs
detect_resolution: int = InputField(default=512, ge=0, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=0, description=FieldDescriptions.image_res)
h: Optional[int] = InputField(default=512, ge=0, description="Content shuffle `h` parameter")
w: Optional[int] = InputField(default=512, ge=0, description="Content shuffle `w` parameter")
f: Optional[int] = InputField(default=256, ge=0, description="Content shuffle `f` parameter")
def run_processor(self, image):
content_shuffle_processor = ContentShuffleDetector()
@ -531,17 +413,12 @@ class ContentShuffleImageProcessorInvocation(ImageProcessorInvocation, PILInvoca
# should work with controlnet_aux >= 0.0.4 and timm <= 0.6.13
class ZoeDepthImageProcessorInvocation(ImageProcessorInvocation, PILInvocationConfig):
@title("Zoe (Depth) Processor")
@tags("controlnet", "zoe", "depth")
class ZoeDepthImageProcessorInvocation(ImageProcessorInvocation):
"""Applies Zoe depth processing to image"""
# fmt: off
type: Literal["zoe_depth_image_processor"] = "zoe_depth_image_processor"
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Zoe (Depth) Processor", "tags": ["controlnet", "zoe", "depth", "image", "processor"]},
}
def run_processor(self, image):
zoe_depth_processor = ZoeDetector.from_pretrained("lllyasviel/Annotators")
@ -549,20 +426,16 @@ class ZoeDepthImageProcessorInvocation(ImageProcessorInvocation, PILInvocationCo
return processed_image
class MediapipeFaceProcessorInvocation(ImageProcessorInvocation, PILInvocationConfig):
@title("Mediapipe Face Processor")
@tags("controlnet", "mediapipe", "face")
class MediapipeFaceProcessorInvocation(ImageProcessorInvocation):
"""Applies mediapipe face processing to image"""
# fmt: off
type: Literal["mediapipe_face_processor"] = "mediapipe_face_processor"
# Inputs
max_faces: int = Field(default=1, ge=1, description="Maximum number of faces to detect")
min_confidence: float = Field(default=0.5, ge=0, le=1, description="Minimum confidence for face detection")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Mediapipe Processor", "tags": ["controlnet", "mediapipe", "image", "processor"]},
}
# Inputs
max_faces: int = InputField(default=1, ge=1, description="Maximum number of faces to detect")
min_confidence: float = InputField(default=0.5, ge=0, le=1, description="Minimum confidence for face detection")
def run_processor(self, image):
# MediaPipeFaceDetector throws an error if image has alpha channel
@ -574,23 +447,19 @@ class MediapipeFaceProcessorInvocation(ImageProcessorInvocation, PILInvocationCo
return processed_image
class LeresImageProcessorInvocation(ImageProcessorInvocation, PILInvocationConfig):
@title("Leres (Depth) Processor")
@tags("controlnet", "leres", "depth")
class LeresImageProcessorInvocation(ImageProcessorInvocation):
"""Applies leres processing to image"""
# fmt: off
type: Literal["leres_image_processor"] = "leres_image_processor"
# Inputs
thr_a: float = Field(default=0, description="Leres parameter `thr_a`")
thr_b: float = Field(default=0, description="Leres parameter `thr_b`")
boost: bool = Field(default=False, description="Whether to use boost mode")
detect_resolution: int = Field(default=512, ge=0, description="The pixel resolution for detection")
image_resolution: int = Field(default=512, ge=0, description="The pixel resolution for the output image")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Leres (Depth) Processor", "tags": ["controlnet", "leres", "depth", "image", "processor"]},
}
# Inputs
thr_a: float = InputField(default=0, description="Leres parameter `thr_a`")
thr_b: float = InputField(default=0, description="Leres parameter `thr_b`")
boost: bool = InputField(default=False, description="Whether to use boost mode")
detect_resolution: int = InputField(default=512, ge=0, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=0, description=FieldDescriptions.image_res)
def run_processor(self, image):
leres_processor = LeresDetector.from_pretrained("lllyasviel/Annotators")
@ -605,21 +474,16 @@ class LeresImageProcessorInvocation(ImageProcessorInvocation, PILInvocationConfi
return processed_image
class TileResamplerProcessorInvocation(ImageProcessorInvocation, PILInvocationConfig):
# fmt: off
type: Literal["tile_image_processor"] = "tile_image_processor"
# Inputs
#res: int = Field(default=512, ge=0, le=1024, description="The pixel resolution for each tile")
down_sampling_rate: float = Field(default=1.0, ge=1.0, le=8.0, description="Down sampling rate")
# fmt: on
@title("Tile Resample Processor")
@tags("controlnet", "tile")
class TileResamplerProcessorInvocation(ImageProcessorInvocation):
"""Tile resampler processor"""
class Config(InvocationConfig):
schema_extra = {
"ui": {
"title": "Tile Resample Processor",
"tags": ["controlnet", "tile", "resample", "image", "processor"],
},
}
type: Literal["tile_image_processor"] = "tile_image_processor"
# Inputs
# res: int = InputField(default=512, ge=0, le=1024, description="The pixel resolution for each tile")
down_sampling_rate: float = InputField(default=1.0, ge=1.0, le=8.0, description="Down sampling rate")
# tile_resample copied from sd-webui-controlnet/scripts/processor.py
def tile_resample(
@ -648,20 +512,12 @@ class TileResamplerProcessorInvocation(ImageProcessorInvocation, PILInvocationCo
return processed_image
class SegmentAnythingProcessorInvocation(ImageProcessorInvocation, PILInvocationConfig):
@title("Segment Anything Processor")
@tags("controlnet", "segmentanything")
class SegmentAnythingProcessorInvocation(ImageProcessorInvocation):
"""Applies segment anything processing to image"""
# fmt: off
type: Literal["segment_anything_processor"] = "segment_anything_processor"
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {
"title": "Segment Anything Processor",
"tags": ["controlnet", "segment", "anything", "sam", "image", "processor"],
},
}
def run_processor(self, image):
# segment_anything_processor = SamDetector.from_pretrained("ybelkada/segment-anything", subfolder="checkpoints")

View File

@ -5,40 +5,22 @@ from typing import Literal
import cv2 as cv
import numpy
from PIL import Image, ImageOps
from pydantic import BaseModel, Field
from invokeai.app.invocations.primitives import ImageField, ImageOutput
from invokeai.app.models.image import ImageCategory, ImageField, ResourceOrigin
from .baseinvocation import BaseInvocation, InvocationContext, InvocationConfig
from .image import ImageOutput
from invokeai.app.models.image import ImageCategory, ResourceOrigin
from .baseinvocation import BaseInvocation, InputField, InvocationContext, tags, title
class CvInvocationConfig(BaseModel):
"""Helper class to provide all OpenCV invocations with additional config"""
# Schema customisation
class Config(InvocationConfig):
schema_extra = {
"ui": {
"tags": ["cv", "image"],
},
}
class CvInpaintInvocation(BaseInvocation, CvInvocationConfig):
@title("OpenCV Inpaint")
@tags("opencv", "inpaint")
class CvInpaintInvocation(BaseInvocation):
"""Simple inpaint using opencv."""
# fmt: off
type: Literal["cv_inpaint"] = "cv_inpaint"
# Inputs
image: ImageField = Field(default=None, description="The image to inpaint")
mask: ImageField = Field(default=None, description="The mask to use when inpainting")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "OpenCV Inpaint", "tags": ["opencv", "inpaint"]},
}
image: ImageField = InputField(description="The image to inpaint")
mask: ImageField = InputField(description="The mask to use when inpainting")
def invoke(self, context: InvocationContext) -> ImageOutput:
image = context.services.images.get_pil_image(self.image.image_name)

View File

@ -1,251 +0,0 @@
# Copyright (c) 2022 Kyle Schouviller (https://github.com/kyle0654)
from functools import partial
from typing import Literal, Optional, get_args
import torch
from pydantic import Field
from invokeai.app.models.image import ColorField, ImageCategory, ImageField, ResourceOrigin
from invokeai.app.util.misc import SEED_MAX, get_random_seed
from invokeai.backend.generator.inpaint import infill_methods
from ...backend.generator import Inpaint, InvokeAIGenerator
from ...backend.stable_diffusion import PipelineIntermediateState
from ..util.step_callback import stable_diffusion_step_callback
from .baseinvocation import BaseInvocation, InvocationConfig, InvocationContext
from .image import ImageOutput
from ...backend.model_management.lora import ModelPatcher
from ...backend.stable_diffusion.diffusers_pipeline import StableDiffusionGeneratorPipeline
from .model import UNetField, VaeField
from .compel import ConditioningField
from contextlib import contextmanager, ExitStack, ContextDecorator
SAMPLER_NAME_VALUES = Literal[tuple(InvokeAIGenerator.schedulers())]
INFILL_METHODS = Literal[tuple(infill_methods())]
DEFAULT_INFILL_METHOD = "patchmatch" if "patchmatch" in get_args(INFILL_METHODS) else "tile"
from .latent import get_scheduler
class OldModelContext(ContextDecorator):
model: StableDiffusionGeneratorPipeline
def __init__(self, model):
self.model = model
def __enter__(self):
return self.model
def __exit__(self, *exc):
return False
class OldModelInfo:
name: str
hash: str
context: OldModelContext
def __init__(self, name: str, hash: str, model: StableDiffusionGeneratorPipeline):
self.name = name
self.hash = hash
self.context = OldModelContext(
model=model,
)
class InpaintInvocation(BaseInvocation):
"""Generates an image using inpaint."""
type: Literal["inpaint"] = "inpaint"
positive_conditioning: Optional[ConditioningField] = Field(description="Positive conditioning for generation")
negative_conditioning: Optional[ConditioningField] = Field(description="Negative conditioning for generation")
seed: int = Field(
ge=0, le=SEED_MAX, description="The seed to use (omit for random)", default_factory=get_random_seed
)
steps: int = Field(default=30, gt=0, description="The number of steps to use to generate the image")
width: int = Field(
default=512,
multiple_of=8,
gt=0,
description="The width of the resulting image",
)
height: int = Field(
default=512,
multiple_of=8,
gt=0,
description="The height of the resulting image",
)
cfg_scale: float = Field(
default=7.5,
ge=1,
description="The Classifier-Free Guidance, higher values may result in a result closer to the prompt",
)
scheduler: SAMPLER_NAME_VALUES = Field(default="euler", description="The scheduler to use")
unet: UNetField = Field(default=None, description="UNet model")
vae: VaeField = Field(default=None, description="Vae model")
# Inputs
image: Optional[ImageField] = Field(description="The input image")
strength: float = Field(default=0.75, gt=0, le=1, description="The strength of the original image")
fit: bool = Field(
default=True,
description="Whether or not the result should be fit to the aspect ratio of the input image",
)
# Inputs
mask: Optional[ImageField] = Field(description="The mask")
seam_size: int = Field(default=96, ge=1, description="The seam inpaint size (px)")
seam_blur: int = Field(default=16, ge=0, description="The seam inpaint blur radius (px)")
seam_strength: float = Field(default=0.75, gt=0, le=1, description="The seam inpaint strength")
seam_steps: int = Field(default=30, ge=1, description="The number of steps to use for seam inpaint")
tile_size: int = Field(default=32, ge=1, description="The tile infill method size (px)")
infill_method: INFILL_METHODS = Field(
default=DEFAULT_INFILL_METHOD,
description="The method used to infill empty regions (px)",
)
inpaint_width: Optional[int] = Field(
default=None,
multiple_of=8,
gt=0,
description="The width of the inpaint region (px)",
)
inpaint_height: Optional[int] = Field(
default=None,
multiple_of=8,
gt=0,
description="The height of the inpaint region (px)",
)
inpaint_fill: Optional[ColorField] = Field(
default=ColorField(r=127, g=127, b=127, a=255),
description="The solid infill method color",
)
inpaint_replace: float = Field(
default=0.0,
ge=0.0,
le=1.0,
description="The amount by which to replace masked areas with latent noise",
)
# Schema customisation
class Config(InvocationConfig):
schema_extra = {
"ui": {"tags": ["stable-diffusion", "image"], "title": "Inpaint"},
}
def dispatch_progress(
self,
context: InvocationContext,
source_node_id: str,
intermediate_state: PipelineIntermediateState,
) -> None:
stable_diffusion_step_callback(
context=context,
intermediate_state=intermediate_state,
node=self.dict(),
source_node_id=source_node_id,
)
def get_conditioning(self, context, unet):
positive_cond_data = context.services.latents.get(self.positive_conditioning.conditioning_name)
c = positive_cond_data.conditionings[0].embeds.to(device=unet.device, dtype=unet.dtype)
extra_conditioning_info = positive_cond_data.conditionings[0].extra_conditioning
negative_cond_data = context.services.latents.get(self.negative_conditioning.conditioning_name)
uc = negative_cond_data.conditionings[0].embeds.to(device=unet.device, dtype=unet.dtype)
return (uc, c, extra_conditioning_info)
@contextmanager
def load_model_old_way(self, context, scheduler):
def _lora_loader():
for lora in self.unet.loras:
lora_info = context.services.model_manager.get_model(
**lora.dict(exclude={"weight"}),
context=context,
)
yield (lora_info.context.model, lora.weight)
del lora_info
return
unet_info = context.services.model_manager.get_model(
**self.unet.unet.dict(),
context=context,
)
vae_info = context.services.model_manager.get_model(
**self.vae.vae.dict(),
context=context,
)
with vae_info as vae, ModelPatcher.apply_lora_unet(unet_info.context.model, _lora_loader()), unet_info as unet:
device = context.services.model_manager.mgr.cache.execution_device
dtype = context.services.model_manager.mgr.cache.precision
pipeline = StableDiffusionGeneratorPipeline(
vae=vae,
text_encoder=None,
tokenizer=None,
unet=unet,
scheduler=scheduler,
safety_checker=None,
feature_extractor=None,
requires_safety_checker=False,
precision="float16" if dtype == torch.float16 else "float32",
execution_device=device,
)
yield OldModelInfo(
name=self.unet.unet.model_name,
hash="<NO-HASH>",
model=pipeline,
)
def invoke(self, context: InvocationContext) -> ImageOutput:
image = None if self.image is None else context.services.images.get_pil_image(self.image.image_name)
mask = None if self.mask is None else context.services.images.get_pil_image(self.mask.image_name)
# Get the source node id (we are invoking the prepared node)
graph_execution_state = context.services.graph_execution_manager.get(context.graph_execution_state_id)
source_node_id = graph_execution_state.prepared_source_mapping[self.id]
scheduler = get_scheduler(
context=context,
scheduler_info=self.unet.scheduler,
scheduler_name=self.scheduler,
)
with self.load_model_old_way(context, scheduler) as model:
conditioning = self.get_conditioning(context, model.context.model.unet)
outputs = Inpaint(model).generate(
conditioning=conditioning,
scheduler=scheduler,
init_image=image,
mask_image=mask,
step_callback=partial(self.dispatch_progress, context, source_node_id),
**self.dict(
exclude={"positive_conditioning", "negative_conditioning", "scheduler", "image", "mask"}
), # Shorthand for passing all of the parameters above manually
)
# Outputs is an infinite iterator that will return a new InvokeAIGeneratorOutput object
# each time it is called. We only need the first one.
generator_output = next(outputs)
image_dto = context.services.images.create(
image=generator_output.image,
image_origin=ResourceOrigin.INTERNAL,
image_category=ImageCategory.GENERAL,
session_id=context.graph_execution_state_id,
node_id=self.id,
is_intermediate=self.is_intermediate,
)
return ImageOutput(
image=ImageField(image_name=image_dto.image_name),
width=image_dto.width,
height=image_dto.height,
)

View File

@ -1,69 +1,31 @@
# Copyright (c) 2022 Kyle Schouviller (https://github.com/kyle0654)
from pathlib import Path
from typing import Literal, Optional
import cv2
import numpy
from PIL import Image, ImageFilter, ImageOps, ImageChops
from pydantic import Field
from pathlib import Path
from typing import Union
from PIL import Image, ImageChops, ImageFilter, ImageOps
from invokeai.app.invocations.metadata import CoreMetadata
from ..models.image import (
ImageCategory,
ImageField,
ResourceOrigin,
PILInvocationConfig,
ImageOutput,
MaskOutput,
)
from .baseinvocation import (
BaseInvocation,
InvocationContext,
InvocationConfig,
)
from invokeai.backend.image_util.safety_checker import SafetyChecker
from invokeai.app.invocations.primitives import ImageField, ImageOutput
from invokeai.backend.image_util.invisible_watermark import InvisibleWatermark
from invokeai.backend.image_util.safety_checker import SafetyChecker
from ..models.image import ImageCategory, ResourceOrigin
from .baseinvocation import BaseInvocation, FieldDescriptions, InputField, InvocationContext, tags, title
class LoadImageInvocation(BaseInvocation):
"""Load an image and provide it as output."""
# fmt: off
type: Literal["load_image"] = "load_image"
# Inputs
image: Optional[ImageField] = Field(
default=None, description="The image to load"
)
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Load Image", "tags": ["image", "load"]},
}
def invoke(self, context: InvocationContext) -> ImageOutput:
image = context.services.images.get_pil_image(self.image.image_name)
return ImageOutput(
image=ImageField(image_name=self.image.image_name),
width=image.width,
height=image.height,
)
@title("Show Image")
@tags("image")
class ShowImageInvocation(BaseInvocation):
"""Displays a provided image, and passes it forward in the pipeline."""
# Metadata
type: Literal["show_image"] = "show_image"
# Inputs
image: Optional[ImageField] = Field(default=None, description="The image to show")
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Show Image", "tags": ["image", "show"]},
}
image: ImageField = InputField(description="The image to show")
def invoke(self, context: InvocationContext) -> ImageOutput:
image = context.services.images.get_pil_image(self.image.image_name)
@ -79,24 +41,20 @@ class ShowImageInvocation(BaseInvocation):
)
class ImageCropInvocation(BaseInvocation, PILInvocationConfig):
@title("Crop Image")
@tags("image", "crop")
class ImageCropInvocation(BaseInvocation):
"""Crops an image to a specified box. The box can be outside of the image."""
# fmt: off
# Metadata
type: Literal["img_crop"] = "img_crop"
# Inputs
image: Optional[ImageField] = Field(default=None, description="The image to crop")
x: int = Field(default=0, description="The left x coordinate of the crop rectangle")
y: int = Field(default=0, description="The top y coordinate of the crop rectangle")
width: int = Field(default=512, gt=0, description="The width of the crop rectangle")
height: int = Field(default=512, gt=0, description="The height of the crop rectangle")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Crop Image", "tags": ["image", "crop"]},
}
image: ImageField = InputField(description="The image to crop")
x: int = InputField(default=0, description="The left x coordinate of the crop rectangle")
y: int = InputField(default=0, description="The top y coordinate of the crop rectangle")
width: int = InputField(default=512, gt=0, description="The width of the crop rectangle")
height: int = InputField(default=512, gt=0, description="The height of the crop rectangle")
def invoke(self, context: InvocationContext) -> ImageOutput:
image = context.services.images.get_pil_image(self.image.image_name)
@ -120,31 +78,31 @@ class ImageCropInvocation(BaseInvocation, PILInvocationConfig):
)
class ImagePasteInvocation(BaseInvocation, PILInvocationConfig):
@title("Paste Image")
@tags("image", "paste")
class ImagePasteInvocation(BaseInvocation):
"""Pastes an image into another image."""
# fmt: off
# Metadata
type: Literal["img_paste"] = "img_paste"
# Inputs
base_image: Optional[ImageField] = Field(default=None, description="The base image")
image: Optional[ImageField] = Field(default=None, description="The image to paste")
mask: Optional[ImageField] = Field(default=None, description="The mask to use when pasting")
x: int = Field(default=0, description="The left x coordinate at which to paste the image")
y: int = Field(default=0, description="The top y coordinate at which to paste the image")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Paste Image", "tags": ["image", "paste"]},
}
base_image: ImageField = InputField(description="The base image")
image: ImageField = InputField(description="The image to paste")
mask: Optional[ImageField] = InputField(
default=None,
description="The mask to use when pasting",
)
x: int = InputField(default=0, description="The left x coordinate at which to paste the image")
y: int = InputField(default=0, description="The top y coordinate at which to paste the image")
def invoke(self, context: InvocationContext) -> ImageOutput:
base_image = context.services.images.get_pil_image(self.base_image.image_name)
image = context.services.images.get_pil_image(self.image.image_name)
mask = (
None if self.mask is None else ImageOps.invert(context.services.images.get_pil_image(self.mask.image_name))
)
mask = None
if self.mask is not None:
mask = context.services.images.get_pil_image(self.mask.image_name)
mask = ImageOps.invert(mask.convert("L"))
# TODO: probably shouldn't invert mask here... should user be required to do it?
min_x = min(0, self.x)
@ -172,23 +130,19 @@ class ImagePasteInvocation(BaseInvocation, PILInvocationConfig):
)
class MaskFromAlphaInvocation(BaseInvocation, PILInvocationConfig):
@title("Mask from Alpha")
@tags("image", "mask")
class MaskFromAlphaInvocation(BaseInvocation):
"""Extracts the alpha channel of an image as a mask."""
# fmt: off
# Metadata
type: Literal["tomask"] = "tomask"
# Inputs
image: Optional[ImageField] = Field(default=None, description="The image to create the mask from")
invert: bool = Field(default=False, description="Whether or not to invert the mask")
# fmt: on
image: ImageField = InputField(description="The image to create the mask from")
invert: bool = InputField(default=False, description="Whether or not to invert the mask")
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Mask From Alpha", "tags": ["image", "mask", "alpha"]},
}
def invoke(self, context: InvocationContext) -> MaskOutput:
def invoke(self, context: InvocationContext) -> ImageOutput:
image = context.services.images.get_pil_image(self.image.image_name)
image_mask = image.split()[-1]
@ -204,28 +158,24 @@ class MaskFromAlphaInvocation(BaseInvocation, PILInvocationConfig):
is_intermediate=self.is_intermediate,
)
return MaskOutput(
mask=ImageField(image_name=image_dto.image_name),
return ImageOutput(
image=ImageField(image_name=image_dto.image_name),
width=image_dto.width,
height=image_dto.height,
)
class ImageMultiplyInvocation(BaseInvocation, PILInvocationConfig):
@title("Multiply Images")
@tags("image", "multiply")
class ImageMultiplyInvocation(BaseInvocation):
"""Multiplies two images together using `PIL.ImageChops.multiply()`."""
# fmt: off
# Metadata
type: Literal["img_mul"] = "img_mul"
# Inputs
image1: Optional[ImageField] = Field(default=None, description="The first image to multiply")
image2: Optional[ImageField] = Field(default=None, description="The second image to multiply")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Multiply Images", "tags": ["image", "multiply"]},
}
image1: ImageField = InputField(description="The first image to multiply")
image2: ImageField = InputField(description="The second image to multiply")
def invoke(self, context: InvocationContext) -> ImageOutput:
image1 = context.services.images.get_pil_image(self.image1.image_name)
@ -252,21 +202,17 @@ class ImageMultiplyInvocation(BaseInvocation, PILInvocationConfig):
IMAGE_CHANNELS = Literal["A", "R", "G", "B"]
class ImageChannelInvocation(BaseInvocation, PILInvocationConfig):
@title("Extract Image Channel")
@tags("image", "channel")
class ImageChannelInvocation(BaseInvocation):
"""Gets a channel from an image."""
# fmt: off
# Metadata
type: Literal["img_chan"] = "img_chan"
# Inputs
image: Optional[ImageField] = Field(default=None, description="The image to get the channel from")
channel: IMAGE_CHANNELS = Field(default="A", description="The channel to get")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Image Channel", "tags": ["image", "channel"]},
}
image: ImageField = InputField(description="The image to get the channel from")
channel: IMAGE_CHANNELS = InputField(default="A", description="The channel to get")
def invoke(self, context: InvocationContext) -> ImageOutput:
image = context.services.images.get_pil_image(self.image.image_name)
@ -292,21 +238,17 @@ class ImageChannelInvocation(BaseInvocation, PILInvocationConfig):
IMAGE_MODES = Literal["L", "RGB", "RGBA", "CMYK", "YCbCr", "LAB", "HSV", "I", "F"]
class ImageConvertInvocation(BaseInvocation, PILInvocationConfig):
@title("Convert Image Mode")
@tags("image", "convert")
class ImageConvertInvocation(BaseInvocation):
"""Converts an image to a different mode."""
# fmt: off
# Metadata
type: Literal["img_conv"] = "img_conv"
# Inputs
image: Optional[ImageField] = Field(default=None, description="The image to convert")
mode: IMAGE_MODES = Field(default="L", description="The mode to convert to")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Convert Image", "tags": ["image", "convert"]},
}
image: ImageField = InputField(description="The image to convert")
mode: IMAGE_MODES = InputField(default="L", description="The mode to convert to")
def invoke(self, context: InvocationContext) -> ImageOutput:
image = context.services.images.get_pil_image(self.image.image_name)
@ -329,22 +271,19 @@ class ImageConvertInvocation(BaseInvocation, PILInvocationConfig):
)
class ImageBlurInvocation(BaseInvocation, PILInvocationConfig):
@title("Blur Image")
@tags("image", "blur")
class ImageBlurInvocation(BaseInvocation):
"""Blurs an image"""
# fmt: off
# Metadata
type: Literal["img_blur"] = "img_blur"
# Inputs
image: Optional[ImageField] = Field(default=None, description="The image to blur")
radius: float = Field(default=8.0, ge=0, description="The blur radius")
blur_type: Literal["gaussian", "box"] = Field(default="gaussian", description="The type of blur")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Blur Image", "tags": ["image", "blur"]},
}
image: ImageField = InputField(description="The image to blur")
radius: float = InputField(default=8.0, ge=0, description="The blur radius")
# Metadata
blur_type: Literal["gaussian", "box"] = InputField(default="gaussian", description="The type of blur")
def invoke(self, context: InvocationContext) -> ImageOutput:
image = context.services.images.get_pil_image(self.image.image_name)
@ -390,23 +329,19 @@ PIL_RESAMPLING_MAP = {
}
class ImageResizeInvocation(BaseInvocation, PILInvocationConfig):
@title("Resize Image")
@tags("image", "resize")
class ImageResizeInvocation(BaseInvocation):
"""Resizes an image to specific dimensions"""
# fmt: off
# Metadata
type: Literal["img_resize"] = "img_resize"
# Inputs
image: Optional[ImageField] = Field(default=None, description="The image to resize")
width: Union[int, None] = Field(ge=64, multiple_of=8, description="The width to resize to (px)")
height: Union[int, None] = Field(ge=64, multiple_of=8, description="The height to resize to (px)")
resample_mode: PIL_RESAMPLING_MODES = Field(default="bicubic", description="The resampling mode")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Resize Image", "tags": ["image", "resize"]},
}
image: ImageField = InputField(description="The image to resize")
width: int = InputField(default=512, ge=64, multiple_of=8, description="The width to resize to (px)")
height: int = InputField(default=512, ge=64, multiple_of=8, description="The height to resize to (px)")
resample_mode: PIL_RESAMPLING_MODES = InputField(default="bicubic", description="The resampling mode")
def invoke(self, context: InvocationContext) -> ImageOutput:
image = context.services.images.get_pil_image(self.image.image_name)
@ -434,22 +369,22 @@ class ImageResizeInvocation(BaseInvocation, PILInvocationConfig):
)
class ImageScaleInvocation(BaseInvocation, PILInvocationConfig):
@title("Scale Image")
@tags("image", "scale")
class ImageScaleInvocation(BaseInvocation):
"""Scales an image by a factor"""
# fmt: off
# Metadata
type: Literal["img_scale"] = "img_scale"
# Inputs
image: Optional[ImageField] = Field(default=None, description="The image to scale")
scale_factor: Optional[float] = Field(default=2.0, gt=0, description="The factor by which to scale the image")
resample_mode: PIL_RESAMPLING_MODES = Field(default="bicubic", description="The resampling mode")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Scale Image", "tags": ["image", "scale"]},
}
image: ImageField = InputField(description="The image to scale")
scale_factor: float = InputField(
default=2.0,
gt=0,
description="The factor by which to scale the image",
)
resample_mode: PIL_RESAMPLING_MODES = InputField(default="bicubic", description="The resampling mode")
def invoke(self, context: InvocationContext) -> ImageOutput:
image = context.services.images.get_pil_image(self.image.image_name)
@ -479,28 +414,24 @@ class ImageScaleInvocation(BaseInvocation, PILInvocationConfig):
)
class ImageLerpInvocation(BaseInvocation, PILInvocationConfig):
@title("Lerp Image")
@tags("image", "lerp")
class ImageLerpInvocation(BaseInvocation):
"""Linear interpolation of all pixels of an image"""
# fmt: off
# Metadata
type: Literal["img_lerp"] = "img_lerp"
# Inputs
image: Optional[ImageField] = Field(default=None, description="The image to lerp")
min: int = Field(default=0, ge=0, le=255, description="The minimum output value")
max: int = Field(default=255, ge=0, le=255, description="The maximum output value")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Image Linear Interpolation", "tags": ["image", "linear", "interpolation", "lerp"]},
}
image: ImageField = InputField(description="The image to lerp")
min: int = InputField(default=0, ge=0, le=255, description="The minimum output value")
max: int = InputField(default=255, ge=0, le=255, description="The maximum output value")
def invoke(self, context: InvocationContext) -> ImageOutput:
image = context.services.images.get_pil_image(self.image.image_name)
image_arr = numpy.asarray(image, dtype=numpy.float32) / 255
image_arr = image_arr * (self.max - self.min) + self.max
image_arr = image_arr * (self.max - self.min) + self.min
lerp_image = Image.fromarray(numpy.uint8(image_arr))
@ -520,25 +451,18 @@ class ImageLerpInvocation(BaseInvocation, PILInvocationConfig):
)
class ImageInverseLerpInvocation(BaseInvocation, PILInvocationConfig):
@title("Inverse Lerp Image")
@tags("image", "ilerp")
class ImageInverseLerpInvocation(BaseInvocation):
"""Inverse linear interpolation of all pixels of an image"""
# fmt: off
# Metadata
type: Literal["img_ilerp"] = "img_ilerp"
# Inputs
image: Optional[ImageField] = Field(default=None, description="The image to lerp")
min: int = Field(default=0, ge=0, le=255, description="The minimum input value")
max: int = Field(default=255, ge=0, le=255, description="The maximum input value")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {
"title": "Image Inverse Linear Interpolation",
"tags": ["image", "linear", "interpolation", "inverse"],
},
}
image: ImageField = InputField(description="The image to lerp")
min: int = InputField(default=0, ge=0, le=255, description="The minimum input value")
max: int = InputField(default=255, ge=0, le=255, description="The maximum input value")
def invoke(self, context: InvocationContext) -> ImageOutput:
image = context.services.images.get_pil_image(self.image.image_name)
@ -564,21 +488,19 @@ class ImageInverseLerpInvocation(BaseInvocation, PILInvocationConfig):
)
class ImageNSFWBlurInvocation(BaseInvocation, PILInvocationConfig):
@title("Blur NSFW Image")
@tags("image", "nsfw")
class ImageNSFWBlurInvocation(BaseInvocation):
"""Add blur to NSFW-flagged images"""
# fmt: off
# Metadata
type: Literal["img_nsfw"] = "img_nsfw"
# Inputs
image: Optional[ImageField] = Field(default=None, description="The image to check")
metadata: Optional[CoreMetadata] = Field(default=None, description="Optional core metadata to be written to the image")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Blur NSFW Images", "tags": ["image", "nsfw", "checker"]},
}
image: ImageField = InputField(description="The image to check")
metadata: Optional[CoreMetadata] = InputField(
default=None, description=FieldDescriptions.core_metadata, ui_hidden=True
)
def invoke(self, context: InvocationContext) -> ImageOutput:
image = context.services.images.get_pil_image(self.image.image_name)
@ -615,22 +537,20 @@ class ImageNSFWBlurInvocation(BaseInvocation, PILInvocationConfig):
return caution.resize((caution.width // 2, caution.height // 2))
class ImageWatermarkInvocation(BaseInvocation, PILInvocationConfig):
@title("Add Invisible Watermark")
@tags("image", "watermark")
class ImageWatermarkInvocation(BaseInvocation):
"""Add an invisible watermark to an image"""
# fmt: off
# Metadata
type: Literal["img_watermark"] = "img_watermark"
# Inputs
image: Optional[ImageField] = Field(default=None, description="The image to check")
text: str = Field(default='InvokeAI', description="Watermark text")
metadata: Optional[CoreMetadata] = Field(default=None, description="Optional core metadata to be written to the image")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Add Invisible Watermark", "tags": ["image", "watermark", "invisible"]},
}
image: ImageField = InputField(description="The image to check")
text: str = InputField(default="InvokeAI", description="Watermark text")
metadata: Optional[CoreMetadata] = InputField(
default=None, description=FieldDescriptions.core_metadata, ui_hidden=True
)
def invoke(self, context: InvocationContext) -> ImageOutput:
image = context.services.images.get_pil_image(self.image.image_name)
@ -650,3 +570,334 @@ class ImageWatermarkInvocation(BaseInvocation, PILInvocationConfig):
width=image_dto.width,
height=image_dto.height,
)
@title("Mask Edge")
@tags("image", "mask", "inpaint")
class MaskEdgeInvocation(BaseInvocation):
"""Applies an edge mask to an image"""
type: Literal["mask_edge"] = "mask_edge"
# Inputs
image: ImageField = InputField(description="The image to apply the mask to")
edge_size: int = InputField(description="The size of the edge")
edge_blur: int = InputField(description="The amount of blur on the edge")
low_threshold: int = InputField(description="First threshold for the hysteresis procedure in Canny edge detection")
high_threshold: int = InputField(
description="Second threshold for the hysteresis procedure in Canny edge detection"
)
def invoke(self, context: InvocationContext) -> ImageOutput:
mask = context.services.images.get_pil_image(self.image.image_name)
npimg = numpy.asarray(mask, dtype=numpy.uint8)
npgradient = numpy.uint8(255 * (1.0 - numpy.floor(numpy.abs(0.5 - numpy.float32(npimg) / 255.0) * 2.0)))
npedge = cv2.Canny(npimg, threshold1=self.low_threshold, threshold2=self.high_threshold)
npmask = npgradient + npedge
npmask = cv2.dilate(npmask, numpy.ones((3, 3), numpy.uint8), iterations=int(self.edge_size / 2))
new_mask = Image.fromarray(npmask)
if self.edge_blur > 0:
new_mask = new_mask.filter(ImageFilter.BoxBlur(self.edge_blur))
new_mask = ImageOps.invert(new_mask)
image_dto = context.services.images.create(
image=new_mask,
image_origin=ResourceOrigin.INTERNAL,
image_category=ImageCategory.MASK,
node_id=self.id,
session_id=context.graph_execution_state_id,
is_intermediate=self.is_intermediate,
)
return ImageOutput(
image=ImageField(image_name=image_dto.image_name),
width=image_dto.width,
height=image_dto.height,
)
@title("Combine Mask")
@tags("image", "mask", "multiply")
class MaskCombineInvocation(BaseInvocation):
"""Combine two masks together by multiplying them using `PIL.ImageChops.multiply()`."""
type: Literal["mask_combine"] = "mask_combine"
# Inputs
mask1: ImageField = InputField(description="The first mask to combine")
mask2: ImageField = InputField(description="The second image to combine")
def invoke(self, context: InvocationContext) -> ImageOutput:
mask1 = context.services.images.get_pil_image(self.mask1.image_name).convert("L")
mask2 = context.services.images.get_pil_image(self.mask2.image_name).convert("L")
combined_mask = ImageChops.multiply(mask1, mask2)
image_dto = context.services.images.create(
image=combined_mask,
image_origin=ResourceOrigin.INTERNAL,
image_category=ImageCategory.GENERAL,
node_id=self.id,
session_id=context.graph_execution_state_id,
is_intermediate=self.is_intermediate,
)
return ImageOutput(
image=ImageField(image_name=image_dto.image_name),
width=image_dto.width,
height=image_dto.height,
)
@title("Color Correct")
@tags("image", "color")
class ColorCorrectInvocation(BaseInvocation):
"""
Shifts the colors of a target image to match the reference image, optionally
using a mask to only color-correct certain regions of the target image.
"""
type: Literal["color_correct"] = "color_correct"
# Inputs
image: ImageField = InputField(description="The image to color-correct")
reference: ImageField = InputField(description="Reference image for color-correction")
mask: Optional[ImageField] = InputField(default=None, description="Mask to use when applying color-correction")
mask_blur_radius: float = InputField(default=8, description="Mask blur radius")
def invoke(self, context: InvocationContext) -> ImageOutput:
pil_init_mask = None
if self.mask is not None:
pil_init_mask = context.services.images.get_pil_image(self.mask.image_name).convert("L")
init_image = context.services.images.get_pil_image(self.reference.image_name)
result = context.services.images.get_pil_image(self.image.image_name).convert("RGBA")
# if init_image is None or init_mask is None:
# return result
# Get the original alpha channel of the mask if there is one.
# Otherwise it is some other black/white image format ('1', 'L' or 'RGB')
# pil_init_mask = (
# init_mask.getchannel("A")
# if init_mask.mode == "RGBA"
# else init_mask.convert("L")
# )
pil_init_image = init_image.convert("RGBA") # Add an alpha channel if one doesn't exist
# Build an image with only visible pixels from source to use as reference for color-matching.
init_rgb_pixels = numpy.asarray(init_image.convert("RGB"), dtype=numpy.uint8)
init_a_pixels = numpy.asarray(pil_init_image.getchannel("A"), dtype=numpy.uint8)
init_mask_pixels = numpy.asarray(pil_init_mask, dtype=numpy.uint8)
# Get numpy version of result
np_image = numpy.asarray(result.convert("RGB"), dtype=numpy.uint8)
# Mask and calculate mean and standard deviation
mask_pixels = init_a_pixels * init_mask_pixels > 0
np_init_rgb_pixels_masked = init_rgb_pixels[mask_pixels, :]
np_image_masked = np_image[mask_pixels, :]
if np_init_rgb_pixels_masked.size > 0:
init_means = np_init_rgb_pixels_masked.mean(axis=0)
init_std = np_init_rgb_pixels_masked.std(axis=0)
gen_means = np_image_masked.mean(axis=0)
gen_std = np_image_masked.std(axis=0)
# Color correct
np_matched_result = np_image.copy()
np_matched_result[:, :, :] = (
(
(
(np_matched_result[:, :, :].astype(numpy.float32) - gen_means[None, None, :])
/ gen_std[None, None, :]
)
* init_std[None, None, :]
+ init_means[None, None, :]
)
.clip(0, 255)
.astype(numpy.uint8)
)
matched_result = Image.fromarray(np_matched_result, mode="RGB")
else:
matched_result = Image.fromarray(np_image, mode="RGB")
# Blur the mask out (into init image) by specified amount
if self.mask_blur_radius > 0:
nm = numpy.asarray(pil_init_mask, dtype=numpy.uint8)
nmd = cv2.erode(
nm,
kernel=numpy.ones((3, 3), dtype=numpy.uint8),
iterations=int(self.mask_blur_radius / 2),
)
pmd = Image.fromarray(nmd, mode="L")
blurred_init_mask = pmd.filter(ImageFilter.BoxBlur(self.mask_blur_radius))
else:
blurred_init_mask = pil_init_mask
multiplied_blurred_init_mask = ImageChops.multiply(blurred_init_mask, result.split()[-1])
# Paste original on color-corrected generation (using blurred mask)
matched_result.paste(init_image, (0, 0), mask=multiplied_blurred_init_mask)
image_dto = context.services.images.create(
image=matched_result,
image_origin=ResourceOrigin.INTERNAL,
image_category=ImageCategory.GENERAL,
node_id=self.id,
session_id=context.graph_execution_state_id,
is_intermediate=self.is_intermediate,
)
return ImageOutput(
image=ImageField(image_name=image_dto.image_name),
width=image_dto.width,
height=image_dto.height,
)
@title("Image Hue Adjustment")
@tags("image", "hue", "hsl")
class ImageHueAdjustmentInvocation(BaseInvocation):
"""Adjusts the Hue of an image."""
type: Literal["img_hue_adjust"] = "img_hue_adjust"
# Inputs
image: ImageField = InputField(description="The image to adjust")
hue: int = InputField(default=0, description="The degrees by which to rotate the hue, 0-360")
def invoke(self, context: InvocationContext) -> ImageOutput:
pil_image = context.services.images.get_pil_image(self.image.image_name)
# Convert image to HSV color space
hsv_image = numpy.array(pil_image.convert("HSV"))
# Convert hue from 0..360 to 0..256
hue = int(256 * ((self.hue % 360) / 360))
# Increment each hue and wrap around at 255
hsv_image[:, :, 0] = (hsv_image[:, :, 0] + hue) % 256
# Convert back to PIL format and to original color mode
pil_image = Image.fromarray(hsv_image, mode="HSV").convert("RGBA")
image_dto = context.services.images.create(
image=pil_image,
image_origin=ResourceOrigin.INTERNAL,
image_category=ImageCategory.GENERAL,
node_id=self.id,
is_intermediate=self.is_intermediate,
session_id=context.graph_execution_state_id,
)
return ImageOutput(
image=ImageField(
image_name=image_dto.image_name,
),
width=image_dto.width,
height=image_dto.height,
)
@title("Image Luminosity Adjustment")
@tags("image", "luminosity", "hsl")
class ImageLuminosityAdjustmentInvocation(BaseInvocation):
"""Adjusts the Luminosity (Value) of an image."""
type: Literal["img_luminosity_adjust"] = "img_luminosity_adjust"
# Inputs
image: ImageField = InputField(description="The image to adjust")
luminosity: float = InputField(
default=1.0, ge=0, le=1, description="The factor by which to adjust the luminosity (value)"
)
def invoke(self, context: InvocationContext) -> ImageOutput:
pil_image = context.services.images.get_pil_image(self.image.image_name)
# Convert PIL image to OpenCV format (numpy array), note color channel
# ordering is changed from RGB to BGR
image = numpy.array(pil_image.convert("RGB"))[:, :, ::-1]
# Convert image to HSV color space
hsv_image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
# Adjust the luminosity (value)
hsv_image[:, :, 2] = numpy.clip(hsv_image[:, :, 2] * self.luminosity, 0, 255)
# Convert image back to BGR color space
image = cv2.cvtColor(hsv_image, cv2.COLOR_HSV2BGR)
# Convert back to PIL format and to original color mode
pil_image = Image.fromarray(image[:, :, ::-1], "RGB").convert("RGBA")
image_dto = context.services.images.create(
image=pil_image,
image_origin=ResourceOrigin.INTERNAL,
image_category=ImageCategory.GENERAL,
node_id=self.id,
is_intermediate=self.is_intermediate,
session_id=context.graph_execution_state_id,
)
return ImageOutput(
image=ImageField(
image_name=image_dto.image_name,
),
width=image_dto.width,
height=image_dto.height,
)
@title("Image Saturation Adjustment")
@tags("image", "saturation", "hsl")
class ImageSaturationAdjustmentInvocation(BaseInvocation):
"""Adjusts the Saturation of an image."""
type: Literal["img_saturation_adjust"] = "img_saturation_adjust"
# Inputs
image: ImageField = InputField(description="The image to adjust")
saturation: float = InputField(default=1.0, ge=0, le=1, description="The factor by which to adjust the saturation")
def invoke(self, context: InvocationContext) -> ImageOutput:
pil_image = context.services.images.get_pil_image(self.image.image_name)
# Convert PIL image to OpenCV format (numpy array), note color channel
# ordering is changed from RGB to BGR
image = numpy.array(pil_image.convert("RGB"))[:, :, ::-1]
# Convert image to HSV color space
hsv_image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
# Adjust the saturation
hsv_image[:, :, 1] = numpy.clip(hsv_image[:, :, 1] * self.saturation, 0, 255)
# Convert image back to BGR color space
image = cv2.cvtColor(hsv_image, cv2.COLOR_HSV2BGR)
# Convert back to PIL format and to original color mode
pil_image = Image.fromarray(image[:, :, ::-1], "RGB").convert("RGBA")
image_dto = context.services.images.create(
image=pil_image,
image_origin=ResourceOrigin.INTERNAL,
image_category=ImageCategory.GENERAL,
node_id=self.id,
is_intermediate=self.is_intermediate,
session_id=context.graph_execution_state_id,
)
return ImageOutput(
image=ImageField(
image_name=image_dto.image_name,
),
width=image_dto.width,
height=image_dto.height,
)

View File

@ -5,18 +5,13 @@ from typing import Literal, Optional, get_args
import numpy as np
import math
from PIL import Image, ImageOps
from pydantic import Field
from invokeai.app.invocations.primitives import ImageField, ImageOutput, ColorField
from invokeai.app.invocations.image import ImageOutput
from invokeai.app.util.misc import SEED_MAX, get_random_seed
from invokeai.backend.image_util.patchmatch import PatchMatch
from ..models.image import ColorField, ImageCategory, ImageField, ResourceOrigin
from .baseinvocation import (
BaseInvocation,
InvocationConfig,
InvocationContext,
)
from ..models.image import ImageCategory, ResourceOrigin
from .baseinvocation import BaseInvocation, InputField, InvocationContext, title, tags
def infill_methods() -> list[str]:
@ -114,21 +109,20 @@ def tile_fill_missing(im: Image.Image, tile_size: int = 16, seed: Optional[int]
return si
@title("Solid Color Infill")
@tags("image", "inpaint")
class InfillColorInvocation(BaseInvocation):
"""Infills transparent areas of an image with a solid color"""
type: Literal["infill_rgba"] = "infill_rgba"
image: Optional[ImageField] = Field(default=None, description="The image to infill")
color: ColorField = Field(
# Inputs
image: ImageField = InputField(description="The image to infill")
color: ColorField = InputField(
default=ColorField(r=127, g=127, b=127, a=255),
description="The color to use to infill",
)
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Color Infill", "tags": ["image", "inpaint", "color", "infill"]},
}
def invoke(self, context: InvocationContext) -> ImageOutput:
image = context.services.images.get_pil_image(self.image.image_name)
@ -153,25 +147,23 @@ class InfillColorInvocation(BaseInvocation):
)
@title("Tile Infill")
@tags("image", "inpaint")
class InfillTileInvocation(BaseInvocation):
"""Infills transparent areas of an image with tiles of the image"""
type: Literal["infill_tile"] = "infill_tile"
image: Optional[ImageField] = Field(default=None, description="The image to infill")
tile_size: int = Field(default=32, ge=1, description="The tile size (px)")
seed: int = Field(
# Input
image: ImageField = InputField(description="The image to infill")
tile_size: int = InputField(default=32, ge=1, description="The tile size (px)")
seed: int = InputField(
ge=0,
le=SEED_MAX,
description="The seed to use for tile generation (omit for random)",
default_factory=get_random_seed,
)
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Tile Infill", "tags": ["image", "inpaint", "tile", "infill"]},
}
def invoke(self, context: InvocationContext) -> ImageOutput:
image = context.services.images.get_pil_image(self.image.image_name)
@ -194,17 +186,15 @@ class InfillTileInvocation(BaseInvocation):
)
@title("PatchMatch Infill")
@tags("image", "inpaint")
class InfillPatchMatchInvocation(BaseInvocation):
"""Infills transparent areas of an image using the PatchMatch algorithm"""
type: Literal["infill_patchmatch"] = "infill_patchmatch"
image: Optional[ImageField] = Field(default=None, description="The image to infill")
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Patch Match Infill", "tags": ["image", "inpaint", "patchmatch", "infill"]},
}
# Inputs
image: ImageField = InputField(description="The image to infill")
def invoke(self, context: InvocationContext) -> ImageOutput:
image = context.services.images.get_pil_image(self.image.image_name)

View File

@ -5,15 +5,32 @@ from typing import List, Literal, Optional, Union
import einops
import torch
from diffusers import ControlNetModel
import torchvision.transforms as T
from diffusers.image_processor import VaeImageProcessor
from diffusers.models.attention_processor import (
AttnProcessor2_0,
LoRAAttnProcessor2_0,
LoRAXFormersAttnProcessor,
XFormersAttnProcessor,
)
from diffusers.schedulers import DPMSolverSDEScheduler
from diffusers.schedulers import SchedulerMixin as Scheduler
from pydantic import BaseModel, Field, validator
from torchvision.transforms.functional import resize as tv_resize
from invokeai.app.invocations.metadata import CoreMetadata
from invokeai.app.invocations.primitives import (
ImageField,
ImageOutput,
LatentsField,
LatentsOutput,
build_latents_output,
)
from invokeai.app.util.controlnet_utils import prepare_control_image
from invokeai.app.util.step_callback import stable_diffusion_step_callback
from invokeai.backend.model_management.models import ModelType, SilenceWarnings
from ...backend.model_management import BaseModelType, ModelPatcher
from ...backend.model_management.lora import ModelPatcher
from ...backend.stable_diffusion import PipelineIntermediateState
from ...backend.stable_diffusion.diffusers_pipeline import (
@ -24,57 +41,27 @@ from ...backend.stable_diffusion.diffusers_pipeline import (
)
from ...backend.stable_diffusion.diffusion.shared_invokeai_diffusion import PostprocessingSettings
from ...backend.stable_diffusion.schedulers import SCHEDULER_MAP
from ...backend.model_management import ModelPatcher
from ...backend.util.devices import choose_torch_device, torch_dtype, choose_precision
from ..models.image import ImageCategory, ImageField, ResourceOrigin
from .baseinvocation import BaseInvocation, BaseInvocationOutput, InvocationConfig, InvocationContext
from ...backend.util.devices import choose_precision, choose_torch_device
from ..models.image import ImageCategory, ResourceOrigin
from .baseinvocation import (
BaseInvocation,
BaseInvocationOutput,
FieldDescriptions,
Input,
InputField,
InvocationContext,
OutputField,
UIType,
tags,
title,
)
from .compel import ConditioningField
from .controlnet_image_processors import ControlField
from .image import ImageOutput
from .model import ModelInfo, UNetField, VaeField
from invokeai.app.util.controlnet_utils import prepare_control_image
from diffusers.models.attention_processor import (
AttnProcessor2_0,
LoRAAttnProcessor2_0,
LoRAXFormersAttnProcessor,
XFormersAttnProcessor,
)
DEFAULT_PRECISION = choose_precision(choose_torch_device())
class LatentsField(BaseModel):
"""A latents field used for passing latents between invocations"""
latents_name: Optional[str] = Field(default=None, description="The name of the latents")
class Config:
schema_extra = {"required": ["latents_name"]}
class LatentsOutput(BaseInvocationOutput):
"""Base class for invocations that output latents"""
# fmt: off
type: Literal["latents_output"] = "latents_output"
# Inputs
latents: LatentsField = Field(default=None, description="The output latents")
width: int = Field(description="The width of the latents in pixels")
height: int = Field(description="The height of the latents in pixels")
# fmt: on
def build_latents_output(latents_name: str, latents: torch.Tensor):
return LatentsOutput(
latents=LatentsField(latents_name=latents_name),
width=latents.size()[3] * 8,
height=latents.size()[2] * 8,
)
SAMPLER_NAME_VALUES = Literal[tuple(list(SCHEDULER_MAP.keys()))]
@ -82,6 +69,7 @@ def get_scheduler(
context: InvocationContext,
scheduler_info: ModelInfo,
scheduler_name: str,
seed: int,
) -> Scheduler:
scheduler_class, scheduler_extra_config = SCHEDULER_MAP.get(scheduler_name, SCHEDULER_MAP["ddim"])
orig_scheduler_info = context.services.model_manager.get_model(
@ -98,6 +86,11 @@ def get_scheduler(
**scheduler_extra_config,
"_backup": scheduler_config,
}
# make dpmpp_sde reproducable(seed can be passed only in initializer)
if scheduler_class is DPMSolverSDEScheduler:
scheduler_config["noise_sampler_seed"] = seed
scheduler = scheduler_class.from_config(scheduler_config)
# hack copied over from generate.py
@ -106,25 +99,37 @@ def get_scheduler(
return scheduler
# Text to image
class TextToLatentsInvocation(BaseInvocation):
"""Generates latents from conditionings."""
@title("Denoise Latents")
@tags("latents", "denoise", "txt2img", "t2i", "t2l", "img2img", "i2i", "l2l")
class DenoiseLatentsInvocation(BaseInvocation):
"""Denoises noisy latents to decodable images"""
type: Literal["t2l"] = "t2l"
type: Literal["denoise_latents"] = "denoise_latents"
# Inputs
# fmt: off
positive_conditioning: Optional[ConditioningField] = Field(description="Positive conditioning for generation")
negative_conditioning: Optional[ConditioningField] = Field(description="Negative conditioning for generation")
noise: Optional[LatentsField] = Field(description="The noise to use")
steps: int = Field(default=10, gt=0, description="The number of steps to use to generate the image")
cfg_scale: Union[float, List[float]] = Field(default=7.5, ge=1, description="The Classifier-Free Guidance, higher values may result in a result closer to the prompt", )
scheduler: SAMPLER_NAME_VALUES = Field(default="euler", description="The scheduler to use" )
unet: UNetField = Field(default=None, description="UNet submodel")
control: Union[ControlField, list[ControlField]] = Field(default=None, description="The control to use")
# seamless: bool = Field(default=False, description="Whether or not to generate an image that can tile without seams", )
# seamless_axes: str = Field(default="", description="The axes to tile the image on, 'x' and/or 'y'")
# fmt: on
positive_conditioning: ConditioningField = InputField(
description=FieldDescriptions.positive_cond, input=Input.Connection
)
negative_conditioning: ConditioningField = InputField(
description=FieldDescriptions.negative_cond, input=Input.Connection
)
noise: Optional[LatentsField] = InputField(description=FieldDescriptions.noise, input=Input.Connection)
steps: int = InputField(default=10, gt=0, description=FieldDescriptions.steps)
cfg_scale: Union[float, List[float]] = InputField(
default=7.5, ge=1, description=FieldDescriptions.cfg_scale, ui_type=UIType.Float
)
denoising_start: float = InputField(default=0.0, ge=0, le=1, description=FieldDescriptions.denoising_start)
denoising_end: float = InputField(default=1.0, ge=0, le=1, description=FieldDescriptions.denoising_end)
scheduler: SAMPLER_NAME_VALUES = InputField(default="euler", description=FieldDescriptions.scheduler)
unet: UNetField = InputField(description=FieldDescriptions.unet, input=Input.Connection)
control: Union[ControlField, list[ControlField]] = InputField(
default=None, description=FieldDescriptions.control, input=Input.Connection
)
latents: Optional[LatentsField] = InputField(description=FieldDescriptions.latents, input=Input.Connection)
mask: Optional[ImageField] = InputField(
default=None,
description=FieldDescriptions.mask,
)
@validator("cfg_scale")
def ge_one(cls, v):
@ -138,33 +143,20 @@ class TextToLatentsInvocation(BaseInvocation):
raise ValueError("cfg_scale must be greater than 1")
return v
# Schema customisation
class Config(InvocationConfig):
schema_extra = {
"ui": {
"title": "Text To Latents",
"tags": ["latents"],
"type_hints": {
"model": "model",
"control": "control",
# "cfg_scale": "float",
"cfg_scale": "number",
},
},
}
# TODO: pass this an emitter method or something? or a session for dispatching?
def dispatch_progress(
self,
context: InvocationContext,
source_node_id: str,
intermediate_state: PipelineIntermediateState,
base_model: BaseModelType,
) -> None:
stable_diffusion_step_callback(
context=context,
intermediate_state=intermediate_state,
node=self.dict(),
source_node_id=source_node_id,
base_model=base_model,
)
def get_conditioning_data(
@ -172,13 +164,14 @@ class TextToLatentsInvocation(BaseInvocation):
context: InvocationContext,
scheduler,
unet,
seed,
) -> ConditioningData:
positive_cond_data = context.services.latents.get(self.positive_conditioning.conditioning_name)
c = positive_cond_data.conditionings[0].embeds.to(device=unet.device, dtype=unet.dtype)
extra_conditioning_info = positive_cond_data.conditionings[0].extra_conditioning
c = positive_cond_data.conditionings[0].to(device=unet.device, dtype=unet.dtype)
extra_conditioning_info = c.extra_conditioning
negative_cond_data = context.services.latents.get(self.negative_conditioning.conditioning_name)
uc = negative_cond_data.conditionings[0].embeds.to(device=unet.device, dtype=unet.dtype)
uc = negative_cond_data.conditionings[0].to(device=unet.device, dtype=unet.dtype)
conditioning_data = ConditioningData(
unconditioned_embeddings=uc,
@ -198,7 +191,8 @@ class TextToLatentsInvocation(BaseInvocation):
# for ddim scheduler
eta=0.0, # ddim_eta
# for ancestral and sde schedulers
generator=torch.Generator(device=unet.device).manual_seed(0),
# flip all bits to have noise different from initial
generator=torch.Generator(device=unet.device).manual_seed(seed ^ 0xFFFFFFFF),
)
return conditioning_data
@ -231,7 +225,6 @@ class TextToLatentsInvocation(BaseInvocation):
safety_checker=None,
feature_extractor=None,
requires_safety_checker=False,
precision="float16" if unet.dtype == torch.float16 else "float32",
)
def prep_control_data(
@ -310,110 +303,83 @@ class TextToLatentsInvocation(BaseInvocation):
# MultiControlNetModel has been refactored out, just need list[ControlNetData]
return control_data
@torch.no_grad()
def invoke(self, context: InvocationContext) -> LatentsOutput:
with SilenceWarnings():
noise = context.services.latents.get(self.noise.latents_name)
# original idea by https://github.com/AmericanPresidentJimmyCarter
# TODO: research more for second order schedulers timesteps
def init_scheduler(self, scheduler, device, steps, denoising_start, denoising_end):
num_inference_steps = steps
if scheduler.config.get("cpu_only", False):
scheduler.set_timesteps(num_inference_steps, device="cpu")
timesteps = scheduler.timesteps.to(device=device)
else:
scheduler.set_timesteps(num_inference_steps, device=device)
timesteps = scheduler.timesteps
# Get the source node id (we are invoking the prepared node)
graph_execution_state = context.services.graph_execution_manager.get(context.graph_execution_state_id)
source_node_id = graph_execution_state.prepared_source_mapping[self.id]
# apply denoising_start
t_start_val = int(round(scheduler.config.num_train_timesteps * (1 - denoising_start)))
t_start_idx = len(list(filter(lambda ts: ts >= t_start_val, timesteps)))
timesteps = timesteps[t_start_idx:]
if scheduler.order == 2 and t_start_idx > 0:
timesteps = timesteps[1:]
def step_callback(state: PipelineIntermediateState):
self.dispatch_progress(context, source_node_id, state)
# save start timestep to apply noise
init_timestep = timesteps[:1]
def _lora_loader():
for lora in self.unet.loras:
lora_info = context.services.model_manager.get_model(
**lora.dict(exclude={"weight"}),
context=context,
)
yield (lora_info.context.model, lora.weight)
del lora_info
return
# apply denoising_end
t_end_val = int(round(scheduler.config.num_train_timesteps * (1 - denoising_end)))
t_end_idx = len(list(filter(lambda ts: ts >= t_end_val, timesteps)))
if scheduler.order == 2 and t_end_idx > 0:
t_end_idx += 1
timesteps = timesteps[:t_end_idx]
unet_info = context.services.model_manager.get_model(
**self.unet.unet.dict(),
context=context,
)
with ExitStack() as exit_stack, ModelPatcher.apply_lora_unet(
unet_info.context.model, _lora_loader()
), unet_info as unet:
noise = noise.to(device=unet.device, dtype=unet.dtype)
# calculate step count based on scheduler order
num_inference_steps = len(timesteps)
if scheduler.order == 2:
num_inference_steps += num_inference_steps % 2
num_inference_steps = num_inference_steps // 2
scheduler = get_scheduler(
context=context,
scheduler_info=self.unet.scheduler,
scheduler_name=self.scheduler,
)
return num_inference_steps, timesteps, init_timestep
pipeline = self.create_pipeline(unet, scheduler)
conditioning_data = self.get_conditioning_data(context, scheduler, unet)
def prep_mask_tensor(self, mask, context, lantents):
if mask is None:
return None
control_data = self.prep_control_data(
model=pipeline,
context=context,
control_input=self.control,
latents_shape=noise.shape,
# do_classifier_free_guidance=(self.cfg_scale >= 1.0))
do_classifier_free_guidance=True,
exit_stack=exit_stack,
)
# TODO: Verify the noise is the right size
result_latents, result_attention_map_saver = pipeline.latents_from_embeddings(
latents=torch.zeros_like(noise, dtype=torch_dtype(unet.device)),
noise=noise,
num_inference_steps=self.steps,
conditioning_data=conditioning_data,
control_data=control_data, # list[ControlNetData]
callback=step_callback,
)
# https://discuss.huggingface.co/t/memory-usage-by-later-pipeline-stages/23699
result_latents = result_latents.to("cpu")
torch.cuda.empty_cache()
name = f"{context.graph_execution_state_id}__{self.id}"
context.services.latents.save(name, result_latents)
return build_latents_output(latents_name=name, latents=result_latents)
class LatentsToLatentsInvocation(TextToLatentsInvocation):
"""Generates latents using latents as base image."""
type: Literal["l2l"] = "l2l"
# Inputs
latents: Optional[LatentsField] = Field(description="The latents to use as a base image")
strength: float = Field(default=0.7, ge=0, le=1, description="The strength of the latents to use")
# Schema customisation
class Config(InvocationConfig):
schema_extra = {
"ui": {
"title": "Latent To Latents",
"tags": ["latents"],
"type_hints": {
"model": "model",
"control": "control",
"cfg_scale": "number",
},
},
}
mask_image = context.services.images.get_pil_image(mask.image_name)
if mask_image.mode != "L":
# FIXME: why do we get passed an RGB image here? We can only use single-channel.
mask_image = mask_image.convert("L")
mask_tensor = image_resized_to_grid_as_tensor(mask_image, normalize=False)
if mask_tensor.dim() == 3:
mask_tensor = mask_tensor.unsqueeze(0)
mask_tensor = tv_resize(mask_tensor, lantents.shape[-2:], T.InterpolationMode.BILINEAR)
return 1 - mask_tensor
@torch.no_grad()
def invoke(self, context: InvocationContext) -> LatentsOutput:
with SilenceWarnings(): # this quenches NSFW nag from diffusers
noise = context.services.latents.get(self.noise.latents_name)
latent = context.services.latents.get(self.latents.latents_name)
seed = None
noise = None
if self.noise is not None:
noise = context.services.latents.get(self.noise.latents_name)
seed = self.noise.seed
if self.latents is not None:
latents = context.services.latents.get(self.latents.latents_name)
if seed is None:
seed = self.latents.seed
else:
latents = torch.zeros_like(noise)
if seed is None:
seed = 0
mask = self.prep_mask_tensor(self.mask, context, latents)
# Get the source node id (we are invoking the prepared node)
graph_execution_state = context.services.graph_execution_manager.get(context.graph_execution_state_id)
source_node_id = graph_execution_state.prepared_source_mapping[self.id]
def step_callback(state: PipelineIntermediateState):
self.dispatch_progress(context, source_node_id, state)
self.dispatch_progress(context, source_node_id, state, self.unet.unet.base_model)
def _lora_loader():
for lora in self.unet.loras:
@ -432,44 +398,48 @@ class LatentsToLatentsInvocation(TextToLatentsInvocation):
with ExitStack() as exit_stack, ModelPatcher.apply_lora_unet(
unet_info.context.model, _lora_loader()
), unet_info as unet:
noise = noise.to(device=unet.device, dtype=unet.dtype)
latent = latent.to(device=unet.device, dtype=unet.dtype)
latents = latents.to(device=unet.device, dtype=unet.dtype)
if noise is not None:
noise = noise.to(device=unet.device, dtype=unet.dtype)
if mask is not None:
mask = mask.to(device=unet.device, dtype=unet.dtype)
scheduler = get_scheduler(
context=context,
scheduler_info=self.unet.scheduler,
scheduler_name=self.scheduler,
seed=seed,
)
pipeline = self.create_pipeline(unet, scheduler)
conditioning_data = self.get_conditioning_data(context, scheduler, unet)
conditioning_data = self.get_conditioning_data(context, scheduler, unet, seed)
control_data = self.prep_control_data(
model=pipeline,
context=context,
control_input=self.control,
latents_shape=noise.shape,
latents_shape=latents.shape,
# do_classifier_free_guidance=(self.cfg_scale >= 1.0))
do_classifier_free_guidance=True,
exit_stack=exit_stack,
)
# TODO: Verify the noise is the right size
initial_latents = (
latent if self.strength < 1.0 else torch.zeros_like(latent, device=unet.device, dtype=latent.dtype)
)
timesteps, _ = pipeline.get_img2img_timesteps(
self.steps,
self.strength,
num_inference_steps, timesteps, init_timestep = self.init_scheduler(
scheduler,
device=unet.device,
steps=self.steps,
denoising_start=self.denoising_start,
denoising_end=self.denoising_end,
)
result_latents, result_attention_map_saver = pipeline.latents_from_embeddings(
latents=initial_latents,
latents=latents,
timesteps=timesteps,
init_timestep=init_timestep,
noise=noise,
num_inference_steps=self.steps,
seed=seed,
mask=mask,
num_inference_steps=num_inference_steps,
conditioning_data=conditioning_data,
control_data=control_data, # list[ControlNetData]
callback=step_callback,
@ -481,32 +451,32 @@ class LatentsToLatentsInvocation(TextToLatentsInvocation):
name = f"{context.graph_execution_state_id}__{self.id}"
context.services.latents.save(name, result_latents)
return build_latents_output(latents_name=name, latents=result_latents)
return build_latents_output(latents_name=name, latents=result_latents, seed=seed)
# Latent to image
@title("Latents to Image")
@tags("latents", "image", "vae")
class LatentsToImageInvocation(BaseInvocation):
"""Generates an image from latents."""
type: Literal["l2i"] = "l2i"
# Inputs
latents: Optional[LatentsField] = Field(description="The latents to generate an image from")
vae: VaeField = Field(default=None, description="Vae submodel")
tiled: bool = Field(default=False, description="Decode latents by overlaping tiles (less memory consumption)")
fp32: bool = Field(DEFAULT_PRECISION == "float32", description="Decode in full precision")
metadata: Optional[CoreMetadata] = Field(
default=None, description="Optional core metadata to be written to the image"
latents: LatentsField = InputField(
description=FieldDescriptions.latents,
input=Input.Connection,
)
vae: VaeField = InputField(
description=FieldDescriptions.vae,
input=Input.Connection,
)
tiled: bool = InputField(default=False, description=FieldDescriptions.tiled)
fp32: bool = InputField(default=DEFAULT_PRECISION == "float32", description=FieldDescriptions.fp32)
metadata: CoreMetadata = InputField(
default=None,
description=FieldDescriptions.core_metadata,
ui_hidden=True,
)
# Schema customisation
class Config(InvocationConfig):
schema_extra = {
"ui": {
"title": "Latents To Image",
"tags": ["latents", "image"],
},
}
@torch.no_grad()
def invoke(self, context: InvocationContext) -> ImageOutput:
@ -584,24 +554,30 @@ class LatentsToImageInvocation(BaseInvocation):
LATENTS_INTERPOLATION_MODE = Literal["nearest", "linear", "bilinear", "bicubic", "trilinear", "area", "nearest-exact"]
@title("Resize Latents")
@tags("latents", "resize")
class ResizeLatentsInvocation(BaseInvocation):
"""Resizes latents to explicit width/height (in pixels). Provided dimensions are floor-divided by 8."""
type: Literal["lresize"] = "lresize"
# Inputs
latents: Optional[LatentsField] = Field(description="The latents to resize")
width: Union[int, None] = Field(default=512, ge=64, multiple_of=8, description="The width to resize to (px)")
height: Union[int, None] = Field(default=512, ge=64, multiple_of=8, description="The height to resize to (px)")
mode: LATENTS_INTERPOLATION_MODE = Field(default="bilinear", description="The interpolation mode")
antialias: bool = Field(
default=False, description="Whether or not to antialias (applied in bilinear and bicubic modes only)"
latents: LatentsField = InputField(
description=FieldDescriptions.latents,
input=Input.Connection,
)
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Resize Latents", "tags": ["latents", "resize"]},
}
width: int = InputField(
ge=64,
multiple_of=8,
description=FieldDescriptions.width,
)
height: int = InputField(
ge=64,
multiple_of=8,
description=FieldDescriptions.width,
)
mode: LATENTS_INTERPOLATION_MODE = InputField(default="bilinear", description=FieldDescriptions.interp_mode)
antialias: bool = InputField(default=False, description=FieldDescriptions.torch_antialias)
def invoke(self, context: InvocationContext) -> LatentsOutput:
latents = context.services.latents.get(self.latents.latents_name)
@ -623,26 +599,24 @@ class ResizeLatentsInvocation(BaseInvocation):
name = f"{context.graph_execution_state_id}__{self.id}"
# context.services.latents.set(name, resized_latents)
context.services.latents.save(name, resized_latents)
return build_latents_output(latents_name=name, latents=resized_latents)
return build_latents_output(latents_name=name, latents=resized_latents, seed=self.latents.seed)
@title("Scale Latents")
@tags("latents", "resize")
class ScaleLatentsInvocation(BaseInvocation):
"""Scales latents by a given factor."""
type: Literal["lscale"] = "lscale"
# Inputs
latents: Optional[LatentsField] = Field(description="The latents to scale")
scale_factor: float = Field(gt=0, description="The factor by which to scale the latents")
mode: LATENTS_INTERPOLATION_MODE = Field(default="bilinear", description="The interpolation mode")
antialias: bool = Field(
default=False, description="Whether or not to antialias (applied in bilinear and bicubic modes only)"
latents: LatentsField = InputField(
description=FieldDescriptions.latents,
input=Input.Connection,
)
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Scale Latents", "tags": ["latents", "scale"]},
}
scale_factor: float = InputField(gt=0, description=FieldDescriptions.scale_factor)
mode: LATENTS_INTERPOLATION_MODE = InputField(default="bilinear", description=FieldDescriptions.interp_mode)
antialias: bool = InputField(default=False, description=FieldDescriptions.torch_antialias)
def invoke(self, context: InvocationContext) -> LatentsOutput:
latents = context.services.latents.get(self.latents.latents_name)
@ -665,25 +639,26 @@ class ScaleLatentsInvocation(BaseInvocation):
name = f"{context.graph_execution_state_id}__{self.id}"
# context.services.latents.set(name, resized_latents)
context.services.latents.save(name, resized_latents)
return build_latents_output(latents_name=name, latents=resized_latents)
return build_latents_output(latents_name=name, latents=resized_latents, seed=self.latents.seed)
@title("Image to Latents")
@tags("latents", "image", "vae")
class ImageToLatentsInvocation(BaseInvocation):
"""Encodes an image into latents."""
type: Literal["i2l"] = "i2l"
# Inputs
image: Optional[ImageField] = Field(description="The image to encode")
vae: VaeField = Field(default=None, description="Vae submodel")
tiled: bool = Field(default=False, description="Encode latents by overlaping tiles(less memory consumption)")
fp32: bool = Field(DEFAULT_PRECISION == "float32", description="Decode in full precision")
# Schema customisation
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Image To Latents", "tags": ["latents", "image"]},
}
image: ImageField = InputField(
description="The image to encode",
)
vae: VaeField = InputField(
description=FieldDescriptions.vae,
input=Input.Connection,
)
tiled: bool = InputField(default=False, description=FieldDescriptions.tiled)
fp32: bool = InputField(default=DEFAULT_PRECISION == "float32", description=FieldDescriptions.fp32)
@torch.no_grad()
def invoke(self, context: InvocationContext) -> LatentsOutput:
@ -746,4 +721,4 @@ class ImageToLatentsInvocation(BaseInvocation):
name = f"{context.graph_execution_state_id}__{self.id}"
latents = latents.to("cpu")
context.services.latents.save(name, latents)
return build_latents_output(latents_name=name, latents=latents)
return build_latents_output(latents_name=name, latents=latents, seed=None)

View File

@ -2,134 +2,83 @@
from typing import Literal
from pydantic import BaseModel, Field
import numpy as np
from .baseinvocation import (
BaseInvocation,
BaseInvocationOutput,
InvocationContext,
InvocationConfig,
)
from invokeai.app.invocations.primitives import IntegerOutput
from .baseinvocation import BaseInvocation, FieldDescriptions, InputField, InvocationContext, tags, title
class MathInvocationConfig(BaseModel):
"""Helper class to provide all math invocations with additional config"""
# Schema customisation
class Config(InvocationConfig):
schema_extra = {
"ui": {
"tags": ["math"],
}
}
class IntOutput(BaseInvocationOutput):
"""An integer output"""
# fmt: off
type: Literal["int_output"] = "int_output"
a: int = Field(default=None, description="The output integer")
# fmt: on
class FloatOutput(BaseInvocationOutput):
"""A float output"""
# fmt: off
type: Literal["float_output"] = "float_output"
param: float = Field(default=None, description="The output float")
# fmt: on
class AddInvocation(BaseInvocation, MathInvocationConfig):
@title("Add Integers")
@tags("math")
class AddInvocation(BaseInvocation):
"""Adds two numbers"""
# fmt: off
type: Literal["add"] = "add"
a: int = Field(default=0, description="The first number")
b: int = Field(default=0, description="The second number")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Add", "tags": ["math", "add"]},
}
# Inputs
a: int = InputField(default=0, description=FieldDescriptions.num_1)
b: int = InputField(default=0, description=FieldDescriptions.num_2)
def invoke(self, context: InvocationContext) -> IntOutput:
return IntOutput(a=self.a + self.b)
def invoke(self, context: InvocationContext) -> IntegerOutput:
return IntegerOutput(a=self.a + self.b)
class SubtractInvocation(BaseInvocation, MathInvocationConfig):
@title("Subtract Integers")
@tags("math")
class SubtractInvocation(BaseInvocation):
"""Subtracts two numbers"""
# fmt: off
type: Literal["sub"] = "sub"
a: int = Field(default=0, description="The first number")
b: int = Field(default=0, description="The second number")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Subtract", "tags": ["math", "subtract"]},
}
# Inputs
a: int = InputField(default=0, description=FieldDescriptions.num_1)
b: int = InputField(default=0, description=FieldDescriptions.num_2)
def invoke(self, context: InvocationContext) -> IntOutput:
return IntOutput(a=self.a - self.b)
def invoke(self, context: InvocationContext) -> IntegerOutput:
return IntegerOutput(a=self.a - self.b)
class MultiplyInvocation(BaseInvocation, MathInvocationConfig):
@title("Multiply Integers")
@tags("math")
class MultiplyInvocation(BaseInvocation):
"""Multiplies two numbers"""
# fmt: off
type: Literal["mul"] = "mul"
a: int = Field(default=0, description="The first number")
b: int = Field(default=0, description="The second number")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Multiply", "tags": ["math", "multiply"]},
}
# Inputs
a: int = InputField(default=0, description=FieldDescriptions.num_1)
b: int = InputField(default=0, description=FieldDescriptions.num_2)
def invoke(self, context: InvocationContext) -> IntOutput:
return IntOutput(a=self.a * self.b)
def invoke(self, context: InvocationContext) -> IntegerOutput:
return IntegerOutput(a=self.a * self.b)
class DivideInvocation(BaseInvocation, MathInvocationConfig):
@title("Divide Integers")
@tags("math")
class DivideInvocation(BaseInvocation):
"""Divides two numbers"""
# fmt: off
type: Literal["div"] = "div"
a: int = Field(default=0, description="The first number")
b: int = Field(default=0, description="The second number")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Divide", "tags": ["math", "divide"]},
}
# Inputs
a: int = InputField(default=0, description=FieldDescriptions.num_1)
b: int = InputField(default=0, description=FieldDescriptions.num_2)
def invoke(self, context: InvocationContext) -> IntOutput:
return IntOutput(a=int(self.a / self.b))
def invoke(self, context: InvocationContext) -> IntegerOutput:
return IntegerOutput(a=int(self.a / self.b))
@title("Random Integer")
@tags("math")
class RandomIntInvocation(BaseInvocation):
"""Outputs a single random integer."""
# fmt: off
type: Literal["rand_int"] = "rand_int"
low: int = Field(default=0, description="The inclusive low value")
high: int = Field(
default=np.iinfo(np.int32).max, description="The exclusive high value"
)
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Random Integer", "tags": ["math", "random", "integer"]},
}
# Inputs
low: int = InputField(default=0, description="The inclusive low value")
high: int = InputField(default=np.iinfo(np.int32).max, description="The exclusive high value")
def invoke(self, context: InvocationContext) -> IntOutput:
return IntOutput(a=np.random.randint(self.low, self.high))
def invoke(self, context: InvocationContext) -> IntegerOutput:
return IntegerOutput(a=np.random.randint(self.low, self.high))

View File

@ -1,27 +1,34 @@
from typing import Literal, Optional, Union
from typing import Literal, Optional
from pydantic import BaseModel, Field
from pydantic import Field
from invokeai.app.invocations.baseinvocation import (
BaseInvocation,
BaseInvocationOutput,
InvocationConfig,
InputField,
InvocationContext,
OutputField,
tags,
title,
)
from invokeai.app.invocations.controlnet_image_processors import ControlField
from invokeai.app.invocations.model import LoRAModelField, MainModelField, VAEModelField
from invokeai.app.util.model_exclude_null import BaseModelExcludeNull
from ...version import __version__
class LoRAMetadataField(BaseModel):
class LoRAMetadataField(BaseModelExcludeNull):
"""LoRA metadata for an image generated in InvokeAI."""
lora: LoRAModelField = Field(description="The LoRA model")
weight: float = Field(description="The weight of the LoRA model")
class CoreMetadata(BaseModel):
class CoreMetadata(BaseModelExcludeNull):
"""Core generation metadata for an image generated in InvokeAI."""
app_version: str = Field(default=__version__, description="The version of InvokeAI used to generate this image")
generation_mode: str = Field(
description="The generation mode that output this image",
)
@ -40,37 +47,40 @@ class CoreMetadata(BaseModel):
model: MainModelField = Field(description="The main model used for inference")
controlnets: list[ControlField] = Field(description="The ControlNets used for inference")
loras: list[LoRAMetadataField] = Field(description="The LoRAs used for inference")
vae: Union[VAEModelField, None] = Field(
vae: Optional[VAEModelField] = Field(
default=None,
description="The VAE used for decoding, if the main model's default was not used",
)
# Latents-to-Latents
strength: Union[float, None] = Field(
strength: Optional[float] = Field(
default=None,
description="The strength used for latents-to-latents",
)
init_image: Union[str, None] = Field(default=None, description="The name of the initial image")
init_image: Optional[str] = Field(default=None, description="The name of the initial image")
# SDXL
positive_style_prompt: Union[str, None] = Field(default=None, description="The positive style prompt parameter")
negative_style_prompt: Union[str, None] = Field(default=None, description="The negative style prompt parameter")
positive_style_prompt: Optional[str] = Field(default=None, description="The positive style prompt parameter")
negative_style_prompt: Optional[str] = Field(default=None, description="The negative style prompt parameter")
# SDXL Refiner
refiner_model: Union[MainModelField, None] = Field(default=None, description="The SDXL Refiner model used")
refiner_cfg_scale: Union[float, None] = Field(
refiner_model: Optional[MainModelField] = Field(default=None, description="The SDXL Refiner model used")
refiner_cfg_scale: Optional[float] = Field(
default=None,
description="The classifier-free guidance scale parameter used for the refiner",
)
refiner_steps: Union[int, None] = Field(default=None, description="The number of steps used for the refiner")
refiner_scheduler: Union[str, None] = Field(default=None, description="The scheduler used for the refiner")
refiner_aesthetic_store: Union[float, None] = Field(
refiner_steps: Optional[int] = Field(default=None, description="The number of steps used for the refiner")
refiner_scheduler: Optional[str] = Field(default=None, description="The scheduler used for the refiner")
refiner_positive_aesthetic_store: Optional[float] = Field(
default=None, description="The aesthetic score used for the refiner"
)
refiner_start: Union[float, None] = Field(default=None, description="The start value used for refiner denoising")
refiner_negative_aesthetic_store: Optional[float] = Field(
default=None, description="The aesthetic score used for the refiner"
)
refiner_start: Optional[float] = Field(default=None, description="The start value used for refiner denoising")
class ImageMetadata(BaseModel):
class ImageMetadata(BaseModelExcludeNull):
"""An image's generation metadata"""
metadata: Optional[dict] = Field(
@ -85,66 +95,86 @@ class MetadataAccumulatorOutput(BaseInvocationOutput):
type: Literal["metadata_accumulator_output"] = "metadata_accumulator_output"
metadata: CoreMetadata = Field(description="The core metadata for the image")
metadata: CoreMetadata = OutputField(description="The core metadata for the image")
@title("Metadata Accumulator")
@tags("metadata")
class MetadataAccumulatorInvocation(BaseInvocation):
"""Outputs a Core Metadata Object"""
type: Literal["metadata_accumulator"] = "metadata_accumulator"
generation_mode: str = Field(
generation_mode: str = InputField(
description="The generation mode that output this image",
)
positive_prompt: str = Field(description="The positive prompt parameter")
negative_prompt: str = Field(description="The negative prompt parameter")
width: int = Field(description="The width parameter")
height: int = Field(description="The height parameter")
seed: int = Field(description="The seed used for noise generation")
rand_device: str = Field(description="The device used for random number generation")
cfg_scale: float = Field(description="The classifier-free guidance scale parameter")
steps: int = Field(description="The number of steps used for inference")
scheduler: str = Field(description="The scheduler used for inference")
clip_skip: int = Field(
positive_prompt: str = InputField(description="The positive prompt parameter")
negative_prompt: str = InputField(description="The negative prompt parameter")
width: int = InputField(description="The width parameter")
height: int = InputField(description="The height parameter")
seed: int = InputField(description="The seed used for noise generation")
rand_device: str = InputField(description="The device used for random number generation")
cfg_scale: float = InputField(description="The classifier-free guidance scale parameter")
steps: int = InputField(description="The number of steps used for inference")
scheduler: str = InputField(description="The scheduler used for inference")
clip_skip: int = InputField(
description="The number of skipped CLIP layers",
)
model: MainModelField = Field(description="The main model used for inference")
controlnets: list[ControlField] = Field(description="The ControlNets used for inference")
loras: list[LoRAMetadataField] = Field(description="The LoRAs used for inference")
strength: Union[float, None] = Field(
model: MainModelField = InputField(description="The main model used for inference")
controlnets: list[ControlField] = InputField(description="The ControlNets used for inference")
loras: list[LoRAMetadataField] = InputField(description="The LoRAs used for inference")
strength: Optional[float] = InputField(
default=None,
description="The strength used for latents-to-latents",
)
init_image: Union[str, None] = Field(default=None, description="The name of the initial image")
vae: Union[VAEModelField, None] = Field(
init_image: Optional[str] = InputField(
default=None,
description="The name of the initial image",
)
vae: Optional[VAEModelField] = InputField(
default=None,
description="The VAE used for decoding, if the main model's default was not used",
)
# SDXL
positive_style_prompt: Union[str, None] = Field(default=None, description="The positive style prompt parameter")
negative_style_prompt: Union[str, None] = Field(default=None, description="The negative style prompt parameter")
positive_style_prompt: Optional[str] = InputField(
default=None,
description="The positive style prompt parameter",
)
negative_style_prompt: Optional[str] = InputField(
default=None,
description="The negative style prompt parameter",
)
# SDXL Refiner
refiner_model: Union[MainModelField, None] = Field(default=None, description="The SDXL Refiner model used")
refiner_cfg_scale: Union[float, None] = Field(
refiner_model: Optional[MainModelField] = InputField(
default=None,
description="The SDXL Refiner model used",
)
refiner_cfg_scale: Optional[float] = InputField(
default=None,
description="The classifier-free guidance scale parameter used for the refiner",
)
refiner_steps: Union[int, None] = Field(default=None, description="The number of steps used for the refiner")
refiner_scheduler: Union[str, None] = Field(default=None, description="The scheduler used for the refiner")
refiner_aesthetic_store: Union[float, None] = Field(
default=None, description="The aesthetic score used for the refiner"
refiner_steps: Optional[int] = InputField(
default=None,
description="The number of steps used for the refiner",
)
refiner_scheduler: Optional[str] = InputField(
default=None,
description="The scheduler used for the refiner",
)
refiner_positive_aesthetic_store: Optional[float] = InputField(
default=None,
description="The aesthetic score used for the refiner",
)
refiner_negative_aesthetic_store: Optional[float] = InputField(
default=None,
description="The aesthetic score used for the refiner",
)
refiner_start: Optional[float] = InputField(
default=None,
description="The start value used for refiner denoising",
)
refiner_start: Union[float, None] = Field(default=None, description="The start value used for refiner denoising")
class Config(InvocationConfig):
schema_extra = {
"ui": {
"title": "Metadata Accumulator",
"tags": ["image", "metadata", "generation"],
},
}
def invoke(self, context: InvocationContext) -> MetadataAccumulatorOutput:
"""Collects and outputs a CoreMetadata object"""

View File

@ -4,7 +4,18 @@ from typing import List, Literal, Optional, Union
from pydantic import BaseModel, Field
from ...backend.model_management import BaseModelType, ModelType, SubModelType
from .baseinvocation import BaseInvocation, BaseInvocationOutput, InvocationConfig, InvocationContext
from .baseinvocation import (
BaseInvocation,
BaseInvocationOutput,
FieldDescriptions,
InputField,
Input,
InvocationContext,
OutputField,
UIType,
tags,
title,
)
class ModelInfo(BaseModel):
@ -39,13 +50,11 @@ class VaeField(BaseModel):
class ModelLoaderOutput(BaseInvocationOutput):
"""Model loader output"""
# fmt: off
type: Literal["model_loader_output"] = "model_loader_output"
unet: UNetField = Field(default=None, description="UNet submodel")
clip: ClipField = Field(default=None, description="Tokenizer and text_encoder submodels")
vae: VaeField = Field(default=None, description="Vae submodel")
# fmt: on
unet: UNetField = OutputField(description=FieldDescriptions.unet, title="UNet")
clip: ClipField = OutputField(description=FieldDescriptions.clip, title="CLIP")
vae: VaeField = OutputField(description=FieldDescriptions.vae, title="VAE")
class MainModelField(BaseModel):
@ -63,24 +72,17 @@ class LoRAModelField(BaseModel):
base_model: BaseModelType = Field(description="Base model")
@title("Main Model Loader")
@tags("model")
class MainModelLoaderInvocation(BaseInvocation):
"""Loads a main model, outputting its submodels."""
type: Literal["main_model_loader"] = "main_model_loader"
model: MainModelField = Field(description="The model to load")
# Inputs
model: MainModelField = InputField(description=FieldDescriptions.main_model, input=Input.Direct)
# TODO: precision?
# Schema customisation
class Config(InvocationConfig):
schema_extra = {
"ui": {
"title": "Model Loader",
"tags": ["model", "loader"],
"type_hints": {"model": "model"},
},
}
def invoke(self, context: InvocationContext) -> ModelLoaderOutput:
base_model = self.model.base_model
model_name = self.model.model_name
@ -155,22 +157,6 @@ class MainModelLoaderInvocation(BaseInvocation):
loras=[],
skipped_layers=0,
),
clip2=ClipField(
tokenizer=ModelInfo(
model_name=model_name,
base_model=base_model,
model_type=model_type,
submodel=SubModelType.Tokenizer2,
),
text_encoder=ModelInfo(
model_name=model_name,
base_model=base_model,
model_type=model_type,
submodel=SubModelType.TextEncoder2,
),
loras=[],
skipped_layers=0,
),
vae=VaeField(
vae=ModelInfo(
model_name=model_name,
@ -188,30 +174,27 @@ class LoraLoaderOutput(BaseInvocationOutput):
# fmt: off
type: Literal["lora_loader_output"] = "lora_loader_output"
unet: Optional[UNetField] = Field(default=None, description="UNet submodel")
clip: Optional[ClipField] = Field(default=None, description="Tokenizer and text_encoder submodels")
unet: Optional[UNetField] = OutputField(default=None, description=FieldDescriptions.unet, title="UNet")
clip: Optional[ClipField] = OutputField(default=None, description=FieldDescriptions.clip, title="CLIP")
# fmt: on
@title("LoRA Loader")
@tags("lora", "model")
class LoraLoaderInvocation(BaseInvocation):
"""Apply selected lora to unet and text_encoder."""
type: Literal["lora_loader"] = "lora_loader"
lora: Union[LoRAModelField, None] = Field(default=None, description="Lora model name")
weight: float = Field(default=0.75, description="With what weight to apply lora")
unet: Optional[UNetField] = Field(description="UNet model for applying lora")
clip: Optional[ClipField] = Field(description="Clip model for applying lora")
class Config(InvocationConfig):
schema_extra = {
"ui": {
"title": "Lora Loader",
"tags": ["lora", "loader"],
"type_hints": {"lora": "lora_model"},
},
}
# Inputs
lora: LoRAModelField = InputField(description=FieldDescriptions.lora_model, input=Input.Direct, title="LoRA")
weight: float = InputField(default=0.75, description=FieldDescriptions.lora_weight)
unet: Optional[UNetField] = InputField(
default=None, description=FieldDescriptions.unet, input=Input.Connection, title="UNet"
)
clip: Optional[ClipField] = InputField(
default=None, description=FieldDescriptions.clip, input=Input.Connection, title="CLIP"
)
def invoke(self, context: InvocationContext) -> LoraLoaderOutput:
if self.lora is None:
@ -262,6 +245,101 @@ class LoraLoaderInvocation(BaseInvocation):
return output
class SDXLLoraLoaderOutput(BaseInvocationOutput):
"""SDXL LoRA Loader Output"""
# fmt: off
type: Literal["sdxl_lora_loader_output"] = "sdxl_lora_loader_output"
unet: Optional[UNetField] = OutputField(default=None, description=FieldDescriptions.unet, title="UNet")
clip: Optional[ClipField] = OutputField(default=None, description=FieldDescriptions.clip, title="CLIP 1")
clip2: Optional[ClipField] = OutputField(default=None, description=FieldDescriptions.clip, title="CLIP 2")
# fmt: on
@title("SDXL LoRA Loader")
@tags("sdxl", "lora", "model")
class SDXLLoraLoaderInvocation(BaseInvocation):
"""Apply selected lora to unet and text_encoder."""
type: Literal["sdxl_lora_loader"] = "sdxl_lora_loader"
lora: LoRAModelField = InputField(description=FieldDescriptions.lora_model, input=Input.Direct, title="LoRA")
weight: float = Field(default=0.75, description=FieldDescriptions.lora_weight)
unet: Optional[UNetField] = Field(
default=None, description=FieldDescriptions.unet, input=Input.Connection, title="UNET"
)
clip: Optional[ClipField] = Field(
default=None, description=FieldDescriptions.clip, input=Input.Connection, title="CLIP 1"
)
clip2: Optional[ClipField] = Field(
default=None, description=FieldDescriptions.clip, input=Input.Connection, title="CLIP 2"
)
def invoke(self, context: InvocationContext) -> SDXLLoraLoaderOutput:
if self.lora is None:
raise Exception("No LoRA provided")
base_model = self.lora.base_model
lora_name = self.lora.model_name
if not context.services.model_manager.model_exists(
base_model=base_model,
model_name=lora_name,
model_type=ModelType.Lora,
):
raise Exception(f"Unknown lora name: {lora_name}!")
if self.unet is not None and any(lora.model_name == lora_name for lora in self.unet.loras):
raise Exception(f'Lora "{lora_name}" already applied to unet')
if self.clip is not None and any(lora.model_name == lora_name for lora in self.clip.loras):
raise Exception(f'Lora "{lora_name}" already applied to clip')
if self.clip2 is not None and any(lora.model_name == lora_name for lora in self.clip2.loras):
raise Exception(f'Lora "{lora_name}" already applied to clip2')
output = SDXLLoraLoaderOutput()
if self.unet is not None:
output.unet = copy.deepcopy(self.unet)
output.unet.loras.append(
LoraInfo(
base_model=base_model,
model_name=lora_name,
model_type=ModelType.Lora,
submodel=None,
weight=self.weight,
)
)
if self.clip is not None:
output.clip = copy.deepcopy(self.clip)
output.clip.loras.append(
LoraInfo(
base_model=base_model,
model_name=lora_name,
model_type=ModelType.Lora,
submodel=None,
weight=self.weight,
)
)
if self.clip2 is not None:
output.clip2 = copy.deepcopy(self.clip2)
output.clip2.loras.append(
LoraInfo(
base_model=base_model,
model_name=lora_name,
model_type=ModelType.Lora,
submodel=None,
weight=self.weight,
)
)
return output
class VAEModelField(BaseModel):
"""Vae model field"""
@ -272,29 +350,23 @@ class VAEModelField(BaseModel):
class VaeLoaderOutput(BaseInvocationOutput):
"""Model loader output"""
# fmt: off
type: Literal["vae_loader_output"] = "vae_loader_output"
vae: VaeField = Field(default=None, description="Vae model")
# fmt: on
# Outputs
vae: VaeField = OutputField(description=FieldDescriptions.vae, title="VAE")
@title("VAE Loader")
@tags("vae", "model")
class VaeLoaderInvocation(BaseInvocation):
"""Loads a VAE model, outputting a VaeLoaderOutput"""
type: Literal["vae_loader"] = "vae_loader"
vae_model: VAEModelField = Field(description="The VAE to load")
# Schema customisation
class Config(InvocationConfig):
schema_extra = {
"ui": {
"title": "VAE Loader",
"tags": ["vae", "loader"],
"type_hints": {"vae_model": "vae_model"},
},
}
# Inputs
vae_model: VAEModelField = InputField(
description=FieldDescriptions.vae_model, input=Input.Direct, ui_type=UIType.VaeModel, title="VAE"
)
def invoke(self, context: InvocationContext) -> VaeLoaderOutput:
base_model = self.vae_model.base_model

View File

@ -1,19 +1,24 @@
# Copyright (c) 2023 Kyle Schouviller (https://github.com/kyle0654) & the InvokeAI Team
import math
from typing import Literal
from pydantic import Field, validator
import torch
from invokeai.app.invocations.latent import LatentsField
from pydantic import validator
from invokeai.app.invocations.latent import LatentsField
from invokeai.app.util.misc import SEED_MAX, get_random_seed
from ...backend.util.devices import choose_torch_device, torch_dtype
from .baseinvocation import (
BaseInvocation,
BaseInvocationOutput,
InvocationConfig,
FieldDescriptions,
InputField,
InvocationContext,
OutputField,
UIType,
tags,
title,
)
"""
@ -61,62 +66,53 @@ Nodes
class NoiseOutput(BaseInvocationOutput):
"""Invocation noise output"""
# fmt: off
type: Literal["noise_output"] = "noise_output"
type: Literal["noise_output"] = "noise_output"
# Inputs
noise: LatentsField = Field(default=None, description="The output noise")
width: int = Field(description="The width of the noise in pixels")
height: int = Field(description="The height of the noise in pixels")
# fmt: on
noise: LatentsField = OutputField(default=None, description=FieldDescriptions.noise)
width: int = OutputField(description=FieldDescriptions.width)
height: int = OutputField(description=FieldDescriptions.height)
def build_noise_output(latents_name: str, latents: torch.Tensor):
def build_noise_output(latents_name: str, latents: torch.Tensor, seed: int):
return NoiseOutput(
noise=LatentsField(latents_name=latents_name),
noise=LatentsField(latents_name=latents_name, seed=seed),
width=latents.size()[3] * 8,
height=latents.size()[2] * 8,
)
@title("Noise")
@tags("latents", "noise")
class NoiseInvocation(BaseInvocation):
"""Generates latent noise."""
type: Literal["noise"] = "noise"
# Inputs
seed: int = Field(
seed: int = InputField(
ge=0,
le=SEED_MAX,
description="The seed to use",
description=FieldDescriptions.seed,
default_factory=get_random_seed,
)
width: int = Field(
width: int = InputField(
default=512,
multiple_of=8,
gt=0,
description="The width of the resulting noise",
description=FieldDescriptions.width,
)
height: int = Field(
height: int = InputField(
default=512,
multiple_of=8,
gt=0,
description="The height of the resulting noise",
description=FieldDescriptions.height,
)
use_cpu: bool = Field(
use_cpu: bool = InputField(
default=True,
description="Use CPU for noise generation (for reproducible results across platforms)",
)
# Schema customisation
class Config(InvocationConfig):
schema_extra = {
"ui": {
"title": "Noise",
"tags": ["latents", "noise"],
},
}
@validator("seed", pre=True)
def modulo_seed(cls, v):
"""Returns the seed modulo (SEED_MAX + 1) to ensure it is within the valid range."""
@ -132,4 +128,4 @@ class NoiseInvocation(BaseInvocation):
)
name = f"{context.graph_execution_state_id}__{self.id}"
context.services.latents.save(name, noise)
return build_noise_output(latents_name=name, latents=noise)
return build_noise_output(latents_name=name, latents=noise, seed=self.seed)

View File

@ -1,37 +1,43 @@
# Copyright (c) 2023 Borisov Sergey (https://github.com/StAlKeR7779)
import inspect
import re
from contextlib import ExitStack
from typing import List, Literal, Optional, Union
import re
import inspect
from pydantic import BaseModel, Field, validator
import torch
import numpy as np
import torch
from diffusers import ControlNetModel, DPMSolverMultistepScheduler
from diffusers.image_processor import VaeImageProcessor
from diffusers.schedulers import SchedulerMixin as Scheduler
from ..models.image import ImageCategory, ImageField, ResourceOrigin
from ...backend.model_management import ONNXModelPatcher
from ...backend.util import choose_torch_device
from .baseinvocation import BaseInvocation, BaseInvocationOutput, InvocationConfig, InvocationContext
from .compel import ConditioningField
from .controlnet_image_processors import ControlField
from .image import ImageOutput
from .model import ModelInfo, UNetField, VaeField
from pydantic import BaseModel, Field, validator
from tqdm import tqdm
from invokeai.app.invocations.metadata import CoreMetadata
from invokeai.backend import BaseModelType, ModelType, SubModelType
from invokeai.app.invocations.primitives import ConditioningField, ConditioningOutput, ImageField, ImageOutput
from invokeai.app.util.step_callback import stable_diffusion_step_callback
from invokeai.backend import BaseModelType, ModelType, SubModelType
from ...backend.model_management import ONNXModelPatcher
from ...backend.stable_diffusion import PipelineIntermediateState
from tqdm import tqdm
from .model import ClipField
from .latent import LatentsField, LatentsOutput, build_latents_output, get_scheduler, SAMPLER_NAME_VALUES
from .compel import CompelOutput
from ...backend.util import choose_torch_device
from ..models.image import ImageCategory, ResourceOrigin
from .baseinvocation import (
BaseInvocation,
BaseInvocationOutput,
FieldDescriptions,
InputField,
Input,
InvocationContext,
OutputField,
UIComponent,
UIType,
tags,
title,
)
from .controlnet_image_processors import ControlField
from .latent import SAMPLER_NAME_VALUES, LatentsField, LatentsOutput, build_latents_output, get_scheduler
from .model import ClipField, ModelInfo, UNetField, VaeField
ORT_TO_NP_TYPE = {
"tensor(bool)": np.bool_,
@ -51,13 +57,15 @@ ORT_TO_NP_TYPE = {
PRECISION_VALUES = Literal[tuple(list(ORT_TO_NP_TYPE.keys()))]
@title("ONNX Prompt (Raw)")
@tags("onnx", "prompt")
class ONNXPromptInvocation(BaseInvocation):
type: Literal["prompt_onnx"] = "prompt_onnx"
prompt: str = Field(default="", description="Prompt")
clip: ClipField = Field(None, description="Clip to use")
prompt: str = InputField(default="", description=FieldDescriptions.raw_prompt, ui_component=UIComponent.Textarea)
clip: ClipField = InputField(description=FieldDescriptions.clip, input=Input.Connection)
def invoke(self, context: InvocationContext) -> CompelOutput:
def invoke(self, context: InvocationContext) -> ConditioningOutput:
tokenizer_info = context.services.model_manager.get_model(
**self.clip.tokenizer.dict(),
)
@ -65,7 +73,6 @@ class ONNXPromptInvocation(BaseInvocation):
**self.clip.text_encoder.dict(),
)
with tokenizer_info as orig_tokenizer, text_encoder_info as text_encoder, ExitStack() as stack:
# loras = [(stack.enter_context(context.services.model_manager.get_model(**lora.dict(exclude={"weight"}))), lora.weight) for lora in self.clip.loras]
loras = [
(context.services.model_manager.get_model(**lora.dict(exclude={"weight"})).context.model, lora.weight)
for lora in self.clip.loras
@ -76,18 +83,14 @@ class ONNXPromptInvocation(BaseInvocation):
name = trigger[1:-1]
try:
ti_list.append(
# stack.enter_context(
# context.services.model_manager.get_model(
# model_name=name,
# base_model=self.clip.text_encoder.base_model,
# model_type=ModelType.TextualInversion,
# )
# )
context.services.model_manager.get_model(
model_name=name,
base_model=self.clip.text_encoder.base_model,
model_type=ModelType.TextualInversion,
).context.model
(
name,
context.services.model_manager.get_model(
model_name=name,
base_model=self.clip.text_encoder.base_model,
model_type=ModelType.TextualInversion,
).context.model,
)
)
except Exception:
# print(e)
@ -131,7 +134,7 @@ class ONNXPromptInvocation(BaseInvocation):
# TODO: hacky but works ;D maybe rename latents somehow?
context.services.latents.save(conditioning_name, (prompt_embeds, None))
return CompelOutput(
return ConditioningOutput(
conditioning=ConditioningField(
conditioning_name=conditioning_name,
),
@ -139,25 +142,48 @@ class ONNXPromptInvocation(BaseInvocation):
# Text to image
@title("ONNX Text to Latents")
@tags("latents", "inference", "txt2img", "onnx")
class ONNXTextToLatentsInvocation(BaseInvocation):
"""Generates latents from conditionings."""
type: Literal["t2l_onnx"] = "t2l_onnx"
# Inputs
# fmt: off
positive_conditioning: Optional[ConditioningField] = Field(description="Positive conditioning for generation")
negative_conditioning: Optional[ConditioningField] = Field(description="Negative conditioning for generation")
noise: Optional[LatentsField] = Field(description="The noise to use")
steps: int = Field(default=10, gt=0, description="The number of steps to use to generate the image")
cfg_scale: Union[float, List[float]] = Field(default=7.5, ge=1, description="The Classifier-Free Guidance, higher values may result in a result closer to the prompt", )
scheduler: SAMPLER_NAME_VALUES = Field(default="euler", description="The scheduler to use" )
precision: PRECISION_VALUES = Field(default = "tensor(float16)", description="The precision to use when generating latents")
unet: UNetField = Field(default=None, description="UNet submodel")
control: Union[ControlField, list[ControlField]] = Field(default=None, description="The control to use")
# seamless: bool = Field(default=False, description="Whether or not to generate an image that can tile without seams", )
# seamless_axes: str = Field(default="", description="The axes to tile the image on, 'x' and/or 'y'")
# fmt: on
positive_conditioning: ConditioningField = InputField(
description=FieldDescriptions.positive_cond,
input=Input.Connection,
)
negative_conditioning: ConditioningField = InputField(
description=FieldDescriptions.negative_cond,
input=Input.Connection,
)
noise: LatentsField = InputField(
description=FieldDescriptions.noise,
input=Input.Connection,
)
steps: int = InputField(default=10, gt=0, description=FieldDescriptions.steps)
cfg_scale: Union[float, List[float]] = InputField(
default=7.5,
ge=1,
description=FieldDescriptions.cfg_scale,
ui_type=UIType.Float,
)
scheduler: SAMPLER_NAME_VALUES = InputField(
default="euler", description=FieldDescriptions.scheduler, input=Input.Direct
)
precision: PRECISION_VALUES = InputField(default="tensor(float16)", description=FieldDescriptions.precision)
unet: UNetField = InputField(
description=FieldDescriptions.unet,
input=Input.Connection,
)
control: Optional[Union[ControlField, list[ControlField]]] = InputField(
default=None,
description=FieldDescriptions.control,
ui_type=UIType.Control,
)
# seamless: bool = InputField(default=False, description="Whether or not to generate an image that can tile without seams", )
# seamless_axes: str = InputField(default="", description="The axes to tile the image on, 'x' and/or 'y'")
@validator("cfg_scale")
def ge_one(cls, v):
@ -171,20 +197,6 @@ class ONNXTextToLatentsInvocation(BaseInvocation):
raise ValueError("cfg_scale must be greater than 1")
return v
# Schema customisation
class Config(InvocationConfig):
schema_extra = {
"ui": {
"tags": ["latents"],
"type_hints": {
"model": "model",
"control": "control",
# "cfg_scale": "float",
"cfg_scale": "number",
},
},
}
# based on
# https://github.com/huggingface/diffusers/blob/3ebbaf7c96801271f9e6c21400033b6aa5ffcf29/src/diffusers/pipelines/stable_diffusion/pipeline_onnx_stable_diffusion.py#L375
def invoke(self, context: InvocationContext) -> LatentsOutput:
@ -217,6 +229,7 @@ class ONNXTextToLatentsInvocation(BaseInvocation):
context=context,
scheduler_info=self.unet.scheduler,
scheduler_name=self.scheduler,
seed=0, # TODO: refactor this node
)
def torch2numpy(latent: torch.Tensor):
@ -304,26 +317,28 @@ class ONNXTextToLatentsInvocation(BaseInvocation):
# Latent to image
@title("ONNX Latents to Image")
@tags("latents", "image", "vae", "onnx")
class ONNXLatentsToImageInvocation(BaseInvocation):
"""Generates an image from latents."""
type: Literal["l2i_onnx"] = "l2i_onnx"
# Inputs
latents: Optional[LatentsField] = Field(description="The latents to generate an image from")
vae: VaeField = Field(default=None, description="Vae submodel")
metadata: Optional[CoreMetadata] = Field(
default=None, description="Optional core metadata to be written to the image"
latents: LatentsField = InputField(
description=FieldDescriptions.denoised_latents,
input=Input.Connection,
)
# tiled: bool = Field(default=False, description="Decode latents by overlaping tiles(less memory consumption)")
# Schema customisation
class Config(InvocationConfig):
schema_extra = {
"ui": {
"tags": ["latents", "image"],
},
}
vae: VaeField = InputField(
description=FieldDescriptions.vae,
input=Input.Connection,
)
metadata: Optional[CoreMetadata] = InputField(
default=None,
description=FieldDescriptions.core_metadata,
ui_hidden=True,
)
# tiled: bool = InputField(default=False, description="Decode latents by overlaping tiles(less memory consumption)")
def invoke(self, context: InvocationContext) -> ImageOutput:
latents = context.services.latents.get(self.latents.latents_name)
@ -377,89 +392,13 @@ class ONNXModelLoaderOutput(BaseInvocationOutput):
# fmt: off
type: Literal["model_loader_output_onnx"] = "model_loader_output_onnx"
unet: UNetField = Field(default=None, description="UNet submodel")
clip: ClipField = Field(default=None, description="Tokenizer and text_encoder submodels")
vae_decoder: VaeField = Field(default=None, description="Vae submodel")
vae_encoder: VaeField = Field(default=None, description="Vae submodel")
unet: UNetField = OutputField(default=None, description=FieldDescriptions.unet, title="UNet")
clip: ClipField = OutputField(default=None, description=FieldDescriptions.clip, title="CLIP")
vae_decoder: VaeField = OutputField(default=None, description=FieldDescriptions.vae, title="VAE Decoder")
vae_encoder: VaeField = OutputField(default=None, description=FieldDescriptions.vae, title="VAE Encoder")
# fmt: on
class ONNXSD1ModelLoaderInvocation(BaseInvocation):
"""Loading submodels of selected model."""
type: Literal["sd1_model_loader_onnx"] = "sd1_model_loader_onnx"
model_name: str = Field(default="", description="Model to load")
# TODO: precision?
# Schema customisation
class Config(InvocationConfig):
schema_extra = {
"ui": {"tags": ["model", "loader"], "type_hints": {"model_name": "model"}}, # TODO: rename to model_name?
}
def invoke(self, context: InvocationContext) -> ONNXModelLoaderOutput:
model_name = "stable-diffusion-v1-5"
base_model = BaseModelType.StableDiffusion1
# TODO: not found exceptions
if not context.services.model_manager.model_exists(
model_name=model_name,
base_model=BaseModelType.StableDiffusion1,
model_type=ModelType.ONNX,
):
raise Exception(f"Unkown model name: {model_name}!")
return ONNXModelLoaderOutput(
unet=UNetField(
unet=ModelInfo(
model_name=model_name,
base_model=base_model,
model_type=ModelType.ONNX,
submodel=SubModelType.UNet,
),
scheduler=ModelInfo(
model_name=model_name,
base_model=base_model,
model_type=ModelType.ONNX,
submodel=SubModelType.Scheduler,
),
loras=[],
),
clip=ClipField(
tokenizer=ModelInfo(
model_name=model_name,
base_model=base_model,
model_type=ModelType.ONNX,
submodel=SubModelType.Tokenizer,
),
text_encoder=ModelInfo(
model_name=model_name,
base_model=base_model,
model_type=ModelType.ONNX,
submodel=SubModelType.TextEncoder,
),
loras=[],
),
vae_decoder=VaeField(
vae=ModelInfo(
model_name=model_name,
base_model=base_model,
model_type=ModelType.ONNX,
submodel=SubModelType.VaeDecoder,
),
),
vae_encoder=VaeField(
vae=ModelInfo(
model_name=model_name,
base_model=base_model,
model_type=ModelType.ONNX,
submodel=SubModelType.VaeEncoder,
),
),
)
class OnnxModelField(BaseModel):
"""Onnx model field"""
@ -468,22 +407,17 @@ class OnnxModelField(BaseModel):
model_type: ModelType = Field(description="Model Type")
@title("ONNX Model Loader")
@tags("onnx", "model")
class OnnxModelLoaderInvocation(BaseInvocation):
"""Loads a main model, outputting its submodels."""
type: Literal["onnx_model_loader"] = "onnx_model_loader"
model: OnnxModelField = Field(description="The model to load")
# Schema customisation
class Config(InvocationConfig):
schema_extra = {
"ui": {
"title": "Onnx Model Loader",
"tags": ["model", "loader"],
"type_hints": {"model": "model"},
},
}
# Inputs
model: OnnxModelField = InputField(
description=FieldDescriptions.onnx_main_model, input=Input.Direct, ui_type=UIType.ONNXModel
)
def invoke(self, context: InvocationContext) -> ONNXModelLoaderOutput:
base_model = self.model.base_model

View File

@ -1,73 +1,64 @@
import io
from typing import Literal, Optional, Any
from typing import Literal, Optional
# from PIL.Image import Image
import PIL.Image
from matplotlib.ticker import MaxNLocator
from matplotlib.figure import Figure
from pydantic import BaseModel, Field
import numpy as np
import matplotlib.pyplot as plt
import numpy as np
import PIL.Image
from easing_functions import (
LinearInOut,
QuadEaseInOut,
QuadEaseIn,
QuadEaseOut,
CubicEaseInOut,
CubicEaseIn,
CubicEaseOut,
QuarticEaseInOut,
QuarticEaseIn,
QuarticEaseOut,
QuinticEaseInOut,
QuinticEaseIn,
QuinticEaseOut,
SineEaseInOut,
SineEaseIn,
SineEaseOut,
CircularEaseIn,
CircularEaseInOut,
CircularEaseOut,
ExponentialEaseInOut,
ExponentialEaseIn,
ExponentialEaseOut,
ElasticEaseIn,
ElasticEaseInOut,
ElasticEaseOut,
BackEaseIn,
BackEaseInOut,
BackEaseOut,
BounceEaseIn,
BounceEaseInOut,
BounceEaseOut,
CircularEaseIn,
CircularEaseInOut,
CircularEaseOut,
CubicEaseIn,
CubicEaseInOut,
CubicEaseOut,
ElasticEaseIn,
ElasticEaseInOut,
ElasticEaseOut,
ExponentialEaseIn,
ExponentialEaseInOut,
ExponentialEaseOut,
LinearInOut,
QuadEaseIn,
QuadEaseInOut,
QuadEaseOut,
QuarticEaseIn,
QuarticEaseInOut,
QuarticEaseOut,
QuinticEaseIn,
QuinticEaseInOut,
QuinticEaseOut,
SineEaseIn,
SineEaseInOut,
SineEaseOut,
)
from matplotlib.figure import Figure
from matplotlib.ticker import MaxNLocator
from pydantic import BaseModel, Field
from invokeai.app.invocations.primitives import FloatCollectionOutput
from .baseinvocation import (
BaseInvocation,
BaseInvocationOutput,
InvocationContext,
InvocationConfig,
)
from ...backend.util.logging import InvokeAILogger
from .collections import FloatCollectionOutput
from .baseinvocation import BaseInvocation, InputField, InvocationContext, tags, title
@title("Float Range")
@tags("math", "range")
class FloatLinearRangeInvocation(BaseInvocation):
"""Creates a range"""
type: Literal["float_range"] = "float_range"
# Inputs
start: float = Field(default=5, description="The first value of the range")
stop: float = Field(default=10, description="The last value of the range")
steps: int = Field(default=30, description="number of values to interpolate over (including start and stop)")
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Linear Range (Float)", "tags": ["math", "float", "linear", "range"]},
}
start: float = InputField(default=5, description="The first value of the range")
stop: float = InputField(default=10, description="The last value of the range")
steps: int = InputField(default=30, description="number of values to interpolate over (including start and stop)")
def invoke(self, context: InvocationContext) -> FloatCollectionOutput:
param_list = list(np.linspace(self.start, self.stop, self.steps))
@ -108,37 +99,32 @@ EASING_FUNCTIONS_MAP = {
"BounceInOut": BounceEaseInOut,
}
EASING_FUNCTION_KEYS: Any = Literal[tuple(list(EASING_FUNCTIONS_MAP.keys()))]
EASING_FUNCTION_KEYS = Literal[tuple(list(EASING_FUNCTIONS_MAP.keys()))]
# actually I think for now could just use CollectionOutput (which is list[Any]
@title("Step Param Easing")
@tags("step", "easing")
class StepParamEasingInvocation(BaseInvocation):
"""Experimental per-step parameter easing for denoising steps"""
type: Literal["step_param_easing"] = "step_param_easing"
# Inputs
# fmt: off
easing: EASING_FUNCTION_KEYS = Field(default="Linear", description="The easing function to use")
num_steps: int = Field(default=20, description="number of denoising steps")
start_value: float = Field(default=0.0, description="easing starting value")
end_value: float = Field(default=1.0, description="easing ending value")
start_step_percent: float = Field(default=0.0, description="fraction of steps at which to start easing")
end_step_percent: float = Field(default=1.0, description="fraction of steps after which to end easing")
easing: EASING_FUNCTION_KEYS = InputField(default="Linear", description="The easing function to use")
num_steps: int = InputField(default=20, description="number of denoising steps")
start_value: float = InputField(default=0.0, description="easing starting value")
end_value: float = InputField(default=1.0, description="easing ending value")
start_step_percent: float = InputField(default=0.0, description="fraction of steps at which to start easing")
end_step_percent: float = InputField(default=1.0, description="fraction of steps after which to end easing")
# if None, then start_value is used prior to easing start
pre_start_value: Optional[float] = Field(default=None, description="value before easing start")
pre_start_value: Optional[float] = InputField(default=None, description="value before easing start")
# if None, then end value is used prior to easing end
post_end_value: Optional[float] = Field(default=None, description="value after easing end")
mirror: bool = Field(default=False, description="include mirror of easing function")
post_end_value: Optional[float] = InputField(default=None, description="value after easing end")
mirror: bool = InputField(default=False, description="include mirror of easing function")
# FIXME: add alt_mirror option (alternative to default or mirror), or remove entirely
# alt_mirror: bool = Field(default=False, description="alternative mirroring by dual easing")
show_easing_plot: bool = Field(default=False, description="show easing plot")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Param Easing By Step", "tags": ["param", "step", "easing"]},
}
# alt_mirror: bool = InputField(default=False, description="alternative mirroring by dual easing")
show_easing_plot: bool = InputField(default=False, description="show easing plot")
def invoke(self, context: InvocationContext) -> FloatCollectionOutput:
log_diagnostics = False

View File

@ -1,83 +0,0 @@
# Copyright (c) 2023 Kyle Schouviller (https://github.com/kyle0654)
from typing import Literal
from pydantic import Field
from invokeai.app.invocations.prompt import PromptOutput
from .baseinvocation import BaseInvocation, BaseInvocationOutput, InvocationConfig, InvocationContext
from .math import FloatOutput, IntOutput
# Pass-through parameter nodes - used by subgraphs
class ParamIntInvocation(BaseInvocation):
"""An integer parameter"""
# fmt: off
type: Literal["param_int"] = "param_int"
a: int = Field(default=0, description="The integer value")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"tags": ["param", "integer"], "title": "Integer Parameter"},
}
def invoke(self, context: InvocationContext) -> IntOutput:
return IntOutput(a=self.a)
class ParamFloatInvocation(BaseInvocation):
"""A float parameter"""
# fmt: off
type: Literal["param_float"] = "param_float"
param: float = Field(default=0.0, description="The float value")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"tags": ["param", "float"], "title": "Float Parameter"},
}
def invoke(self, context: InvocationContext) -> FloatOutput:
return FloatOutput(param=self.param)
class StringOutput(BaseInvocationOutput):
"""A string output"""
type: Literal["string_output"] = "string_output"
text: str = Field(default=None, description="The output string")
class ParamStringInvocation(BaseInvocation):
"""A string parameter"""
type: Literal["param_string"] = "param_string"
text: str = Field(default="", description="The string value")
class Config(InvocationConfig):
schema_extra = {
"ui": {"tags": ["param", "string"], "title": "String Parameter"},
}
def invoke(self, context: InvocationContext) -> StringOutput:
return StringOutput(text=self.text)
class ParamPromptInvocation(BaseInvocation):
"""A prompt input parameter"""
type: Literal["param_prompt"] = "param_prompt"
prompt: str = Field(default="", description="The prompt value")
class Config(InvocationConfig):
schema_extra = {
"ui": {"tags": ["param", "prompt"], "title": "Prompt"},
}
def invoke(self, context: InvocationContext) -> PromptOutput:
return PromptOutput(prompt=self.prompt)

View File

@ -0,0 +1,494 @@
# Copyright (c) 2023 Kyle Schouviller (https://github.com/kyle0654)
from typing import Literal, Optional, Tuple, Union
from anyio import Condition
from pydantic import BaseModel, Field
import torch
from .baseinvocation import (
BaseInvocation,
BaseInvocationOutput,
FieldDescriptions,
Input,
InputField,
InvocationContext,
OutputField,
UIComponent,
UIType,
tags,
title,
)
"""
Primitives: Boolean, Integer, Float, String, Image, Latents, Conditioning, Color
- primitive nodes
- primitive outputs
- primitive collection outputs
"""
# region Boolean
class BooleanOutput(BaseInvocationOutput):
"""Base class for nodes that output a single boolean"""
type: Literal["boolean_output"] = "boolean_output"
a: bool = OutputField(description="The output boolean")
class BooleanCollectionOutput(BaseInvocationOutput):
"""Base class for nodes that output a collection of booleans"""
type: Literal["boolean_collection_output"] = "boolean_collection_output"
# Outputs
collection: list[bool] = OutputField(
default_factory=list, description="The output boolean collection", ui_type=UIType.BooleanCollection
)
@title("Boolean Primitive")
@tags("primitives", "boolean")
class BooleanInvocation(BaseInvocation):
"""A boolean primitive value"""
type: Literal["boolean"] = "boolean"
# Inputs
a: bool = InputField(default=False, description="The boolean value")
def invoke(self, context: InvocationContext) -> BooleanOutput:
return BooleanOutput(a=self.a)
@title("Boolean Primitive Collection")
@tags("primitives", "boolean", "collection")
class BooleanCollectionInvocation(BaseInvocation):
"""A collection of boolean primitive values"""
type: Literal["boolean_collection"] = "boolean_collection"
# Inputs
collection: list[bool] = InputField(
default=False, description="The collection of boolean values", ui_type=UIType.BooleanCollection
)
def invoke(self, context: InvocationContext) -> BooleanCollectionOutput:
return BooleanCollectionOutput(collection=self.collection)
# endregion
# region Integer
class IntegerOutput(BaseInvocationOutput):
"""Base class for nodes that output a single integer"""
type: Literal["integer_output"] = "integer_output"
a: int = OutputField(description="The output integer")
class IntegerCollectionOutput(BaseInvocationOutput):
"""Base class for nodes that output a collection of integers"""
type: Literal["integer_collection_output"] = "integer_collection_output"
# Outputs
collection: list[int] = OutputField(
default_factory=list, description="The int collection", ui_type=UIType.IntegerCollection
)
@title("Integer Primitive")
@tags("primitives", "integer")
class IntegerInvocation(BaseInvocation):
"""An integer primitive value"""
type: Literal["integer"] = "integer"
# Inputs
a: int = InputField(default=0, description="The integer value")
def invoke(self, context: InvocationContext) -> IntegerOutput:
return IntegerOutput(a=self.a)
@title("Integer Primitive Collection")
@tags("primitives", "integer", "collection")
class IntegerCollectionInvocation(BaseInvocation):
"""A collection of integer primitive values"""
type: Literal["integer_collection"] = "integer_collection"
# Inputs
collection: list[int] = InputField(
default=0, description="The collection of integer values", ui_type=UIType.IntegerCollection
)
def invoke(self, context: InvocationContext) -> IntegerCollectionOutput:
return IntegerCollectionOutput(collection=self.collection)
# endregion
# region Float
class FloatOutput(BaseInvocationOutput):
"""Base class for nodes that output a single float"""
type: Literal["float_output"] = "float_output"
a: float = OutputField(description="The output float")
class FloatCollectionOutput(BaseInvocationOutput):
"""Base class for nodes that output a collection of floats"""
type: Literal["float_collection_output"] = "float_collection_output"
# Outputs
collection: list[float] = OutputField(
default_factory=list, description="The float collection", ui_type=UIType.FloatCollection
)
@title("Float Primitive")
@tags("primitives", "float")
class FloatInvocation(BaseInvocation):
"""A float primitive value"""
type: Literal["float"] = "float"
# Inputs
param: float = InputField(default=0.0, description="The float value")
def invoke(self, context: InvocationContext) -> FloatOutput:
return FloatOutput(a=self.param)
@title("Float Primitive Collection")
@tags("primitives", "float", "collection")
class FloatCollectionInvocation(BaseInvocation):
"""A collection of float primitive values"""
type: Literal["float_collection"] = "float_collection"
# Inputs
collection: list[float] = InputField(
default=0, description="The collection of float values", ui_type=UIType.FloatCollection
)
def invoke(self, context: InvocationContext) -> FloatCollectionOutput:
return FloatCollectionOutput(collection=self.collection)
# endregion
# region String
class StringOutput(BaseInvocationOutput):
"""Base class for nodes that output a single string"""
type: Literal["string_output"] = "string_output"
text: str = OutputField(description="The output string")
class StringCollectionOutput(BaseInvocationOutput):
"""Base class for nodes that output a collection of strings"""
type: Literal["string_collection_output"] = "string_collection_output"
# Outputs
collection: list[str] = OutputField(
default_factory=list, description="The output strings", ui_type=UIType.StringCollection
)
@title("String Primitive")
@tags("primitives", "string")
class StringInvocation(BaseInvocation):
"""A string primitive value"""
type: Literal["string"] = "string"
# Inputs
text: str = InputField(default="", description="The string value", ui_component=UIComponent.Textarea)
def invoke(self, context: InvocationContext) -> StringOutput:
return StringOutput(text=self.text)
@title("String Primitive Collection")
@tags("primitives", "string", "collection")
class StringCollectionInvocation(BaseInvocation):
"""A collection of string primitive values"""
type: Literal["string_collection"] = "string_collection"
# Inputs
collection: list[str] = InputField(
default=0, description="The collection of string values", ui_type=UIType.StringCollection
)
def invoke(self, context: InvocationContext) -> StringCollectionOutput:
return StringCollectionOutput(collection=self.collection)
# endregion
# region Image
class ImageField(BaseModel):
"""An image primitive field"""
image_name: str = Field(description="The name of the image")
class ImageOutput(BaseInvocationOutput):
"""Base class for nodes that output a single image"""
type: Literal["image_output"] = "image_output"
image: ImageField = OutputField(description="The output image")
width: int = OutputField(description="The width of the image in pixels")
height: int = OutputField(description="The height of the image in pixels")
class ImageCollectionOutput(BaseInvocationOutput):
"""Base class for nodes that output a collection of images"""
type: Literal["image_collection_output"] = "image_collection_output"
# Outputs
collection: list[ImageField] = OutputField(
default_factory=list, description="The output images", ui_type=UIType.ImageCollection
)
@title("Image Primitive")
@tags("primitives", "image")
class ImageInvocation(BaseInvocation):
"""An image primitive value"""
# Metadata
type: Literal["image"] = "image"
# Inputs
image: ImageField = InputField(description="The image to load")
def invoke(self, context: InvocationContext) -> ImageOutput:
image = context.services.images.get_pil_image(self.image.image_name)
return ImageOutput(
image=ImageField(image_name=self.image.image_name),
width=image.width,
height=image.height,
)
@title("Image Primitive Collection")
@tags("primitives", "image", "collection")
class ImageCollectionInvocation(BaseInvocation):
"""A collection of image primitive values"""
type: Literal["image_collection"] = "image_collection"
# Inputs
collection: list[ImageField] = InputField(
default=0, description="The collection of image values", ui_type=UIType.ImageCollection
)
def invoke(self, context: InvocationContext) -> ImageCollectionOutput:
return ImageCollectionOutput(collection=self.collection)
# endregion
# region Latents
class LatentsField(BaseModel):
"""A latents tensor primitive field"""
latents_name: str = Field(description="The name of the latents")
seed: Optional[int] = Field(default=None, description="Seed used to generate this latents")
class LatentsOutput(BaseInvocationOutput):
"""Base class for nodes that output a single latents tensor"""
type: Literal["latents_output"] = "latents_output"
latents: LatentsField = OutputField(
description=FieldDescriptions.latents,
)
width: int = OutputField(description=FieldDescriptions.width)
height: int = OutputField(description=FieldDescriptions.height)
class LatentsCollectionOutput(BaseInvocationOutput):
"""Base class for nodes that output a collection of latents tensors"""
type: Literal["latents_collection_output"] = "latents_collection_output"
collection: list[LatentsField] = OutputField(
default_factory=list,
description=FieldDescriptions.latents,
ui_type=UIType.LatentsCollection,
)
@title("Latents Primitive")
@tags("primitives", "latents")
class LatentsInvocation(BaseInvocation):
"""A latents tensor primitive value"""
type: Literal["latents"] = "latents"
# Inputs
latents: LatentsField = InputField(description="The latents tensor", input=Input.Connection)
def invoke(self, context: InvocationContext) -> LatentsOutput:
latents = context.services.latents.get(self.latents.latents_name)
return build_latents_output(self.latents.latents_name, latents)
@title("Latents Primitive Collection")
@tags("primitives", "latents", "collection")
class LatentsCollectionInvocation(BaseInvocation):
"""A collection of latents tensor primitive values"""
type: Literal["latents_collection"] = "latents_collection"
# Inputs
collection: list[LatentsField] = InputField(
default=0, description="The collection of latents tensors", ui_type=UIType.LatentsCollection
)
def invoke(self, context: InvocationContext) -> LatentsCollectionOutput:
return LatentsCollectionOutput(collection=self.collection)
def build_latents_output(latents_name: str, latents: torch.Tensor, seed: Optional[int] = None):
return LatentsOutput(
latents=LatentsField(latents_name=latents_name, seed=seed),
width=latents.size()[3] * 8,
height=latents.size()[2] * 8,
)
# endregion
# region Color
class ColorField(BaseModel):
"""A color primitive field"""
r: int = Field(ge=0, le=255, description="The red component")
g: int = Field(ge=0, le=255, description="The green component")
b: int = Field(ge=0, le=255, description="The blue component")
a: int = Field(ge=0, le=255, description="The alpha component")
def tuple(self) -> Tuple[int, int, int, int]:
return (self.r, self.g, self.b, self.a)
class ColorOutput(BaseInvocationOutput):
"""Base class for nodes that output a single color"""
type: Literal["color_output"] = "color_output"
color: ColorField = OutputField(description="The output color")
class ColorCollectionOutput(BaseInvocationOutput):
"""Base class for nodes that output a collection of colors"""
type: Literal["color_collection_output"] = "color_collection_output"
# Outputs
collection: list[ColorField] = OutputField(
default_factory=list, description="The output colors", ui_type=UIType.ColorCollection
)
@title("Color Primitive")
@tags("primitives", "color")
class ColorInvocation(BaseInvocation):
"""A color primitive value"""
type: Literal["color"] = "color"
# Inputs
color: ColorField = InputField(default=ColorField(r=0, g=0, b=0, a=255), description="The color value")
def invoke(self, context: InvocationContext) -> ColorOutput:
return ColorOutput(color=self.color)
# endregion
# region Conditioning
class ConditioningField(BaseModel):
"""A conditioning tensor primitive value"""
conditioning_name: str = Field(description="The name of conditioning tensor")
class ConditioningOutput(BaseInvocationOutput):
"""Base class for nodes that output a single conditioning tensor"""
type: Literal["conditioning_output"] = "conditioning_output"
conditioning: ConditioningField = OutputField(description=FieldDescriptions.cond)
class ConditioningCollectionOutput(BaseInvocationOutput):
"""Base class for nodes that output a collection of conditioning tensors"""
type: Literal["conditioning_collection_output"] = "conditioning_collection_output"
# Outputs
collection: list[ConditioningField] = OutputField(
default_factory=list,
description="The output conditioning tensors",
ui_type=UIType.ConditioningCollection,
)
@title("Conditioning Primitive")
@tags("primitives", "conditioning")
class ConditioningInvocation(BaseInvocation):
"""A conditioning tensor primitive value"""
type: Literal["conditioning"] = "conditioning"
conditioning: ConditioningField = InputField(description=FieldDescriptions.cond, input=Input.Connection)
def invoke(self, context: InvocationContext) -> ConditioningOutput:
return ConditioningOutput(conditioning=self.conditioning)
@title("Conditioning Primitive Collection")
@tags("primitives", "conditioning", "collection")
class ConditioningCollectionInvocation(BaseInvocation):
"""A collection of conditioning tensor primitive values"""
type: Literal["conditioning_collection"] = "conditioning_collection"
# Inputs
collection: list[ConditioningField] = InputField(
default=0, description="The collection of conditioning tensors", ui_type=UIType.ConditioningCollection
)
def invoke(self, context: InvocationContext) -> ConditioningCollectionOutput:
return ConditioningCollectionOutput(collection=self.collection)
# endregion

View File

@ -1,59 +1,28 @@
from os.path import exists
from typing import Literal, Optional
from typing import Literal, Optional, Union
import numpy as np
from pydantic import Field, validator
from dynamicprompts.generators import CombinatorialPromptGenerator, RandomPromptGenerator
from pydantic import validator
from .baseinvocation import BaseInvocation, BaseInvocationOutput, InvocationConfig, InvocationContext
from dynamicprompts.generators import RandomPromptGenerator, CombinatorialPromptGenerator
class PromptOutput(BaseInvocationOutput):
"""Base class for invocations that output a prompt"""
# fmt: off
type: Literal["prompt"] = "prompt"
prompt: str = Field(default=None, description="The output prompt")
# fmt: on
class Config:
schema_extra = {
"required": [
"type",
"prompt",
]
}
class PromptCollectionOutput(BaseInvocationOutput):
"""Base class for invocations that output a collection of prompts"""
# fmt: off
type: Literal["prompt_collection_output"] = "prompt_collection_output"
prompt_collection: list[str] = Field(description="The output prompt collection")
count: int = Field(description="The size of the prompt collection")
# fmt: on
class Config:
schema_extra = {"required": ["type", "prompt_collection", "count"]}
from invokeai.app.invocations.primitives import StringCollectionOutput
from .baseinvocation import BaseInvocation, InputField, InvocationContext, UIComponent, UIType, tags, title
@title("Dynamic Prompt")
@tags("prompt", "collection")
class DynamicPromptInvocation(BaseInvocation):
"""Parses a prompt using adieyal/dynamicprompts' random or combinatorial generator"""
type: Literal["dynamic_prompt"] = "dynamic_prompt"
prompt: str = Field(description="The prompt to parse with dynamicprompts")
max_prompts: int = Field(default=1, description="The number of prompts to generate")
combinatorial: bool = Field(default=False, description="Whether to use the combinatorial generator")
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Dynamic Prompt", "tags": ["prompt", "dynamic"]},
}
# Inputs
prompt: str = InputField(description="The prompt to parse with dynamicprompts", ui_component=UIComponent.Textarea)
max_prompts: int = InputField(default=1, description="The number of prompts to generate")
combinatorial: bool = InputField(default=False, description="Whether to use the combinatorial generator")
def invoke(self, context: InvocationContext) -> PromptCollectionOutput:
def invoke(self, context: InvocationContext) -> StringCollectionOutput:
if self.combinatorial:
generator = CombinatorialPromptGenerator()
prompts = generator.generate(self.prompt, max_prompts=self.max_prompts)
@ -61,27 +30,26 @@ class DynamicPromptInvocation(BaseInvocation):
generator = RandomPromptGenerator()
prompts = generator.generate(self.prompt, num_images=self.max_prompts)
return PromptCollectionOutput(prompt_collection=prompts, count=len(prompts))
return StringCollectionOutput(collection=prompts)
@title("Prompts from File")
@tags("prompt", "file")
class PromptsFromFileInvocation(BaseInvocation):
"""Loads prompts from a text file"""
# fmt: off
type: Literal['prompt_from_file'] = 'prompt_from_file'
type: Literal["prompt_from_file"] = "prompt_from_file"
# Inputs
file_path: str = Field(description="Path to prompt text file")
pre_prompt: Optional[str] = Field(description="String to prepend to each prompt")
post_prompt: Optional[str] = Field(description="String to append to each prompt")
start_line: int = Field(default=1, ge=1, description="Line in the file to start start from")
max_prompts: int = Field(default=1, ge=0, description="Max lines to read from file (0=all)")
# fmt: on
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Prompts From File", "tags": ["prompt", "file"]},
}
file_path: str = InputField(description="Path to prompt text file", ui_type=UIType.FilePath)
pre_prompt: Optional[str] = InputField(
default=None, description="String to prepend to each prompt", ui_component=UIComponent.Textarea
)
post_prompt: Optional[str] = InputField(
default=None, description="String to append to each prompt", ui_component=UIComponent.Textarea
)
start_line: int = InputField(default=1, ge=1, description="Line in the file to start start from")
max_prompts: int = InputField(default=1, ge=0, description="Max lines to read from file (0=all)")
@validator("file_path")
def file_path_exists(cls, v):
@ -89,7 +57,14 @@ class PromptsFromFileInvocation(BaseInvocation):
raise ValueError(FileNotFoundError)
return v
def promptsFromFile(self, file_path: str, pre_prompt: str, post_prompt: str, start_line: int, max_prompts: int):
def promptsFromFile(
self,
file_path: str,
pre_prompt: Union[str, None],
post_prompt: Union[str, None],
start_line: int,
max_prompts: int,
):
prompts = []
start_line -= 1
end_line = start_line + max_prompts
@ -103,8 +78,8 @@ class PromptsFromFileInvocation(BaseInvocation):
break
return prompts
def invoke(self, context: InvocationContext) -> PromptCollectionOutput:
def invoke(self, context: InvocationContext) -> StringCollectionOutput:
prompts = self.promptsFromFile(
self.file_path, self.pre_prompt, self.post_prompt, self.start_line, self.max_prompts
)
return PromptCollectionOutput(prompt_collection=prompts, count=len(prompts))
return StringCollectionOutput(collection=prompts)

View File

@ -1,62 +1,55 @@
import torch
import inspect
from tqdm import tqdm
from typing import List, Literal, Optional, Union
from pydantic import Field, validator
from typing import Literal
from ...backend.model_management import ModelType, SubModelType
from invokeai.app.util.step_callback import stable_diffusion_xl_step_callback
from .baseinvocation import BaseInvocation, BaseInvocationOutput, InvocationConfig, InvocationContext
from .model import UNetField, ClipField, VaeField, MainModelField, ModelInfo
from .compel import ConditioningField
from .latent import LatentsField, SAMPLER_NAME_VALUES, LatentsOutput, get_scheduler, build_latents_output
from .baseinvocation import (
BaseInvocation,
BaseInvocationOutput,
FieldDescriptions,
Input,
InputField,
InvocationContext,
OutputField,
UIType,
tags,
title,
)
from .model import ClipField, MainModelField, ModelInfo, UNetField, VaeField
class SDXLModelLoaderOutput(BaseInvocationOutput):
"""SDXL base model loader output"""
# fmt: off
type: Literal["sdxl_model_loader_output"] = "sdxl_model_loader_output"
unet: UNetField = Field(default=None, description="UNet submodel")
clip: ClipField = Field(default=None, description="Tokenizer and text_encoder submodels")
clip2: ClipField = Field(default=None, description="Tokenizer and text_encoder submodels")
vae: VaeField = Field(default=None, description="Vae submodel")
# fmt: on
unet: UNetField = OutputField(description=FieldDescriptions.unet, title="UNet")
clip: ClipField = OutputField(description=FieldDescriptions.clip, title="CLIP 1")
clip2: ClipField = OutputField(description=FieldDescriptions.clip, title="CLIP 2")
vae: VaeField = OutputField(description=FieldDescriptions.vae, title="VAE")
class SDXLRefinerModelLoaderOutput(BaseInvocationOutput):
"""SDXL refiner model loader output"""
# fmt: off
type: Literal["sdxl_refiner_model_loader_output"] = "sdxl_refiner_model_loader_output"
unet: UNetField = Field(default=None, description="UNet submodel")
clip2: ClipField = Field(default=None, description="Tokenizer and text_encoder submodels")
vae: VaeField = Field(default=None, description="Vae submodel")
# fmt: on
# fmt: on
unet: UNetField = OutputField(description=FieldDescriptions.unet, title="UNet")
clip2: ClipField = OutputField(description=FieldDescriptions.clip, title="CLIP 2")
vae: VaeField = OutputField(description=FieldDescriptions.vae, title="VAE")
@title("SDXL Main Model Loader")
@tags("model", "sdxl")
class SDXLModelLoaderInvocation(BaseInvocation):
"""Loads an sdxl base model, outputting its submodels."""
type: Literal["sdxl_model_loader"] = "sdxl_model_loader"
model: MainModelField = Field(description="The model to load")
# Inputs
model: MainModelField = InputField(
description=FieldDescriptions.sdxl_main_model, input=Input.Direct, ui_type=UIType.SDXLMainModel
)
# TODO: precision?
# Schema customisation
class Config(InvocationConfig):
schema_extra = {
"ui": {
"title": "SDXL Model Loader",
"tags": ["model", "loader", "sdxl"],
"type_hints": {"model": "model"},
},
}
def invoke(self, context: InvocationContext) -> SDXLModelLoaderOutput:
base_model = self.model.base_model
model_name = self.model.model_name
@ -129,24 +122,21 @@ class SDXLModelLoaderInvocation(BaseInvocation):
)
@title("SDXL Refiner Model Loader")
@tags("model", "sdxl", "refiner")
class SDXLRefinerModelLoaderInvocation(BaseInvocation):
"""Loads an sdxl refiner model, outputting its submodels."""
type: Literal["sdxl_refiner_model_loader"] = "sdxl_refiner_model_loader"
model: MainModelField = Field(description="The model to load")
# Inputs
model: MainModelField = InputField(
description=FieldDescriptions.sdxl_refiner_model,
input=Input.Direct,
ui_type=UIType.SDXLRefinerModel,
)
# TODO: precision?
# Schema customisation
class Config(InvocationConfig):
schema_extra = {
"ui": {
"title": "SDXL Refiner Model Loader",
"tags": ["model", "loader", "sdxl_refiner"],
"type_hints": {"model": "refiner_model"},
},
}
def invoke(self, context: InvocationContext) -> SDXLRefinerModelLoaderOutput:
base_model = self.model.base_model
model_name = self.model.model_name
@ -201,506 +191,3 @@ class SDXLRefinerModelLoaderInvocation(BaseInvocation):
),
),
)
# Text to image
class SDXLTextToLatentsInvocation(BaseInvocation):
"""Generates latents from conditionings."""
type: Literal["t2l_sdxl"] = "t2l_sdxl"
# Inputs
# fmt: off
positive_conditioning: Optional[ConditioningField] = Field(description="Positive conditioning for generation")
negative_conditioning: Optional[ConditioningField] = Field(description="Negative conditioning for generation")
noise: Optional[LatentsField] = Field(description="The noise to use")
steps: int = Field(default=10, gt=0, description="The number of steps to use to generate the image")
cfg_scale: Union[float, List[float]] = Field(default=7.5, ge=1, description="The Classifier-Free Guidance, higher values may result in a result closer to the prompt", )
scheduler: SAMPLER_NAME_VALUES = Field(default="euler", description="The scheduler to use" )
unet: UNetField = Field(default=None, description="UNet submodel")
denoising_end: float = Field(default=1.0, gt=0, le=1, description="")
# control: Union[ControlField, list[ControlField]] = Field(default=None, description="The control to use")
# seamless: bool = Field(default=False, description="Whether or not to generate an image that can tile without seams", )
# seamless_axes: str = Field(default="", description="The axes to tile the image on, 'x' and/or 'y'")
# fmt: on
@validator("cfg_scale")
def ge_one(cls, v):
"""validate that all cfg_scale values are >= 1"""
if isinstance(v, list):
for i in v:
if i < 1:
raise ValueError("cfg_scale must be greater than 1")
else:
if v < 1:
raise ValueError("cfg_scale must be greater than 1")
return v
# Schema customisation
class Config(InvocationConfig):
schema_extra = {
"ui": {
"title": "SDXL Text To Latents",
"tags": ["latents"],
"type_hints": {
"model": "model",
# "cfg_scale": "float",
"cfg_scale": "number",
},
},
}
def dispatch_progress(
self,
context: InvocationContext,
source_node_id: str,
sample,
step,
total_steps,
) -> None:
stable_diffusion_xl_step_callback(
context=context,
node=self.dict(),
source_node_id=source_node_id,
sample=sample,
step=step,
total_steps=total_steps,
)
# based on
# https://github.com/huggingface/diffusers/blob/3ebbaf7c96801271f9e6c21400033b6aa5ffcf29/src/diffusers/pipelines/stable_diffusion/pipeline_onnx_stable_diffusion.py#L375
@torch.no_grad()
def invoke(self, context: InvocationContext) -> LatentsOutput:
graph_execution_state = context.services.graph_execution_manager.get(context.graph_execution_state_id)
source_node_id = graph_execution_state.prepared_source_mapping[self.id]
latents = context.services.latents.get(self.noise.latents_name)
positive_cond_data = context.services.latents.get(self.positive_conditioning.conditioning_name)
prompt_embeds = positive_cond_data.conditionings[0].embeds
pooled_prompt_embeds = positive_cond_data.conditionings[0].pooled_embeds
add_time_ids = positive_cond_data.conditionings[0].add_time_ids
negative_cond_data = context.services.latents.get(self.negative_conditioning.conditioning_name)
negative_prompt_embeds = negative_cond_data.conditionings[0].embeds
negative_pooled_prompt_embeds = negative_cond_data.conditionings[0].pooled_embeds
add_neg_time_ids = negative_cond_data.conditionings[0].add_time_ids
scheduler = get_scheduler(
context=context,
scheduler_info=self.unet.scheduler,
scheduler_name=self.scheduler,
)
num_inference_steps = self.steps
unet_info = context.services.model_manager.get_model(**self.unet.unet.dict(), context=context)
do_classifier_free_guidance = True
cross_attention_kwargs = None
with unet_info as unet:
scheduler.set_timesteps(num_inference_steps, device=unet.device)
timesteps = scheduler.timesteps
latents = latents.to(device=unet.device, dtype=unet.dtype) * scheduler.init_noise_sigma
extra_step_kwargs = dict()
if "eta" in set(inspect.signature(scheduler.step).parameters.keys()):
extra_step_kwargs.update(
eta=0.0,
)
if "generator" in set(inspect.signature(scheduler.step).parameters.keys()):
extra_step_kwargs.update(
generator=torch.Generator(device=unet.device).manual_seed(0),
)
num_warmup_steps = len(timesteps) - self.steps * scheduler.order
# apply denoising_end
skipped_final_steps = int(round((1 - self.denoising_end) * self.steps))
num_inference_steps = num_inference_steps - skipped_final_steps
timesteps = timesteps[: num_warmup_steps + scheduler.order * num_inference_steps]
if not context.services.configuration.sequential_guidance:
prompt_embeds = torch.cat([negative_prompt_embeds, prompt_embeds], dim=0)
add_text_embeds = torch.cat([negative_pooled_prompt_embeds, pooled_prompt_embeds], dim=0)
add_time_ids = torch.cat([add_neg_time_ids, add_time_ids], dim=0)
prompt_embeds = prompt_embeds.to(device=unet.device, dtype=unet.dtype)
add_text_embeds = add_text_embeds.to(device=unet.device, dtype=unet.dtype)
add_time_ids = add_time_ids.to(device=unet.device, dtype=unet.dtype)
latents = latents.to(device=unet.device, dtype=unet.dtype)
with tqdm(total=num_inference_steps) as progress_bar:
for i, t in enumerate(timesteps):
# expand the latents if we are doing classifier free guidance
latent_model_input = torch.cat([latents] * 2) if do_classifier_free_guidance else latents
latent_model_input = scheduler.scale_model_input(latent_model_input, t)
# predict the noise residual
added_cond_kwargs = {"text_embeds": add_text_embeds, "time_ids": add_time_ids}
noise_pred = unet(
latent_model_input,
t,
encoder_hidden_states=prompt_embeds,
cross_attention_kwargs=cross_attention_kwargs,
added_cond_kwargs=added_cond_kwargs,
return_dict=False,
)[0]
# perform guidance
if do_classifier_free_guidance:
noise_pred_uncond, noise_pred_text = noise_pred.chunk(2)
noise_pred = noise_pred_uncond + self.cfg_scale * (noise_pred_text - noise_pred_uncond)
# del noise_pred_uncond
# del noise_pred_text
# if do_classifier_free_guidance and guidance_rescale > 0.0:
# # Based on 3.4. in https://arxiv.org/pdf/2305.08891.pdf
# noise_pred = rescale_noise_cfg(noise_pred, noise_pred_text, guidance_rescale=guidance_rescale)
# compute the previous noisy sample x_t -> x_t-1
latents = scheduler.step(noise_pred, t, latents, **extra_step_kwargs, return_dict=False)[0]
# call the callback, if provided
if i == len(timesteps) - 1 or ((i + 1) > num_warmup_steps and (i + 1) % scheduler.order == 0):
progress_bar.update()
self.dispatch_progress(context, source_node_id, latents, i, num_inference_steps)
# if callback is not None and i % callback_steps == 0:
# callback(i, t, latents)
else:
negative_pooled_prompt_embeds = negative_pooled_prompt_embeds.to(device=unet.device, dtype=unet.dtype)
negative_prompt_embeds = negative_prompt_embeds.to(device=unet.device, dtype=unet.dtype)
add_neg_time_ids = add_neg_time_ids.to(device=unet.device, dtype=unet.dtype)
pooled_prompt_embeds = pooled_prompt_embeds.to(device=unet.device, dtype=unet.dtype)
prompt_embeds = prompt_embeds.to(device=unet.device, dtype=unet.dtype)
add_time_ids = add_time_ids.to(device=unet.device, dtype=unet.dtype)
latents = latents.to(device=unet.device, dtype=unet.dtype)
with tqdm(total=num_inference_steps) as progress_bar:
for i, t in enumerate(timesteps):
# expand the latents if we are doing classifier free guidance
# latent_model_input = torch.cat([latents] * 2) if do_classifier_free_guidance else latents
latent_model_input = scheduler.scale_model_input(latents, t)
# import gc
# gc.collect()
# torch.cuda.empty_cache()
# predict the noise residual
added_cond_kwargs = {"text_embeds": negative_pooled_prompt_embeds, "time_ids": add_neg_time_ids}
noise_pred_uncond = unet(
latent_model_input,
t,
encoder_hidden_states=negative_prompt_embeds,
cross_attention_kwargs=cross_attention_kwargs,
added_cond_kwargs=added_cond_kwargs,
return_dict=False,
)[0]
added_cond_kwargs = {"text_embeds": pooled_prompt_embeds, "time_ids": add_time_ids}
noise_pred_text = unet(
latent_model_input,
t,
encoder_hidden_states=prompt_embeds,
cross_attention_kwargs=cross_attention_kwargs,
added_cond_kwargs=added_cond_kwargs,
return_dict=False,
)[0]
# perform guidance
noise_pred = noise_pred_uncond + self.cfg_scale * (noise_pred_text - noise_pred_uncond)
# del noise_pred_text
# del noise_pred_uncond
# import gc
# gc.collect()
# torch.cuda.empty_cache()
# if do_classifier_free_guidance and guidance_rescale > 0.0:
# # Based on 3.4. in https://arxiv.org/pdf/2305.08891.pdf
# noise_pred = rescale_noise_cfg(noise_pred, noise_pred_text, guidance_rescale=guidance_rescale)
# compute the previous noisy sample x_t -> x_t-1
latents = scheduler.step(noise_pred, t, latents, **extra_step_kwargs, return_dict=False)[0]
# del noise_pred
# import gc
# gc.collect()
# torch.cuda.empty_cache()
# call the callback, if provided
if i == len(timesteps) - 1 or ((i + 1) > num_warmup_steps and (i + 1) % scheduler.order == 0):
progress_bar.update()
self.dispatch_progress(context, source_node_id, latents, i, num_inference_steps)
# if callback is not None and i % callback_steps == 0:
# callback(i, t, latents)
#################
latents = latents.to("cpu")
torch.cuda.empty_cache()
name = f"{context.graph_execution_state_id}__{self.id}"
context.services.latents.save(name, latents)
return build_latents_output(latents_name=name, latents=latents)
class SDXLLatentsToLatentsInvocation(BaseInvocation):
"""Generates latents from conditionings."""
type: Literal["l2l_sdxl"] = "l2l_sdxl"
# Inputs
# fmt: off
positive_conditioning: Optional[ConditioningField] = Field(description="Positive conditioning for generation")
negative_conditioning: Optional[ConditioningField] = Field(description="Negative conditioning for generation")
noise: Optional[LatentsField] = Field(description="The noise to use")
steps: int = Field(default=10, gt=0, description="The number of steps to use to generate the image")
cfg_scale: Union[float, List[float]] = Field(default=7.5, ge=1, description="The Classifier-Free Guidance, higher values may result in a result closer to the prompt", )
scheduler: SAMPLER_NAME_VALUES = Field(default="euler", description="The scheduler to use" )
unet: UNetField = Field(default=None, description="UNet submodel")
latents: Optional[LatentsField] = Field(description="Initial latents")
denoising_start: float = Field(default=0.0, ge=0, le=1, description="")
denoising_end: float = Field(default=1.0, ge=0, le=1, description="")
# control: Union[ControlField, list[ControlField]] = Field(default=None, description="The control to use")
# seamless: bool = Field(default=False, description="Whether or not to generate an image that can tile without seams", )
# seamless_axes: str = Field(default="", description="The axes to tile the image on, 'x' and/or 'y'")
# fmt: on
@validator("cfg_scale")
def ge_one(cls, v):
"""validate that all cfg_scale values are >= 1"""
if isinstance(v, list):
for i in v:
if i < 1:
raise ValueError("cfg_scale must be greater than 1")
else:
if v < 1:
raise ValueError("cfg_scale must be greater than 1")
return v
# Schema customisation
class Config(InvocationConfig):
schema_extra = {
"ui": {
"title": "SDXL Latents to Latents",
"tags": ["latents"],
"type_hints": {
"model": "model",
# "cfg_scale": "float",
"cfg_scale": "number",
},
},
}
def dispatch_progress(
self,
context: InvocationContext,
source_node_id: str,
sample,
step,
total_steps,
) -> None:
stable_diffusion_xl_step_callback(
context=context,
node=self.dict(),
source_node_id=source_node_id,
sample=sample,
step=step,
total_steps=total_steps,
)
# based on
# https://github.com/huggingface/diffusers/blob/3ebbaf7c96801271f9e6c21400033b6aa5ffcf29/src/diffusers/pipelines/stable_diffusion/pipeline_onnx_stable_diffusion.py#L375
@torch.no_grad()
def invoke(self, context: InvocationContext) -> LatentsOutput:
graph_execution_state = context.services.graph_execution_manager.get(context.graph_execution_state_id)
source_node_id = graph_execution_state.prepared_source_mapping[self.id]
latents = context.services.latents.get(self.latents.latents_name)
positive_cond_data = context.services.latents.get(self.positive_conditioning.conditioning_name)
prompt_embeds = positive_cond_data.conditionings[0].embeds
pooled_prompt_embeds = positive_cond_data.conditionings[0].pooled_embeds
add_time_ids = positive_cond_data.conditionings[0].add_time_ids
negative_cond_data = context.services.latents.get(self.negative_conditioning.conditioning_name)
negative_prompt_embeds = negative_cond_data.conditionings[0].embeds
negative_pooled_prompt_embeds = negative_cond_data.conditionings[0].pooled_embeds
add_neg_time_ids = negative_cond_data.conditionings[0].add_time_ids
scheduler = get_scheduler(
context=context,
scheduler_info=self.unet.scheduler,
scheduler_name=self.scheduler,
)
unet_info = context.services.model_manager.get_model(
**self.unet.unet.dict(),
context=context,
)
do_classifier_free_guidance = True
cross_attention_kwargs = None
with unet_info as unet:
# apply denoising_start
num_inference_steps = self.steps
scheduler.set_timesteps(num_inference_steps, device=unet.device)
t_start = int(round(self.denoising_start * num_inference_steps))
timesteps = scheduler.timesteps[t_start * scheduler.order :]
num_inference_steps = num_inference_steps - t_start
# apply noise(if provided)
if self.noise is not None and timesteps.shape[0] > 0:
noise = context.services.latents.get(self.noise.latents_name)
latents = scheduler.add_noise(latents, noise, timesteps[:1])
del noise
# apply scheduler extra args
extra_step_kwargs = dict()
if "eta" in set(inspect.signature(scheduler.step).parameters.keys()):
extra_step_kwargs.update(
eta=0.0,
)
if "generator" in set(inspect.signature(scheduler.step).parameters.keys()):
extra_step_kwargs.update(
generator=torch.Generator(device=unet.device).manual_seed(0),
)
num_warmup_steps = max(len(timesteps) - num_inference_steps * scheduler.order, 0)
# apply denoising_end
skipped_final_steps = int(round((1 - self.denoising_end) * self.steps))
num_inference_steps = num_inference_steps - skipped_final_steps
timesteps = timesteps[: num_warmup_steps + scheduler.order * num_inference_steps]
if not context.services.configuration.sequential_guidance:
prompt_embeds = torch.cat([negative_prompt_embeds, prompt_embeds], dim=0)
add_text_embeds = torch.cat([negative_pooled_prompt_embeds, pooled_prompt_embeds], dim=0)
add_time_ids = torch.cat([add_neg_time_ids, add_time_ids], dim=0)
prompt_embeds = prompt_embeds.to(device=unet.device, dtype=unet.dtype)
add_text_embeds = add_text_embeds.to(device=unet.device, dtype=unet.dtype)
add_time_ids = add_time_ids.to(device=unet.device, dtype=unet.dtype)
latents = latents.to(device=unet.device, dtype=unet.dtype)
with tqdm(total=num_inference_steps) as progress_bar:
for i, t in enumerate(timesteps):
# expand the latents if we are doing classifier free guidance
latent_model_input = torch.cat([latents] * 2) if do_classifier_free_guidance else latents
latent_model_input = scheduler.scale_model_input(latent_model_input, t)
# predict the noise residual
added_cond_kwargs = {"text_embeds": add_text_embeds, "time_ids": add_time_ids}
noise_pred = unet(
latent_model_input,
t,
encoder_hidden_states=prompt_embeds,
cross_attention_kwargs=cross_attention_kwargs,
added_cond_kwargs=added_cond_kwargs,
return_dict=False,
)[0]
# perform guidance
if do_classifier_free_guidance:
noise_pred_uncond, noise_pred_text = noise_pred.chunk(2)
noise_pred = noise_pred_uncond + self.cfg_scale * (noise_pred_text - noise_pred_uncond)
# del noise_pred_uncond
# del noise_pred_text
# if do_classifier_free_guidance and guidance_rescale > 0.0:
# # Based on 3.4. in https://arxiv.org/pdf/2305.08891.pdf
# noise_pred = rescale_noise_cfg(noise_pred, noise_pred_text, guidance_rescale=guidance_rescale)
# compute the previous noisy sample x_t -> x_t-1
latents = scheduler.step(noise_pred, t, latents, **extra_step_kwargs, return_dict=False)[0]
# call the callback, if provided
if i == len(timesteps) - 1 or ((i + 1) > num_warmup_steps and (i + 1) % scheduler.order == 0):
progress_bar.update()
self.dispatch_progress(context, source_node_id, latents, i, num_inference_steps)
# if callback is not None and i % callback_steps == 0:
# callback(i, t, latents)
else:
negative_pooled_prompt_embeds = negative_pooled_prompt_embeds.to(device=unet.device, dtype=unet.dtype)
negative_prompt_embeds = negative_prompt_embeds.to(device=unet.device, dtype=unet.dtype)
add_neg_time_ids = add_neg_time_ids.to(device=unet.device, dtype=unet.dtype)
pooled_prompt_embeds = pooled_prompt_embeds.to(device=unet.device, dtype=unet.dtype)
prompt_embeds = prompt_embeds.to(device=unet.device, dtype=unet.dtype)
add_time_ids = add_time_ids.to(device=unet.device, dtype=unet.dtype)
latents = latents.to(device=unet.device, dtype=unet.dtype)
with tqdm(total=num_inference_steps) as progress_bar:
for i, t in enumerate(timesteps):
# expand the latents if we are doing classifier free guidance
# latent_model_input = torch.cat([latents] * 2) if do_classifier_free_guidance else latents
latent_model_input = scheduler.scale_model_input(latents, t)
# import gc
# gc.collect()
# torch.cuda.empty_cache()
# predict the noise residual
added_cond_kwargs = {"text_embeds": negative_pooled_prompt_embeds, "time_ids": add_time_ids}
noise_pred_uncond = unet(
latent_model_input,
t,
encoder_hidden_states=negative_prompt_embeds,
cross_attention_kwargs=cross_attention_kwargs,
added_cond_kwargs=added_cond_kwargs,
return_dict=False,
)[0]
added_cond_kwargs = {"text_embeds": pooled_prompt_embeds, "time_ids": add_time_ids}
noise_pred_text = unet(
latent_model_input,
t,
encoder_hidden_states=prompt_embeds,
cross_attention_kwargs=cross_attention_kwargs,
added_cond_kwargs=added_cond_kwargs,
return_dict=False,
)[0]
# perform guidance
noise_pred = noise_pred_uncond + self.cfg_scale * (noise_pred_text - noise_pred_uncond)
# del noise_pred_text
# del noise_pred_uncond
# import gc
# gc.collect()
# torch.cuda.empty_cache()
# if do_classifier_free_guidance and guidance_rescale > 0.0:
# # Based on 3.4. in https://arxiv.org/pdf/2305.08891.pdf
# noise_pred = rescale_noise_cfg(noise_pred, noise_pred_text, guidance_rescale=guidance_rescale)
# compute the previous noisy sample x_t -> x_t-1
latents = scheduler.step(noise_pred, t, latents, **extra_step_kwargs, return_dict=False)[0]
# del noise_pred
# import gc
# gc.collect()
# torch.cuda.empty_cache()
# call the callback, if provided
if i == len(timesteps) - 1 or ((i + 1) > num_warmup_steps and (i + 1) % scheduler.order == 0):
progress_bar.update()
self.dispatch_progress(context, source_node_id, latents, i, num_inference_steps)
# if callback is not None and i % callback_steps == 0:
# callback(i, t, latents)
#################
latents = latents.to("cpu")
torch.cuda.empty_cache()
name = f"{context.graph_execution_state_id}__{self.id}"
context.services.latents.save(name, latents)
return build_latents_output(latents_name=name, latents=latents)

View File

@ -6,13 +6,12 @@ import cv2 as cv
import numpy as np
from basicsr.archs.rrdbnet_arch import RRDBNet
from PIL import Image
from pydantic import Field
from realesrgan import RealESRGANer
from invokeai.app.invocations.primitives import ImageField, ImageOutput
from invokeai.app.models.image import ImageCategory, ImageField, ResourceOrigin
from invokeai.app.models.image import ImageCategory, ResourceOrigin
from .baseinvocation import BaseInvocation, InvocationConfig, InvocationContext
from .image import ImageOutput
from .baseinvocation import BaseInvocation, InputField, InvocationContext, title, tags
# TODO: Populate this from disk?
# TODO: Use model manager to load?
@ -24,17 +23,16 @@ ESRGAN_MODELS = Literal[
]
@title("Upscale (RealESRGAN)")
@tags("esrgan", "upscale")
class ESRGANInvocation(BaseInvocation):
"""Upscales an image using RealESRGAN."""
type: Literal["esrgan"] = "esrgan"
image: Union[ImageField, None] = Field(default=None, description="The input image")
model_name: ESRGAN_MODELS = Field(default="RealESRGAN_x4plus.pth", description="The Real-ESRGAN model to use")
class Config(InvocationConfig):
schema_extra = {
"ui": {"title": "Upscale (RealESRGAN)", "tags": ["image", "upscale", "realesrgan"]},
}
# Inputs
image: ImageField = InputField(description="The input image")
model_name: ESRGAN_MODELS = InputField(default="RealESRGAN_x4plus.pth", description="The Real-ESRGAN model to use")
def invoke(self, context: InvocationContext) -> ImageOutput:
image = context.services.images.get_pil_image(self.image.image_name)

View File

@ -1,31 +1,8 @@
from enum import Enum
from typing import Optional, Tuple, Literal
from pydantic import BaseModel, Field
from invokeai.app.util.metaenum import MetaEnum
from ..invocations.baseinvocation import (
BaseInvocationOutput,
InvocationConfig,
)
class ImageField(BaseModel):
"""An image field used for passing image objects between invocations"""
image_name: Optional[str] = Field(default=None, description="The name of the image")
class Config:
schema_extra = {"required": ["image_name"]}
class ColorField(BaseModel):
r: int = Field(ge=0, le=255, description="The red component")
g: int = Field(ge=0, le=255, description="The green component")
b: int = Field(ge=0, le=255, description="The blue component")
a: int = Field(ge=0, le=255, description="The alpha component")
def tuple(self) -> Tuple[int, int, int, int]:
return (self.r, self.g, self.b, self.a)
class ProgressImage(BaseModel):
@ -36,50 +13,6 @@ class ProgressImage(BaseModel):
dataURL: str = Field(description="The image data as a b64 data URL")
class PILInvocationConfig(BaseModel):
"""Helper class to provide all PIL invocations with additional config"""
class Config(InvocationConfig):
schema_extra = {
"ui": {
"tags": ["PIL", "image"],
},
}
class ImageOutput(BaseInvocationOutput):
"""Base class for invocations that output an image"""
# fmt: off
type: Literal["image_output"] = "image_output"
image: ImageField = Field(default=None, description="The output image")
width: int = Field(description="The width of the image in pixels")
height: int = Field(description="The height of the image in pixels")
# fmt: on
class Config:
schema_extra = {"required": ["type", "image", "width", "height"]}
class MaskOutput(BaseInvocationOutput):
"""Base class for invocations that output a mask"""
# fmt: off
type: Literal["mask"] = "mask"
mask: ImageField = Field(default=None, description="The output mask")
width: int = Field(description="The width of the mask in pixels")
height: int = Field(description="The height of the mask in pixels")
# fmt: on
class Config:
schema_extra = {
"required": [
"type",
"mask",
]
}
class ResourceOrigin(str, Enum, metaclass=MetaEnum):
"""The origin of a resource (eg image).

View File

@ -25,7 +25,6 @@ class BoardImageRecordStorageBase(ABC):
@abstractmethod
def remove_image_from_board(
self,
board_id: str,
image_name: str,
) -> None:
"""Removes an image from a board."""
@ -154,7 +153,6 @@ class SqliteBoardImageRecordStorage(BoardImageRecordStorageBase):
def remove_image_from_board(
self,
board_id: str,
image_name: str,
) -> None:
try:
@ -162,9 +160,9 @@ class SqliteBoardImageRecordStorage(BoardImageRecordStorageBase):
self._cursor.execute(
"""--sql
DELETE FROM board_images
WHERE board_id = ? AND image_name = ?;
WHERE image_name = ?;
""",
(board_id, image_name),
(image_name,),
)
self._conn.commit()
except sqlite3.Error as e:

View File

@ -31,7 +31,6 @@ class BoardImagesServiceABC(ABC):
@abstractmethod
def remove_image_from_board(
self,
board_id: str,
image_name: str,
) -> None:
"""Removes an image from a board."""
@ -93,10 +92,9 @@ class BoardImagesService(BoardImagesServiceABC):
def remove_image_from_board(
self,
board_id: str,
image_name: str,
) -> None:
self._services.board_image_records.remove_image_from_board(board_id, image_name)
self._services.board_image_records.remove_image_from_board(image_name)
def get_all_board_image_names_for_board(
self,

View File

@ -24,11 +24,10 @@ InvokeAI:
sequential_guidance: false
precision: float16
max_cache_size: 6
max_vram_cache_size: 2.7
max_vram_cache_size: 0.5
always_use_cpu: false
free_gpu_mem: false
Features:
restore: true
esrgan: true
patchmatch: true
internet_available: true
@ -165,7 +164,7 @@ import pydoc
import os
import sys
from argparse import ArgumentParser
from omegaconf import OmegaConf, DictConfig
from omegaconf import OmegaConf, DictConfig, ListConfig
from pathlib import Path
from pydantic import BaseSettings, Field, parse_obj_as
from typing import ClassVar, Dict, List, Set, Literal, Union, get_origin, get_type_hints, get_args
@ -173,6 +172,7 @@ from typing import ClassVar, Dict, List, Set, Literal, Union, get_origin, get_ty
INIT_FILE = Path("invokeai.yaml")
DB_FILE = Path("invokeai.db")
LEGACY_INIT_FILE = Path("invokeai.init")
DEFAULT_MAX_VRAM = 0.5
class InvokeAISettings(BaseSettings):
@ -189,7 +189,12 @@ class InvokeAISettings(BaseSettings):
opt = parser.parse_args(argv)
for name in self.__fields__:
if name not in self._excluded():
setattr(self, name, getattr(opt, name))
value = getattr(opt, name)
if isinstance(value, ListConfig):
value = list(value)
elif isinstance(value, DictConfig):
value = dict(value)
setattr(self, name, value)
def to_yaml(self) -> str:
"""
@ -274,7 +279,7 @@ class InvokeAISettings(BaseSettings):
@classmethod
def _excluded(self) -> List[str]:
# internal fields that shouldn't be exposed as command line options
return ["type", "initconf", "cached_root"]
return ["type", "initconf"]
@classmethod
def _excluded_from_yaml(self) -> List[str]:
@ -282,15 +287,10 @@ class InvokeAISettings(BaseSettings):
return [
"type",
"initconf",
"gpu_mem_reserved",
"max_loaded_models",
"version",
"from_file",
"model",
"restore",
"root",
"nsfw_checker",
"cached_root",
]
class Config:
@ -356,7 +356,7 @@ class InvokeAISettings(BaseSettings):
def _find_root() -> Path:
venv = Path(os.environ.get("VIRTUAL_ENV") or ".")
if os.environ.get("INVOKEAI_ROOT"):
root = Path(os.environ.get("INVOKEAI_ROOT")).resolve()
root = Path(os.environ["INVOKEAI_ROOT"])
elif any([(venv.parent / x).exists() for x in [INIT_FILE, LEGACY_INIT_FILE]]):
root = (venv.parent).resolve()
else:
@ -389,21 +389,17 @@ class InvokeAIAppConfig(InvokeAISettings):
internet_available : bool = Field(default=True, description="If true, attempt to download models on the fly; otherwise only use local models", category='Features')
log_tokenization : bool = Field(default=False, description="Enable logging of parsed prompt tokens.", category='Features')
patchmatch : bool = Field(default=True, description="Enable/disable patchmatch inpaint code", category='Features')
restore : bool = Field(default=True, description="Enable/disable face restoration code (DEPRECATED)", category='DEPRECATED')
always_use_cpu : bool = Field(default=False, description="If true, use the CPU for rendering even if a GPU is available.", category='Memory/Performance')
free_gpu_mem : bool = Field(default=False, description="If true, purge model from GPU after each generation.", category='Memory/Performance')
max_loaded_models : int = Field(default=3, gt=0, description="(DEPRECATED: use max_cache_size) Maximum number of models to keep in memory for rapid switching", category='DEPRECATED')
max_cache_size : float = Field(default=6.0, gt=0, description="Maximum memory amount used by model cache for rapid switching", category='Memory/Performance')
max_vram_cache_size : float = Field(default=2.75, ge=0, description="Amount of VRAM reserved for model storage", category='Memory/Performance')
gpu_mem_reserved : float = Field(default=2.75, ge=0, description="DEPRECATED: use max_vram_cache_size. Amount of VRAM reserved for model storage", category='DEPRECATED')
nsfw_checker : bool = Field(default=True, description="DEPRECATED: use Web settings to enable/disable", category='DEPRECATED')
precision : Literal[tuple(['auto','float16','float32','autocast'])] = Field(default='auto',description='Floating point precision', category='Memory/Performance')
sequential_guidance : bool = Field(default=False, description="Whether to calculate guidance in serial instead of in parallel, lowering memory requirements", category='Memory/Performance')
xformers_enabled : bool = Field(default=True, description="Enable/disable memory-efficient attention", category='Memory/Performance')
tiled_decode : bool = Field(default=False, description="Whether to enable tiled VAE decode (reduces memory consumption with some performance penalty)", category='Memory/Performance')
root : Path = Field(default=_find_root(), description='InvokeAI runtime root directory', category='Paths')
root : Path = Field(default=None, description='InvokeAI runtime root directory', category='Paths')
autoimport_dir : Path = Field(default='autoimport', description='Path to a directory of models files to be imported on startup.', category='Paths')
lora_dir : Path = Field(default=None, description='Path to a directory of LoRA/LyCORIS models to be imported on startup.', category='Paths')
embedding_dir : Path = Field(default=None, description='Path to a directory of Textual Inversion embeddings to be imported on startup.', category='Paths')
@ -415,8 +411,7 @@ class InvokeAIAppConfig(InvokeAISettings):
outdir : Path = Field(default='outputs', description='Default folder for output images', category='Paths')
from_file : Path = Field(default=None, description='Take command input from the indicated file (command-line client only)', category='Paths')
use_memory_db : bool = Field(default=False, description='Use in-memory database for storing image metadata', category='Paths')
model : str = Field(default='stable-diffusion-1.5', description='Initial model name', category='Models')
ignore_missing_core_models : bool = Field(default=False, description='Ignore missing models in models/core/convert', category='Features')
log_handlers : List[str] = Field(default=["console"], description='Log handler. Valid options are "console", "file=<path>", "syslog=path|address:host:port", "http=<url>"', category="Logging")
# note - would be better to read the log_format values from logging.py, but this creates circular dependencies issues
@ -424,9 +419,11 @@ class InvokeAIAppConfig(InvokeAISettings):
log_level : Literal[tuple(["debug","info","warning","error","critical"])] = Field(default="info", description="Emit logging messages at this level or higher", category="Logging")
version : bool = Field(default=False, description="Show InvokeAI version and exit", category="Other")
cached_root : Path = Field(default=None, description="internal use only", category="DEPRECATED")
# fmt: on
class Config:
validate_assignment = True
def parse_args(self, argv: List[str] = None, conf: DictConfig = None, clobber=False):
"""
Update settings with contents of init file, environment, and
@ -472,15 +469,12 @@ class InvokeAIAppConfig(InvokeAISettings):
"""
Path to the runtime root directory
"""
# we cache value of root to protect against it being '.' and the cwd changing
if self.cached_root:
root = self.cached_root
elif self.root:
if self.root:
root = Path(self.root).expanduser().absolute()
else:
root = self.find_root()
self.cached_root = root
return self.cached_root
root = self.find_root().expanduser().absolute()
self.root = root # insulate ourselves from relative paths that may change
return root
@property
def root_dir(self) -> Path:

View File

@ -1,8 +1,8 @@
from ..invocations.latent import LatentsToImageInvocation, TextToLatentsInvocation
from ..invocations.latent import LatentsToImageInvocation, DenoiseLatentsInvocation
from ..invocations.image import ImageNSFWBlurInvocation
from ..invocations.noise import NoiseInvocation
from ..invocations.compel import CompelInvocation
from ..invocations.params import ParamIntInvocation
from ..invocations.primitives import IntegerInvocation
from .graph import Edge, EdgeConnection, ExposedNodeInput, ExposedNodeOutput, Graph, LibraryGraph
from .item_storage import ItemStorageABC
@ -17,13 +17,13 @@ def create_text_to_image() -> LibraryGraph:
description="Converts text to an image",
graph=Graph(
nodes={
"width": ParamIntInvocation(id="width", a=512),
"height": ParamIntInvocation(id="height", a=512),
"seed": ParamIntInvocation(id="seed", a=-1),
"width": IntegerInvocation(id="width", a=512),
"height": IntegerInvocation(id="height", a=512),
"seed": IntegerInvocation(id="seed", a=-1),
"3": NoiseInvocation(id="3"),
"4": CompelInvocation(id="4"),
"5": CompelInvocation(id="5"),
"6": TextToLatentsInvocation(id="6"),
"6": DenoiseLatentsInvocation(id="6"),
"7": LatentsToImageInvocation(id="7"),
"8": ImageNSFWBlurInvocation(id="8"),
},

View File

@ -35,6 +35,7 @@ class EventServiceBase:
source_node_id: str,
progress_image: Optional[ProgressImage],
step: int,
order: int,
total_steps: int,
) -> None:
"""Emitted when there is generation progress"""
@ -46,6 +47,7 @@ class EventServiceBase:
source_node_id=source_node_id,
progress_image=progress_image.dict() if progress_image is not None else None,
step=step,
order=order,
total_steps=total_steps,
),
)

View File

@ -3,16 +3,7 @@
import copy
import itertools
import uuid
from typing import (
Annotated,
Any,
Literal,
Optional,
Union,
get_args,
get_origin,
get_type_hints,
)
from typing import Annotated, Any, Literal, Optional, Union, get_args, get_origin, get_type_hints
import networkx as nx
from pydantic import BaseModel, root_validator, validator
@ -22,7 +13,11 @@ from ..invocations import *
from ..invocations.baseinvocation import (
BaseInvocation,
BaseInvocationOutput,
Input,
InputField,
InvocationContext,
OutputField,
UIType,
)
# in 3.10 this would be "from types import NoneType"
@ -183,15 +178,9 @@ class IterateInvocationOutput(BaseInvocationOutput):
type: Literal["iterate_output"] = "iterate_output"
item: Any = Field(description="The item being iterated over")
class Config:
schema_extra = {
"required": [
"type",
"item",
]
}
item: Any = OutputField(
description="The item being iterated over", title="Collection Item", ui_type=UIType.CollectionItem
)
# TODO: Fill this out and move to invocations
@ -200,8 +189,10 @@ class IterateInvocation(BaseInvocation):
type: Literal["iterate"] = "iterate"
collection: list[Any] = Field(description="The list of items to iterate over", default_factory=list)
index: int = Field(description="The index, will be provided on executed iterators", default=0)
collection: list[Any] = InputField(
description="The list of items to iterate over", default_factory=list, ui_type=UIType.Collection
)
index: int = InputField(description="The index, will be provided on executed iterators", default=0, ui_hidden=True)
def invoke(self, context: InvocationContext) -> IterateInvocationOutput:
"""Produces the outputs as values"""
@ -211,15 +202,9 @@ class IterateInvocation(BaseInvocation):
class CollectInvocationOutput(BaseInvocationOutput):
type: Literal["collect_output"] = "collect_output"
collection: list[Any] = Field(description="The collection of input items")
class Config:
schema_extra = {
"required": [
"type",
"collection",
]
}
collection: list[Any] = OutputField(
description="The collection of input items", title="Collection", ui_type=UIType.Collection
)
class CollectInvocation(BaseInvocation):
@ -227,13 +212,14 @@ class CollectInvocation(BaseInvocation):
type: Literal["collect"] = "collect"
item: Any = Field(
item: Any = InputField(
description="The item to collect (all inputs must be of the same type)",
default=None,
ui_type=UIType.CollectionItem,
title="Collection Item",
input=Input.Connection,
)
collection: list[Any] = Field(
description="The collection, will be provided on execution",
default_factory=list,
collection: list[Any] = InputField(
description="The collection, will be provided on execution", default_factory=list, ui_hidden=True
)
def invoke(self, context: InvocationContext) -> CollectInvocationOutput:

View File

@ -67,6 +67,7 @@ IMAGE_DTO_COLS = ", ".join(
"created_at",
"updated_at",
"deleted_at",
"starred",
],
)
)
@ -139,6 +140,7 @@ class ImageRecordStorageBase(ABC):
node_id: Optional[str],
metadata: Optional[dict],
is_intermediate: bool = False,
starred: bool = False,
) -> datetime:
"""Saves an image record."""
pass
@ -200,6 +202,16 @@ class SqliteImageRecordStorage(ImageRecordStorageBase):
"""
)
self._cursor.execute("PRAGMA table_info(images)")
columns = [column[1] for column in self._cursor.fetchall()]
if "starred" not in columns:
self._cursor.execute(
"""--sql
ALTER TABLE images ADD COLUMN starred BOOLEAN DEFAULT FALSE;
"""
)
# Create the `images` table indices.
self._cursor.execute(
"""--sql
@ -222,6 +234,12 @@ class SqliteImageRecordStorage(ImageRecordStorageBase):
"""
)
self._cursor.execute(
"""--sql
CREATE INDEX IF NOT EXISTS idx_images_starred ON images(starred);
"""
)
# Add trigger for `updated_at`.
self._cursor.execute(
"""--sql
@ -321,6 +339,17 @@ class SqliteImageRecordStorage(ImageRecordStorageBase):
(changes.is_intermediate, image_name),
)
# Change the image's `starred`` state
if changes.starred is not None:
self._cursor.execute(
f"""--sql
UPDATE images
SET starred = ?
WHERE image_name = ?;
""",
(changes.starred, image_name),
)
self._conn.commit()
except sqlite3.Error as e:
self._conn.rollback()
@ -397,7 +426,7 @@ class SqliteImageRecordStorage(ImageRecordStorageBase):
query_params.append(board_id)
query_pagination = """--sql
ORDER BY images.created_at DESC LIMIT ? OFFSET ?
ORDER BY images.starred DESC, images.created_at DESC LIMIT ? OFFSET ?
"""
# Final images query with pagination
@ -500,6 +529,7 @@ class SqliteImageRecordStorage(ImageRecordStorageBase):
node_id: Optional[str],
metadata: Optional[dict],
is_intermediate: bool = False,
starred: bool = False,
) -> datetime:
try:
metadata_json = None if metadata is None else json.dumps(metadata)
@ -515,9 +545,10 @@ class SqliteImageRecordStorage(ImageRecordStorageBase):
node_id,
session_id,
metadata,
is_intermediate
is_intermediate,
starred
)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?);
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?);
""",
(
image_name,
@ -529,6 +560,7 @@ class SqliteImageRecordStorage(ImageRecordStorageBase):
session_id,
metadata_json,
is_intermediate,
starred,
),
)
self._conn.commit()

View File

@ -289,9 +289,10 @@ class ImageService(ImageServiceABC):
def get_metadata(self, image_name: str) -> Optional[ImageMetadata]:
try:
image_record = self._services.image_records.get(image_name)
metadata = self._services.image_records.get_metadata(image_name)
if not image_record.session_id:
return ImageMetadata()
return ImageMetadata(metadata=metadata)
session_raw = self._services.graph_execution_manager.get_raw(image_record.session_id)
graph = None
@ -303,7 +304,6 @@ class ImageService(ImageServiceABC):
self._services.logger.warn(f"Failed to parse session graph: {e}")
graph = None
metadata = self._services.image_records.get_metadata(image_name)
return ImageMetadata(graph=graph, metadata=metadata)
except ImageRecordNotFoundException:
self._services.logger.error("Image record not found")

View File

@ -32,6 +32,7 @@ class InvocationServices:
logger: "Logger"
model_manager: "ModelManagerServiceBase"
processor: "InvocationProcessorABC"
performance_statistics: "InvocationStatsServiceBase"
queue: "InvocationQueueABC"
def __init__(
@ -47,6 +48,7 @@ class InvocationServices:
logger: "Logger",
model_manager: "ModelManagerServiceBase",
processor: "InvocationProcessorABC",
performance_statistics: "InvocationStatsServiceBase",
queue: "InvocationQueueABC",
):
self.board_images = board_images
@ -61,4 +63,5 @@ class InvocationServices:
self.logger = logger
self.model_manager = model_manager
self.processor = processor
self.performance_statistics = performance_statistics
self.queue = queue

View File

@ -0,0 +1,305 @@
# Copyright 2023 Lincoln D. Stein <lincoln.stein@gmail.com>
"""Utility to collect execution time and GPU usage stats on invocations in flight"""
"""
Usage:
statistics = InvocationStatsService(graph_execution_manager)
with statistics.collect_stats(invocation, graph_execution_state.id):
... execute graphs...
statistics.log_stats()
Typical output:
[2023-08-02 18:03:04,507]::[InvokeAI]::INFO --> Graph stats: c7764585-9c68-4d9d-a199-55e8186790f3
[2023-08-02 18:03:04,507]::[InvokeAI]::INFO --> Node Calls Seconds VRAM Used
[2023-08-02 18:03:04,507]::[InvokeAI]::INFO --> main_model_loader 1 0.005s 0.01G
[2023-08-02 18:03:04,508]::[InvokeAI]::INFO --> clip_skip 1 0.004s 0.01G
[2023-08-02 18:03:04,508]::[InvokeAI]::INFO --> compel 2 0.512s 0.26G
[2023-08-02 18:03:04,508]::[InvokeAI]::INFO --> rand_int 1 0.001s 0.01G
[2023-08-02 18:03:04,508]::[InvokeAI]::INFO --> range_of_size 1 0.001s 0.01G
[2023-08-02 18:03:04,508]::[InvokeAI]::INFO --> iterate 1 0.001s 0.01G
[2023-08-02 18:03:04,508]::[InvokeAI]::INFO --> metadata_accumulator 1 0.002s 0.01G
[2023-08-02 18:03:04,508]::[InvokeAI]::INFO --> noise 1 0.002s 0.01G
[2023-08-02 18:03:04,508]::[InvokeAI]::INFO --> t2l 1 3.541s 1.93G
[2023-08-02 18:03:04,508]::[InvokeAI]::INFO --> l2i 1 0.679s 0.58G
[2023-08-02 18:03:04,508]::[InvokeAI]::INFO --> TOTAL GRAPH EXECUTION TIME: 4.749s
[2023-08-02 18:03:04,508]::[InvokeAI]::INFO --> Current VRAM utilization 0.01G
The abstract base class for this class is InvocationStatsServiceBase. An implementing class which
writes to the system log is stored in InvocationServices.performance_statistics.
"""
import psutil
import time
from abc import ABC, abstractmethod
from contextlib import AbstractContextManager
from dataclasses import dataclass, field
from typing import Dict
import torch
import invokeai.backend.util.logging as logger
from ..invocations.baseinvocation import BaseInvocation
from .graph import GraphExecutionState
from .item_storage import ItemStorageABC
from .model_manager_service import ModelManagerService
from invokeai.backend.model_management.model_cache import CacheStats
# size of GIG in bytes
GIG = 1073741824
class InvocationStatsServiceBase(ABC):
"Abstract base class for recording node memory/time performance statistics"
@abstractmethod
def __init__(self, graph_execution_manager: ItemStorageABC["GraphExecutionState"]):
"""
Initialize the InvocationStatsService and reset counters to zero
:param graph_execution_manager: Graph execution manager for this session
"""
pass
@abstractmethod
def collect_stats(
self,
invocation: BaseInvocation,
graph_execution_state_id: str,
) -> AbstractContextManager:
"""
Return a context object that will capture the statistics on the execution
of invocaation. Use with: to place around the part of the code that executes the invocation.
:param invocation: BaseInvocation object from the current graph.
:param graph_execution_state: GraphExecutionState object from the current session.
"""
pass
@abstractmethod
def reset_stats(self, graph_execution_state_id: str):
"""
Reset all statistics for the indicated graph
:param graph_execution_state_id
"""
pass
@abstractmethod
def reset_all_stats(self):
"""Zero all statistics"""
pass
@abstractmethod
def update_invocation_stats(
self,
graph_id: str,
invocation_type: str,
time_used: float,
vram_used: float,
ram_used: float,
ram_changed: float,
):
"""
Add timing information on execution of a node. Usually
used internally.
:param graph_id: ID of the graph that is currently executing
:param invocation_type: String literal type of the node
:param time_used: Time used by node's exection (sec)
:param vram_used: Maximum VRAM used during exection (GB)
:param ram_used: Current RAM available (GB)
:param ram_changed: Change in RAM usage over course of the run (GB)
"""
pass
@abstractmethod
def log_stats(self):
"""
Write out the accumulated statistics to the log or somewhere else.
"""
pass
@dataclass
class NodeStats:
"""Class for tracking execution stats of an invocation node"""
calls: int = 0
time_used: float = 0.0 # seconds
max_vram: float = 0.0 # GB
cache_hits: int = 0
cache_misses: int = 0
cache_high_watermark: int = 0
@dataclass
class NodeLog:
"""Class for tracking node usage"""
# {node_type => NodeStats}
nodes: Dict[str, NodeStats] = field(default_factory=dict)
class InvocationStatsService(InvocationStatsServiceBase):
"""Accumulate performance information about a running graph. Collects time spent in each node,
as well as the maximum and current VRAM utilisation for CUDA systems"""
def __init__(self, graph_execution_manager: ItemStorageABC["GraphExecutionState"]):
self.graph_execution_manager = graph_execution_manager
# {graph_id => NodeLog}
self._stats: Dict[str, NodeLog] = {}
self._cache_stats: Dict[str, CacheStats] = {}
self.ram_used: float = 0.0
self.ram_changed: float = 0.0
class StatsContext:
"""Context manager for collecting statistics."""
invocation: BaseInvocation = None
collector: "InvocationStatsServiceBase" = None
graph_id: str = None
start_time: int = 0
ram_used: int = 0
model_manager: ModelManagerService = None
def __init__(
self,
invocation: BaseInvocation,
graph_id: str,
model_manager: ModelManagerService,
collector: "InvocationStatsServiceBase",
):
"""Initialize statistics for this run."""
self.invocation = invocation
self.collector = collector
self.graph_id = graph_id
self.start_time = 0
self.ram_used = 0
self.model_manager = model_manager
def __enter__(self):
self.start_time = time.time()
if torch.cuda.is_available():
torch.cuda.reset_peak_memory_stats()
self.ram_used = psutil.Process().memory_info().rss
if self.model_manager:
self.model_manager.collect_cache_stats(self.collector._cache_stats[self.graph_id])
def __exit__(self, *args):
"""Called on exit from the context."""
ram_used = psutil.Process().memory_info().rss
self.collector.update_mem_stats(
ram_used=ram_used / GIG,
ram_changed=(ram_used - self.ram_used) / GIG,
)
self.collector.update_invocation_stats(
graph_id=self.graph_id,
invocation_type=self.invocation.type,
time_used=time.time() - self.start_time,
vram_used=torch.cuda.max_memory_allocated() / GIG if torch.cuda.is_available() else 0.0,
)
def collect_stats(
self,
invocation: BaseInvocation,
graph_execution_state_id: str,
model_manager: ModelManagerService,
) -> StatsContext:
"""
Return a context object that will capture the statistics.
:param invocation: BaseInvocation object from the current graph.
:param graph_execution_state: GraphExecutionState object from the current session.
"""
if not self._stats.get(graph_execution_state_id): # first time we're seeing this
self._stats[graph_execution_state_id] = NodeLog()
self._cache_stats[graph_execution_state_id] = CacheStats()
return self.StatsContext(invocation, graph_execution_state_id, model_manager, self)
def reset_all_stats(self):
"""Zero all statistics"""
self._stats = {}
def reset_stats(self, graph_execution_id: str):
"""Zero the statistics for the indicated graph."""
try:
self._stats.pop(graph_execution_id)
except KeyError:
logger.warning(f"Attempted to clear statistics for unknown graph {graph_execution_id}")
def update_mem_stats(
self,
ram_used: float,
ram_changed: float,
):
"""
Update the collector with RAM memory usage info.
:param ram_used: How much RAM is currently in use.
:param ram_changed: How much RAM changed since last generation.
"""
self.ram_used = ram_used
self.ram_changed = ram_changed
def update_invocation_stats(
self,
graph_id: str,
invocation_type: str,
time_used: float,
vram_used: float,
):
"""
Add timing information on execution of a node. Usually
used internally.
:param graph_id: ID of the graph that is currently executing
:param invocation_type: String literal type of the node
:param time_used: Time used by node's exection (sec)
:param vram_used: Maximum VRAM used during exection (GB)
:param ram_used: Current RAM available (GB)
:param ram_changed: Change in RAM usage over course of the run (GB)
"""
if not self._stats[graph_id].nodes.get(invocation_type):
self._stats[graph_id].nodes[invocation_type] = NodeStats()
stats = self._stats[graph_id].nodes[invocation_type]
stats.calls += 1
stats.time_used += time_used
stats.max_vram = max(stats.max_vram, vram_used)
def log_stats(self):
"""
Send the statistics to the system logger at the info level.
Stats will only be printed when the execution of the graph
is complete.
"""
completed = set()
for graph_id, node_log in self._stats.items():
current_graph_state = self.graph_execution_manager.get(graph_id)
if not current_graph_state.is_complete():
continue
total_time = 0
logger.info(f"Graph stats: {graph_id}")
logger.info(f"{'Node':>30} {'Calls':>7}{'Seconds':>9} {'VRAM Used':>10}")
for node_type, stats in self._stats[graph_id].nodes.items():
logger.info(f"{node_type:>30} {stats.calls:>4} {stats.time_used:7.3f}s {stats.max_vram:4.3f}G")
total_time += stats.time_used
cache_stats = self._cache_stats[graph_id]
hwm = cache_stats.high_watermark / GIG
tot = cache_stats.cache_size / GIG
loaded = sum([v for v in cache_stats.loaded_model_sizes.values()]) / GIG
logger.info(f"TOTAL GRAPH EXECUTION TIME: {total_time:7.3f}s")
logger.info("RAM used by InvokeAI process: " + "%4.2fG" % self.ram_used + f" ({self.ram_changed:+5.3f}G)")
logger.info(f"RAM used to load models: {loaded:4.2f}G")
if torch.cuda.is_available():
logger.info("VRAM in use: " + "%4.3fG" % (torch.cuda.memory_allocated() / GIG))
logger.info("RAM cache statistics:")
logger.info(f" Model cache hits: {cache_stats.hits}")
logger.info(f" Model cache misses: {cache_stats.misses}")
logger.info(f" Models cached: {cache_stats.in_cache}")
logger.info(f" Models cleared from cache: {cache_stats.cleared}")
logger.info(f" Cache high water mark: {hwm:4.2f}/{tot:4.2f}G")
completed.add(graph_id)
for graph_id in completed:
del self._stats[graph_id]
del self._cache_stats[graph_id]

View File

@ -3,9 +3,10 @@
from __future__ import annotations
from abc import ABC, abstractmethod
from logging import Logger
from pathlib import Path
from pydantic import Field
from typing import Optional, Union, Callable, List, Tuple, TYPE_CHECKING
from typing import Literal, Optional, Union, Callable, List, Tuple, TYPE_CHECKING
from types import ModuleType
from invokeai.backend.model_management import (
@ -21,6 +22,7 @@ from invokeai.backend.model_management import (
ModelNotFoundException,
)
from invokeai.backend.model_management.model_search import FindModels
from invokeai.backend.model_management.model_cache import CacheStats
import torch
from invokeai.app.models.exceptions import CanceledException
@ -193,7 +195,7 @@ class ModelManagerServiceBase(ABC):
self,
model_name: str,
base_model: BaseModelType,
model_type: Union[ModelType.Main, ModelType.Vae],
model_type: Literal[ModelType.Main, ModelType.Vae],
) -> AddModelResult:
"""
Convert a checkpoint file into a diffusers folder, deleting the cached
@ -275,6 +277,13 @@ class ModelManagerServiceBase(ABC):
"""
pass
@abstractmethod
def collect_cache_stats(self, cache_stats: CacheStats):
"""
Reset model cache statistics for graph with graph_id.
"""
pass
@abstractmethod
def commit(self, conf_file: Optional[Path] = None) -> None:
"""
@ -292,7 +301,7 @@ class ModelManagerService(ModelManagerServiceBase):
def __init__(
self,
config: InvokeAIAppConfig,
logger: ModuleType,
logger: Logger,
):
"""
Initialize with the path to the models.yaml config file.
@ -396,7 +405,7 @@ class ModelManagerService(ModelManagerServiceBase):
model_type,
)
def model_info(self, model_name: str, base_model: BaseModelType, model_type: ModelType) -> dict:
def model_info(self, model_name: str, base_model: BaseModelType, model_type: ModelType) -> Union[dict, None]:
"""
Given a model name returns a dict-like (OmegaConf) object describing it.
"""
@ -416,7 +425,7 @@ class ModelManagerService(ModelManagerServiceBase):
"""
return self.mgr.list_models(base_model, model_type)
def list_model(self, model_name: str, base_model: BaseModelType, model_type: ModelType) -> dict:
def list_model(self, model_name: str, base_model: BaseModelType, model_type: ModelType) -> Union[dict, None]:
"""
Return information about the model using the same format as list_models()
"""
@ -429,7 +438,7 @@ class ModelManagerService(ModelManagerServiceBase):
model_type: ModelType,
model_attributes: dict,
clobber: bool = False,
) -> None:
) -> AddModelResult:
"""
Update the named model with a dictionary of attributes. Will fail with an
assertion error if the name already exists. Pass clobber=True to overwrite.
@ -478,7 +487,7 @@ class ModelManagerService(ModelManagerServiceBase):
self,
model_name: str,
base_model: BaseModelType,
model_type: Union[ModelType.Main, ModelType.Vae],
model_type: Literal[ModelType.Main, ModelType.Vae],
convert_dest_directory: Optional[Path] = Field(
default=None, description="Optional directory location for merged model"
),
@ -499,6 +508,12 @@ class ModelManagerService(ModelManagerServiceBase):
self.logger.debug(f"convert model {model_name}")
return self.mgr.convert_model(model_name, base_model, model_type, convert_dest_directory)
def collect_cache_stats(self, cache_stats: CacheStats):
"""
Reset model cache statistics for graph with graph_id.
"""
self.mgr.cache.stats = cache_stats
def commit(self, conf_file: Optional[Path] = None):
"""
Write current configuration out to the indicated file.
@ -573,9 +588,9 @@ class ModelManagerService(ModelManagerServiceBase):
default=None, description="Base model shared by all models to be merged"
),
merged_model_name: str = Field(default=None, description="Name of destination model after merging"),
alpha: Optional[float] = 0.5,
alpha: float = 0.5,
interp: Optional[MergeInterpolationMethod] = None,
force: Optional[bool] = False,
force: bool = False,
merge_dest_directory: Optional[Path] = Field(
default=None, description="Optional directory location for merged model"
),
@ -633,8 +648,8 @@ class ModelManagerService(ModelManagerServiceBase):
model_name: str,
base_model: BaseModelType,
model_type: ModelType,
new_name: str = None,
new_base: BaseModelType = None,
new_name: Optional[str] = None,
new_base: Optional[BaseModelType] = None,
):
"""
Rename the indicated model. Can provide a new name and/or a new base.

View File

@ -0,0 +1,8 @@
from pydantic import Field
from invokeai.app.util.model_exclude_null import BaseModelExcludeNull
class BoardImage(BaseModelExcludeNull):
board_id: str = Field(description="The id of the board")
image_name: str = Field(description="The name of the image")

View File

@ -1,10 +1,11 @@
from typing import Optional, Union
from datetime import datetime
from pydantic import BaseModel, Extra, Field, StrictBool, StrictStr
from pydantic import Field
from invokeai.app.util.misc import get_iso_timestamp
from invokeai.app.util.model_exclude_null import BaseModelExcludeNull
class BoardRecord(BaseModel):
class BoardRecord(BaseModelExcludeNull):
"""Deserialized board record."""
board_id: str = Field(description="The unique ID of the board.")

View File

@ -1,13 +1,14 @@
import datetime
from typing import Optional, Union
from pydantic import BaseModel, Extra, Field, StrictBool, StrictStr
from pydantic import Extra, Field, StrictBool, StrictStr
from invokeai.app.models.image import ImageCategory, ResourceOrigin
from invokeai.app.util.misc import get_iso_timestamp
from invokeai.app.util.model_exclude_null import BaseModelExcludeNull
class ImageRecord(BaseModel):
class ImageRecord(BaseModelExcludeNull):
"""Deserialized image record without metadata."""
image_name: str = Field(description="The unique name of the image.")
@ -38,15 +39,18 @@ class ImageRecord(BaseModel):
description="The node ID that generated this image, if it is a generated image.",
)
"""The node ID that generated this image, if it is a generated image."""
starred: bool = Field(description="Whether this image is starred.")
"""Whether this image is starred."""
class ImageRecordChanges(BaseModel, extra=Extra.forbid):
class ImageRecordChanges(BaseModelExcludeNull, extra=Extra.forbid):
"""A set of changes to apply to an image record.
Only limited changes are valid:
- `image_category`: change the category of an image
- `session_id`: change the session associated with an image
- `is_intermediate`: change the image's `is_intermediate` flag
- `starred`: change whether the image is starred
"""
image_category: Optional[ImageCategory] = Field(description="The image's new category.")
@ -58,9 +62,11 @@ class ImageRecordChanges(BaseModel, extra=Extra.forbid):
"""The image's new session ID."""
is_intermediate: Optional[StrictBool] = Field(default=None, description="The image's new `is_intermediate` flag.")
"""The image's new `is_intermediate` flag."""
starred: Optional[StrictBool] = Field(default=None, description="The image's new `starred` state")
"""The image's new `starred` state."""
class ImageUrlsDTO(BaseModel):
class ImageUrlsDTO(BaseModelExcludeNull):
"""The URLs for an image and its thumbnail."""
image_name: str = Field(description="The unique name of the image.")
@ -76,11 +82,15 @@ class ImageDTO(ImageRecord, ImageUrlsDTO):
board_id: Optional[str] = Field(description="The id of the board the image belongs to, if one exists.")
"""The id of the board the image belongs to, if one exists."""
pass
def image_record_to_dto(
image_record: ImageRecord, image_url: str, thumbnail_url: str, board_id: Optional[str]
image_record: ImageRecord,
image_url: str,
thumbnail_url: str,
board_id: Optional[str],
) -> ImageDTO:
"""Converts an image record to an image DTO."""
return ImageDTO(
@ -108,6 +118,7 @@ def deserialize_image_record(image_dict: dict) -> ImageRecord:
updated_at = image_dict.get("updated_at", get_iso_timestamp())
deleted_at = image_dict.get("deleted_at", get_iso_timestamp())
is_intermediate = image_dict.get("is_intermediate", False)
starred = image_dict.get("starred", False)
return ImageRecord(
image_name=image_name,
@ -121,4 +132,5 @@ def deserialize_image_record(image_dict: dict) -> ImageRecord:
updated_at=updated_at,
deleted_at=deleted_at,
is_intermediate=is_intermediate,
starred=starred,
)

View File

@ -1,14 +1,15 @@
import time
import traceback
from threading import Event, Thread, BoundedSemaphore
from ..invocations.baseinvocation import InvocationContext
from .invocation_queue import InvocationQueueItem
from .invoker import InvocationProcessorABC, Invoker
from ..models.exceptions import CanceledException
from threading import BoundedSemaphore, Event, Thread
import invokeai.backend.util.logging as logger
from ..invocations.baseinvocation import InvocationContext
from ..models.exceptions import CanceledException
from .invocation_queue import InvocationQueueItem
from .invocation_stats import InvocationStatsServiceBase
from .invoker import InvocationProcessorABC, Invoker
class DefaultInvocationProcessor(InvocationProcessorABC):
__invoker_thread: Thread
@ -35,6 +36,8 @@ class DefaultInvocationProcessor(InvocationProcessorABC):
def __process(self, stop_event: Event):
try:
self.__threadLimit.acquire()
statistics: InvocationStatsServiceBase = self.__invoker.services.performance_statistics
while not stop_event.is_set():
try:
queue_item: InvocationQueueItem = self.__invoker.services.queue.get()
@ -83,35 +86,43 @@ class DefaultInvocationProcessor(InvocationProcessorABC):
# Invoke
try:
outputs = invocation.invoke(
InvocationContext(
services=self.__invoker.services,
graph_execution_state_id=graph_execution_state.id,
graph_id = graph_execution_state.id
model_manager = self.__invoker.services.model_manager
with statistics.collect_stats(invocation, graph_id, model_manager):
# use the internal invoke_internal(), which wraps the node's invoke() method in
# this accomodates nodes which require a value, but get it only from a
# connection
outputs = invocation.invoke_internal(
InvocationContext(
services=self.__invoker.services,
graph_execution_state_id=graph_execution_state.id,
)
)
)
# Check queue to see if this is canceled, and skip if so
if self.__invoker.services.queue.is_canceled(graph_execution_state.id):
continue
# Check queue to see if this is canceled, and skip if so
if self.__invoker.services.queue.is_canceled(graph_execution_state.id):
continue
# Save outputs and history
graph_execution_state.complete(invocation.id, outputs)
# Save outputs and history
graph_execution_state.complete(invocation.id, outputs)
# Save the state changes
self.__invoker.services.graph_execution_manager.set(graph_execution_state)
# Save the state changes
self.__invoker.services.graph_execution_manager.set(graph_execution_state)
# Send complete event
self.__invoker.services.events.emit_invocation_complete(
graph_execution_state_id=graph_execution_state.id,
node=invocation.dict(),
source_node_id=source_node_id,
result=outputs.dict(),
)
# Send complete event
self.__invoker.services.events.emit_invocation_complete(
graph_execution_state_id=graph_execution_state.id,
node=invocation.dict(),
source_node_id=source_node_id,
result=outputs.dict(),
)
statistics.log_stats()
except KeyboardInterrupt:
pass
except CanceledException:
statistics.reset_stats(graph_execution_state.id)
pass
except Exception as e:
@ -133,7 +144,7 @@ class DefaultInvocationProcessor(InvocationProcessorABC):
error_type=e.__class__.__name__,
error=error,
)
statistics.reset_stats(graph_execution_state.id)
pass
# Check queue to see if this is canceled, and skip if so

View File

@ -49,7 +49,8 @@ class SqliteItemStorage(ItemStorageABC, Generic[T]):
def _parse_item(self, item: str) -> T:
item_type = get_args(self.__orig_class__)[0]
return parse_raw_as(item_type, item)
parsed = parse_raw_as(item_type, item)
return parsed
def set(self, item: T):
try:

View File

@ -20,6 +20,6 @@ class LocalUrlService(UrlServiceBase):
# These paths are determined by the routes in invokeai/app/api/routers/images.py
if thumbnail:
return f"{self._base_url}/images/{image_basename}/thumbnail"
return f"{self._base_url}/images/i/{image_basename}/thumbnail"
return f"{self._base_url}/images/{image_basename}/full"
return f"{self._base_url}/images/i/{image_basename}/full"

View File

@ -18,5 +18,5 @@ SEED_MAX = np.iinfo(np.uint32).max
def get_random_seed():
rng = np.random.default_rng(seed=0)
rng = np.random.default_rng(seed=None)
return int(rng.integers(0, SEED_MAX))

View File

@ -0,0 +1,23 @@
from typing import Any
from pydantic import BaseModel
"""
We want to exclude null values from objects that make their way to the client.
Unfortunately there is no built-in way to do this in pydantic, so we need to override the default
dict method to do this.
From https://github.com/tiangolo/fastapi/discussions/8882#discussioncomment-5154541
"""
class BaseModelExcludeNull(BaseModel):
def dict(self, *args, **kwargs) -> dict[str, Any]:
"""
Override the default dict method to exclude None values in the response
"""
kwargs.pop("exclude_none", None)
return super().dict(*args, exclude_none=True, **kwargs)
pass

View File

@ -4,9 +4,9 @@ from invokeai.app.models.exceptions import CanceledException
from invokeai.app.models.image import ProgressImage
from ..invocations.baseinvocation import InvocationContext
from ...backend.util.util import image_to_dataURL
from ...backend.generator.base import Generator
from ...backend.stable_diffusion import PipelineIntermediateState
from invokeai.app.services.config import InvokeAIAppConfig
from ...backend.model_management.models import BaseModelType
def sample_to_lowres_estimated_image(samples, latent_rgb_factors, smooth_matrix=None):
@ -29,6 +29,7 @@ def stable_diffusion_step_callback(
intermediate_state: PipelineIntermediateState,
node: dict,
source_node_id: str,
base_model: BaseModelType,
):
if context.services.queue.is_canceled(context.graph_execution_state_id):
raise CanceledException
@ -56,23 +57,50 @@ def stable_diffusion_step_callback(
# TODO: only output a preview image when requested
# origingally adapted from code by @erucipe and @keturn here:
# https://discuss.huggingface.co/t/decoding-latents-to-rgb-without-upscaling/23204/7
if base_model in [BaseModelType.StableDiffusionXL, BaseModelType.StableDiffusionXLRefiner]:
# fast latents preview matrix for sdxl
# generated by @StAlKeR7779
sdxl_latent_rgb_factors = torch.tensor(
[
# R G B
[0.3816, 0.4930, 0.5320],
[-0.3753, 0.1631, 0.1739],
[0.1770, 0.3588, -0.2048],
[-0.4350, -0.2644, -0.4289],
],
dtype=sample.dtype,
device=sample.device,
)
# these updated numbers for v1.5 are from @torridgristle
v1_5_latent_rgb_factors = torch.tensor(
[
# R G B
[0.3444, 0.1385, 0.0670], # L1
[0.1247, 0.4027, 0.1494], # L2
[-0.3192, 0.2513, 0.2103], # L3
[-0.1307, -0.1874, -0.7445], # L4
],
dtype=sample.dtype,
device=sample.device,
)
sdxl_smooth_matrix = torch.tensor(
[
[0.0358, 0.0964, 0.0358],
[0.0964, 0.4711, 0.0964],
[0.0358, 0.0964, 0.0358],
],
dtype=sample.dtype,
device=sample.device,
)
image = sample_to_lowres_estimated_image(sample, v1_5_latent_rgb_factors)
image = sample_to_lowres_estimated_image(sample, sdxl_latent_rgb_factors, sdxl_smooth_matrix)
else:
# origingally adapted from code by @erucipe and @keturn here:
# https://discuss.huggingface.co/t/decoding-latents-to-rgb-without-upscaling/23204/7
# these updated numbers for v1.5 are from @torridgristle
v1_5_latent_rgb_factors = torch.tensor(
[
# R G B
[0.3444, 0.1385, 0.0670], # L1
[0.1247, 0.4027, 0.1494], # L2
[-0.3192, 0.2513, 0.2103], # L3
[-0.1307, -0.1874, -0.7445], # L4
],
dtype=sample.dtype,
device=sample.device,
)
image = sample_to_lowres_estimated_image(sample, v1_5_latent_rgb_factors)
(width, height) = image.size
width *= 8
@ -86,59 +114,6 @@ def stable_diffusion_step_callback(
source_node_id=source_node_id,
progress_image=ProgressImage(width=width, height=height, dataURL=dataURL),
step=intermediate_state.step,
total_steps=node["steps"],
)
def stable_diffusion_xl_step_callback(
context: InvocationContext,
node: dict,
source_node_id: str,
sample,
step,
total_steps,
):
if context.services.queue.is_canceled(context.graph_execution_state_id):
raise CanceledException
sdxl_latent_rgb_factors = torch.tensor(
[
# R G B
[0.3816, 0.4930, 0.5320],
[-0.3753, 0.1631, 0.1739],
[0.1770, 0.3588, -0.2048],
[-0.4350, -0.2644, -0.4289],
],
dtype=sample.dtype,
device=sample.device,
)
sdxl_smooth_matrix = torch.tensor(
[
# [ 0.0478, 0.1285, 0.0478],
# [ 0.1285, 0.2948, 0.1285],
# [ 0.0478, 0.1285, 0.0478],
[0.0358, 0.0964, 0.0358],
[0.0964, 0.4711, 0.0964],
[0.0358, 0.0964, 0.0358],
],
dtype=sample.dtype,
device=sample.device,
)
image = sample_to_lowres_estimated_image(sample, sdxl_latent_rgb_factors, sdxl_smooth_matrix)
(width, height) = image.size
width *= 8
height *= 8
dataURL = image_to_dataURL(image, image_format="JPEG")
context.services.events.emit_generator_progress(
graph_execution_state_id=context.graph_execution_state_id,
node=node,
source_node_id=source_node_id,
progress_image=ProgressImage(width=width, height=height, dataURL=dataURL),
step=step,
total_steps=total_steps,
order=intermediate_state.order,
total_steps=intermediate_state.total_steps,
)

View File

@ -1,6 +1,5 @@
"""
Initialization file for invokeai.backend
"""
from .generator import InvokeAIGeneratorBasicParams, InvokeAIGenerator, InvokeAIGeneratorOutput, Img2Img, Inpaint
from .model_management import ModelManager, ModelCache, BaseModelType, ModelType, SubModelType, ModelInfo
from .model_management.models import SilenceWarnings

View File

@ -1,12 +0,0 @@
"""
Initialization file for the invokeai.generator package
"""
from .base import (
InvokeAIGenerator,
InvokeAIGeneratorBasicParams,
InvokeAIGeneratorOutput,
Img2Img,
Inpaint,
Generator,
)
from .inpaint import infill_methods

View File

@ -1,559 +0,0 @@
"""
Base class for invokeai.backend.generator.*
including img2img, txt2img, and inpaint
"""
from __future__ import annotations
import itertools
import dataclasses
import diffusers
import os
import random
import traceback
from abc import ABCMeta
from argparse import Namespace
from contextlib import nullcontext
import cv2
import numpy as np
import torch
from PIL import Image, ImageChops, ImageFilter
from accelerate.utils import set_seed
from diffusers import DiffusionPipeline
from tqdm import trange
from typing import Callable, List, Iterator, Optional, Type, Union
from dataclasses import dataclass, field
from diffusers.schedulers import SchedulerMixin as Scheduler
import invokeai.backend.util.logging as logger
from ..image_util import configure_model_padding
from ..util.util import rand_perlin_2d
from ..stable_diffusion.diffusers_pipeline import StableDiffusionGeneratorPipeline
from ..stable_diffusion.schedulers import SCHEDULER_MAP
downsampling = 8
@dataclass
class InvokeAIGeneratorBasicParams:
seed: Optional[int] = None
width: int = 512
height: int = 512
cfg_scale: float = 7.5
steps: int = 20
ddim_eta: float = 0.0
scheduler: str = "ddim"
precision: str = "float16"
perlin: float = 0.0
threshold: float = 0.0
seamless: bool = False
seamless_axes: List[str] = field(default_factory=lambda: ["x", "y"])
h_symmetry_time_pct: Optional[float] = None
v_symmetry_time_pct: Optional[float] = None
variation_amount: float = 0.0
with_variations: list = field(default_factory=list)
@dataclass
class InvokeAIGeneratorOutput:
"""
InvokeAIGeneratorOutput is a dataclass that contains the outputs of a generation
operation, including the image, its seed, the model name used to generate the image
and the model hash, as well as all the generate() parameters that went into
generating the image (in .params, also available as attributes)
"""
image: Image.Image
seed: int
model_hash: str
attention_maps_images: List[Image.Image]
params: Namespace
# we are interposing a wrapper around the original Generator classes so that
# old code that calls Generate will continue to work.
class InvokeAIGenerator(metaclass=ABCMeta):
def __init__(
self,
model_info: dict,
params: InvokeAIGeneratorBasicParams = InvokeAIGeneratorBasicParams(),
**kwargs,
):
self.model_info = model_info
self.params = params
self.kwargs = kwargs
def generate(
self,
conditioning: tuple,
scheduler,
callback: Optional[Callable] = None,
step_callback: Optional[Callable] = None,
iterations: int = 1,
**keyword_args,
) -> Iterator[InvokeAIGeneratorOutput]:
"""
Return an iterator across the indicated number of generations.
Each time the iterator is called it will return an InvokeAIGeneratorOutput
object. Use like this:
outputs = txt2img.generate(prompt='banana sushi', iterations=5)
for result in outputs:
print(result.image, result.seed)
In the typical case of wanting to get just a single image, iterations
defaults to 1 and do:
output = next(txt2img.generate(prompt='banana sushi')
Pass None to get an infinite iterator.
outputs = txt2img.generate(prompt='banana sushi', iterations=None)
for o in outputs:
print(o.image, o.seed)
"""
generator_args = dataclasses.asdict(self.params)
generator_args.update(keyword_args)
model_info = self.model_info
model_name = model_info.name
model_hash = model_info.hash
with model_info.context as model:
gen_class = self._generator_class()
generator = gen_class(model, self.params.precision, **self.kwargs)
if self.params.variation_amount > 0:
generator.set_variation(
generator_args.get("seed"),
generator_args.get("variation_amount"),
generator_args.get("with_variations"),
)
if isinstance(model, DiffusionPipeline):
for component in [model.unet, model.vae]:
configure_model_padding(
component, generator_args.get("seamless", False), generator_args.get("seamless_axes")
)
else:
configure_model_padding(
model, generator_args.get("seamless", False), generator_args.get("seamless_axes")
)
iteration_count = range(iterations) if iterations else itertools.count(start=0, step=1)
for i in iteration_count:
results = generator.generate(
conditioning=conditioning,
step_callback=step_callback,
sampler=scheduler,
**generator_args,
)
output = InvokeAIGeneratorOutput(
image=results[0][0],
seed=results[0][1],
attention_maps_images=results[0][2],
model_hash=model_hash,
params=Namespace(model_name=model_name, **generator_args),
)
if callback:
callback(output)
yield output
@classmethod
def schedulers(self) -> List[str]:
"""
Return list of all the schedulers that we currently handle.
"""
return list(SCHEDULER_MAP.keys())
def load_generator(self, model: StableDiffusionGeneratorPipeline, generator_class: Type[Generator]):
return generator_class(model, self.params.precision)
@classmethod
def _generator_class(cls) -> Type[Generator]:
"""
In derived classes return the name of the generator to apply.
If you don't override will return the name of the derived
class, which nicely parallels the generator class names.
"""
return Generator
# ------------------------------------
class Img2Img(InvokeAIGenerator):
def generate(
self, init_image: Union[Image.Image, torch.FloatTensor], strength: float = 0.75, **keyword_args
) -> Iterator[InvokeAIGeneratorOutput]:
return super().generate(init_image=init_image, strength=strength, **keyword_args)
@classmethod
def _generator_class(cls):
from .img2img import Img2Img
return Img2Img
# ------------------------------------
# Takes all the arguments of Img2Img and adds the mask image and the seam/infill stuff
class Inpaint(Img2Img):
def generate(
self,
mask_image: Union[Image.Image, torch.FloatTensor],
# Seam settings - when 0, doesn't fill seam
seam_size: int = 96,
seam_blur: int = 16,
seam_strength: float = 0.7,
seam_steps: int = 30,
tile_size: int = 32,
inpaint_replace=False,
infill_method=None,
inpaint_width=None,
inpaint_height=None,
inpaint_fill: tuple(int) = (0x7F, 0x7F, 0x7F, 0xFF),
**keyword_args,
) -> Iterator[InvokeAIGeneratorOutput]:
return super().generate(
mask_image=mask_image,
seam_size=seam_size,
seam_blur=seam_blur,
seam_strength=seam_strength,
seam_steps=seam_steps,
tile_size=tile_size,
inpaint_replace=inpaint_replace,
infill_method=infill_method,
inpaint_width=inpaint_width,
inpaint_height=inpaint_height,
inpaint_fill=inpaint_fill,
**keyword_args,
)
@classmethod
def _generator_class(cls):
from .inpaint import Inpaint
return Inpaint
class Generator:
downsampling_factor: int
latent_channels: int
precision: str
model: DiffusionPipeline
def __init__(self, model: DiffusionPipeline, precision: str, **kwargs):
self.model = model
self.precision = precision
self.seed = None
self.latent_channels = model.unet.config.in_channels
self.downsampling_factor = downsampling # BUG: should come from model or config
self.perlin = 0.0
self.threshold = 0
self.variation_amount = 0
self.with_variations = []
self.use_mps_noise = False
self.free_gpu_mem = None
# this is going to be overridden in img2img.py, txt2img.py and inpaint.py
def get_make_image(self, **kwargs):
"""
Returns a function returning an image derived from the prompt and the initial image
Return value depends on the seed at the time you call it
"""
raise NotImplementedError("image_iterator() must be implemented in a descendent class")
def set_variation(self, seed, variation_amount, with_variations):
self.seed = seed
self.variation_amount = variation_amount
self.with_variations = with_variations
def generate(
self,
width,
height,
sampler,
init_image=None,
iterations=1,
seed=None,
image_callback=None,
step_callback=None,
threshold=0.0,
perlin=0.0,
h_symmetry_time_pct=None,
v_symmetry_time_pct=None,
free_gpu_mem: bool = False,
**kwargs,
):
scope = nullcontext
self.free_gpu_mem = free_gpu_mem
attention_maps_images = []
attention_maps_callback = lambda saver: attention_maps_images.append(saver.get_stacked_maps_image())
make_image = self.get_make_image(
sampler=sampler,
init_image=init_image,
width=width,
height=height,
step_callback=step_callback,
threshold=threshold,
perlin=perlin,
h_symmetry_time_pct=h_symmetry_time_pct,
v_symmetry_time_pct=v_symmetry_time_pct,
attention_maps_callback=attention_maps_callback,
**kwargs,
)
results = []
seed = seed if seed is not None and seed >= 0 else self.new_seed()
first_seed = seed
seed, initial_noise = self.generate_initial_noise(seed, width, height)
# There used to be an additional self.model.ema_scope() here, but it breaks
# the inpaint-1.5 model. Not sure what it did.... ?
with scope(self.model.device.type):
for n in trange(iterations, desc="Generating"):
x_T = None
if self.variation_amount > 0:
set_seed(seed)
target_noise = self.get_noise(width, height)
x_T = self.slerp(self.variation_amount, initial_noise, target_noise)
elif initial_noise is not None:
# i.e. we specified particular variations
x_T = initial_noise
else:
set_seed(seed)
try:
x_T = self.get_noise(width, height)
except:
logger.error("An error occurred while getting initial noise")
print(traceback.format_exc())
# Pass on the seed in case a layer beneath us needs to generate noise on its own.
image = make_image(x_T, seed)
results.append([image, seed, attention_maps_images])
if image_callback is not None:
attention_maps_image = None if len(attention_maps_images) == 0 else attention_maps_images[-1]
image_callback(
image,
seed,
first_seed=first_seed,
attention_maps_image=attention_maps_image,
)
seed = self.new_seed()
# Free up memory from the last generation.
clear_cuda_cache = kwargs["clear_cuda_cache"] if "clear_cuda_cache" in kwargs else None
if clear_cuda_cache is not None:
clear_cuda_cache()
return results
def sample_to_image(self, samples) -> Image.Image:
"""
Given samples returned from a sampler, converts
it into a PIL Image
"""
with torch.inference_mode():
image = self.model.decode_latents(samples)
return self.model.numpy_to_pil(image)[0]
def repaste_and_color_correct(
self,
result: Image.Image,
init_image: Image.Image,
init_mask: Image.Image,
mask_blur_radius: int = 8,
) -> Image.Image:
if init_image is None or init_mask is None:
return result
# Get the original alpha channel of the mask if there is one.
# Otherwise it is some other black/white image format ('1', 'L' or 'RGB')
pil_init_mask = init_mask.getchannel("A") if init_mask.mode == "RGBA" else init_mask.convert("L")
pil_init_image = init_image.convert("RGBA") # Add an alpha channel if one doesn't exist
# Build an image with only visible pixels from source to use as reference for color-matching.
init_rgb_pixels = np.asarray(init_image.convert("RGB"), dtype=np.uint8)
init_a_pixels = np.asarray(pil_init_image.getchannel("A"), dtype=np.uint8)
init_mask_pixels = np.asarray(pil_init_mask, dtype=np.uint8)
# Get numpy version of result
np_image = np.asarray(result, dtype=np.uint8)
# Mask and calculate mean and standard deviation
mask_pixels = init_a_pixels * init_mask_pixels > 0
np_init_rgb_pixels_masked = init_rgb_pixels[mask_pixels, :]
np_image_masked = np_image[mask_pixels, :]
if np_init_rgb_pixels_masked.size > 0:
init_means = np_init_rgb_pixels_masked.mean(axis=0)
init_std = np_init_rgb_pixels_masked.std(axis=0)
gen_means = np_image_masked.mean(axis=0)
gen_std = np_image_masked.std(axis=0)
# Color correct
np_matched_result = np_image.copy()
np_matched_result[:, :, :] = (
(
(
(np_matched_result[:, :, :].astype(np.float32) - gen_means[None, None, :])
/ gen_std[None, None, :]
)
* init_std[None, None, :]
+ init_means[None, None, :]
)
.clip(0, 255)
.astype(np.uint8)
)
matched_result = Image.fromarray(np_matched_result, mode="RGB")
else:
matched_result = Image.fromarray(np_image, mode="RGB")
# Blur the mask out (into init image) by specified amount
if mask_blur_radius > 0:
nm = np.asarray(pil_init_mask, dtype=np.uint8)
nmd = cv2.erode(
nm,
kernel=np.ones((3, 3), dtype=np.uint8),
iterations=int(mask_blur_radius / 2),
)
pmd = Image.fromarray(nmd, mode="L")
blurred_init_mask = pmd.filter(ImageFilter.BoxBlur(mask_blur_radius))
else:
blurred_init_mask = pil_init_mask
multiplied_blurred_init_mask = ImageChops.multiply(blurred_init_mask, self.pil_image.split()[-1])
# Paste original on color-corrected generation (using blurred mask)
matched_result.paste(init_image, (0, 0), mask=multiplied_blurred_init_mask)
return matched_result
@staticmethod
def sample_to_lowres_estimated_image(samples):
# origingally adapted from code by @erucipe and @keturn here:
# https://discuss.huggingface.co/t/decoding-latents-to-rgb-without-upscaling/23204/7
# these updated numbers for v1.5 are from @torridgristle
v1_5_latent_rgb_factors = torch.tensor(
[
# R G B
[0.3444, 0.1385, 0.0670], # L1
[0.1247, 0.4027, 0.1494], # L2
[-0.3192, 0.2513, 0.2103], # L3
[-0.1307, -0.1874, -0.7445], # L4
],
dtype=samples.dtype,
device=samples.device,
)
latent_image = samples[0].permute(1, 2, 0) @ v1_5_latent_rgb_factors
latents_ubyte = (
((latent_image + 1) / 2).clamp(0, 1).mul(0xFF).byte() # change scale from -1..1 to 0..1 # to 0..255
).cpu()
return Image.fromarray(latents_ubyte.numpy())
def generate_initial_noise(self, seed, width, height):
initial_noise = None
if self.variation_amount > 0 or len(self.with_variations) > 0:
# use fixed initial noise plus random noise per iteration
set_seed(seed)
initial_noise = self.get_noise(width, height)
for v_seed, v_weight in self.with_variations:
seed = v_seed
set_seed(seed)
next_noise = self.get_noise(width, height)
initial_noise = self.slerp(v_weight, initial_noise, next_noise)
if self.variation_amount > 0:
random.seed() # reset RNG to an actually random state, so we can get a random seed for variations
seed = random.randrange(0, np.iinfo(np.uint32).max)
return (seed, initial_noise)
def get_perlin_noise(self, width, height):
fixdevice = "cpu" if (self.model.device.type == "mps") else self.model.device
# limit noise to only the diffusion image channels, not the mask channels
input_channels = min(self.latent_channels, 4)
# round up to the nearest block of 8
temp_width = int((width + 7) / 8) * 8
temp_height = int((height + 7) / 8) * 8
noise = torch.stack(
[
rand_perlin_2d((temp_height, temp_width), (8, 8), device=self.model.device).to(fixdevice)
for _ in range(input_channels)
],
dim=0,
).to(self.model.device)
return noise[0:4, 0:height, 0:width]
def new_seed(self):
self.seed = random.randrange(0, np.iinfo(np.uint32).max)
return self.seed
def slerp(self, t, v0, v1, DOT_THRESHOLD=0.9995):
"""
Spherical linear interpolation
Args:
t (float/np.ndarray): Float value between 0.0 and 1.0
v0 (np.ndarray): Starting vector
v1 (np.ndarray): Final vector
DOT_THRESHOLD (float): Threshold for considering the two vectors as
colineal. Not recommended to alter this.
Returns:
v2 (np.ndarray): Interpolation vector between v0 and v1
"""
inputs_are_torch = False
if not isinstance(v0, np.ndarray):
inputs_are_torch = True
v0 = v0.detach().cpu().numpy()
if not isinstance(v1, np.ndarray):
inputs_are_torch = True
v1 = v1.detach().cpu().numpy()
dot = np.sum(v0 * v1 / (np.linalg.norm(v0) * np.linalg.norm(v1)))
if np.abs(dot) > DOT_THRESHOLD:
v2 = (1 - t) * v0 + t * v1
else:
theta_0 = np.arccos(dot)
sin_theta_0 = np.sin(theta_0)
theta_t = theta_0 * t
sin_theta_t = np.sin(theta_t)
s0 = np.sin(theta_0 - theta_t) / sin_theta_0
s1 = sin_theta_t / sin_theta_0
v2 = s0 * v0 + s1 * v1
if inputs_are_torch:
v2 = torch.from_numpy(v2).to(self.model.device)
return v2
# this is a handy routine for debugging use. Given a generated sample,
# convert it into a PNG image and store it at the indicated path
def save_sample(self, sample, filepath):
image = self.sample_to_image(sample)
dirname = os.path.dirname(filepath) or "."
if not os.path.exists(dirname):
logger.info(f"creating directory {dirname}")
os.makedirs(dirname, exist_ok=True)
image.save(filepath, "PNG")
def torch_dtype(self) -> torch.dtype:
return torch.float16 if self.precision == "float16" else torch.float32
# returns a tensor filled with random numbers from a normal distribution
def get_noise(self, width, height):
device = self.model.device
# limit noise to only the diffusion image channels, not the mask channels
input_channels = min(self.latent_channels, 4)
x = torch.randn(
[
1,
input_channels,
height // self.downsampling_factor,
width // self.downsampling_factor,
],
dtype=self.torch_dtype(),
device=device,
)
if self.perlin > 0.0:
perlin_noise = self.get_perlin_noise(width // self.downsampling_factor, height // self.downsampling_factor)
x = (1 - self.perlin) * x + self.perlin * perlin_noise
return x

View File

@ -1,92 +0,0 @@
"""
invokeai.backend.generator.img2img descends from .generator
"""
from typing import Optional
import torch
from accelerate.utils import set_seed
from diffusers import logging
from ..stable_diffusion import (
ConditioningData,
PostprocessingSettings,
StableDiffusionGeneratorPipeline,
)
from .base import Generator
class Img2Img(Generator):
def __init__(self, model, precision):
super().__init__(model, precision)
self.init_latent = None # by get_noise()
def get_make_image(
self,
sampler,
steps,
cfg_scale,
ddim_eta,
conditioning,
init_image,
strength,
step_callback=None,
threshold=0.0,
warmup=0.2,
perlin=0.0,
h_symmetry_time_pct=None,
v_symmetry_time_pct=None,
attention_maps_callback=None,
**kwargs,
):
"""
Returns a function returning an image derived from the prompt and the initial image
Return value depends on the seed at the time you call it.
"""
self.perlin = perlin
# noinspection PyTypeChecker
pipeline: StableDiffusionGeneratorPipeline = self.model
pipeline.scheduler = sampler
uc, c, extra_conditioning_info = conditioning
conditioning_data = ConditioningData(
uc,
c,
cfg_scale,
extra_conditioning_info,
postprocessing_settings=PostprocessingSettings(
threshold=threshold,
warmup=warmup,
h_symmetry_time_pct=h_symmetry_time_pct,
v_symmetry_time_pct=v_symmetry_time_pct,
),
).add_scheduler_args_if_applicable(pipeline.scheduler, eta=ddim_eta)
def make_image(x_T: torch.Tensor, seed: int):
# FIXME: use x_T for initial seeded noise
# We're not at the moment because the pipeline automatically resizes init_image if
# necessary, which the x_T input might not match.
# In the meantime, reset the seed prior to generating pipeline output so we at least get the same result.
logging.set_verbosity_error() # quench safety check warnings
pipeline_output = pipeline.img2img_from_embeddings(
init_image,
strength,
steps,
conditioning_data,
noise_func=self.get_noise_like,
callback=step_callback,
seed=seed,
)
if pipeline_output.attention_map_saver is not None and attention_maps_callback is not None:
attention_maps_callback(pipeline_output.attention_map_saver)
return pipeline.numpy_to_pil(pipeline_output.images)[0]
return make_image
def get_noise_like(self, like: torch.Tensor):
device = like.device
x = torch.randn_like(like, device=device)
if self.perlin > 0.0:
shape = like.shape
x = (1 - self.perlin) * x + self.perlin * self.get_perlin_noise(shape[3], shape[2])
return x

View File

@ -1,379 +0,0 @@
"""
invokeai.backend.generator.inpaint descends from .generator
"""
from __future__ import annotations
import math
from typing import Tuple, Union, Optional
import cv2
import numpy as np
import torch
from PIL import Image, ImageChops, ImageFilter, ImageOps
from ..image_util import PatchMatch, debug_image
from ..stable_diffusion.diffusers_pipeline import (
ConditioningData,
StableDiffusionGeneratorPipeline,
image_resized_to_grid_as_tensor,
)
from .img2img import Img2Img
def infill_methods() -> list[str]:
methods = [
"tile",
"solid",
]
if PatchMatch.patchmatch_available():
methods.insert(0, "patchmatch")
return methods
class Inpaint(Img2Img):
def __init__(self, model, precision):
self.inpaint_height = 0
self.inpaint_width = 0
self.enable_image_debugging = False
self.init_latent = None
self.pil_image = None
self.pil_mask = None
self.mask_blur_radius = 0
self.infill_method = None
super().__init__(model, precision)
# Outpaint support code
def get_tile_images(self, image: np.ndarray, width=8, height=8):
_nrows, _ncols, depth = image.shape
_strides = image.strides
nrows, _m = divmod(_nrows, height)
ncols, _n = divmod(_ncols, width)
if _m != 0 or _n != 0:
return None
return np.lib.stride_tricks.as_strided(
np.ravel(image),
shape=(nrows, ncols, height, width, depth),
strides=(height * _strides[0], width * _strides[1], *_strides),
writeable=False,
)
def infill_patchmatch(self, im: Image.Image) -> Image.Image:
if im.mode != "RGBA":
return im
# Skip patchmatch if patchmatch isn't available
if not PatchMatch.patchmatch_available():
return im
# Patchmatch (note, we may want to expose patch_size? Increasing it significantly impacts performance though)
im_patched_np = PatchMatch.inpaint(im.convert("RGB"), ImageOps.invert(im.split()[-1]), patch_size=3)
im_patched = Image.fromarray(im_patched_np, mode="RGB")
return im_patched
def tile_fill_missing(self, im: Image.Image, tile_size: int = 16, seed: Optional[int] = None) -> Image.Image:
# Only fill if there's an alpha layer
if im.mode != "RGBA":
return im
a = np.asarray(im, dtype=np.uint8)
tile_size_tuple = (tile_size, tile_size)
# Get the image as tiles of a specified size
tiles = self.get_tile_images(a, *tile_size_tuple).copy()
# Get the mask as tiles
tiles_mask = tiles[:, :, :, :, 3]
# Find any mask tiles with any fully transparent pixels (we will be replacing these later)
tmask_shape = tiles_mask.shape
tiles_mask = tiles_mask.reshape(math.prod(tiles_mask.shape))
n, ny = (math.prod(tmask_shape[0:2])), math.prod(tmask_shape[2:])
tiles_mask = tiles_mask > 0
tiles_mask = tiles_mask.reshape((n, ny)).all(axis=1)
# Get RGB tiles in single array and filter by the mask
tshape = tiles.shape
tiles_all = tiles.reshape((math.prod(tiles.shape[0:2]), *tiles.shape[2:]))
filtered_tiles = tiles_all[tiles_mask]
if len(filtered_tiles) == 0:
return im
# Find all invalid tiles and replace with a random valid tile
replace_count = (tiles_mask == False).sum()
rng = np.random.default_rng(seed=seed)
tiles_all[np.logical_not(tiles_mask)] = filtered_tiles[
rng.choice(filtered_tiles.shape[0], replace_count), :, :, :
]
# Convert back to an image
tiles_all = tiles_all.reshape(tshape)
tiles_all = tiles_all.swapaxes(1, 2)
st = tiles_all.reshape(
(
math.prod(tiles_all.shape[0:2]),
math.prod(tiles_all.shape[2:4]),
tiles_all.shape[4],
)
)
si = Image.fromarray(st, mode="RGBA")
return si
def mask_edge(self, mask: Image.Image, edge_size: int, edge_blur: int) -> Image.Image:
npimg = np.asarray(mask, dtype=np.uint8)
# Detect any partially transparent regions
npgradient = np.uint8(255 * (1.0 - np.floor(np.abs(0.5 - np.float32(npimg) / 255.0) * 2.0)))
# Detect hard edges
npedge = cv2.Canny(npimg, threshold1=100, threshold2=200)
# Combine
npmask = npgradient + npedge
# Expand
npmask = cv2.dilate(npmask, np.ones((3, 3), np.uint8), iterations=int(edge_size / 2))
new_mask = Image.fromarray(npmask)
if edge_blur > 0:
new_mask = new_mask.filter(ImageFilter.BoxBlur(edge_blur))
return ImageOps.invert(new_mask)
def seam_paint(
self,
im: Image.Image,
seam_size: int,
seam_blur: int,
seed,
steps,
cfg_scale,
ddim_eta,
conditioning,
strength,
noise,
infill_method,
step_callback,
) -> Image.Image:
hard_mask = self.pil_image.split()[-1].copy()
mask = self.mask_edge(hard_mask, seam_size, seam_blur)
make_image = self.get_make_image(
steps,
cfg_scale,
ddim_eta,
conditioning,
init_image=im.copy().convert("RGBA"),
mask_image=mask,
strength=strength,
mask_blur_radius=0,
seam_size=0,
step_callback=step_callback,
inpaint_width=im.width,
inpaint_height=im.height,
infill_method=infill_method,
)
seam_noise = self.get_noise(im.width, im.height)
result = make_image(seam_noise, seed=None)
return result
@torch.no_grad()
def get_make_image(
self,
steps,
cfg_scale,
ddim_eta,
conditioning,
init_image: Union[Image.Image, torch.FloatTensor],
mask_image: Union[Image.Image, torch.FloatTensor],
strength: float,
mask_blur_radius: int = 8,
# Seam settings - when 0, doesn't fill seam
seam_size: int = 96,
seam_blur: int = 16,
seam_strength: float = 0.7,
seam_steps: int = 30,
tile_size: int = 32,
step_callback=None,
inpaint_replace=False,
enable_image_debugging=False,
infill_method=None,
inpaint_width=None,
inpaint_height=None,
inpaint_fill: Tuple[int, int, int, int] = (0x7F, 0x7F, 0x7F, 0xFF),
attention_maps_callback=None,
**kwargs,
):
"""
Returns a function returning an image derived from the prompt and
the initial image + mask. Return value depends on the seed at
the time you call it. kwargs are 'init_latent' and 'strength'
"""
self.enable_image_debugging = enable_image_debugging
infill_method = infill_method or infill_methods()[0]
self.infill_method = infill_method
self.inpaint_width = inpaint_width
self.inpaint_height = inpaint_height
if isinstance(init_image, Image.Image):
self.pil_image = init_image.copy()
# Do infill
if infill_method == "patchmatch" and PatchMatch.patchmatch_available():
init_filled = self.infill_patchmatch(self.pil_image.copy())
elif infill_method == "tile":
init_filled = self.tile_fill_missing(self.pil_image.copy(), seed=self.seed, tile_size=tile_size)
elif infill_method == "solid":
solid_bg = Image.new("RGBA", init_image.size, inpaint_fill)
init_filled = Image.alpha_composite(solid_bg, init_image)
else:
raise ValueError(f"Non-supported infill type {infill_method}", infill_method)
init_filled.paste(init_image, (0, 0), init_image.split()[-1])
# Resize if requested for inpainting
if inpaint_width and inpaint_height:
init_filled = init_filled.resize((inpaint_width, inpaint_height))
debug_image(init_filled, "init_filled", debug_status=self.enable_image_debugging)
# Create init tensor
init_image = image_resized_to_grid_as_tensor(init_filled.convert("RGB"))
if isinstance(mask_image, Image.Image):
self.pil_mask = mask_image.copy()
debug_image(
mask_image,
"mask_image BEFORE multiply with pil_image",
debug_status=self.enable_image_debugging,
)
init_alpha = self.pil_image.getchannel("A")
if mask_image.mode != "L":
# FIXME: why do we get passed an RGB image here? We can only use single-channel.
mask_image = mask_image.convert("L")
mask_image = ImageChops.multiply(mask_image, init_alpha)
self.pil_mask = mask_image
# Resize if requested for inpainting
if inpaint_width and inpaint_height:
mask_image = mask_image.resize((inpaint_width, inpaint_height))
debug_image(
mask_image,
"mask_image AFTER multiply with pil_image",
debug_status=self.enable_image_debugging,
)
mask: torch.FloatTensor = image_resized_to_grid_as_tensor(mask_image, normalize=False)
else:
mask: torch.FloatTensor = mask_image
self.mask_blur_radius = mask_blur_radius
# noinspection PyTypeChecker
pipeline: StableDiffusionGeneratorPipeline = self.model
# todo: support cross-attention control
uc, c, _ = conditioning
conditioning_data = ConditioningData(uc, c, cfg_scale).add_scheduler_args_if_applicable(
pipeline.scheduler, eta=ddim_eta
)
def make_image(x_T: torch.Tensor, seed: int):
pipeline_output = pipeline.inpaint_from_embeddings(
init_image=init_image,
mask=1 - mask, # expects white means "paint here."
strength=strength,
num_inference_steps=steps,
conditioning_data=conditioning_data,
noise_func=self.get_noise_like,
callback=step_callback,
seed=seed,
)
if pipeline_output.attention_map_saver is not None and attention_maps_callback is not None:
attention_maps_callback(pipeline_output.attention_map_saver)
result = self.postprocess_size_and_mask(pipeline.numpy_to_pil(pipeline_output.images)[0])
# Seam paint if this is our first pass (seam_size set to 0 during seam painting)
if seam_size > 0:
old_image = self.pil_image or init_image
old_mask = self.pil_mask or mask_image
result = self.seam_paint(
result,
seam_size,
seam_blur,
seed,
seam_steps,
cfg_scale,
ddim_eta,
conditioning,
seam_strength,
x_T,
infill_method,
step_callback,
)
# Restore original settings
self.get_make_image(
steps,
cfg_scale,
ddim_eta,
conditioning,
old_image,
old_mask,
strength,
mask_blur_radius,
seam_size,
seam_blur,
seam_strength,
seam_steps,
tile_size,
step_callback,
inpaint_replace,
enable_image_debugging,
inpaint_width=inpaint_width,
inpaint_height=inpaint_height,
infill_method=infill_method,
**kwargs,
)
return result
return make_image
def sample_to_image(self, samples) -> Image.Image:
gen_result = super().sample_to_image(samples).convert("RGB")
return self.postprocess_size_and_mask(gen_result)
def postprocess_size_and_mask(self, gen_result: Image.Image) -> Image.Image:
debug_image(gen_result, "gen_result", debug_status=self.enable_image_debugging)
# Resize if necessary
if self.inpaint_width and self.inpaint_height:
gen_result = gen_result.resize(self.pil_image.size)
if self.pil_image is None or self.pil_mask is None:
return gen_result
corrected_result = self.repaste_and_color_correct(
gen_result, self.pil_image, self.pil_mask, self.mask_blur_radius
)
debug_image(
corrected_result,
"corrected_result",
debug_status=self.enable_image_debugging,
)
return corrected_result

View File

@ -12,16 +12,17 @@ def check_invokeai_root(config: InvokeAIAppConfig):
assert config.model_conf_path.exists(), f"{config.model_conf_path} not found"
assert config.db_path.parent.exists(), f"{config.db_path.parent} not found"
assert config.models_path.exists(), f"{config.models_path} not found"
for model in [
"CLIP-ViT-bigG-14-laion2B-39B-b160k",
"bert-base-uncased",
"clip-vit-large-patch14",
"sd-vae-ft-mse",
"stable-diffusion-2-clip",
"stable-diffusion-safety-checker",
]:
path = config.models_path / f"core/convert/{model}"
assert path.exists(), f"{path} is missing"
if not config.ignore_missing_core_models:
for model in [
"CLIP-ViT-bigG-14-laion2B-39B-b160k",
"bert-base-uncased",
"clip-vit-large-patch14",
"sd-vae-ft-mse",
"stable-diffusion-2-clip",
"stable-diffusion-safety-checker",
]:
path = config.models_path / f"core/convert/{model}"
assert path.exists(), f"{path} is missing"
except Exception as e:
print()
print(f"An exception has occurred: {str(e)}")
@ -32,5 +33,10 @@ def check_invokeai_root(config: InvokeAIAppConfig):
print(
'** From the command line, activate the virtual environment and run "invokeai-configure --yes --skip-sd-weights" **'
)
print(
'** (To skip this check completely, add "--ignore_missing_core_models" to your CLI args. Not installing '
"these core models will prevent the loading of some or all .safetensors and .ckpt files. However, you can "
"always come back and install these core models in the future.)"
)
input("Press any key to continue...")
sys.exit(0)

View File

@ -10,15 +10,17 @@ import sys
import argparse
import io
import os
import psutil
import shutil
import textwrap
import torch
import traceback
import yaml
import warnings
from argparse import Namespace
from enum import Enum
from pathlib import Path
from shutil import get_terminal_size
from typing import get_type_hints
from urllib import request
import npyscreen
@ -44,6 +46,8 @@ from invokeai.app.services.config import (
)
from invokeai.backend.util.logging import InvokeAILogger
from invokeai.frontend.install.model_install import addModelsForm, process_and_execute
# TO DO - Move all the frontend code into invokeai.frontend.install
from invokeai.frontend.install.widgets import (
SingleSelectColumns,
CenteredButtonPress,
@ -53,6 +57,7 @@ from invokeai.frontend.install.widgets import (
CyclingForm,
MIN_COLS,
MIN_LINES,
WindowTooSmallException,
)
from invokeai.backend.install.legacy_arg_parsing import legacy_parser
from invokeai.backend.install.model_install_backend import (
@ -61,6 +66,7 @@ from invokeai.backend.install.model_install_backend import (
ModelInstall,
)
from invokeai.backend.model_management.model_probe import ModelType, BaseModelType
from pydantic.error_wrappers import ValidationError
warnings.filterwarnings("ignore")
transformers.logging.set_verbosity_error()
@ -76,6 +82,13 @@ Default_config_file = config.model_conf_path
SD_Configs = config.legacy_conf_path
PRECISION_CHOICES = ["auto", "float16", "float32"]
GB = 1073741824 # GB in bytes
HAS_CUDA = torch.cuda.is_available()
_, MAX_VRAM = torch.cuda.mem_get_info() if HAS_CUDA else (0, 0)
MAX_VRAM /= GB
MAX_RAM = psutil.virtual_memory().total / GB
INIT_FILE_PREAMBLE = """# InvokeAI initialization file
# This is the InvokeAI initialization file, which contains command-line default values.
@ -86,6 +99,12 @@ INIT_FILE_PREAMBLE = """# InvokeAI initialization file
logger = InvokeAILogger.getLogger()
class DummyWidgetValue(Enum):
zero = 0
true = True
false = False
# --------------------------------------------
def postscript(errors: None):
if not any(errors):
@ -376,15 +395,47 @@ Use cursor arrows to make a checkbox selection, and space to toggle.
max_width=80,
scroll_exit=True,
)
self.max_cache_size = self.add_widget_intelligent(
IntTitleSlider,
name="Size of the RAM cache used for fast model switching (GB)",
value=old_opts.max_cache_size,
out_of=20,
lowest=3,
begin_entry_at=6,
self.nextrely += 1
self.add_widget_intelligent(
npyscreen.TitleFixedText,
name="RAM cache size (GB). Make this at least large enough to hold a single full model.",
begin_entry_at=0,
editable=False,
color="CONTROL",
scroll_exit=True,
)
self.nextrely -= 1
self.max_cache_size = self.add_widget_intelligent(
npyscreen.Slider,
value=clip(old_opts.max_cache_size, range=(3.0, MAX_RAM), step=0.5),
out_of=round(MAX_RAM),
lowest=0.0,
step=0.5,
relx=8,
scroll_exit=True,
)
if HAS_CUDA:
self.nextrely += 1
self.add_widget_intelligent(
npyscreen.TitleFixedText,
name="VRAM cache size (GB). Reserving a small amount of VRAM will modestly speed up the start of image generation.",
begin_entry_at=0,
editable=False,
color="CONTROL",
scroll_exit=True,
)
self.nextrely -= 1
self.max_vram_cache_size = self.add_widget_intelligent(
npyscreen.Slider,
value=clip(old_opts.max_vram_cache_size, range=(0, MAX_VRAM), step=0.25),
out_of=round(MAX_VRAM * 2) / 2,
lowest=0.0,
relx=8,
step=0.25,
scroll_exit=True,
)
else:
self.max_vram_cache_size = DummyWidgetValue.zero
self.nextrely += 1
self.outdir = self.add_widget_intelligent(
FileBox,
@ -401,7 +452,7 @@ Use cursor arrows to make a checkbox selection, and space to toggle.
self.autoimport_dirs = {}
self.autoimport_dirs["autoimport_dir"] = self.add_widget_intelligent(
FileBox,
name=f"Folder to recursively scan for new checkpoints, ControlNets, LoRAs and TI models",
name="Folder to recursively scan for new checkpoints, ControlNets, LoRAs and TI models",
value=str(config.root_path / config.autoimport_dir),
select_dir=True,
must_exist=False,
@ -476,6 +527,7 @@ https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/LICENS
"outdir",
"free_gpu_mem",
"max_cache_size",
"max_vram_cache_size",
"xformers_enabled",
"always_use_cpu",
]:
@ -553,6 +605,16 @@ def default_user_selections(program_opts: Namespace) -> InstallSelections:
)
# -------------------------------------
def clip(value: float, range: tuple[float, float], step: float) -> float:
minimum, maximum = range
if value < minimum:
value = minimum
if value > maximum:
value = maximum
return round(value / step) * step
# -------------------------------------
def initialize_rootdir(root: Path, yes_to_all: bool = False):
logger.info("Initializing InvokeAI runtime directory")
@ -592,13 +654,13 @@ def maybe_create_models_yaml(root: Path):
# -------------------------------------
def run_console_ui(program_opts: Namespace, initfile: Path = None) -> (Namespace, Namespace):
# parse_args() will read from init file if present
invokeai_opts = default_startup_options(initfile)
invokeai_opts.root = program_opts.root
# The third argument is needed in the Windows 11 environment to
# launch a console window running this program.
set_min_terminal_size(MIN_COLS, MIN_LINES)
if not set_min_terminal_size(MIN_COLS, MIN_LINES):
raise WindowTooSmallException(
"Could not increase terminal size. Try running again with a larger window or smaller font size."
)
# the install-models application spawns a subprocess to install
# models, and will crash unless this is set before running.
@ -654,10 +716,13 @@ def migrate_init_file(legacy_format: Path):
old = legacy_parser.parse_args([f"@{str(legacy_format)}"])
new = InvokeAIAppConfig.get_config()
fields = list(get_type_hints(InvokeAIAppConfig).keys())
fields = [x for x, y in InvokeAIAppConfig.__fields__.items() if y.field_info.extra.get("category") != "DEPRECATED"]
for attr in fields:
if hasattr(old, attr):
setattr(new, attr, getattr(old, attr))
try:
setattr(new, attr, getattr(old, attr))
except ValidationError as e:
print(f"* Ignoring incompatible value for field {attr}:\n {str(e)}")
# a few places where the field names have changed and we have to
# manually add in the new names/values
@ -777,6 +842,7 @@ def main():
models_to_download = default_user_selections(opt)
new_init_file = config.root_path / "invokeai.yaml"
if opt.yes_to_all:
write_default_options(opt, new_init_file)
init_options = Namespace(precision="float32" if opt.full_precision else "float16")
@ -802,6 +868,8 @@ def main():
postscript(errors=errors)
if not opt.yes_to_all:
input("Press any key to continue...")
except WindowTooSmallException as e:
logger.error(str(e))
except KeyboardInterrupt:
print("\nGoodbye! Come back soon.")

View File

@ -591,7 +591,6 @@ script, which will perform a full upgrade in place.""",
# TODO: revisit - don't rely on invokeai.yaml to exist yet!
dest_is_setup = (dest_root / "models/core").exists() and (dest_root / "databases").exists()
if not dest_is_setup:
import invokeai.frontend.install.invokeai_configure
from invokeai.backend.install.invokeai_configure import initialize_rootdir
initialize_rootdir(dest_root, True)

View File

@ -13,6 +13,7 @@ import requests
from diffusers import DiffusionPipeline
from diffusers import logging as dlogging
import onnx
import torch
from huggingface_hub import hf_hub_url, HfFolder, HfApi
from omegaconf import OmegaConf
from tqdm import tqdm
@ -23,6 +24,7 @@ from invokeai.app.services.config import InvokeAIAppConfig
from invokeai.backend.model_management import ModelManager, ModelType, BaseModelType, ModelVariantType, AddModelResult
from invokeai.backend.model_management.model_probe import ModelProbe, SchedulerPredictionType, ModelProbeInfo
from invokeai.backend.util import download_with_resume
from invokeai.backend.util.devices import torch_dtype, choose_torch_device
from ..util.logging import InvokeAILogger
warnings.filterwarnings("ignore")
@ -99,9 +101,9 @@ class ModelInstall(object):
def __init__(
self,
config: InvokeAIAppConfig,
prediction_type_helper: Callable[[Path], SchedulerPredictionType] = None,
model_manager: ModelManager = None,
access_token: str = None,
prediction_type_helper: Optional[Callable[[Path], SchedulerPredictionType]] = None,
model_manager: Optional[ModelManager] = None,
access_token: Optional[str] = None,
):
self.config = config
self.mgr = model_manager or ModelManager(config.model_conf_path)
@ -303,7 +305,7 @@ class ModelInstall(object):
with TemporaryDirectory(dir=self.config.models_path) as staging:
staging = Path(staging)
if "model_index.json" in files and "unet/model.onnx" not in files:
if "model_index.json" in files:
location = self._download_hf_pipeline(repo_id, staging) # pipeline
elif "unet/model.onnx" in files:
location = self._download_hf_model(repo_id, files, staging)
@ -416,15 +418,25 @@ class ModelInstall(object):
does a save_pretrained() to the indicated staging area.
"""
_, name = repo_id.split("/")
revisions = ["fp16", "main"] if self.config.precision == "float16" else ["main"]
precision = torch_dtype(choose_torch_device())
variants = ["fp16", None] if precision == torch.float16 else [None, "fp16"]
model = None
for revision in revisions:
for variant in variants:
try:
model = DiffusionPipeline.from_pretrained(repo_id, revision=revision, safety_checker=None)
except: # most errors are due to fp16 not being present. Fix this to catch other errors
pass
model = DiffusionPipeline.from_pretrained(
repo_id,
variant=variant,
torch_dtype=precision,
safety_checker=None,
)
except Exception as e: # most errors are due to fp16 not being present. Fix this to catch other errors
if "fp16" not in str(e):
print(e)
if model:
break
if not model:
logger.error(f"Diffusers model {repo_id} could not be downloaded. Skipping.")
return None

View File

@ -13,3 +13,4 @@ from .models import (
DuplicateModelException,
)
from .model_merge import ModelMerger, MergeInterpolationMethod
from .lora import ModelPatcher

View File

@ -20,424 +20,6 @@ from diffusers.models import UNet2DConditionModel
from safetensors.torch import load_file
from transformers import CLIPTextModel, CLIPTokenizer
# TODO: rename and split this file
class LoRALayerBase:
# rank: Optional[int]
# alpha: Optional[float]
# bias: Optional[torch.Tensor]
# layer_key: str
# @property
# def scale(self):
# return self.alpha / self.rank if (self.alpha and self.rank) else 1.0
def __init__(
self,
layer_key: str,
values: dict,
):
if "alpha" in values:
self.alpha = values["alpha"].item()
else:
self.alpha = None
if "bias_indices" in values and "bias_values" in values and "bias_size" in values:
self.bias = torch.sparse_coo_tensor(
values["bias_indices"],
values["bias_values"],
tuple(values["bias_size"]),
)
else:
self.bias = None
self.rank = None # set in layer implementation
self.layer_key = layer_key
def forward(
self,
module: torch.nn.Module,
input_h: Any, # for real looks like Tuple[torch.nn.Tensor] but not sure
multiplier: float,
):
if type(module) == torch.nn.Conv2d:
op = torch.nn.functional.conv2d
extra_args = dict(
stride=module.stride,
padding=module.padding,
dilation=module.dilation,
groups=module.groups,
)
else:
op = torch.nn.functional.linear
extra_args = {}
weight = self.get_weight()
bias = self.bias if self.bias is not None else 0
scale = self.alpha / self.rank if (self.alpha and self.rank) else 1.0
return (
op(
*input_h,
(weight + bias).view(module.weight.shape),
None,
**extra_args,
)
* multiplier
* scale
)
def get_weight(self):
raise NotImplementedError()
def calc_size(self) -> int:
model_size = 0
for val in [self.bias]:
if val is not None:
model_size += val.nelement() * val.element_size()
return model_size
def to(
self,
device: Optional[torch.device] = None,
dtype: Optional[torch.dtype] = None,
):
if self.bias is not None:
self.bias = self.bias.to(device=device, dtype=dtype)
# TODO: find and debug lora/locon with bias
class LoRALayer(LoRALayerBase):
# up: torch.Tensor
# mid: Optional[torch.Tensor]
# down: torch.Tensor
def __init__(
self,
layer_key: str,
values: dict,
):
super().__init__(layer_key, values)
self.up = values["lora_up.weight"]
self.down = values["lora_down.weight"]
if "lora_mid.weight" in values:
self.mid = values["lora_mid.weight"]
else:
self.mid = None
self.rank = self.down.shape[0]
def get_weight(self):
if self.mid is not None:
up = self.up.reshape(self.up.shape[0], self.up.shape[1])
down = self.down.reshape(self.down.shape[0], self.down.shape[1])
weight = torch.einsum("m n w h, i m, n j -> i j w h", self.mid, up, down)
else:
weight = self.up.reshape(self.up.shape[0], -1) @ self.down.reshape(self.down.shape[0], -1)
return weight
def calc_size(self) -> int:
model_size = super().calc_size()
for val in [self.up, self.mid, self.down]:
if val is not None:
model_size += val.nelement() * val.element_size()
return model_size
def to(
self,
device: Optional[torch.device] = None,
dtype: Optional[torch.dtype] = None,
):
super().to(device=device, dtype=dtype)
self.up = self.up.to(device=device, dtype=dtype)
self.down = self.down.to(device=device, dtype=dtype)
if self.mid is not None:
self.mid = self.mid.to(device=device, dtype=dtype)
class LoHALayer(LoRALayerBase):
# w1_a: torch.Tensor
# w1_b: torch.Tensor
# w2_a: torch.Tensor
# w2_b: torch.Tensor
# t1: Optional[torch.Tensor] = None
# t2: Optional[torch.Tensor] = None
def __init__(
self,
layer_key: str,
values: dict,
):
super().__init__(layer_key, values)
self.w1_a = values["hada_w1_a"]
self.w1_b = values["hada_w1_b"]
self.w2_a = values["hada_w2_a"]
self.w2_b = values["hada_w2_b"]
if "hada_t1" in values:
self.t1 = values["hada_t1"]
else:
self.t1 = None
if "hada_t2" in values:
self.t2 = values["hada_t2"]
else:
self.t2 = None
self.rank = self.w1_b.shape[0]
def get_weight(self):
if self.t1 is None:
weight = (self.w1_a @ self.w1_b) * (self.w2_a @ self.w2_b)
else:
rebuild1 = torch.einsum("i j k l, j r, i p -> p r k l", self.t1, self.w1_b, self.w1_a)
rebuild2 = torch.einsum("i j k l, j r, i p -> p r k l", self.t2, self.w2_b, self.w2_a)
weight = rebuild1 * rebuild2
return weight
def calc_size(self) -> int:
model_size = super().calc_size()
for val in [self.w1_a, self.w1_b, self.w2_a, self.w2_b, self.t1, self.t2]:
if val is not None:
model_size += val.nelement() * val.element_size()
return model_size
def to(
self,
device: Optional[torch.device] = None,
dtype: Optional[torch.dtype] = None,
):
super().to(device=device, dtype=dtype)
self.w1_a = self.w1_a.to(device=device, dtype=dtype)
self.w1_b = self.w1_b.to(device=device, dtype=dtype)
if self.t1 is not None:
self.t1 = self.t1.to(device=device, dtype=dtype)
self.w2_a = self.w2_a.to(device=device, dtype=dtype)
self.w2_b = self.w2_b.to(device=device, dtype=dtype)
if self.t2 is not None:
self.t2 = self.t2.to(device=device, dtype=dtype)
class LoKRLayer(LoRALayerBase):
# w1: Optional[torch.Tensor] = None
# w1_a: Optional[torch.Tensor] = None
# w1_b: Optional[torch.Tensor] = None
# w2: Optional[torch.Tensor] = None
# w2_a: Optional[torch.Tensor] = None
# w2_b: Optional[torch.Tensor] = None
# t2: Optional[torch.Tensor] = None
def __init__(
self,
layer_key: str,
values: dict,
):
super().__init__(layer_key, values)
if "lokr_w1" in values:
self.w1 = values["lokr_w1"]
self.w1_a = None
self.w1_b = None
else:
self.w1 = None
self.w1_a = values["lokr_w1_a"]
self.w1_b = values["lokr_w1_b"]
if "lokr_w2" in values:
self.w2 = values["lokr_w2"]
self.w2_a = None
self.w2_b = None
else:
self.w2 = None
self.w2_a = values["lokr_w2_a"]
self.w2_b = values["lokr_w2_b"]
if "lokr_t2" in values:
self.t2 = values["lokr_t2"]
else:
self.t2 = None
if "lokr_w1_b" in values:
self.rank = values["lokr_w1_b"].shape[0]
elif "lokr_w2_b" in values:
self.rank = values["lokr_w2_b"].shape[0]
else:
self.rank = None # unscaled
def get_weight(self):
w1 = self.w1
if w1 is None:
w1 = self.w1_a @ self.w1_b
w2 = self.w2
if w2 is None:
if self.t2 is None:
w2 = self.w2_a @ self.w2_b
else:
w2 = torch.einsum("i j k l, i p, j r -> p r k l", self.t2, self.w2_a, self.w2_b)
if len(w2.shape) == 4:
w1 = w1.unsqueeze(2).unsqueeze(2)
w2 = w2.contiguous()
weight = torch.kron(w1, w2)
return weight
def calc_size(self) -> int:
model_size = super().calc_size()
for val in [self.w1, self.w1_a, self.w1_b, self.w2, self.w2_a, self.w2_b, self.t2]:
if val is not None:
model_size += val.nelement() * val.element_size()
return model_size
def to(
self,
device: Optional[torch.device] = None,
dtype: Optional[torch.dtype] = None,
):
super().to(device=device, dtype=dtype)
if self.w1 is not None:
self.w1 = self.w1.to(device=device, dtype=dtype)
else:
self.w1_a = self.w1_a.to(device=device, dtype=dtype)
self.w1_b = self.w1_b.to(device=device, dtype=dtype)
if self.w2 is not None:
self.w2 = self.w2.to(device=device, dtype=dtype)
else:
self.w2_a = self.w2_a.to(device=device, dtype=dtype)
self.w2_b = self.w2_b.to(device=device, dtype=dtype)
if self.t2 is not None:
self.t2 = self.t2.to(device=device, dtype=dtype)
class LoRAModel: # (torch.nn.Module):
_name: str
layers: Dict[str, LoRALayer]
_device: torch.device
_dtype: torch.dtype
def __init__(
self,
name: str,
layers: Dict[str, LoRALayer],
device: torch.device,
dtype: torch.dtype,
):
self._name = name
self._device = device or torch.cpu
self._dtype = dtype or torch.float32
self.layers = layers
@property
def name(self):
return self._name
@property
def device(self):
return self._device
@property
def dtype(self):
return self._dtype
def to(
self,
device: Optional[torch.device] = None,
dtype: Optional[torch.dtype] = None,
) -> LoRAModel:
# TODO: try revert if exception?
for key, layer in self.layers.items():
layer.to(device=device, dtype=dtype)
self._device = device
self._dtype = dtype
def calc_size(self) -> int:
model_size = 0
for _, layer in self.layers.items():
model_size += layer.calc_size()
return model_size
@classmethod
def from_checkpoint(
cls,
file_path: Union[str, Path],
device: Optional[torch.device] = None,
dtype: Optional[torch.dtype] = None,
):
device = device or torch.device("cpu")
dtype = dtype or torch.float32
if isinstance(file_path, str):
file_path = Path(file_path)
model = cls(
device=device,
dtype=dtype,
name=file_path.stem, # TODO:
layers=dict(),
)
if file_path.suffix == ".safetensors":
state_dict = load_file(file_path.absolute().as_posix(), device="cpu")
else:
state_dict = torch.load(file_path, map_location="cpu")
state_dict = cls._group_state(state_dict)
for layer_key, values in state_dict.items():
# lora and locon
if "lora_down.weight" in values:
layer = LoRALayer(layer_key, values)
# loha
elif "hada_w1_b" in values:
layer = LoHALayer(layer_key, values)
# lokr
elif "lokr_w1_b" in values or "lokr_w1" in values:
layer = LoKRLayer(layer_key, values)
else:
# TODO: diff/ia3/... format
print(f">> Encountered unknown lora layer module in {model.name}: {layer_key}")
return
# lower memory consumption by removing already parsed layer values
state_dict[layer_key].clear()
layer.to(device=device, dtype=dtype)
model.layers[layer_key] = layer
return model
@staticmethod
def _group_state(state_dict: dict):
state_dict_groupped = dict()
for key, value in state_dict.items():
stem, leaf = key.split(".", 1)
if stem not in state_dict_groupped:
state_dict_groupped[stem] = dict()
state_dict_groupped[stem][leaf] = value
return state_dict_groupped
"""
loras = [
(lora_model1, 0.7),
@ -516,6 +98,26 @@ class ModelPatcher:
with cls.apply_lora(text_encoder, loras, "lora_te_"):
yield
@classmethod
@contextmanager
def apply_sdxl_lora_text_encoder(
cls,
text_encoder: CLIPTextModel,
loras: List[Tuple[LoRAModel, float]],
):
with cls.apply_lora(text_encoder, loras, "lora_te1_"):
yield
@classmethod
@contextmanager
def apply_sdxl_lora_text_encoder2(
cls,
text_encoder: CLIPTextModel,
loras: List[Tuple[LoRAModel, float]],
):
with cls.apply_lora(text_encoder, loras, "lora_te2_"):
yield
@classmethod
@contextmanager
def apply_lora(
@ -541,7 +143,7 @@ class ModelPatcher:
# with torch.autocast(device_type="cpu"):
layer.to(dtype=torch.float32)
layer_scale = layer.alpha / layer.rank if (layer.alpha and layer.rank) else 1.0
layer_weight = layer.get_weight() * lora_weight * layer_scale
layer_weight = layer.get_weight(original_weights[module_key]) * lora_weight * layer_scale
if module.weight.shape != layer_weight.shape:
# TODO: debug on lycoris
@ -562,7 +164,7 @@ class ModelPatcher:
cls,
tokenizer: CLIPTokenizer,
text_encoder: CLIPTextModel,
ti_list: List[Any],
ti_list: List[Tuple[str, Any]],
) -> Tuple[CLIPTokenizer, TextualInversionManager]:
init_tokens_count = None
new_tokens_added = None
@ -572,27 +174,27 @@ class ModelPatcher:
ti_manager = TextualInversionManager(ti_tokenizer)
init_tokens_count = text_encoder.resize_token_embeddings(None).num_embeddings
def _get_trigger(ti, index):
trigger = ti.name
def _get_trigger(ti_name, index):
trigger = ti_name
if index > 0:
trigger += f"-!pad-{i}"
return f"<{trigger}>"
# modify tokenizer
new_tokens_added = 0
for ti in ti_list:
for ti_name, ti in ti_list:
for i in range(ti.embedding.shape[0]):
new_tokens_added += ti_tokenizer.add_tokens(_get_trigger(ti, i))
new_tokens_added += ti_tokenizer.add_tokens(_get_trigger(ti_name, i))
# modify text_encoder
text_encoder.resize_token_embeddings(init_tokens_count + new_tokens_added)
model_embeddings = text_encoder.get_input_embeddings()
for ti in ti_list:
for ti_name, ti in ti_list:
ti_tokens = []
for i in range(ti.embedding.shape[0]):
embedding = ti.embedding[i]
trigger = _get_trigger(ti, i)
trigger = _get_trigger(ti_name, i)
token_id = ti_tokenizer.convert_tokens_to_ids(trigger)
if token_id == ti_tokenizer.unk_token_id:
@ -637,7 +239,6 @@ class ModelPatcher:
class TextualInversionModel:
name: str
embedding: torch.Tensor # [n, 768]|[n, 1280]
@classmethod
@ -651,7 +252,6 @@ class TextualInversionModel:
file_path = Path(file_path)
result = cls() # TODO:
result.name = file_path.stem # TODO:
if file_path.suffix == ".safetensors":
state_dict = load_file(file_path.absolute().as_posix(), device="cpu")
@ -761,7 +361,8 @@ class ONNXModelPatcher:
layer.to(dtype=torch.float32)
layer_key = layer_key.replace(prefix, "")
layer_weight = layer.get_weight().detach().cpu().numpy() * lora_weight
# TODO: rewrite to pass original tensor weight(required by ia3)
layer_weight = layer.get_weight(None).detach().cpu().numpy() * lora_weight
if layer_key is blended_loras:
blended_loras[layer_key] += layer_weight
else:
@ -828,7 +429,7 @@ class ONNXModelPatcher:
cls,
tokenizer: CLIPTokenizer,
text_encoder: IAIOnnxRuntimeModel,
ti_list: List[Any],
ti_list: List[Tuple[str, Any]],
) -> Tuple[CLIPTokenizer, TextualInversionManager]:
from .models.base import IAIOnnxRuntimeModel
@ -841,17 +442,17 @@ class ONNXModelPatcher:
ti_tokenizer = copy.deepcopy(tokenizer)
ti_manager = TextualInversionManager(ti_tokenizer)
def _get_trigger(ti, index):
trigger = ti.name
def _get_trigger(ti_name, index):
trigger = ti_name
if index > 0:
trigger += f"-!pad-{i}"
return f"<{trigger}>"
# modify tokenizer
new_tokens_added = 0
for ti in ti_list:
for ti_name, ti in ti_list:
for i in range(ti.embedding.shape[0]):
new_tokens_added += ti_tokenizer.add_tokens(_get_trigger(ti, i))
new_tokens_added += ti_tokenizer.add_tokens(_get_trigger(ti_name, i))
# modify text_encoder
orig_embeddings = text_encoder.tensors["text_model.embeddings.token_embedding.weight"]
@ -861,11 +462,11 @@ class ONNXModelPatcher:
axis=0,
)
for ti in ti_list:
for ti_name, ti in ti_list:
ti_tokens = []
for i in range(ti.embedding.shape[0]):
embedding = ti.embedding[i].detach().numpy()
trigger = _get_trigger(ti, i)
trigger = _get_trigger(ti_name, i)
token_id = ti_tokenizer.convert_tokens_to_ids(trigger)
if token_id == ti_tokenizer.unk_token_id:

View File

@ -21,15 +21,13 @@ import os
import sys
import hashlib
from contextlib import suppress
from dataclasses import dataclass, field
from pathlib import Path
from typing import Dict, Union, types, Optional, Type, Any
import torch
import logging
import invokeai.backend.util.logging as logger
from invokeai.app.services.config import get_invokeai_config
from .lora import LoRAModel, TextualInversionModel
from .models import BaseModelType, ModelType, SubModelType, ModelBase
# Maximum size of the cache, in gigs
@ -43,6 +41,18 @@ DEFAULT_MAX_VRAM_CACHE_SIZE = 2.75
GIG = 1073741824
@dataclass
class CacheStats(object):
hits: int = 0 # cache hits
misses: int = 0 # cache misses
high_watermark: int = 0 # amount of cache used
in_cache: int = 0 # number of models in cache
cleared: int = 0 # number of models cleared to make space
cache_size: int = 0 # total size of cache
# {submodel_key => size}
loaded_model_sizes: Dict[str, int] = field(default_factory=dict)
class ModelLocker(object):
"Forward declaration"
pass
@ -117,6 +127,9 @@ class ModelCache(object):
self.sha_chunksize = sha_chunksize
self.logger = logger
# used for stats collection
self.stats = None
self._cached_models = dict()
self._cache_stack = list()
@ -183,13 +196,14 @@ class ModelCache(object):
model_type=model_type,
submodel_type=submodel,
)
# TODO: lock for no copies on simultaneous calls?
cache_entry = self._cached_models.get(key, None)
if cache_entry is None:
self.logger.info(
f"Loading model {model_path}, type {base_model.value}:{model_type.value}:{submodel.value if submodel else ''}"
f"Loading model {model_path}, type {base_model.value}:{model_type.value}{':'+submodel.value if submodel else ''}"
)
if self.stats:
self.stats.misses += 1
# this will remove older cached models until
# there is sufficient room to load the requested model
@ -203,6 +217,17 @@ class ModelCache(object):
cache_entry = _CacheRecord(self, model, mem_used)
self._cached_models[key] = cache_entry
else:
if self.stats:
self.stats.hits += 1
if self.stats:
self.stats.cache_size = self.max_cache_size * GIG
self.stats.high_watermark = max(self.stats.high_watermark, self._cache_size())
self.stats.in_cache = len(self._cached_models)
self.stats.loaded_model_sizes[key] = max(
self.stats.loaded_model_sizes.get(key, 0), model_info.get_size(submodel)
)
with suppress(Exception):
self._cache_stack.remove(key)
@ -282,14 +307,14 @@ class ModelCache(object):
"""
Given the HF repo id or path to a model on disk, returns a unique
hash. Works for legacy checkpoint files, HF models on disk, and HF repo IDs
:param model_path: Path to model file/directory on disk.
"""
return self._local_model_hash(model_path)
def cache_size(self) -> float:
"Return the current size of the cache, in GB"
current_cache_size = sum([m.size for m in self._cached_models.values()])
return current_cache_size / GIG
"""Return the current size of the cache, in GB."""
return self._cache_size() / GIG
def _has_cuda(self) -> bool:
return self.execution_device.type == "cuda"
@ -312,12 +337,15 @@ class ModelCache(object):
f"Current VRAM/RAM usage: {vram}/{ram}; cached_models/loaded_models/locked_models/ = {cached_models}/{loaded_models}/{locked_models}"
)
def _cache_size(self) -> int:
return sum([m.size for m in self._cached_models.values()])
def _make_cache_room(self, model_size):
# calculate how much memory this model will require
# multiplier = 2 if self.precision==torch.float32 else 1
bytes_needed = model_size
maximum_size = self.max_cache_size * GIG # stored in GB, convert to bytes
current_size = sum([m.size for m in self._cached_models.values()])
current_size = self._cache_size()
if current_size + bytes_needed > maximum_size:
self.logger.debug(
@ -366,6 +394,8 @@ class ModelCache(object):
f"Unloading model {model_key} to free {(model_size/GIG):.2f} GB (-{(cache_entry.size/GIG):.2f} GB)"
)
current_size -= cache_entry.size
if self.stats:
self.stats.cleared += 1
del self._cache_stack[pos]
del self._cached_models[model_key]
del cache_entry

View File

@ -228,19 +228,19 @@ the root is the InvokeAI ROOTDIR.
"""
from __future__ import annotations
import os
import hashlib
import os
import textwrap
import yaml
import types
from dataclasses import dataclass
from pathlib import Path
from typing import Optional, List, Tuple, Union, Dict, Set, Callable, types
from shutil import rmtree, move
from typing import Optional, List, Literal, Tuple, Union, Dict, Set, Callable
import torch
import yaml
from omegaconf import OmegaConf
from omegaconf.dictconfig import DictConfig
from pydantic import BaseModel, Field
import invokeai.backend.util.logging as logger
@ -259,6 +259,7 @@ from .models import (
ModelNotFoundException,
InvalidModelException,
DuplicateModelException,
ModelBase,
)
# We are only starting to number the config file with release 3.
@ -361,7 +362,7 @@ class ModelManager(object):
if model_key.startswith("_"):
continue
model_name, base_model, model_type = self.parse_key(model_key)
model_class = MODEL_CLASSES[base_model][model_type]
model_class = self._get_implementation(base_model, model_type)
# alias for config file
model_config["model_format"] = model_config.pop("format")
self.models[model_key] = model_class.create_config(**model_config)
@ -381,18 +382,24 @@ class ModelManager(object):
# causing otherwise unreferenced models to be removed from memory
self._read_models()
def model_exists(
self,
model_name: str,
base_model: BaseModelType,
model_type: ModelType,
) -> bool:
def model_exists(self, model_name: str, base_model: BaseModelType, model_type: ModelType, *, rescan=False) -> bool:
"""
Given a model name, returns True if it is a valid
identifier.
Given a model name, returns True if it is a valid identifier.
:param model_name: symbolic name of the model in models.yaml
:param model_type: ModelType enum indicating the type of model to return
:param base_model: BaseModelType enum indicating the base model used by this model
:param rescan: if True, scan_models_directory
"""
model_key = self.create_key(model_name, base_model, model_type)
return model_key in self.models
exists = model_key in self.models
# if model not found try to find it (maybe file just pasted)
if rescan and not exists:
self.scan_models_directory(base_model=base_model, model_type=model_type)
exists = self.model_exists(model_name, base_model, model_type, rescan=False)
return exists
@classmethod
def create_key(
@ -443,39 +450,32 @@ class ModelManager(object):
:param model_name: symbolic name of the model in models.yaml
:param model_type: ModelType enum indicating the type of model to return
:param base_model: BaseModelType enum indicating the base model used by this model
:param submode_typel: an ModelType enum indicating the portion of
:param submodel_type: an ModelType enum indicating the portion of
the model to retrieve (e.g. ModelType.Vae)
"""
model_class = MODEL_CLASSES[base_model][model_type]
model_key = self.create_key(model_name, base_model, model_type)
# if model not found try to find it (maybe file just pasted)
if model_key not in self.models:
self.scan_models_directory(base_model=base_model, model_type=model_type)
if model_key not in self.models:
raise ModelNotFoundException(f"Model not found - {model_key}")
if not self.model_exists(model_name, base_model, model_type, rescan=True):
raise ModelNotFoundException(f"Model not found - {model_key}")
model_config = self.models[model_key]
model_path = self.resolve_model_path(model_config.path)
model_config = self._get_model_config(base_model, model_name, model_type)
model_path, is_submodel_override = self._get_model_path(model_config, submodel_type)
if is_submodel_override:
model_type = submodel_type
submodel_type = None
model_class = self._get_implementation(base_model, model_type)
if not model_path.exists():
if model_class.save_to_config:
self.models[model_key].error = ModelError.NotFound
raise Exception(f'Files for model "{model_key}" not found')
raise Exception(f'Files for model "{model_key}" not found at {model_path}')
else:
self.models.pop(model_key, None)
raise ModelNotFoundException(f"Model not found - {model_key}")
# vae/movq override
# TODO:
if submodel_type is not None and hasattr(model_config, submodel_type):
override_path = getattr(model_config, submodel_type)
if override_path:
model_path = self.app_config.root_path / override_path
model_type = submodel_type
submodel_type = None
model_class = MODEL_CLASSES[base_model][model_type]
raise ModelNotFoundException(f'Files for model "{model_key}" not found at {model_path}')
# TODO: path
# TODO: is it accurate to use path as id
@ -513,12 +513,61 @@ class ModelManager(object):
_cache=self.cache,
)
def _get_model_path(
self, model_config: ModelConfigBase, submodel_type: Optional[SubModelType] = None
) -> (Path, bool):
"""Extract a model's filesystem path from its config.
:return: The fully qualified Path of the module (or submodule).
"""
model_path = model_config.path
is_submodel_override = False
# Does the config explicitly override the submodel?
if submodel_type is not None and hasattr(model_config, submodel_type):
submodel_path = getattr(model_config, submodel_type)
if submodel_path is not None and len(submodel_path) > 0:
model_path = getattr(model_config, submodel_type)
is_submodel_override = True
model_path = self.resolve_model_path(model_path)
return model_path, is_submodel_override
def _get_model_config(self, base_model: BaseModelType, model_name: str, model_type: ModelType) -> ModelConfigBase:
"""Get a model's config object."""
model_key = self.create_key(model_name, base_model, model_type)
try:
model_config = self.models[model_key]
except KeyError:
raise ModelNotFoundException(f"Model not found - {model_key}")
return model_config
def _get_implementation(self, base_model: BaseModelType, model_type: ModelType) -> type[ModelBase]:
"""Get the concrete implementation class for a specific model type."""
model_class = MODEL_CLASSES[base_model][model_type]
return model_class
def _instantiate(
self,
model_name: str,
base_model: BaseModelType,
model_type: ModelType,
submodel_type: Optional[SubModelType] = None,
) -> ModelBase:
"""Make a new instance of this model, without loading it."""
model_config = self._get_model_config(base_model, model_name, model_type)
model_path, is_submodel_override = self._get_model_path(model_config, submodel_type)
# FIXME: do non-overriden submodels get the right class?
constructor = self._get_implementation(base_model, model_type)
instance = constructor(model_path, base_model, model_type)
return instance
def model_info(
self,
model_name: str,
base_model: BaseModelType,
model_type: ModelType,
) -> dict:
) -> Union[dict, None]:
"""
Given a model name returns the OmegaConf (dict-like) object describing it.
"""
@ -540,13 +589,16 @@ class ModelManager(object):
model_name: str,
base_model: BaseModelType,
model_type: ModelType,
) -> dict:
) -> Union[dict, None]:
"""
Returns a dict describing one installed model, using
the combined format of the list_models() method.
"""
models = self.list_models(base_model, model_type, model_name)
return models[0] if models else None
if len(models) >= 1:
return models[0]
else:
return None
def list_models(
self,
@ -560,7 +612,7 @@ class ModelManager(object):
model_keys = (
[self.create_key(model_name, base_model, model_type)]
if model_name
if model_name and base_model and model_type
else sorted(self.models, key=str.casefold)
)
models = []
@ -596,7 +648,7 @@ class ModelManager(object):
Print a table of models and their descriptions. This needs to be redone
"""
# TODO: redo
for model_type, model_dict in self.list_models().items():
for model_dict in self.list_models():
for model_name, model_info in model_dict.items():
line = f'{model_info["name"]:25s} {model_info["type"]:10s} {model_info["description"]}'
print(line)
@ -658,7 +710,7 @@ class ModelManager(object):
if path := model_attributes.get("path"):
model_attributes["path"] = str(self.relative_model_path(Path(path)))
model_class = MODEL_CLASSES[base_model][model_type]
model_class = self._get_implementation(base_model, model_type)
model_config = model_class.create_config(**model_attributes)
model_key = self.create_key(model_name, base_model, model_type)
@ -670,7 +722,7 @@ class ModelManager(object):
# TODO: if path changed and old_model.path inside models folder should we delete this too?
# remove conversion cache as config changed
old_model_path = self.app_config.root_path / old_model.path
old_model_path = self.resolve_model_path(old_model.path)
old_model_cache = self._get_model_cache_path(old_model_path)
if old_model_cache.exists():
if old_model_cache.is_dir():
@ -699,8 +751,8 @@ class ModelManager(object):
model_name: str,
base_model: BaseModelType,
model_type: ModelType,
new_name: str = None,
new_base: BaseModelType = None,
new_name: Optional[str] = None,
new_base: Optional[BaseModelType] = None,
):
"""
Rename or rebase a model.
@ -753,7 +805,7 @@ class ModelManager(object):
self,
model_name: str,
base_model: BaseModelType,
model_type: Union[ModelType.Main, ModelType.Vae],
model_type: Literal[ModelType.Main, ModelType.Vae],
dest_directory: Optional[Path] = None,
) -> AddModelResult:
"""
@ -767,6 +819,10 @@ class ModelManager(object):
This will raise a ValueError unless the model is a checkpoint.
"""
info = self.model_info(model_name, base_model, model_type)
if info is None:
raise FileNotFoundError(f"model not found: {model_name}")
if info["model_format"] != "checkpoint":
raise ValueError(f"not a checkpoint format model: {model_name}")
@ -780,7 +836,7 @@ class ModelManager(object):
model_type,
**submodel,
)
checkpoint_path = self.app_config.root_path / info["path"]
checkpoint_path = self.resolve_model_path(info["path"])
old_diffusers_path = self.resolve_model_path(model.location)
new_diffusers_path = (
dest_directory or self.app_config.models_path / base_model.value / model_type.value
@ -836,7 +892,7 @@ class ModelManager(object):
return search_folder, found_models
def commit(self, conf_file: Path = None) -> None:
def commit(self, conf_file: Optional[Path] = None) -> None:
"""
Write current configuration out to the indicated file.
"""
@ -845,7 +901,7 @@ class ModelManager(object):
for model_key, model_config in self.models.items():
model_name, base_model, model_type = self.parse_key(model_key)
model_class = MODEL_CLASSES[base_model][model_type]
model_class = self._get_implementation(base_model, model_type)
if model_class.save_to_config:
# TODO: or exclude_unset better fits here?
data_to_save[model_key] = model_config.dict(exclude_defaults=True, exclude={"error"})
@ -903,7 +959,7 @@ class ModelManager(object):
model_path = self.resolve_model_path(model_config.path).absolute()
if not model_path.exists():
model_class = MODEL_CLASSES[cur_base_model][cur_model_type]
model_class = self._get_implementation(cur_base_model, cur_model_type)
if model_class.save_to_config:
model_config.error = ModelError.NotFound
self.models.pop(model_key, None)
@ -919,7 +975,7 @@ class ModelManager(object):
for cur_model_type in ModelType:
if model_type is not None and cur_model_type != model_type:
continue
model_class = MODEL_CLASSES[cur_base_model][cur_model_type]
model_class = self._get_implementation(cur_base_model, cur_model_type)
models_dir = self.resolve_model_path(Path(cur_base_model.value, cur_model_type.value))
if not models_dir.exists():
@ -935,7 +991,9 @@ class ModelManager(object):
raise DuplicateModelException(f"Model with key {model_key} added twice")
model_path = self.relative_model_path(model_path)
model_config: ModelConfigBase = model_class.probe_config(str(model_path))
model_config: ModelConfigBase = model_class.probe_config(
str(model_path), model_base=cur_base_model
)
self.models[model_key] = model_config
new_models_found = True
except DuplicateModelException as e:
@ -983,7 +1041,7 @@ class ModelManager(object):
# LS: hacky
# Patch in the SD VAE from core so that it is available for use by the UI
try:
self.heuristic_import({self.resolve_model_path("core/convert/sd-vae-ft-mse")})
self.heuristic_import({str(self.resolve_model_path("core/convert/sd-vae-ft-mse"))})
except:
pass
@ -992,7 +1050,7 @@ class ModelManager(object):
model_manager=self,
prediction_type_helper=ask_user_for_prediction_type,
)
known_paths = {config.root_path / x["path"] for x in self.list_models()}
known_paths = {self.resolve_model_path(x["path"]) for x in self.list_models()}
directories = {
config.root_path / x
for x in [
@ -1011,7 +1069,7 @@ class ModelManager(object):
def heuristic_import(
self,
items_to_import: Set[str],
prediction_type_helper: Callable[[Path], SchedulerPredictionType] = None,
prediction_type_helper: Optional[Callable[[Path], SchedulerPredictionType]] = None,
) -> Dict[str, AddModelResult]:
"""Import a list of paths, repo_ids or URLs. Returns the set of
successfully imported items.

View File

@ -33,7 +33,7 @@ class ModelMerger(object):
self,
model_paths: List[Path],
alpha: float = 0.5,
interp: MergeInterpolationMethod = None,
interp: Optional[MergeInterpolationMethod] = None,
force: bool = False,
**kwargs,
) -> DiffusionPipeline:
@ -73,7 +73,7 @@ class ModelMerger(object):
base_model: Union[BaseModelType, str],
merged_model_name: str,
alpha: float = 0.5,
interp: MergeInterpolationMethod = None,
interp: Optional[MergeInterpolationMethod] = None,
force: bool = False,
merge_dest_directory: Optional[Path] = None,
**kwargs,
@ -109,7 +109,7 @@ class ModelMerger(object):
# pick up the first model's vae
if mod == model_names[0]:
vae = info.get("vae")
model_paths.extend([config.root_path / info["path"]])
model_paths.extend([(config.root_path / info["path"]).as_posix()])
merge_method = None if interp == "weighted_sum" else MergeInterpolationMethod(interp)
logger.debug(f"interp = {interp}, merge_method={merge_method}")
@ -120,11 +120,11 @@ class ModelMerger(object):
else config.models_path / base_model.value / ModelType.Main.value
)
dump_path.mkdir(parents=True, exist_ok=True)
dump_path = dump_path / merged_model_name
dump_path = (dump_path / merged_model_name).as_posix()
merged_pipe.save_pretrained(dump_path, safe_serialization=1)
merged_pipe.save_pretrained(dump_path, safe_serialization=True)
attributes = dict(
path=str(dump_path),
path=dump_path,
description=f"Merge of models {', '.join(model_names)}",
model_format="diffusers",
variant=ModelVariantType.Normal.value,

View File

@ -17,6 +17,7 @@ from .models import (
SilenceWarnings,
InvalidModelException,
)
from .util import lora_token_vector_length
from .models.base import read_checkpoint_meta
@ -315,21 +316,16 @@ class LoRACheckpointProbe(CheckpointProbeBase):
def get_base_type(self) -> BaseModelType:
checkpoint = self.checkpoint
key1 = "lora_te_text_model_encoder_layers_0_mlp_fc1.lora_down.weight"
key2 = "lora_te_text_model_encoder_layers_0_self_attn_k_proj.hada_w1_a"
lora_token_vector_length = (
checkpoint[key1].shape[1]
if key1 in checkpoint
else checkpoint[key2].shape[0]
if key2 in checkpoint
else 768
)
if lora_token_vector_length == 768:
token_vector_length = lora_token_vector_length(checkpoint)
if token_vector_length == 768:
return BaseModelType.StableDiffusion1
elif lora_token_vector_length == 1024:
elif token_vector_length == 1024:
return BaseModelType.StableDiffusion2
elif token_vector_length == 2048:
return BaseModelType.StableDiffusionXL
else:
return None
raise InvalidModelException(f"Unknown LoRA type: {self.checkpoint_path}")
class TextualInversionCheckpointProbe(CheckpointProbeBase):
@ -485,9 +481,19 @@ class ControlNetFolderProbe(FolderProbeBase):
with open(config_file, "r") as file:
config = json.load(file)
# no obvious way to distinguish between sd2-base and sd2-768
return (
BaseModelType.StableDiffusion1 if config["cross_attention_dim"] == 768 else BaseModelType.StableDiffusion2
dimension = config["cross_attention_dim"]
base_model = (
BaseModelType.StableDiffusion1
if dimension == 768
else BaseModelType.StableDiffusion2
if dimension == 1024
else BaseModelType.StableDiffusionXL
if dimension == 2048
else None
)
if not base_model:
raise InvalidModelException(f"Unable to determine model base for {self.folder_path}")
return base_model
class LoRAFolderProbe(FolderProbeBase):

View File

@ -292,8 +292,9 @@ class DiffusersModel(ModelBase):
)
break
except Exception as e:
# print("====ERR LOAD====")
# print(f"{variant}: {e}")
if not str(e).startswith("Error no file"):
print("====ERR LOAD====")
print(f"{variant}: {e}")
pass
else:
raise Exception(f"Failed to load {self.base_model}:{self.model_type}:{child_type} model")

View File

@ -1,21 +1,23 @@
import bisect
import os
import torch
from enum import Enum
from typing import Optional, Union, Literal
from pathlib import Path
from typing import Dict, Optional, Union
import torch
from safetensors.torch import load_file
from .base import (
BaseModelType,
InvalidModelException,
ModelBase,
ModelConfigBase,
BaseModelType,
ModelNotFoundException,
ModelType,
SubModelType,
classproperty,
InvalidModelException,
ModelNotFoundException,
)
# TODO: naming
from ..lora import LoRAModel as LoRAModelRaw
class LoRAModelFormat(str, Enum):
LyCORIS = "lycoris"
@ -50,6 +52,7 @@ class LoRAModel(ModelBase):
model = LoRAModelRaw.from_checkpoint(
file_path=self.model_path,
dtype=torch_dtype,
base_model=self.base_model,
)
self.model_size = model.calc_size()
@ -87,3 +90,622 @@ class LoRAModel(ModelBase):
raise NotImplementedError("Diffusers lora not supported")
else:
return model_path
class LoRALayerBase:
# rank: Optional[int]
# alpha: Optional[float]
# bias: Optional[torch.Tensor]
# layer_key: str
# @property
# def scale(self):
# return self.alpha / self.rank if (self.alpha and self.rank) else 1.0
def __init__(
self,
layer_key: str,
values: dict,
):
if "alpha" in values:
self.alpha = values["alpha"].item()
else:
self.alpha = None
if "bias_indices" in values and "bias_values" in values and "bias_size" in values:
self.bias = torch.sparse_coo_tensor(
values["bias_indices"],
values["bias_values"],
tuple(values["bias_size"]),
)
else:
self.bias = None
self.rank = None # set in layer implementation
self.layer_key = layer_key
def get_weight(self, orig_weight: torch.Tensor):
raise NotImplementedError()
def calc_size(self) -> int:
model_size = 0
for val in [self.bias]:
if val is not None:
model_size += val.nelement() * val.element_size()
return model_size
def to(
self,
device: Optional[torch.device] = None,
dtype: Optional[torch.dtype] = None,
):
if self.bias is not None:
self.bias = self.bias.to(device=device, dtype=dtype)
# TODO: find and debug lora/locon with bias
class LoRALayer(LoRALayerBase):
# up: torch.Tensor
# mid: Optional[torch.Tensor]
# down: torch.Tensor
def __init__(
self,
layer_key: str,
values: dict,
):
super().__init__(layer_key, values)
self.up = values["lora_up.weight"]
self.down = values["lora_down.weight"]
if "lora_mid.weight" in values:
self.mid = values["lora_mid.weight"]
else:
self.mid = None
self.rank = self.down.shape[0]
def get_weight(self, orig_weight: torch.Tensor):
if self.mid is not None:
up = self.up.reshape(self.up.shape[0], self.up.shape[1])
down = self.down.reshape(self.down.shape[0], self.down.shape[1])
weight = torch.einsum("m n w h, i m, n j -> i j w h", self.mid, up, down)
else:
weight = self.up.reshape(self.up.shape[0], -1) @ self.down.reshape(self.down.shape[0], -1)
return weight
def calc_size(self) -> int:
model_size = super().calc_size()
for val in [self.up, self.mid, self.down]:
if val is not None:
model_size += val.nelement() * val.element_size()
return model_size
def to(
self,
device: Optional[torch.device] = None,
dtype: Optional[torch.dtype] = None,
):
super().to(device=device, dtype=dtype)
self.up = self.up.to(device=device, dtype=dtype)
self.down = self.down.to(device=device, dtype=dtype)
if self.mid is not None:
self.mid = self.mid.to(device=device, dtype=dtype)
class LoHALayer(LoRALayerBase):
# w1_a: torch.Tensor
# w1_b: torch.Tensor
# w2_a: torch.Tensor
# w2_b: torch.Tensor
# t1: Optional[torch.Tensor] = None
# t2: Optional[torch.Tensor] = None
def __init__(
self,
layer_key: str,
values: dict,
):
super().__init__(layer_key, values)
self.w1_a = values["hada_w1_a"]
self.w1_b = values["hada_w1_b"]
self.w2_a = values["hada_w2_a"]
self.w2_b = values["hada_w2_b"]
if "hada_t1" in values:
self.t1 = values["hada_t1"]
else:
self.t1 = None
if "hada_t2" in values:
self.t2 = values["hada_t2"]
else:
self.t2 = None
self.rank = self.w1_b.shape[0]
def get_weight(self, orig_weight: torch.Tensor):
if self.t1 is None:
weight = (self.w1_a @ self.w1_b) * (self.w2_a @ self.w2_b)
else:
rebuild1 = torch.einsum("i j k l, j r, i p -> p r k l", self.t1, self.w1_b, self.w1_a)
rebuild2 = torch.einsum("i j k l, j r, i p -> p r k l", self.t2, self.w2_b, self.w2_a)
weight = rebuild1 * rebuild2
return weight
def calc_size(self) -> int:
model_size = super().calc_size()
for val in [self.w1_a, self.w1_b, self.w2_a, self.w2_b, self.t1, self.t2]:
if val is not None:
model_size += val.nelement() * val.element_size()
return model_size
def to(
self,
device: Optional[torch.device] = None,
dtype: Optional[torch.dtype] = None,
):
super().to(device=device, dtype=dtype)
self.w1_a = self.w1_a.to(device=device, dtype=dtype)
self.w1_b = self.w1_b.to(device=device, dtype=dtype)
if self.t1 is not None:
self.t1 = self.t1.to(device=device, dtype=dtype)
self.w2_a = self.w2_a.to(device=device, dtype=dtype)
self.w2_b = self.w2_b.to(device=device, dtype=dtype)
if self.t2 is not None:
self.t2 = self.t2.to(device=device, dtype=dtype)
class LoKRLayer(LoRALayerBase):
# w1: Optional[torch.Tensor] = None
# w1_a: Optional[torch.Tensor] = None
# w1_b: Optional[torch.Tensor] = None
# w2: Optional[torch.Tensor] = None
# w2_a: Optional[torch.Tensor] = None
# w2_b: Optional[torch.Tensor] = None
# t2: Optional[torch.Tensor] = None
def __init__(
self,
layer_key: str,
values: dict,
):
super().__init__(layer_key, values)
if "lokr_w1" in values:
self.w1 = values["lokr_w1"]
self.w1_a = None
self.w1_b = None
else:
self.w1 = None
self.w1_a = values["lokr_w1_a"]
self.w1_b = values["lokr_w1_b"]
if "lokr_w2" in values:
self.w2 = values["lokr_w2"]
self.w2_a = None
self.w2_b = None
else:
self.w2 = None
self.w2_a = values["lokr_w2_a"]
self.w2_b = values["lokr_w2_b"]
if "lokr_t2" in values:
self.t2 = values["lokr_t2"]
else:
self.t2 = None
if "lokr_w1_b" in values:
self.rank = values["lokr_w1_b"].shape[0]
elif "lokr_w2_b" in values:
self.rank = values["lokr_w2_b"].shape[0]
else:
self.rank = None # unscaled
def get_weight(self, orig_weight: torch.Tensor):
w1 = self.w1
if w1 is None:
w1 = self.w1_a @ self.w1_b
w2 = self.w2
if w2 is None:
if self.t2 is None:
w2 = self.w2_a @ self.w2_b
else:
w2 = torch.einsum("i j k l, i p, j r -> p r k l", self.t2, self.w2_a, self.w2_b)
if len(w2.shape) == 4:
w1 = w1.unsqueeze(2).unsqueeze(2)
w2 = w2.contiguous()
weight = torch.kron(w1, w2)
return weight
def calc_size(self) -> int:
model_size = super().calc_size()
for val in [self.w1, self.w1_a, self.w1_b, self.w2, self.w2_a, self.w2_b, self.t2]:
if val is not None:
model_size += val.nelement() * val.element_size()
return model_size
def to(
self,
device: Optional[torch.device] = None,
dtype: Optional[torch.dtype] = None,
):
super().to(device=device, dtype=dtype)
if self.w1 is not None:
self.w1 = self.w1.to(device=device, dtype=dtype)
else:
self.w1_a = self.w1_a.to(device=device, dtype=dtype)
self.w1_b = self.w1_b.to(device=device, dtype=dtype)
if self.w2 is not None:
self.w2 = self.w2.to(device=device, dtype=dtype)
else:
self.w2_a = self.w2_a.to(device=device, dtype=dtype)
self.w2_b = self.w2_b.to(device=device, dtype=dtype)
if self.t2 is not None:
self.t2 = self.t2.to(device=device, dtype=dtype)
class FullLayer(LoRALayerBase):
# weight: torch.Tensor
def __init__(
self,
layer_key: str,
values: dict,
):
super().__init__(layer_key, values)
self.weight = values["diff"]
if len(values.keys()) > 1:
_keys = list(values.keys())
_keys.remove("diff")
raise NotImplementedError(f"Unexpected keys in lora diff layer: {_keys}")
self.rank = None # unscaled
def get_weight(self, orig_weight: torch.Tensor):
return self.weight
def calc_size(self) -> int:
model_size = super().calc_size()
model_size += self.weight.nelement() * self.weight.element_size()
return model_size
def to(
self,
device: Optional[torch.device] = None,
dtype: Optional[torch.dtype] = None,
):
super().to(device=device, dtype=dtype)
self.weight = self.weight.to(device=device, dtype=dtype)
class IA3Layer(LoRALayerBase):
# weight: torch.Tensor
# on_input: torch.Tensor
def __init__(
self,
layer_key: str,
values: dict,
):
super().__init__(layer_key, values)
self.weight = values["weight"]
self.on_input = values["on_input"]
self.rank = None # unscaled
def get_weight(self, orig_weight: torch.Tensor):
weight = self.weight
if not self.on_input:
weight = weight.reshape(-1, 1)
return orig_weight * weight
def calc_size(self) -> int:
model_size = super().calc_size()
model_size += self.weight.nelement() * self.weight.element_size()
model_size += self.on_input.nelement() * self.on_input.element_size()
return model_size
def to(
self,
device: Optional[torch.device] = None,
dtype: Optional[torch.dtype] = None,
):
super().to(device=device, dtype=dtype)
self.weight = self.weight.to(device=device, dtype=dtype)
self.on_input = self.on_input.to(device=device, dtype=dtype)
# TODO: rename all methods used in model logic with Info postfix and remove here Raw postfix
class LoRAModelRaw: # (torch.nn.Module):
_name: str
layers: Dict[str, LoRALayer]
_device: torch.device
_dtype: torch.dtype
def __init__(
self,
name: str,
layers: Dict[str, LoRALayer],
device: torch.device,
dtype: torch.dtype,
):
self._name = name
self._device = device or torch.cpu
self._dtype = dtype or torch.float32
self.layers = layers
@property
def name(self):
return self._name
@property
def device(self):
return self._device
@property
def dtype(self):
return self._dtype
def to(
self,
device: Optional[torch.device] = None,
dtype: Optional[torch.dtype] = None,
):
# TODO: try revert if exception?
for key, layer in self.layers.items():
layer.to(device=device, dtype=dtype)
self._device = device
self._dtype = dtype
def calc_size(self) -> int:
model_size = 0
for _, layer in self.layers.items():
model_size += layer.calc_size()
return model_size
@classmethod
def _convert_sdxl_keys_to_diffusers_format(cls, state_dict):
"""Convert the keys of an SDXL LoRA state_dict to diffusers format.
The input state_dict can be in either Stability AI format or diffusers format. If the state_dict is already in
diffusers format, then this function will have no effect.
This function is adapted from:
https://github.com/bmaltais/kohya_ss/blob/2accb1305979ba62f5077a23aabac23b4c37e935/networks/lora_diffusers.py#L385-L409
Args:
state_dict (Dict[str, Tensor]): The SDXL LoRA state_dict.
Raises:
ValueError: If state_dict contains an unrecognized key, or not all keys could be converted.
Returns:
Dict[str, Tensor]: The diffusers-format state_dict.
"""
converted_count = 0 # The number of Stability AI keys converted to diffusers format.
not_converted_count = 0 # The number of keys that were not converted.
# Get a sorted list of Stability AI UNet keys so that we can efficiently search for keys with matching prefixes.
# For example, we want to efficiently find `input_blocks_4_1` in the list when searching for
# `input_blocks_4_1_proj_in`.
stability_unet_keys = list(SDXL_UNET_STABILITY_TO_DIFFUSERS_MAP)
stability_unet_keys.sort()
new_state_dict = dict()
for full_key, value in state_dict.items():
if full_key.startswith("lora_unet_"):
search_key = full_key.replace("lora_unet_", "")
# Use bisect to find the key in stability_unet_keys that *may* match the search_key's prefix.
position = bisect.bisect_right(stability_unet_keys, search_key)
map_key = stability_unet_keys[position - 1]
# Now, check if the map_key *actually* matches the search_key.
if search_key.startswith(map_key):
new_key = full_key.replace(map_key, SDXL_UNET_STABILITY_TO_DIFFUSERS_MAP[map_key])
new_state_dict[new_key] = value
converted_count += 1
else:
new_state_dict[full_key] = value
not_converted_count += 1
elif full_key.startswith("lora_te1_") or full_key.startswith("lora_te2_"):
# The CLIP text encoders have the same keys in both Stability AI and diffusers formats.
new_state_dict[full_key] = value
continue
else:
raise ValueError(f"Unrecognized SDXL LoRA key prefix: '{full_key}'.")
if converted_count > 0 and not_converted_count > 0:
raise ValueError(
f"The SDXL LoRA could only be partially converted to diffusers format. converted={converted_count},"
f" not_converted={not_converted_count}"
)
return new_state_dict
@classmethod
def from_checkpoint(
cls,
file_path: Union[str, Path],
device: Optional[torch.device] = None,
dtype: Optional[torch.dtype] = None,
base_model: Optional[BaseModelType] = None,
):
device = device or torch.device("cpu")
dtype = dtype or torch.float32
if isinstance(file_path, str):
file_path = Path(file_path)
model = cls(
device=device,
dtype=dtype,
name=file_path.stem, # TODO:
layers=dict(),
)
if file_path.suffix == ".safetensors":
state_dict = load_file(file_path.absolute().as_posix(), device="cpu")
else:
state_dict = torch.load(file_path, map_location="cpu")
state_dict = cls._group_state(state_dict)
if base_model == BaseModelType.StableDiffusionXL:
state_dict = cls._convert_sdxl_keys_to_diffusers_format(state_dict)
for layer_key, values in state_dict.items():
# lora and locon
if "lora_down.weight" in values:
layer = LoRALayer(layer_key, values)
# loha
elif "hada_w1_b" in values:
layer = LoHALayer(layer_key, values)
# lokr
elif "lokr_w1_b" in values or "lokr_w1" in values:
layer = LoKRLayer(layer_key, values)
# diff
elif "diff" in values:
layer = FullLayer(layer_key, values)
# ia3
elif "weight" in values and "on_input" in values:
layer = IA3Layer(layer_key, values)
else:
print(f">> Encountered unknown lora layer module in {model.name}: {layer_key} - {list(values.keys())}")
raise Exception("Unknown lora format!")
# lower memory consumption by removing already parsed layer values
state_dict[layer_key].clear()
layer.to(device=device, dtype=dtype)
model.layers[layer_key] = layer
return model
@staticmethod
def _group_state(state_dict: dict):
state_dict_groupped = dict()
for key, value in state_dict.items():
stem, leaf = key.split(".", 1)
if stem not in state_dict_groupped:
state_dict_groupped[stem] = dict()
state_dict_groupped[stem][leaf] = value
return state_dict_groupped
# code from
# https://github.com/bmaltais/kohya_ss/blob/2accb1305979ba62f5077a23aabac23b4c37e935/networks/lora_diffusers.py#L15C1-L97C32
def make_sdxl_unet_conversion_map():
"""Create a dict mapping state_dict keys from Stability AI SDXL format to diffusers SDXL format."""
unet_conversion_map_layer = []
for i in range(3): # num_blocks is 3 in sdxl
# loop over downblocks/upblocks
for j in range(2):
# loop over resnets/attentions for downblocks
hf_down_res_prefix = f"down_blocks.{i}.resnets.{j}."
sd_down_res_prefix = f"input_blocks.{3*i + j + 1}.0."
unet_conversion_map_layer.append((sd_down_res_prefix, hf_down_res_prefix))
if i < 3:
# no attention layers in down_blocks.3
hf_down_atn_prefix = f"down_blocks.{i}.attentions.{j}."
sd_down_atn_prefix = f"input_blocks.{3*i + j + 1}.1."
unet_conversion_map_layer.append((sd_down_atn_prefix, hf_down_atn_prefix))
for j in range(3):
# loop over resnets/attentions for upblocks
hf_up_res_prefix = f"up_blocks.{i}.resnets.{j}."
sd_up_res_prefix = f"output_blocks.{3*i + j}.0."
unet_conversion_map_layer.append((sd_up_res_prefix, hf_up_res_prefix))
# if i > 0: commentout for sdxl
# no attention layers in up_blocks.0
hf_up_atn_prefix = f"up_blocks.{i}.attentions.{j}."
sd_up_atn_prefix = f"output_blocks.{3*i + j}.1."
unet_conversion_map_layer.append((sd_up_atn_prefix, hf_up_atn_prefix))
if i < 3:
# no downsample in down_blocks.3
hf_downsample_prefix = f"down_blocks.{i}.downsamplers.0.conv."
sd_downsample_prefix = f"input_blocks.{3*(i+1)}.0.op."
unet_conversion_map_layer.append((sd_downsample_prefix, hf_downsample_prefix))
# no upsample in up_blocks.3
hf_upsample_prefix = f"up_blocks.{i}.upsamplers.0."
sd_upsample_prefix = f"output_blocks.{3*i + 2}.{2}." # change for sdxl
unet_conversion_map_layer.append((sd_upsample_prefix, hf_upsample_prefix))
hf_mid_atn_prefix = "mid_block.attentions.0."
sd_mid_atn_prefix = "middle_block.1."
unet_conversion_map_layer.append((sd_mid_atn_prefix, hf_mid_atn_prefix))
for j in range(2):
hf_mid_res_prefix = f"mid_block.resnets.{j}."
sd_mid_res_prefix = f"middle_block.{2*j}."
unet_conversion_map_layer.append((sd_mid_res_prefix, hf_mid_res_prefix))
unet_conversion_map_resnet = [
# (stable-diffusion, HF Diffusers)
("in_layers.0.", "norm1."),
("in_layers.2.", "conv1."),
("out_layers.0.", "norm2."),
("out_layers.3.", "conv2."),
("emb_layers.1.", "time_emb_proj."),
("skip_connection.", "conv_shortcut."),
]
unet_conversion_map = []
for sd, hf in unet_conversion_map_layer:
if "resnets" in hf:
for sd_res, hf_res in unet_conversion_map_resnet:
unet_conversion_map.append((sd + sd_res, hf + hf_res))
else:
unet_conversion_map.append((sd, hf))
for j in range(2):
hf_time_embed_prefix = f"time_embedding.linear_{j+1}."
sd_time_embed_prefix = f"time_embed.{j*2}."
unet_conversion_map.append((sd_time_embed_prefix, hf_time_embed_prefix))
for j in range(2):
hf_label_embed_prefix = f"add_embedding.linear_{j+1}."
sd_label_embed_prefix = f"label_emb.0.{j*2}."
unet_conversion_map.append((sd_label_embed_prefix, hf_label_embed_prefix))
unet_conversion_map.append(("input_blocks.0.0.", "conv_in."))
unet_conversion_map.append(("out.0.", "conv_norm_out."))
unet_conversion_map.append(("out.2.", "conv_out."))
return unet_conversion_map
SDXL_UNET_STABILITY_TO_DIFFUSERS_MAP = {
sd.rstrip(".").replace(".", "_"): hf.rstrip(".").replace(".", "_") for sd, hf in make_sdxl_unet_conversion_map()
}

View File

@ -1,6 +1,5 @@
import os
import json
import invokeai.backend.util.logging as logger
from enum import Enum
from pydantic import Field
from typing import Literal, Optional
@ -12,6 +11,7 @@ from .base import (
DiffusersModel,
read_checkpoint_meta,
classproperty,
InvalidModelException,
)
from omegaconf import OmegaConf
@ -65,7 +65,7 @@ class StableDiffusionXLModel(DiffusersModel):
in_channels = unet_config["in_channels"]
else:
raise Exception("Not supported stable diffusion diffusers format(possibly onnx?)")
raise InvalidModelException(f"{path} is not a recognized Stable Diffusion diffusers model")
else:
raise NotImplementedError(f"Unknown stable diffusion 2.* format: {model_format}")
@ -80,8 +80,10 @@ class StableDiffusionXLModel(DiffusersModel):
raise Exception("Unkown stable diffusion 2.* model format")
if ckpt_config_path is None:
# TO DO: implement picking
pass
# avoid circular import
from .stable_diffusion import _select_ckpt_config
ckpt_config_path = _select_ckpt_config(kwargs.get("model_base", BaseModelType.StableDiffusionXL), variant)
return cls.create_config(
path=path,

View File

@ -4,6 +4,7 @@ from enum import Enum
from pydantic import Field
from pathlib import Path
from typing import Literal, Optional, Union
from diffusers import StableDiffusionInpaintPipeline, StableDiffusionPipeline
from .base import (
ModelConfigBase,
BaseModelType,
@ -263,6 +264,8 @@ def _convert_ckpt_and_cache(
weights = app_config.models_path / model_config.path
config_file = app_config.root_path / model_config.config
output_path = Path(output_path)
variant = model_config.variant
pipeline_class = StableDiffusionInpaintPipeline if variant == "inpaint" else StableDiffusionPipeline
# return cached version if it exists
if output_path.exists():
@ -289,6 +292,7 @@ def _convert_ckpt_and_cache(
original_config_file=config_file,
extract_ema=True,
scan_needed=True,
pipeline_class=pipeline_class,
from_safetensors=weights.suffix == ".safetensors",
precision=torch_dtype(choose_torch_device()),
**kwargs,

View File

@ -1,9 +1,14 @@
import os
import torch
import safetensors
from enum import Enum
from pathlib import Path
from typing import Optional, Union, Literal
from typing import Optional
import safetensors
import torch
from diffusers.utils import is_safetensors_available
from omegaconf import OmegaConf
from invokeai.app.services.config import InvokeAIAppConfig
from .base import (
ModelBase,
ModelConfigBase,
@ -18,9 +23,6 @@ from .base import (
InvalidModelException,
ModelNotFoundException,
)
from invokeai.app.services.config import InvokeAIAppConfig
from diffusers.utils import is_safetensors_available
from omegaconf import OmegaConf
class VaeModelFormat(str, Enum):
@ -80,7 +82,7 @@ class VaeModel(ModelBase):
@classmethod
def detect_format(cls, path: str):
if not os.path.exists(path):
raise ModelNotFoundException()
raise ModelNotFoundException(f"Does not exist as local file: {path}")
if os.path.isdir(path):
if os.path.exists(os.path.join(path, "config.json")):

View File

@ -0,0 +1,75 @@
# Copyright (c) 2023 The InvokeAI Development Team
"""Utilities used by the Model Manager"""
def lora_token_vector_length(checkpoint: dict) -> int:
"""
Given a checkpoint in memory, return the lora token vector length
:param checkpoint: The checkpoint
"""
def _get_shape_1(key, tensor, checkpoint):
lora_token_vector_length = None
if "." not in key:
return lora_token_vector_length # wrong key format
model_key, lora_key = key.split(".", 1)
# check lora/locon
if lora_key == "lora_down.weight":
lora_token_vector_length = tensor.shape[1]
# check loha (don't worry about hada_t1/hada_t2 as it used only in 4d shapes)
elif lora_key in ["hada_w1_b", "hada_w2_b"]:
lora_token_vector_length = tensor.shape[1]
# check lokr (don't worry about lokr_t2 as it used only in 4d shapes)
elif "lokr_" in lora_key:
if model_key + ".lokr_w1" in checkpoint:
_lokr_w1 = checkpoint[model_key + ".lokr_w1"]
elif model_key + "lokr_w1_b" in checkpoint:
_lokr_w1 = checkpoint[model_key + ".lokr_w1_b"]
else:
return lora_token_vector_length # unknown format
if model_key + ".lokr_w2" in checkpoint:
_lokr_w2 = checkpoint[model_key + ".lokr_w2"]
elif model_key + "lokr_w2_b" in checkpoint:
_lokr_w2 = checkpoint[model_key + ".lokr_w2_b"]
else:
return lora_token_vector_length # unknown format
lora_token_vector_length = _lokr_w1.shape[1] * _lokr_w2.shape[1]
elif lora_key == "diff":
lora_token_vector_length = tensor.shape[1]
# ia3 can be detected only by shape[0] in text encoder
elif lora_key == "weight" and "lora_unet_" not in model_key:
lora_token_vector_length = tensor.shape[0]
return lora_token_vector_length
lora_token_vector_length = None
lora_te1_length = None
lora_te2_length = None
for key, tensor in checkpoint.items():
if key.startswith("lora_unet_") and ("_attn2_to_k." in key or "_attn2_to_v." in key):
lora_token_vector_length = _get_shape_1(key, tensor, checkpoint)
elif key.startswith("lora_te") and "_self_attn_" in key:
tmp_length = _get_shape_1(key, tensor, checkpoint)
if key.startswith("lora_te_"):
lora_token_vector_length = tmp_length
elif key.startswith("lora_te1_"):
lora_te1_length = tmp_length
elif key.startswith("lora_te2_"):
lora_te2_length = tmp_length
if lora_te1_length is not None and lora_te2_length is not None:
lora_token_vector_length = lora_te1_length + lora_te2_length
if lora_token_vector_length is not None:
break
return lora_token_vector_length

View File

@ -8,4 +8,4 @@ from .diffusers_pipeline import (
)
from .diffusion import InvokeAIDiffuserComponent
from .diffusion.cross_attention_map_saving import AttentionMapSaver
from .diffusion.shared_invokeai_diffusion import PostprocessingSettings
from .diffusion.shared_invokeai_diffusion import PostprocessingSettings, BasicConditioningInfo, SDXLConditioningInfo

View File

@ -4,15 +4,11 @@ import dataclasses
import inspect
import math
import secrets
from collections.abc import Sequence
from dataclasses import dataclass, field
from typing import Any, Callable, Generic, List, Optional, Type, TypeVar, Union
from pydantic import Field
from typing import Any, Callable, Generic, List, Optional, Type, Union
import einops
import PIL.Image
import numpy as np
from accelerate.utils import set_seed
import einops
import psutil
import torch
import torchvision.transforms as T
@ -22,36 +18,31 @@ from diffusers.pipelines.stable_diffusion import StableDiffusionPipelineOutput
from diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion import (
StableDiffusionPipeline,
)
from diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion_img2img import (
StableDiffusionImg2ImgPipeline,
)
from diffusers.pipelines.stable_diffusion.safety_checker import (
StableDiffusionSafetyChecker,
)
from diffusers.schedulers import KarrasDiffusionSchedulers
from diffusers.schedulers.scheduling_utils import SchedulerMixin, SchedulerOutput
from diffusers.utils import PIL_INTERPOLATION
from diffusers.utils.import_utils import is_xformers_available
from diffusers.utils.outputs import BaseOutput
from torchvision.transforms.functional import resize as tv_resize
from pydantic import Field
from transformers import CLIPFeatureExtractor, CLIPTextModel, CLIPTokenizer
from typing_extensions import ParamSpec
from invokeai.app.services.config import InvokeAIAppConfig
from ..util import CPU_DEVICE, normalize_device
from .diffusion import (
AttentionMapSaver,
InvokeAIDiffuserComponent,
PostprocessingSettings,
BasicConditioningInfo,
)
from .offloading import FullyLoadedModelGroup, ModelGroup
from ..util import normalize_device
@dataclass
class PipelineIntermediateState:
run_id: str
step: int
order: int
total_steps: int
timestep: int
latents: torch.Tensor
predicted_original: Optional[torch.Tensor] = None
@ -102,7 +93,6 @@ class AddsMaskGuidance:
mask_latents: torch.FloatTensor
scheduler: SchedulerMixin
noise: torch.Tensor
_debug: Optional[Callable] = None
def __call__(self, step_output: Union[BaseOutput, SchedulerOutput], t: torch.Tensor, conditioning) -> BaseOutput:
output_class = step_output.__class__ # We'll create a new one with masked data.
@ -139,8 +129,6 @@ class AddsMaskGuidance:
# mask_latents = self.scheduler.scale_model_input(mask_latents, t)
mask_latents = einops.repeat(mask_latents, "b c h w -> (repeat b) c h w", repeat=batch_size)
masked_input = torch.lerp(mask_latents.to(dtype=latents.dtype), latents, mask.to(dtype=latents.dtype))
if self._debug:
self._debug(masked_input, f"t={t} lerped")
return masked_input
@ -172,33 +160,6 @@ def is_inpainting_model(unet: UNet2DConditionModel):
return unet.conv_in.in_channels == 9
CallbackType = TypeVar("CallbackType")
ReturnType = TypeVar("ReturnType")
ParamType = ParamSpec("ParamType")
@dataclass(frozen=True)
class GeneratorToCallbackinator(Generic[ParamType, ReturnType, CallbackType]):
"""Convert a generator to a function with a callback and a return value."""
generator_method: Callable[ParamType, ReturnType]
callback_arg_type: Type[CallbackType]
def __call__(
self,
*args: ParamType.args,
callback: Callable[[CallbackType], Any] = None,
**kwargs: ParamType.kwargs,
) -> ReturnType:
result = None
for result in self.generator_method(*args, **kwargs):
if callback is not None and isinstance(result, self.callback_arg_type):
callback(result)
if result is None:
raise AssertionError("why was that an empty generator?")
return result
@dataclass
class ControlNetData:
model: ControlNetModel = Field(default=None)
@ -212,8 +173,8 @@ class ControlNetData:
@dataclass
class ConditioningData:
unconditioned_embeddings: torch.Tensor
text_embeddings: torch.Tensor
unconditioned_embeddings: BasicConditioningInfo
text_embeddings: BasicConditioningInfo
guidance_scale: Union[float, List[float]]
"""
Guidance scale as defined in [Classifier-Free Diffusion Guidance](https://arxiv.org/abs/2207.12598).
@ -289,9 +250,6 @@ class StableDiffusionGeneratorPipeline(StableDiffusionPipeline):
feature_extractor ([`CLIPFeatureExtractor`]):
Model that extracts features from generated images to be used as inputs for the `safety_checker`.
"""
_model_group: ModelGroup
ID_LENGTH = 8
def __init__(
self,
@ -303,9 +261,7 @@ class StableDiffusionGeneratorPipeline(StableDiffusionPipeline):
safety_checker: Optional[StableDiffusionSafetyChecker],
feature_extractor: Optional[CLIPFeatureExtractor],
requires_safety_checker: bool = False,
precision: str = "float32",
control_model: ControlNetModel = None,
execution_device: Optional[torch.device] = None,
):
super().__init__(
vae,
@ -330,9 +286,6 @@ class StableDiffusionGeneratorPipeline(StableDiffusionPipeline):
# control_model=control_model,
)
self.invokeai_diffuser = InvokeAIDiffuserComponent(self.unet, self._unet_forward)
self._model_group = FullyLoadedModelGroup(execution_device or self.unet.device)
self._model_group.install(*self._submodels)
self.control_model = control_model
def _adjust_memory_efficient_attention(self, latents: torch.Tensor):
@ -340,99 +293,41 @@ class StableDiffusionGeneratorPipeline(StableDiffusionPipeline):
if xformers is available, use it, otherwise use sliced attention.
"""
config = InvokeAIAppConfig.get_config()
if torch.cuda.is_available() and is_xformers_available() and not config.disable_xformers:
self.enable_xformers_memory_efficient_attention()
if self.unet.device.type == "cuda":
if is_xformers_available() and not config.disable_xformers:
self.enable_xformers_memory_efficient_attention()
return
elif hasattr(torch.nn.functional, "scaled_dot_product_attention"):
# diffusers enable sdp automatically
return
if self.unet.device.type == "cpu" or self.unet.device.type == "mps":
mem_free = psutil.virtual_memory().free
elif self.unet.device.type == "cuda":
mem_free, _ = torch.cuda.mem_get_info(normalize_device(self.unet.device))
else:
if self.device.type == "cpu" or self.device.type == "mps":
mem_free = psutil.virtual_memory().free
elif self.device.type == "cuda":
mem_free, _ = torch.cuda.mem_get_info(normalize_device(self.device))
else:
raise ValueError(f"unrecognized device {self.device}")
# input tensor of [1, 4, h/8, w/8]
# output tensor of [16, (h/8 * w/8), (h/8 * w/8)]
bytes_per_element_needed_for_baddbmm_duplication = latents.element_size() + 4
max_size_required_for_baddbmm = (
16
* latents.size(dim=2)
* latents.size(dim=3)
* latents.size(dim=2)
* latents.size(dim=3)
* bytes_per_element_needed_for_baddbmm_duplication
)
if max_size_required_for_baddbmm > (mem_free * 3.0 / 4.0): # 3.3 / 4.0 is from old Invoke code
self.enable_attention_slicing(slice_size="max")
elif torch.backends.mps.is_available():
# diffusers recommends always enabling for mps
self.enable_attention_slicing(slice_size="max")
else:
self.disable_attention_slicing()
raise ValueError(f"unrecognized device {self.unet.device}")
# input tensor of [1, 4, h/8, w/8]
# output tensor of [16, (h/8 * w/8), (h/8 * w/8)]
bytes_per_element_needed_for_baddbmm_duplication = latents.element_size() + 4
max_size_required_for_baddbmm = (
16
* latents.size(dim=2)
* latents.size(dim=3)
* latents.size(dim=2)
* latents.size(dim=3)
* bytes_per_element_needed_for_baddbmm_duplication
)
if max_size_required_for_baddbmm > (mem_free * 3.0 / 4.0): # 3.3 / 4.0 is from old Invoke code
self.enable_attention_slicing(slice_size="max")
elif torch.backends.mps.is_available():
# diffusers recommends always enabling for mps
self.enable_attention_slicing(slice_size="max")
else:
self.disable_attention_slicing()
def to(self, torch_device: Optional[Union[str, torch.device]] = None, silence_dtype_warnings=False):
# overridden method; types match the superclass.
if torch_device is None:
return self
self._model_group.set_device(torch.device(torch_device))
self._model_group.ready()
@property
def device(self) -> torch.device:
return self._model_group.execution_device
@property
def _submodels(self) -> Sequence[torch.nn.Module]:
module_names, _, _ = self.extract_init_dict(dict(self.config))
submodels = []
for name in module_names.keys():
if hasattr(self, name):
value = getattr(self, name)
else:
value = getattr(self.config, name)
if isinstance(value, torch.nn.Module):
submodels.append(value)
return submodels
def image_from_embeddings(
self,
latents: torch.Tensor,
num_inference_steps: int,
conditioning_data: ConditioningData,
*,
noise: torch.Tensor,
callback: Callable[[PipelineIntermediateState], None] = None,
run_id=None,
) -> InvokeAIStableDiffusionPipelineOutput:
r"""
Function invoked when calling the pipeline for generation.
:param conditioning_data:
:param latents: Pre-generated un-noised latents, to be used as inputs for
image generation. Can be used to tweak the same generation with different prompts.
:param num_inference_steps: The number of denoising steps. More denoising steps usually lead to a higher quality
image at the expense of slower inference.
:param noise: Noise to add to the latents, sampled from a Gaussian distribution.
:param callback:
:param run_id:
"""
result_latents, result_attention_map_saver = self.latents_from_embeddings(
latents,
num_inference_steps,
conditioning_data,
noise=noise,
run_id=run_id,
callback=callback,
)
# https://discuss.huggingface.co/t/memory-usage-by-later-pipeline-stages/23699
torch.cuda.empty_cache()
with torch.inference_mode():
image = self.decode_latents(result_latents)
output = InvokeAIStableDiffusionPipelineOutput(
images=image,
nsfw_content_detected=[],
attention_map_saver=result_attention_map_saver,
)
return self.check_for_safety(output, dtype=conditioning_data.dtype)
raise Exception("Should not be called")
def latents_from_embeddings(
self,
@ -440,35 +335,72 @@ class StableDiffusionGeneratorPipeline(StableDiffusionPipeline):
num_inference_steps: int,
conditioning_data: ConditioningData,
*,
noise: torch.Tensor,
timesteps=None,
noise: Optional[torch.Tensor],
timesteps: torch.Tensor,
init_timestep: torch.Tensor,
additional_guidance: List[Callable] = None,
run_id=None,
callback: Callable[[PipelineIntermediateState], None] = None,
control_data: List[ControlNetData] = None,
mask: Optional[torch.Tensor] = None,
seed: Optional[int] = None,
) -> tuple[torch.Tensor, Optional[AttentionMapSaver]]:
if self.scheduler.config.get("cpu_only", False):
scheduler_device = torch.device("cpu")
else:
scheduler_device = self._model_group.device_for(self.unet)
if init_timestep.shape[0] == 0:
return latents, None
if timesteps is None:
self.scheduler.set_timesteps(num_inference_steps, device=scheduler_device)
timesteps = self.scheduler.timesteps
infer_latents_from_embeddings = GeneratorToCallbackinator(
self.generate_latents_from_embeddings, PipelineIntermediateState
)
result: PipelineIntermediateState = infer_latents_from_embeddings(
latents,
timesteps,
conditioning_data,
noise=noise,
run_id=run_id,
additional_guidance=additional_guidance,
control_data=control_data,
callback=callback,
)
return result.latents, result.attention_map_saver
if additional_guidance is None:
additional_guidance = []
orig_latents = latents.clone()
batch_size = latents.shape[0]
batched_t = init_timestep.expand(batch_size)
if noise is not None:
# latents = noise * self.scheduler.init_noise_sigma # it's like in t2l according to diffusers
latents = self.scheduler.add_noise(latents, noise, batched_t)
if mask is not None:
if is_inpainting_model(self.unet):
# You'd think the inpainting model wouldn't be paying attention to the area it is going to repaint
# (that's why there's a mask!) but it seems to really want that blanked out.
# masked_latents = latents * torch.where(mask < 0.5, 1, 0) TODO: inpaint/outpaint/infill
# TODO: we should probably pass this in so we don't have to try/finally around setting it.
self.invokeai_diffuser.model_forward_callback = AddsMaskLatents(self._unet_forward, mask, orig_latents)
else:
# if no noise provided, noisify unmasked area based on seed(or 0 as fallback)
if noise is None:
noise = torch.randn(
orig_latents.shape,
dtype=torch.float32,
device="cpu",
generator=torch.Generator(device="cpu").manual_seed(seed or 0),
).to(device=orig_latents.device, dtype=orig_latents.dtype)
latents = self.scheduler.add_noise(latents, noise, batched_t)
latents = torch.lerp(
orig_latents, latents.to(dtype=orig_latents.dtype), mask.to(dtype=orig_latents.dtype)
)
additional_guidance.append(AddsMaskGuidance(mask, orig_latents, self.scheduler, noise))
try:
latents, attention_map_saver = self.generate_latents_from_embeddings(
latents,
timesteps,
conditioning_data,
additional_guidance=additional_guidance,
control_data=control_data,
callback=callback,
)
finally:
self.invokeai_diffuser.model_forward_callback = self._unet_forward
# restore unmasked part
if mask is not None:
latents = torch.lerp(orig_latents, latents.to(dtype=orig_latents.dtype), mask.to(dtype=orig_latents.dtype))
return latents, attention_map_saver
def generate_latents_from_embeddings(
self,
@ -476,42 +408,40 @@ class StableDiffusionGeneratorPipeline(StableDiffusionPipeline):
timesteps,
conditioning_data: ConditioningData,
*,
noise: torch.Tensor,
run_id: str = None,
additional_guidance: List[Callable] = None,
control_data: List[ControlNetData] = None,
callback: Callable[[PipelineIntermediateState], None] = None,
):
self._adjust_memory_efficient_attention(latents)
if run_id is None:
run_id = secrets.token_urlsafe(self.ID_LENGTH)
if additional_guidance is None:
additional_guidance = []
batch_size = latents.shape[0]
attention_map_saver: Optional[AttentionMapSaver] = None
if timesteps.shape[0] == 0:
return latents, attention_map_saver
extra_conditioning_info = conditioning_data.extra
with self.invokeai_diffuser.custom_attention_context(
self.invokeai_diffuser.model,
extra_conditioning_info=extra_conditioning_info,
step_count=len(self.scheduler.timesteps),
):
yield PipelineIntermediateState(
run_id=run_id,
step=-1,
timestep=self.scheduler.config.num_train_timesteps,
latents=latents,
)
if callback is not None:
callback(
PipelineIntermediateState(
step=-1,
order=self.scheduler.order,
total_steps=len(timesteps),
timestep=self.scheduler.config.num_train_timesteps,
latents=latents,
)
)
batch_size = latents.shape[0]
batched_t = torch.full(
(batch_size,),
timesteps[0],
dtype=timesteps.dtype,
device=self._model_group.device_for(self.unet),
)
latents = self.scheduler.add_noise(latents, noise, batched_t)
attention_map_saver: Optional[AttentionMapSaver] = None
# print("timesteps:", timesteps)
for i, t in enumerate(self.progress_bar(timesteps)):
batched_t.fill_(t)
batched_t = t.expand(batch_size)
step_output = self.step(
batched_t,
latents,
@ -540,14 +470,18 @@ class StableDiffusionGeneratorPipeline(StableDiffusionPipeline):
# attention_map_saver = AttentionMapSaver(token_ids=attention_map_token_ids, latents_shape=latents.shape[-2:])
# self.invokeai_diffuser.setup_attention_map_saving(attention_map_saver)
yield PipelineIntermediateState(
run_id=run_id,
step=i,
timestep=int(t),
latents=latents,
predicted_original=predicted_original,
attention_map_saver=attention_map_saver,
)
if callback is not None:
callback(
PipelineIntermediateState(
step=i,
order=self.scheduler.order,
total_steps=len(timesteps),
timestep=int(t),
latents=latents,
predicted_original=predicted_original,
attention_map_saver=attention_map_saver,
)
)
return latents, attention_map_saver
@ -569,95 +503,39 @@ class StableDiffusionGeneratorPipeline(StableDiffusionPipeline):
# TODO: should this scaling happen here or inside self._unet_forward?
# i.e. before or after passing it to InvokeAIDiffuserComponent
unet_latent_input = self.scheduler.scale_model_input(latents, timestep)
latent_model_input = self.scheduler.scale_model_input(latents, timestep)
# default is no controlnet, so set controlnet processing output to None
down_block_res_samples, mid_block_res_sample = None, None
controlnet_down_block_samples, controlnet_mid_block_sample = None, None
if control_data is not None:
# control_data should be type List[ControlNetData]
# this loop covers both ControlNet (one ControlNetData in list)
# and MultiControlNet (multiple ControlNetData in list)
for i, control_datum in enumerate(control_data):
control_mode = control_datum.control_mode
# soft_injection and cfg_injection are the two ControlNet control_mode booleans
# that are combined at higher level to make control_mode enum
# soft_injection determines whether to do per-layer re-weighting adjustment (if True)
# or default weighting (if False)
soft_injection = control_mode == "more_prompt" or control_mode == "more_control"
# cfg_injection = determines whether to apply ControlNet to only the conditional (if True)
# or the default both conditional and unconditional (if False)
cfg_injection = control_mode == "more_control" or control_mode == "unbalanced"
controlnet_down_block_samples, controlnet_mid_block_sample = self.invokeai_diffuser.do_controlnet_step(
control_data=control_data,
sample=latent_model_input,
timestep=timestep,
step_index=step_index,
total_step_count=total_step_count,
conditioning_data=conditioning_data,
)
first_control_step = math.floor(control_datum.begin_step_percent * total_step_count)
last_control_step = math.ceil(control_datum.end_step_percent * total_step_count)
# only apply controlnet if current step is within the controlnet's begin/end step range
if step_index >= first_control_step and step_index <= last_control_step:
if cfg_injection:
control_latent_input = unet_latent_input
else:
# expand the latents input to control model if doing classifier free guidance
# (which I think for now is always true, there is conditional elsewhere that stops execution if
# classifier_free_guidance is <= 1.0 ?)
control_latent_input = torch.cat([unet_latent_input] * 2)
if cfg_injection: # only applying ControlNet to conditional instead of in unconditioned
encoder_hidden_states = conditioning_data.text_embeddings
encoder_attention_mask = None
else:
(
encoder_hidden_states,
encoder_attention_mask,
) = self.invokeai_diffuser._concat_conditionings_for_batch(
conditioning_data.unconditioned_embeddings,
conditioning_data.text_embeddings,
)
if isinstance(control_datum.weight, list):
# if controlnet has multiple weights, use the weight for the current step
controlnet_weight = control_datum.weight[step_index]
else:
# if controlnet has a single weight, use it for all steps
controlnet_weight = control_datum.weight
# controlnet(s) inference
down_samples, mid_sample = control_datum.model(
sample=control_latent_input,
timestep=timestep,
encoder_hidden_states=encoder_hidden_states,
controlnet_cond=control_datum.image_tensor,
conditioning_scale=controlnet_weight, # controlnet specific, NOT the guidance scale
encoder_attention_mask=encoder_attention_mask,
guess_mode=soft_injection, # this is still called guess_mode in diffusers ControlNetModel
return_dict=False,
)
if cfg_injection:
# Inferred ControlNet only for the conditional batch.
# To apply the output of ControlNet to both the unconditional and conditional batches,
# prepend zeros for unconditional batch
down_samples = [torch.cat([torch.zeros_like(d), d]) for d in down_samples]
mid_sample = torch.cat([torch.zeros_like(mid_sample), mid_sample])
if down_block_res_samples is None and mid_block_res_sample is None:
down_block_res_samples, mid_block_res_sample = down_samples, mid_sample
else:
# add controlnet outputs together if have multiple controlnets
down_block_res_samples = [
samples_prev + samples_curr
for samples_prev, samples_curr in zip(down_block_res_samples, down_samples)
]
mid_block_res_sample += mid_sample
# predict the noise residual
noise_pred = self.invokeai_diffuser.do_diffusion_step(
x=unet_latent_input,
sigma=t,
unconditioning=conditioning_data.unconditioned_embeddings,
conditioning=conditioning_data.text_embeddings,
unconditional_guidance_scale=conditioning_data.guidance_scale,
uc_noise_pred, c_noise_pred = self.invokeai_diffuser.do_unet_step(
sample=latent_model_input,
timestep=t, # TODO: debug how handled batched and non batched timesteps
step_index=step_index,
total_step_count=total_step_count,
down_block_additional_residuals=down_block_res_samples, # from controlnet(s)
mid_block_additional_residual=mid_block_res_sample, # from controlnet(s)
conditioning_data=conditioning_data,
# extra:
down_block_additional_residuals=controlnet_down_block_samples, # from controlnet(s)
mid_block_additional_residual=controlnet_mid_block_sample, # from controlnet(s)
)
guidance_scale = conditioning_data.guidance_scale
if isinstance(guidance_scale, list):
guidance_scale = guidance_scale[step_index]
noise_pred = self.invokeai_diffuser._combine(
uc_noise_pred,
c_noise_pred,
guidance_scale,
)
# compute the previous noisy sample x_t -> x_t-1
@ -699,224 +577,3 @@ class StableDiffusionGeneratorPipeline(StableDiffusionPipeline):
cross_attention_kwargs=cross_attention_kwargs,
**kwargs,
).sample
def img2img_from_embeddings(
self,
init_image: Union[torch.FloatTensor, PIL.Image.Image],
strength: float,
num_inference_steps: int,
conditioning_data: ConditioningData,
*,
callback: Callable[[PipelineIntermediateState], None] = None,
run_id=None,
noise_func=None,
seed=None,
) -> InvokeAIStableDiffusionPipelineOutput:
if isinstance(init_image, PIL.Image.Image):
init_image = image_resized_to_grid_as_tensor(init_image.convert("RGB"))
if init_image.dim() == 3:
init_image = einops.rearrange(init_image, "c h w -> 1 c h w")
# 6. Prepare latent variables
initial_latents = self.non_noised_latents_from_image(
init_image,
device=self._model_group.device_for(self.unet),
dtype=self.unet.dtype,
)
if seed is not None:
set_seed(seed)
noise = noise_func(initial_latents)
return self.img2img_from_latents_and_embeddings(
initial_latents,
num_inference_steps,
conditioning_data,
strength,
noise,
run_id,
callback,
)
def img2img_from_latents_and_embeddings(
self,
initial_latents,
num_inference_steps,
conditioning_data: ConditioningData,
strength,
noise: torch.Tensor,
run_id=None,
callback=None,
) -> InvokeAIStableDiffusionPipelineOutput:
timesteps, _ = self.get_img2img_timesteps(num_inference_steps, strength)
result_latents, result_attention_maps = self.latents_from_embeddings(
latents=initial_latents
if strength < 1.0
else torch.zeros_like(initial_latents, device=initial_latents.device, dtype=initial_latents.dtype),
num_inference_steps=num_inference_steps,
conditioning_data=conditioning_data,
timesteps=timesteps,
noise=noise,
run_id=run_id,
callback=callback,
)
# https://discuss.huggingface.co/t/memory-usage-by-later-pipeline-stages/23699
torch.cuda.empty_cache()
with torch.inference_mode():
image = self.decode_latents(result_latents)
output = InvokeAIStableDiffusionPipelineOutput(
images=image,
nsfw_content_detected=[],
attention_map_saver=result_attention_maps,
)
return self.check_for_safety(output, dtype=conditioning_data.dtype)
def get_img2img_timesteps(self, num_inference_steps: int, strength: float, device=None) -> (torch.Tensor, int):
img2img_pipeline = StableDiffusionImg2ImgPipeline(**self.components)
assert img2img_pipeline.scheduler is self.scheduler
if self.scheduler.config.get("cpu_only", False):
scheduler_device = torch.device("cpu")
else:
scheduler_device = self._model_group.device_for(self.unet)
img2img_pipeline.scheduler.set_timesteps(num_inference_steps, device=scheduler_device)
timesteps, adjusted_steps = img2img_pipeline.get_timesteps(
num_inference_steps, strength, device=scheduler_device
)
# Workaround for low strength resulting in zero timesteps.
# TODO: submit upstream fix for zero-step img2img
if timesteps.numel() == 0:
timesteps = self.scheduler.timesteps[-1:]
adjusted_steps = timesteps.numel()
return timesteps, adjusted_steps
def inpaint_from_embeddings(
self,
init_image: torch.FloatTensor,
mask: torch.FloatTensor,
strength: float,
num_inference_steps: int,
conditioning_data: ConditioningData,
*,
callback: Callable[[PipelineIntermediateState], None] = None,
run_id=None,
noise_func=None,
seed=None,
) -> InvokeAIStableDiffusionPipelineOutput:
device = self._model_group.device_for(self.unet)
latents_dtype = self.unet.dtype
if isinstance(init_image, PIL.Image.Image):
init_image = image_resized_to_grid_as_tensor(init_image.convert("RGB"))
init_image = init_image.to(device=device, dtype=latents_dtype)
mask = mask.to(device=device, dtype=latents_dtype)
if init_image.dim() == 3:
init_image = init_image.unsqueeze(0)
timesteps, _ = self.get_img2img_timesteps(num_inference_steps, strength)
# 6. Prepare latent variables
# can't quite use upstream StableDiffusionImg2ImgPipeline.prepare_latents
# because we have our own noise function
init_image_latents = self.non_noised_latents_from_image(init_image, device=device, dtype=latents_dtype)
if seed is not None:
set_seed(seed)
noise = noise_func(init_image_latents)
if mask.dim() == 3:
mask = mask.unsqueeze(0)
latent_mask = tv_resize(mask, init_image_latents.shape[-2:], T.InterpolationMode.BILINEAR).to(
device=device, dtype=latents_dtype
)
guidance: List[Callable] = []
if is_inpainting_model(self.unet):
# You'd think the inpainting model wouldn't be paying attention to the area it is going to repaint
# (that's why there's a mask!) but it seems to really want that blanked out.
masked_init_image = init_image * torch.where(mask < 0.5, 1, 0)
masked_latents = self.non_noised_latents_from_image(masked_init_image, device=device, dtype=latents_dtype)
# TODO: we should probably pass this in so we don't have to try/finally around setting it.
self.invokeai_diffuser.model_forward_callback = AddsMaskLatents(
self._unet_forward, latent_mask, masked_latents
)
else:
guidance.append(AddsMaskGuidance(latent_mask, init_image_latents, self.scheduler, noise))
try:
result_latents, result_attention_maps = self.latents_from_embeddings(
latents=init_image_latents
if strength < 1.0
else torch.zeros_like(
init_image_latents, device=init_image_latents.device, dtype=init_image_latents.dtype
),
num_inference_steps=num_inference_steps,
conditioning_data=conditioning_data,
noise=noise,
timesteps=timesteps,
additional_guidance=guidance,
run_id=run_id,
callback=callback,
)
finally:
self.invokeai_diffuser.model_forward_callback = self._unet_forward
# https://discuss.huggingface.co/t/memory-usage-by-later-pipeline-stages/23699
torch.cuda.empty_cache()
with torch.inference_mode():
image = self.decode_latents(result_latents)
output = InvokeAIStableDiffusionPipelineOutput(
images=image,
nsfw_content_detected=[],
attention_map_saver=result_attention_maps,
)
return self.check_for_safety(output, dtype=conditioning_data.dtype)
def non_noised_latents_from_image(self, init_image, *, device: torch.device, dtype):
init_image = init_image.to(device=device, dtype=dtype)
with torch.inference_mode():
self._model_group.load(self.vae)
init_latent_dist = self.vae.encode(init_image).latent_dist
init_latents = init_latent_dist.sample().to(dtype=dtype) # FIXME: uses torch.randn. make reproducible!
init_latents = 0.18215 * init_latents
return init_latents
def check_for_safety(self, output, dtype):
with torch.inference_mode():
screened_images, has_nsfw_concept = self.run_safety_checker(output.images, dtype=dtype)
screened_attention_map_saver = None
if has_nsfw_concept is None or not has_nsfw_concept:
screened_attention_map_saver = output.attention_map_saver
return InvokeAIStableDiffusionPipelineOutput(
screened_images,
has_nsfw_concept,
# block the attention maps if NSFW content is detected
attention_map_saver=screened_attention_map_saver,
)
def run_safety_checker(self, image, device=None, dtype=None):
# overriding to use the model group for device info instead of requiring the caller to know.
if self.safety_checker is not None:
device = self._model_group.device_for(self.safety_checker)
return super().run_safety_checker(image, device, dtype)
def decode_latents(self, latents):
# Explicit call to get the vae loaded, since `decode` isn't the forward method.
self._model_group.load(self.vae)
return super().decode_latents(latents)
def debug_latents(self, latents, msg):
from invokeai.backend.image_util import debug_image
with torch.inference_mode():
decoded = self.numpy_to_pil(self.decode_latents(latents))
for i, img in enumerate(decoded):
debug_image(img, f"latents {msg} {i+1}/{len(decoded)}", debug_status=True)

View File

@ -3,4 +3,9 @@ Initialization file for invokeai.models.diffusion
"""
from .cross_attention_control import InvokeAICrossAttentionMixin
from .cross_attention_map_saving import AttentionMapSaver
from .shared_invokeai_diffusion import InvokeAIDiffuserComponent, PostprocessingSettings
from .shared_invokeai_diffusion import (
InvokeAIDiffuserComponent,
PostprocessingSettings,
BasicConditioningInfo,
SDXLConditioningInfo,
)

View File

@ -1,6 +1,8 @@
from __future__ import annotations
from contextlib import contextmanager
from dataclasses import dataclass
from math import ceil
import math
from typing import Any, Callable, Dict, Optional, Union, List
import numpy as np
@ -32,6 +34,29 @@ ModelForwardCallback: TypeAlias = Union[
]
@dataclass
class BasicConditioningInfo:
embeds: torch.Tensor
extra_conditioning: Optional[InvokeAIDiffuserComponent.ExtraConditioningInfo]
# weight: float
# mode: ConditioningAlgo
def to(self, device, dtype=None):
self.embeds = self.embeds.to(device=device, dtype=dtype)
return self
@dataclass
class SDXLConditioningInfo(BasicConditioningInfo):
pooled_embeds: torch.Tensor
add_time_ids: torch.Tensor
def to(self, device, dtype=None):
self.pooled_embeds = self.pooled_embeds.to(device=device, dtype=dtype)
self.add_time_ids = self.add_time_ids.to(device=device, dtype=dtype)
return super().to(device=device, dtype=dtype)
@dataclass(frozen=True)
class PostprocessingSettings:
threshold: float
@ -78,10 +103,9 @@ class InvokeAIDiffuserComponent:
self.cross_attention_control_context = None
self.sequential_guidance = config.sequential_guidance
@classmethod
@contextmanager
def custom_attention_context(
cls,
self,
unet: UNet2DConditionModel, # note: also may futz with the text encoder depending on requested LoRAs
extra_conditioning_info: Optional[ExtraConditioningInfo],
step_count: int,
@ -91,18 +115,19 @@ class InvokeAIDiffuserComponent:
old_attn_processors = unet.attn_processors
# Load lora conditions into the model
if extra_conditioning_info.wants_cross_attention_control:
cross_attention_control_context = Context(
self.cross_attention_control_context = Context(
arguments=extra_conditioning_info.cross_attention_control_args,
step_count=step_count,
)
setup_cross_attention_control_attention_processors(
unet,
cross_attention_control_context,
self.cross_attention_control_context,
)
try:
yield None
finally:
self.cross_attention_control_context = None
if old_attn_processors is not None:
unet.set_attn_processor(old_attn_processors)
# TODO resuscitate attention map saving
@ -127,33 +152,125 @@ class InvokeAIDiffuserComponent:
for _, module in tokens_cross_attention_modules:
module.set_attention_slice_calculated_callback(None)
def do_diffusion_step(
def do_controlnet_step(
self,
x: torch.Tensor,
sigma: torch.Tensor,
unconditioning: Union[torch.Tensor, dict],
conditioning: Union[torch.Tensor, dict],
# unconditional_guidance_scale: float,
unconditional_guidance_scale: Union[float, List[float]],
step_index: Optional[int] = None,
total_step_count: Optional[int] = None,
control_data,
sample: torch.Tensor,
timestep: torch.Tensor,
step_index: int,
total_step_count: int,
conditioning_data,
):
down_block_res_samples, mid_block_res_sample = None, None
# control_data should be type List[ControlNetData]
# this loop covers both ControlNet (one ControlNetData in list)
# and MultiControlNet (multiple ControlNetData in list)
for i, control_datum in enumerate(control_data):
control_mode = control_datum.control_mode
# soft_injection and cfg_injection are the two ControlNet control_mode booleans
# that are combined at higher level to make control_mode enum
# soft_injection determines whether to do per-layer re-weighting adjustment (if True)
# or default weighting (if False)
soft_injection = control_mode == "more_prompt" or control_mode == "more_control"
# cfg_injection = determines whether to apply ControlNet to only the conditional (if True)
# or the default both conditional and unconditional (if False)
cfg_injection = control_mode == "more_control" or control_mode == "unbalanced"
first_control_step = math.floor(control_datum.begin_step_percent * total_step_count)
last_control_step = math.ceil(control_datum.end_step_percent * total_step_count)
# only apply controlnet if current step is within the controlnet's begin/end step range
if step_index >= first_control_step and step_index <= last_control_step:
if cfg_injection:
sample_model_input = sample
else:
# expand the latents input to control model if doing classifier free guidance
# (which I think for now is always true, there is conditional elsewhere that stops execution if
# classifier_free_guidance is <= 1.0 ?)
sample_model_input = torch.cat([sample] * 2)
added_cond_kwargs = None
if cfg_injection: # only applying ControlNet to conditional instead of in unconditioned
if type(conditioning_data.text_embeddings) is SDXLConditioningInfo:
added_cond_kwargs = {
"text_embeds": conditioning_data.text_embeddings.pooled_embeds,
"time_ids": conditioning_data.text_embeddings.add_time_ids,
}
encoder_hidden_states = conditioning_data.text_embeddings.embeds
encoder_attention_mask = None
else:
if type(conditioning_data.text_embeddings) is SDXLConditioningInfo:
added_cond_kwargs = {
"text_embeds": torch.cat(
[
# TODO: how to pad? just by zeros? or even truncate?
conditioning_data.unconditioned_embeddings.pooled_embeds,
conditioning_data.text_embeddings.pooled_embeds,
],
dim=0,
),
"time_ids": torch.cat(
[
conditioning_data.unconditioned_embeddings.add_time_ids,
conditioning_data.text_embeddings.add_time_ids,
],
dim=0,
),
}
(
encoder_hidden_states,
encoder_attention_mask,
) = self._concat_conditionings_for_batch(
conditioning_data.unconditioned_embeddings.embeds,
conditioning_data.text_embeddings.embeds,
)
if isinstance(control_datum.weight, list):
# if controlnet has multiple weights, use the weight for the current step
controlnet_weight = control_datum.weight[step_index]
else:
# if controlnet has a single weight, use it for all steps
controlnet_weight = control_datum.weight
# controlnet(s) inference
down_samples, mid_sample = control_datum.model(
sample=sample_model_input,
timestep=timestep,
encoder_hidden_states=encoder_hidden_states,
controlnet_cond=control_datum.image_tensor,
conditioning_scale=controlnet_weight, # controlnet specific, NOT the guidance scale
encoder_attention_mask=encoder_attention_mask,
guess_mode=soft_injection, # this is still called guess_mode in diffusers ControlNetModel
return_dict=False,
)
if cfg_injection:
# Inferred ControlNet only for the conditional batch.
# To apply the output of ControlNet to both the unconditional and conditional batches,
# prepend zeros for unconditional batch
down_samples = [torch.cat([torch.zeros_like(d), d]) for d in down_samples]
mid_sample = torch.cat([torch.zeros_like(mid_sample), mid_sample])
if down_block_res_samples is None and mid_block_res_sample is None:
down_block_res_samples, mid_block_res_sample = down_samples, mid_sample
else:
# add controlnet outputs together if have multiple controlnets
down_block_res_samples = [
samples_prev + samples_curr
for samples_prev, samples_curr in zip(down_block_res_samples, down_samples)
]
mid_block_res_sample += mid_sample
return down_block_res_samples, mid_block_res_sample
def do_unet_step(
self,
sample: torch.Tensor,
timestep: torch.Tensor,
conditioning_data, # TODO: type
step_index: int,
total_step_count: int,
**kwargs,
):
"""
:param x: current latents
:param sigma: aka t, passed to the internal model to control how much denoising will occur
:param unconditioning: embeddings for unconditioned output. for hybrid conditioning this is a dict of tensors [B x 77 x 768], otherwise a single tensor [B x 77 x 768]
:param conditioning: embeddings for conditioned output. for hybrid conditioning this is a dict of tensors [B x 77 x 768], otherwise a single tensor [B x 77 x 768]
:param unconditional_guidance_scale: aka CFG scale, controls how much effect the conditioning tensor has
:param step_index: counts upwards from 0 to (step_count-1) (as passed to setup_cross_attention_control, if using). May be called multiple times for a single step, therefore do not assume that its value will monotically increase. If None, will be estimated by comparing sigma against self.model.sigmas .
:return: the new latents after applying the model to x using unscaled unconditioning and CFG-scaled conditioning.
"""
if isinstance(unconditional_guidance_scale, list):
guidance_scale = unconditional_guidance_scale[step_index]
else:
guidance_scale = unconditional_guidance_scale
cross_attention_control_types_to_do = []
context: Context = self.cross_attention_control_context
if self.cross_attention_control_context is not None:
@ -163,25 +280,15 @@ class InvokeAIDiffuserComponent:
)
wants_cross_attention_control = len(cross_attention_control_types_to_do) > 0
wants_hybrid_conditioning = isinstance(conditioning, dict)
if wants_hybrid_conditioning:
unconditioned_next_x, conditioned_next_x = self._apply_hybrid_conditioning(
x,
sigma,
unconditioning,
conditioning,
**kwargs,
)
elif wants_cross_attention_control:
if wants_cross_attention_control:
(
unconditioned_next_x,
conditioned_next_x,
) = self._apply_cross_attention_controlled_conditioning(
x,
sigma,
unconditioning,
conditioning,
sample,
timestep,
conditioning_data,
cross_attention_control_types_to_do,
**kwargs,
)
@ -190,10 +297,9 @@ class InvokeAIDiffuserComponent:
unconditioned_next_x,
conditioned_next_x,
) = self._apply_standard_conditioning_sequentially(
x,
sigma,
unconditioning,
conditioning,
sample,
timestep,
conditioning_data,
**kwargs,
)
@ -202,21 +308,13 @@ class InvokeAIDiffuserComponent:
unconditioned_next_x,
conditioned_next_x,
) = self._apply_standard_conditioning(
x,
sigma,
unconditioning,
conditioning,
sample,
timestep,
conditioning_data,
**kwargs,
)
combined_next_x = self._combine(
# unconditioned_next_x, conditioned_next_x, unconditional_guidance_scale
unconditioned_next_x,
conditioned_next_x,
guidance_scale,
)
return combined_next_x
return unconditioned_next_x, conditioned_next_x
def do_latent_postprocessing(
self,
@ -228,7 +326,6 @@ class InvokeAIDiffuserComponent:
) -> torch.Tensor:
if postprocessing_settings is not None:
percent_through = step_index / total_step_count
latents = self.apply_threshold(postprocessing_settings, latents, percent_through)
latents = self.apply_symmetry(postprocessing_settings, latents, percent_through)
return latents
@ -281,17 +378,40 @@ class InvokeAIDiffuserComponent:
# methods below are called from do_diffusion_step and should be considered private to this class.
def _apply_standard_conditioning(self, x, sigma, unconditioning, conditioning, **kwargs):
def _apply_standard_conditioning(self, x, sigma, conditioning_data, **kwargs):
# fast batched path
x_twice = torch.cat([x] * 2)
sigma_twice = torch.cat([sigma] * 2)
both_conditionings, encoder_attention_mask = self._concat_conditionings_for_batch(unconditioning, conditioning)
added_cond_kwargs = None
if type(conditioning_data.text_embeddings) is SDXLConditioningInfo:
added_cond_kwargs = {
"text_embeds": torch.cat(
[
# TODO: how to pad? just by zeros? or even truncate?
conditioning_data.unconditioned_embeddings.pooled_embeds,
conditioning_data.text_embeddings.pooled_embeds,
],
dim=0,
),
"time_ids": torch.cat(
[
conditioning_data.unconditioned_embeddings.add_time_ids,
conditioning_data.text_embeddings.add_time_ids,
],
dim=0,
),
}
both_conditionings, encoder_attention_mask = self._concat_conditionings_for_batch(
conditioning_data.unconditioned_embeddings.embeds, conditioning_data.text_embeddings.embeds
)
both_results = self.model_forward_callback(
x_twice,
sigma_twice,
both_conditionings,
encoder_attention_mask=encoder_attention_mask,
added_cond_kwargs=added_cond_kwargs,
**kwargs,
)
unconditioned_next_x, conditioned_next_x = both_results.chunk(2)
@ -301,8 +421,7 @@ class InvokeAIDiffuserComponent:
self,
x: torch.Tensor,
sigma,
unconditioning: torch.Tensor,
conditioning: torch.Tensor,
conditioning_data,
**kwargs,
):
# low-memory sequential path
@ -320,52 +439,46 @@ class InvokeAIDiffuserComponent:
if mid_block_additional_residual is not None:
uncond_mid_block, cond_mid_block = mid_block_additional_residual.chunk(2)
added_cond_kwargs = None
is_sdxl = type(conditioning_data.text_embeddings) is SDXLConditioningInfo
if is_sdxl:
added_cond_kwargs = {
"text_embeds": conditioning_data.unconditioned_embeddings.pooled_embeds,
"time_ids": conditioning_data.unconditioned_embeddings.add_time_ids,
}
unconditioned_next_x = self.model_forward_callback(
x,
sigma,
unconditioning,
conditioning_data.unconditioned_embeddings.embeds,
down_block_additional_residuals=uncond_down_block,
mid_block_additional_residual=uncond_mid_block,
added_cond_kwargs=added_cond_kwargs,
**kwargs,
)
if is_sdxl:
added_cond_kwargs = {
"text_embeds": conditioning_data.text_embeddings.pooled_embeds,
"time_ids": conditioning_data.text_embeddings.add_time_ids,
}
conditioned_next_x = self.model_forward_callback(
x,
sigma,
conditioning,
conditioning_data.text_embeddings.embeds,
down_block_additional_residuals=cond_down_block,
mid_block_additional_residual=cond_mid_block,
added_cond_kwargs=added_cond_kwargs,
**kwargs,
)
return unconditioned_next_x, conditioned_next_x
# TODO: looks unused
def _apply_hybrid_conditioning(self, x, sigma, unconditioning, conditioning, **kwargs):
assert isinstance(conditioning, dict)
assert isinstance(unconditioning, dict)
x_twice = torch.cat([x] * 2)
sigma_twice = torch.cat([sigma] * 2)
both_conditionings = dict()
for k in conditioning:
if isinstance(conditioning[k], list):
both_conditionings[k] = [
torch.cat([unconditioning[k][i], conditioning[k][i]]) for i in range(len(conditioning[k]))
]
else:
both_conditionings[k] = torch.cat([unconditioning[k], conditioning[k]])
unconditioned_next_x, conditioned_next_x = self.model_forward_callback(
x_twice,
sigma_twice,
both_conditionings,
**kwargs,
).chunk(2)
return unconditioned_next_x, conditioned_next_x
def _apply_cross_attention_controlled_conditioning(
self,
x: torch.Tensor,
sigma,
unconditioning,
conditioning,
conditioning_data,
cross_attention_control_types_to_do,
**kwargs,
):
@ -391,26 +504,43 @@ class InvokeAIDiffuserComponent:
mask=context.cross_attention_mask,
cross_attention_types_to_do=[],
)
added_cond_kwargs = None
is_sdxl = type(conditioning_data.text_embeddings) is SDXLConditioningInfo
if is_sdxl:
added_cond_kwargs = {
"text_embeds": conditioning_data.unconditioned_embeddings.pooled_embeds,
"time_ids": conditioning_data.unconditioned_embeddings.add_time_ids,
}
# no cross attention for unconditioning (negative prompt)
unconditioned_next_x = self.model_forward_callback(
x,
sigma,
unconditioning,
conditioning_data.unconditioned_embeddings.embeds,
{"swap_cross_attn_context": cross_attn_processor_context},
down_block_additional_residuals=uncond_down_block,
mid_block_additional_residual=uncond_mid_block,
added_cond_kwargs=added_cond_kwargs,
**kwargs,
)
if is_sdxl:
added_cond_kwargs = {
"text_embeds": conditioning_data.text_embeddings.pooled_embeds,
"time_ids": conditioning_data.text_embeddings.add_time_ids,
}
# do requested cross attention types for conditioning (positive prompt)
cross_attn_processor_context.cross_attention_types_to_do = cross_attention_control_types_to_do
conditioned_next_x = self.model_forward_callback(
x,
sigma,
conditioning,
conditioning_data.text_embeddings.embeds,
{"swap_cross_attn_context": cross_attn_processor_context},
down_block_additional_residuals=cond_down_block,
mid_block_additional_residual=cond_mid_block,
added_cond_kwargs=added_cond_kwargs,
**kwargs,
)
return unconditioned_next_x, conditioned_next_x
@ -421,63 +551,6 @@ class InvokeAIDiffuserComponent:
combined_next_x = unconditioned_next_x + scaled_delta
return combined_next_x
def apply_threshold(
self,
postprocessing_settings: PostprocessingSettings,
latents: torch.Tensor,
percent_through: float,
) -> torch.Tensor:
if postprocessing_settings.threshold is None or postprocessing_settings.threshold == 0.0:
return latents
threshold = postprocessing_settings.threshold
warmup = postprocessing_settings.warmup
if percent_through < warmup:
current_threshold = threshold + threshold * 5 * (1 - (percent_through / warmup))
else:
current_threshold = threshold
if current_threshold <= 0:
return latents
maxval = latents.max().item()
minval = latents.min().item()
scale = 0.7 # default value from #395
if self.debug_thresholding:
std, mean = [i.item() for i in torch.std_mean(latents)]
outside = torch.count_nonzero((latents < -current_threshold) | (latents > current_threshold))
logger.info(f"Threshold: %={percent_through} threshold={current_threshold:.3f} (of {threshold:.3f})")
logger.debug(f"min, mean, max = {minval:.3f}, {mean:.3f}, {maxval:.3f}\tstd={std}")
logger.debug(f"{outside / latents.numel() * 100:.2f}% values outside threshold")
if maxval < current_threshold and minval > -current_threshold:
return latents
num_altered = 0
# MPS torch.rand_like is fine because torch.rand_like is wrapped in generate.py!
if maxval > current_threshold:
latents = torch.clone(latents)
maxval = np.clip(maxval * scale, 1, current_threshold)
num_altered += torch.count_nonzero(latents > maxval)
latents[latents > maxval] = torch.rand_like(latents[latents > maxval]) * maxval
if minval < -current_threshold:
latents = torch.clone(latents)
minval = np.clip(minval * scale, -current_threshold, -1)
num_altered += torch.count_nonzero(latents < minval)
latents[latents < minval] = torch.rand_like(latents[latents < minval]) * minval
if self.debug_thresholding:
logger.debug(f"min, , max = {minval:.3f}, , {maxval:.3f}\t(scaled by {scale})")
logger.debug(f"{num_altered / latents.numel() * 100:.2f}% values altered")
return latents
def apply_symmetry(
self,
postprocessing_settings: PostprocessingSettings,
@ -539,18 +612,6 @@ class InvokeAIDiffuserComponent:
self.last_percent_through = percent_through
return latents.to(device=dev)
def estimate_percent_through(self, step_index, sigma):
if step_index is not None and self.cross_attention_control_context is not None:
# percent_through will never reach 1.0 (but this is intended)
return float(step_index) / float(self.cross_attention_control_context.step_count)
# find the best possible index of the current sigma in the sigma sequence
smaller_sigmas = torch.nonzero(self.model.sigmas <= sigma)
sigma_index = smaller_sigmas[-1].item() if smaller_sigmas.shape[0] > 0 else 0
# flip because sigmas[0] is for the fully denoised image
# percent_through must be <1
return 1.0 - float(sigma_index + 1) / float(self.model.sigmas.shape[0])
# print('estimated percent_through', percent_through, 'from sigma', sigma.item())
# todo: make this work
@classmethod
def apply_conjunction(cls, x, t, forward_func, uc, c_or_weighted_c_list, global_guidance_scale):
@ -564,7 +625,7 @@ class InvokeAIDiffuserComponent:
# below is fugly omg
conditionings = [uc] + [c for c, weight in weighted_cond_list]
weights = [1] + [weight for c, weight in weighted_cond_list]
chunk_count = ceil(len(conditionings) / 2)
chunk_count = math.ceil(len(conditionings) / 2)
deltas = None
for chunk_index in range(chunk_count):
offset = chunk_index * 2

View File

@ -1,253 +0,0 @@
from __future__ import annotations
import warnings
import weakref
from abc import ABCMeta, abstractmethod
from collections.abc import MutableMapping
from typing import Callable, Union
import torch
from accelerate.utils import send_to_device
from torch.utils.hooks import RemovableHandle
OFFLOAD_DEVICE = torch.device("cpu")
class _NoModel:
"""Symbol that indicates no model is loaded.
(We can't weakref.ref(None), so this was my best idea at the time to come up with something
type-checkable.)
"""
def __bool__(self):
return False
def to(self, device: torch.device):
pass
def __repr__(self):
return "<NO MODEL>"
NO_MODEL = _NoModel()
class ModelGroup(metaclass=ABCMeta):
"""
A group of models.
The use case I had in mind when writing this is the sub-models used by a DiffusionPipeline,
e.g. its text encoder, U-net, VAE, etc.
Those models are :py:class:`diffusers.ModelMixin`, but "model" is interchangeable with
:py:class:`torch.nn.Module` here.
"""
def __init__(self, execution_device: torch.device):
self.execution_device = execution_device
@abstractmethod
def install(self, *models: torch.nn.Module):
"""Add models to this group."""
pass
@abstractmethod
def uninstall(self, models: torch.nn.Module):
"""Remove models from this group."""
pass
@abstractmethod
def uninstall_all(self):
"""Remove all models from this group."""
@abstractmethod
def load(self, model: torch.nn.Module):
"""Load this model to the execution device."""
pass
@abstractmethod
def offload_current(self):
"""Offload the current model(s) from the execution device."""
pass
@abstractmethod
def ready(self):
"""Ready this group for use."""
pass
@abstractmethod
def set_device(self, device: torch.device):
"""Change which device models from this group will execute on."""
pass
@abstractmethod
def device_for(self, model) -> torch.device:
"""Get the device the given model will execute on.
The model should already be a member of this group.
"""
pass
@abstractmethod
def __contains__(self, model):
"""Check if the model is a member of this group."""
pass
def __repr__(self) -> str:
return f"<{self.__class__.__name__} object at {id(self):x}: " f"device={self.execution_device} >"
class LazilyLoadedModelGroup(ModelGroup):
"""
Only one model from this group is loaded on the GPU at a time.
Running the forward method of a model will displace the previously-loaded model,
offloading it to CPU.
If you call other methods on the model, e.g. ``model.encode(x)`` instead of ``model(x)``,
you will need to explicitly load it with :py:method:`.load(model)`.
This implementation relies on pytorch forward-pre-hooks, and it will copy forward arguments
to the appropriate execution device, as long as they are positional arguments and not keyword
arguments. (I didn't make the rules; that's the way the pytorch 1.13 API works for hooks.)
"""
_hooks: MutableMapping[torch.nn.Module, RemovableHandle]
_current_model_ref: Callable[[], Union[torch.nn.Module, _NoModel]]
def __init__(self, execution_device: torch.device):
super().__init__(execution_device)
self._hooks = weakref.WeakKeyDictionary()
self._current_model_ref = weakref.ref(NO_MODEL)
def install(self, *models: torch.nn.Module):
for model in models:
self._hooks[model] = model.register_forward_pre_hook(self._pre_hook)
def uninstall(self, *models: torch.nn.Module):
for model in models:
hook = self._hooks.pop(model)
hook.remove()
if self.is_current_model(model):
# no longer hooked by this object, so don't claim to manage it
self.clear_current_model()
def uninstall_all(self):
self.uninstall(*self._hooks.keys())
def _pre_hook(self, module: torch.nn.Module, forward_input):
self.load(module)
if len(forward_input) == 0:
warnings.warn(
f"Hook for {module.__class__.__name__} got no input. " f"Inputs must be positional, not keywords.",
stacklevel=3,
)
return send_to_device(forward_input, self.execution_device)
def load(self, module):
if not self.is_current_model(module):
self.offload_current()
self._load(module)
def offload_current(self):
module = self._current_model_ref()
if module is not NO_MODEL:
module.to(OFFLOAD_DEVICE)
self.clear_current_model()
def _load(self, module: torch.nn.Module) -> torch.nn.Module:
assert self.is_empty(), f"A model is already loaded: {self._current_model_ref()}"
module = module.to(self.execution_device)
self.set_current_model(module)
return module
def is_current_model(self, model: torch.nn.Module) -> bool:
"""Is the given model the one currently loaded on the execution device?"""
return self._current_model_ref() is model
def is_empty(self):
"""Are none of this group's models loaded on the execution device?"""
return self._current_model_ref() is NO_MODEL
def set_current_model(self, value):
self._current_model_ref = weakref.ref(value)
def clear_current_model(self):
self._current_model_ref = weakref.ref(NO_MODEL)
def set_device(self, device: torch.device):
if device == self.execution_device:
return
self.execution_device = device
current = self._current_model_ref()
if current is not NO_MODEL:
current.to(device)
def device_for(self, model):
if model not in self:
raise KeyError(f"This does not manage this model {type(model).__name__}", model)
return self.execution_device # this implementation only dispatches to one device
def ready(self):
pass # always ready to load on-demand
def __contains__(self, model):
return model in self._hooks
def __repr__(self) -> str:
return (
f"<{self.__class__.__name__} object at {id(self):x}: "
f"current_model={type(self._current_model_ref()).__name__} >"
)
class FullyLoadedModelGroup(ModelGroup):
"""
A group of models without any implicit loading or unloading.
:py:meth:`.ready` loads _all_ the models to the execution device at once.
"""
_models: weakref.WeakSet
def __init__(self, execution_device: torch.device):
super().__init__(execution_device)
self._models = weakref.WeakSet()
def install(self, *models: torch.nn.Module):
for model in models:
self._models.add(model)
model.to(self.execution_device)
def uninstall(self, *models: torch.nn.Module):
for model in models:
self._models.remove(model)
def uninstall_all(self):
self.uninstall(*self._models)
def load(self, model):
model.to(self.execution_device)
def offload_current(self):
for model in self._models:
model.to(OFFLOAD_DEVICE)
def ready(self):
for model in self._models:
self.load(model)
def set_device(self, device: torch.device):
self.execution_device = device
for model in self._models:
if model.device != OFFLOAD_DEVICE:
model.to(device)
def device_for(self, model):
if model not in self:
raise KeyError("This does not manage this model f{type(model).__name__}", model)
return self.execution_device # this implementation only dispatches to one device
def __contains__(self, model):
return model in self._models

View File

@ -1,6 +1,8 @@
from __future__ import annotations
from contextlib import nullcontext
from packaging import version
import platform
import torch
from torch import autocast
@ -30,7 +32,7 @@ def choose_precision(device: torch.device) -> str:
device_name = torch.cuda.get_device_name(device)
if not ("GeForce GTX 1660" in device_name or "GeForce GTX 1650" in device_name):
return "float16"
elif device.type == "mps":
elif device.type == "mps" and version.parse(platform.mac_ver()[0]) < version.parse("14.0.0"):
return "float16"
return "float32"

View File

@ -98,6 +98,7 @@ class ControlNetModel(ModelMixin, ConfigMixin):
norm_num_groups: Optional[int] = 32,
norm_eps: float = 1e-5,
cross_attention_dim: int = 1280,
transformer_layers_per_block: Union[int, Tuple[int]] = 1,
attention_head_dim: Union[int, Tuple[int]] = 8,
num_attention_heads: Optional[Union[int, Tuple[int]]] = None,
use_linear_projection: bool = False,
@ -135,6 +136,8 @@ class ControlNetModel(ModelMixin, ConfigMixin):
raise ValueError(
f"Must provide the same number of `num_attention_heads` as `down_block_types`. `num_attention_heads`: {num_attention_heads}. `down_block_types`: {down_block_types}."
)
if isinstance(transformer_layers_per_block, int):
transformer_layers_per_block = [transformer_layers_per_block] * len(down_block_types)
# input
conv_in_kernel = 3
@ -212,6 +215,7 @@ class ControlNetModel(ModelMixin, ConfigMixin):
down_block = get_down_block(
down_block_type,
num_layers=layers_per_block,
transformer_layers_per_block=transformer_layers_per_block[i],
in_channels=input_channel,
out_channels=output_channel,
temb_channels=time_embed_dim,
@ -248,6 +252,7 @@ class ControlNetModel(ModelMixin, ConfigMixin):
self.controlnet_mid_block = controlnet_block
self.mid_block = UNetMidBlock2DCrossAttn(
transformer_layers_per_block=transformer_layers_per_block[-1],
in_channels=mid_block_channel,
temb_channels=time_embed_dim,
resnet_eps=norm_eps,

View File

@ -1,4 +0,0 @@
"""
Initialization file for the web backend.
"""
from .invoke_ai_web_server import InvokeAIWebServer

File diff suppressed because it is too large Load Diff

View File

@ -1,56 +0,0 @@
import argparse
import os
from ...args import PRECISION_CHOICES
def create_cmd_parser():
parser = argparse.ArgumentParser(description="InvokeAI web UI")
parser.add_argument(
"--host",
type=str,
help="The host to serve on",
default="localhost",
)
parser.add_argument("--port", type=int, help="The port to serve on", default=9090)
parser.add_argument(
"--cors",
nargs="*",
type=str,
help="Additional allowed origins, comma-separated",
)
parser.add_argument(
"--embedding_path",
type=str,
help="Path to a pre-trained embedding manager checkpoint - can only be set on command line",
)
# TODO: Can't get flask to serve images from any dir (saving to the dir does work when specified)
# parser.add_argument(
# "--output_dir",
# default="outputs/",
# type=str,
# help="Directory for output images",
# )
parser.add_argument(
"-v",
"--verbose",
action="store_true",
help="Enables verbose logging",
)
parser.add_argument(
"--precision",
dest="precision",
type=str,
choices=PRECISION_CHOICES,
metavar="PRECISION",
help=f'Set model precision. Defaults to auto selected based on device. Options: {", ".join(PRECISION_CHOICES)}',
default="auto",
)
parser.add_argument(
"--free_gpu_mem",
dest="free_gpu_mem",
action="store_true",
help="Force free gpu memory before final decoding",
)
return parser

View File

@ -1,113 +0,0 @@
from typing import Literal, Union
from PIL import Image, ImageChops
from PIL.Image import Image as ImageType
# https://stackoverflow.com/questions/43864101/python-pil-check-if-image-is-transparent
def check_for_any_transparency(img: Union[ImageType, str]) -> bool:
if type(img) is str:
img = Image.open(str)
if img.info.get("transparency", None) is not None:
return True
if img.mode == "P":
transparent = img.info.get("transparency", -1)
for _, index in img.getcolors():
if index == transparent:
return True
elif img.mode == "RGBA":
extrema = img.getextrema()
if extrema[3][0] < 255:
return True
return False
def get_canvas_generation_mode(
init_img: Union[ImageType, str], init_mask: Union[ImageType, str]
) -> Literal["txt2img", "outpainting", "inpainting", "img2img",]:
if type(init_img) is str:
init_img = Image.open(init_img)
if type(init_mask) is str:
init_mask = Image.open(init_mask)
init_img = init_img.convert("RGBA")
# Get alpha from init_img
init_img_alpha = init_img.split()[-1]
init_img_alpha_mask = init_img_alpha.convert("L")
init_img_has_transparency = check_for_any_transparency(init_img)
if init_img_has_transparency:
init_img_is_fully_transparent = True if init_img_alpha_mask.getbbox() is None else False
"""
Mask images are white in areas where no change should be made, black where changes
should be made.
"""
# Fit the mask to init_img's size and convert it to greyscale
init_mask = init_mask.resize(init_img.size).convert("L")
"""
PIL.Image.getbbox() returns the bounding box of non-zero areas of the image, so we first
invert the mask image so that masked areas are white and other areas black == zero.
getbbox() now tells us if the are any masked areas.
"""
init_mask_bbox = ImageChops.invert(init_mask).getbbox()
init_mask_exists = False if init_mask_bbox is None else True
if init_img_has_transparency:
if init_img_is_fully_transparent:
return "txt2img"
else:
return "outpainting"
else:
if init_mask_exists:
return "inpainting"
else:
return "img2img"
def main():
# Testing
init_img_opaque = "test_images/init-img_opaque.png"
init_img_partial_transparency = "test_images/init-img_partial_transparency.png"
init_img_full_transparency = "test_images/init-img_full_transparency.png"
init_mask_no_mask = "test_images/init-mask_no_mask.png"
init_mask_has_mask = "test_images/init-mask_has_mask.png"
print(
"OPAQUE IMAGE, NO MASK, expect img2img, got ",
get_canvas_generation_mode(init_img_opaque, init_mask_no_mask),
)
print(
"IMAGE WITH TRANSPARENCY, NO MASK, expect outpainting, got ",
get_canvas_generation_mode(init_img_partial_transparency, init_mask_no_mask),
)
print(
"FULLY TRANSPARENT IMAGE NO MASK, expect txt2img, got ",
get_canvas_generation_mode(init_img_full_transparency, init_mask_no_mask),
)
print(
"OPAQUE IMAGE, WITH MASK, expect inpainting, got ",
get_canvas_generation_mode(init_img_opaque, init_mask_has_mask),
)
print(
"IMAGE WITH TRANSPARENCY, WITH MASK, expect outpainting, got ",
get_canvas_generation_mode(init_img_partial_transparency, init_mask_has_mask),
)
print(
"FULLY TRANSPARENT IMAGE WITH MASK, expect txt2img, got ",
get_canvas_generation_mode(init_img_full_transparency, init_mask_has_mask),
)
if __name__ == "__main__":
main()

View File

@ -1,82 +0,0 @@
import argparse
from .parse_seed_weights import parse_seed_weights
SAMPLER_CHOICES = [
"ddim",
"ddpm",
"deis",
"lms",
"lms_k",
"pndm",
"heun",
"heun_k",
"euler",
"euler_k",
"euler_a",
"kdpm_2",
"kdpm_2_a",
"dpmpp_2s",
"dpmpp_2s_k",
"dpmpp_2m",
"dpmpp_2m_k",
"dpmpp_2m_sde",
"dpmpp_2m_sde_k",
"dpmpp_sde",
"dpmpp_sde_k",
"unipc",
]
def parameters_to_command(params):
"""
Converts dict of parameters into a `invoke.py` REPL command.
"""
switches = list()
if "prompt" in params:
switches.append(f'"{params["prompt"]}"')
if "steps" in params:
switches.append(f'-s {params["steps"]}')
if "seed" in params:
switches.append(f'-S {params["seed"]}')
if "width" in params:
switches.append(f'-W {params["width"]}')
if "height" in params:
switches.append(f'-H {params["height"]}')
if "cfg_scale" in params:
switches.append(f'-C {params["cfg_scale"]}')
if "sampler_name" in params:
switches.append(f'-A {params["sampler_name"]}')
if "seamless" in params and params["seamless"] == True:
switches.append(f"--seamless")
if "hires_fix" in params and params["hires_fix"] == True:
switches.append(f"--hires")
if "init_img" in params and len(params["init_img"]) > 0:
switches.append(f'-I {params["init_img"]}')
if "init_mask" in params and len(params["init_mask"]) > 0:
switches.append(f'-M {params["init_mask"]}')
if "init_color" in params and len(params["init_color"]) > 0:
switches.append(f'--init_color {params["init_color"]}')
if "strength" in params and "init_img" in params:
switches.append(f'-f {params["strength"]}')
if "fit" in params and params["fit"] == True:
switches.append(f"--fit")
if "facetool" in params:
switches.append(f'-ft {params["facetool"]}')
if "facetool_strength" in params and params["facetool_strength"]:
switches.append(f'-G {params["facetool_strength"]}')
elif "gfpgan_strength" in params and params["gfpgan_strength"]:
switches.append(f'-G {params["gfpgan_strength"]}')
if "codeformer_fidelity" in params:
switches.append(f'-cf {params["codeformer_fidelity"]}')
if "upscale" in params and params["upscale"]:
switches.append(f'-U {params["upscale"][0]} {params["upscale"][1]}')
if "variation_amount" in params and params["variation_amount"] > 0:
switches.append(f'-v {params["variation_amount"]}')
if "with_variations" in params:
seed_weight_pairs = ",".join(f"{seed}:{weight}" for seed, weight in params["with_variations"])
switches.append(f"-V {seed_weight_pairs}")
return " ".join(switches)

View File

@ -1,47 +0,0 @@
def parse_seed_weights(seed_weights):
"""
Accepts seed weights as string in "12345:0.1,23456:0.2,3456:0.3" format
Validates them
If valid: returns as [[12345, 0.1], [23456, 0.2], [3456, 0.3]]
If invalid: returns False
"""
# Must be a string
if not isinstance(seed_weights, str):
return False
# String must not be empty
if len(seed_weights) == 0:
return False
pairs = []
for pair in seed_weights.split(","):
split_values = pair.split(":")
# Seed and weight are required
if len(split_values) != 2:
return False
if len(split_values[0]) == 0 or len(split_values[1]) == 1:
return False
# Try casting the seed to int and weight to float
try:
seed = int(split_values[0])
weight = float(split_values[1])
except ValueError:
return False
# Seed must be 0 or above
if not seed >= 0:
return False
# Weight must be between 0 and 1
if not (weight >= 0 and weight <= 1):
return False
# This pair is valid
pairs.append([seed, weight])
# All pairs are valid
return pairs

Binary file not shown.

Before

Width:  |  Height:  |  Size: 2.7 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 292 KiB

Some files were not shown because too many files have changed in this diff Show More