mirror of
https://github.com/invoke-ai/InvokeAI
synced 2024-08-30 20:32:17 +00:00
Merge branch 'main' into patch-1
This commit is contained in:
commit
63178c6a8c
453
README.md
453
README.md
@ -1,17 +1,39 @@
|
||||
# Stable Diffusion Dream Script
|
||||
|
||||
This is a fork of CompVis/stable-diffusion, the wonderful open source
|
||||
text-to-image generator. The original has been improved in several ways:
|
||||
text-to-image generator. This fork supports:
|
||||
|
||||
1. An interactive command-line interface that accepts the same prompt
|
||||
and switches as the Discord bot.
|
||||
|
||||
2. Support for img2img in which you provide a seed image to build on
|
||||
top of.
|
||||
|
||||
3. A basic Web interface that allows you to run a local web server for
|
||||
generating images in your browser.
|
||||
|
||||
4. Upscaling and face fixing using the optional ESRGAN and GFPGAN
|
||||
packages.
|
||||
|
||||
5. Weighted subprompts for prompt tuning.
|
||||
|
||||
6. Textual inversion for customization of the prompt language and images.
|
||||
|
||||
7. ...and more!
|
||||
|
||||
This fork is rapidly evolving, so use the Issues panel to report bugs
|
||||
and make feature requests, and check back periodically for
|
||||
improvements and bug fixes.
|
||||
|
||||
## Interactive command-line interface similar to the Discord bot
|
||||
|
||||
The *dream.py* script, located in scripts/dream.py,
|
||||
The _dream.py_ script, located in scripts/dream.py,
|
||||
provides an interactive interface to image generation similar to
|
||||
the "dream mothership" bot that Stable AI provided on its Discord
|
||||
server. Unlike the txt2img.py and img2img.py scripts provided in the
|
||||
original CompViz/stable-diffusion source code repository, the
|
||||
time-consuming initialization of the AI model
|
||||
initialization only happens once. After that image generation
|
||||
initialization only happens once. After that image generation
|
||||
from the command-line interface is very fast.
|
||||
|
||||
The script uses the readline library to allow for in-line editing,
|
||||
@ -25,7 +47,7 @@ The script is confirmed to work on Linux and Windows systems. It should
|
||||
work on MacOSX as well, but this is not confirmed. Note that this script
|
||||
runs from the command-line (CMD or Terminal window), and does not have a GUI.
|
||||
|
||||
~~~~
|
||||
```
|
||||
(ldm) ~/stable-diffusion$ python3 ./scripts/dream.py
|
||||
* Initializing, be patient...
|
||||
Loading model from models/ldm/text2img-large/model.ckpt
|
||||
@ -47,13 +69,13 @@ dream> q
|
||||
00009.png: "ashley judd riding a camel" -s150 -S 416354203
|
||||
00010.png: "ashley judd riding a camel" -s150 -S 1362479620
|
||||
00011.png: "there's a fly in my soup" -n6 -g -S 2685670268
|
||||
~~~~
|
||||
```
|
||||
|
||||
The dream> prompt's arguments are pretty much identical to those used
|
||||
in the Discord bot, except you don't need to type "!dream" (it doesn't
|
||||
hurt if you do). A significant change is that creation of individual
|
||||
images is now the default unless --grid (-g) is given. For backward
|
||||
compatibility, the -i switch is recognized. For command-line help
|
||||
compatibility, the -i switch is recognized. For command-line help
|
||||
type -h (or --help) at the dream> prompt.
|
||||
|
||||
The script itself also recognizes a series of command-line switches
|
||||
@ -65,12 +87,12 @@ image outputs and the location of the model weight files.
|
||||
This script also provides an img2img feature that lets you seed your
|
||||
creations with a drawing or photo. This is a really cool feature that tells
|
||||
stable diffusion to build the prompt on top of the image you provide, preserving
|
||||
the original's basic shape and layout. To use it, provide the --init_img
|
||||
the original's basic shape and layout. To use it, provide the --init_img
|
||||
option as shown here:
|
||||
|
||||
~~~~
|
||||
```
|
||||
dream> "waterfall and rainbow" --init_img=./init-images/crude_drawing.png --strength=0.5 -s100 -n4
|
||||
~~~~
|
||||
```
|
||||
|
||||
The --init_img (-I) option gives the path to the seed picture. --strength (-f) controls how much
|
||||
the original will be modified, ranging from 0.0 (keep the original intact), to 1.0 (ignore the original
|
||||
@ -80,92 +102,131 @@ You may also pass a -v<count> option to generate count variants on the original
|
||||
passing the first generated image back into img2img the requested number of times. It generates interesting
|
||||
variants.
|
||||
|
||||
## GFPGAN Support
|
||||
## GFPGAN and Real-ESRGAN Support
|
||||
|
||||
This script also provides the ability to invoke GFPGAN after image
|
||||
generation. Doing so will enhance faces and optionally upscale the
|
||||
image to a higher resolution.
|
||||
The script also provides the ability to do face restoration and
|
||||
upscaling with the help of GFPGAN and Real-ESRGAN respectively.
|
||||
|
||||
To use the ability, clone the [GFPGAN
|
||||
repository](https://github.com/TencentARC/GFPGAN) and follow their
|
||||
To use the ability, clone the **[GFPGAN
|
||||
repository](https://github.com/TencentARC/GFPGAN)** and follow their
|
||||
installation instructions. By default, we expect GFPGAN to be
|
||||
installed in a 'GFPGAN' sibling directory. Be sure that the "ldm"
|
||||
installed in a 'GFPGAN' sibling directory. Be sure that the `"ldm"`
|
||||
conda environment is active as you install GFPGAN.
|
||||
|
||||
You may also want to install Real-ESRGAN, if you want to enhance
|
||||
non-face regions in the image, by installing the pip Real-ESRGAN
|
||||
package.
|
||||
You can use the `--gfpgan_dir` argument with `dream.py` to set a
|
||||
custom path to your GFPGAN directory. _There are other GFPGAN related
|
||||
boot arguments if you wish to customize further._
|
||||
|
||||
You can install **Real-ESRGAN** by typing the following command.
|
||||
|
||||
```
|
||||
pip install realesrgan
|
||||
|
||||
```
|
||||
|
||||
**Preloading Models**
|
||||
|
||||
Users whose GPU machines are isolated from the Internet (e.g. on a
|
||||
University cluster) should be aware that the first time you run
|
||||
dream.py with GFPGAN turned on, it will try to download model files
|
||||
from the Internet. To rectify this, you may run `python3
|
||||
scripts/preload_models.pl` after you have installed GFPGAN and all its
|
||||
dependencies.
|
||||
dream.py with GFPGAN and Real-ESRGAN turned on, it will try to
|
||||
download model files from the Internet. To rectify this, you may run
|
||||
`python3 scripts/preload_models.py` after you have installed GFPGAN
|
||||
and all its dependencies.
|
||||
|
||||
Now, you can run this script by adding the **--gfpgan** option. Any
|
||||
issues with GFPGAN will be reported on initialization.
|
||||
**Usage**
|
||||
|
||||
~~~~
|
||||
(ldm) ~/stable-diffusion$ python3 ./scripts/dream.py --gfpgan
|
||||
* Initializing, be patient...
|
||||
(...more initialization messages...)
|
||||
* --gfpgan was specified, loading gfpgan...
|
||||
(...even more initialization messages...)
|
||||
* Initialization done! Awaiting your command...
|
||||
~~~~
|
||||
You will now have access to two new prompt arguments.
|
||||
|
||||
When generating prompts, add a -G or --gfpgan_strenth option to
|
||||
control the strength of the GFPGAN enhancement. 0.0 is no
|
||||
enhancement, 1.0 is maximum enhancement.
|
||||
**Upscaling**
|
||||
|
||||
So for instance, to apply the maximum strength:
|
||||
~~~~
|
||||
dream> a man wearing a pineapple hat -G 1
|
||||
~~~~
|
||||
`-U : <upscaling_factor> <upscaling_strength>`
|
||||
|
||||
The upscaling prompt argument takes two values. The first value is a
|
||||
scaling factor and should be set to either `2` or `4` only. This will
|
||||
either scale the image 2x or 4x respectively using different models.
|
||||
|
||||
You can set the scaling stength between `0` and `1.0` to control
|
||||
intensity of the of the scaling. This is handy because AI upscalers
|
||||
generally tend to smooth out texture details. If you wish to retain
|
||||
some of those for natural looking results, we recommend using values
|
||||
between `0.5 to 0.8`.
|
||||
|
||||
If you do not explicitly specify an upscaling_strength, it will
|
||||
default to 0.75.
|
||||
|
||||
**Face Restoration**
|
||||
|
||||
`-G : <gfpgan_strength>`
|
||||
|
||||
This prompt argument controls the strength of the face restoration
|
||||
that is being applied. Similar to upscaling, values between `0.5 to
|
||||
0.8` are recommended.
|
||||
|
||||
You can use either one or both without any conflicts. In cases where
|
||||
you use both, the image will be first upscaled and then the face
|
||||
restoration process will be executed to ensure you get the highest
|
||||
quality facial features.
|
||||
|
||||
`--save_orig`
|
||||
|
||||
When you use either `-U` or `-G`, the final result you get is upscaled
|
||||
or face modified. If you want to save the original Stable Diffusion
|
||||
generation, you can use the `-save_orig` prompt argument to save the
|
||||
original unaffected version too.
|
||||
|
||||
**Example Usage**
|
||||
|
||||
```
|
||||
dream > superman dancing with a panda bear -U 2 0.6 -G 0.4
|
||||
```
|
||||
|
||||
This also works with img2img:
|
||||
~~~
|
||||
dream> a man wearing a pineapple hat -I path/to/your/file.png -G 1
|
||||
~~~
|
||||
|
||||
That's it!
|
||||
```
|
||||
dream> a man wearing a pineapple hat -I path/to/your/file.png -U 2 0.5 -G 0.6
|
||||
```
|
||||
|
||||
There's also a bunch of options to control GFPGAN settings when
|
||||
starting the script for different configs that you can read about in
|
||||
the help text. This will let you control where GFPGAN is installed, if
|
||||
upsampling is enabled, the upsampler to use and the model path.
|
||||
**Note**
|
||||
|
||||
By default, images will be upscaled by 2-fold, meaning that the old
|
||||
Stable Diffusion default size of 512x512 will now be a glorious
|
||||
detailed 1024x1024. The extent of upscaling is set when you run the
|
||||
script, and can't be changed while it's running. However, at any time
|
||||
you may specify **-G0** to turn off upscaling and facial enhancement
|
||||
for that image or set of images.
|
||||
GFPGAN and Real-ESRGAN are both memory intensive. In order to avoid
|
||||
crashes and memory overloads during the Stable Diffusion process,
|
||||
these effects are applied after Stable Diffusion has completed its
|
||||
work.
|
||||
|
||||
Note that loading GFPGAN consumes additional GPU memory, and will add
|
||||
a few seconds to image generation. However, if can afford a 3090s with
|
||||
24Gi, the results are well worth it.
|
||||
In single image generations, you will see the output right away but
|
||||
when you are using multiple iterations, the images will first be
|
||||
generated and then upscaled and face restored after that process is
|
||||
complete. While the image generation is taking place, you will still
|
||||
be able to preview the base images.
|
||||
|
||||
If you wish to stop during the image generation but want to upscale or
|
||||
face restore a particular generated image, pass it again with the same
|
||||
prompt and generated seed along with the `-U` and `-G` prompt
|
||||
arguments to perform those actions.
|
||||
|
||||
## Google Colab
|
||||
|
||||
Stable Diffusion AI Notebook: <a href="https://colab.research.google.com/github/lstein/stable-diffusion/blob/main/Stable_Diffusion_AI_Notebook.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a> <br>
|
||||
Open and follow instructions to use an isolated environment running Dream.<br>
|
||||
|
||||
Output example:
|
||||

|
||||
|
||||
## Barebones Web Server
|
||||
|
||||
As of version 1.10, this distribution comes with a bare bones web
|
||||
server (see screenshot). To use it, run the command:
|
||||
server (see screenshot). To use it, run the *dream.py* script by
|
||||
adding the **--web** option.
|
||||
|
||||
~~~~
|
||||
(ldm) ~/stable-diffusion$ python3 scripts/dream_web.py
|
||||
~~~~
|
||||
```
|
||||
(ldm) ~/stable-diffusion$ python3 scripts/dream_web.py --web
|
||||
```
|
||||
|
||||
You can then connect to the server by pointing your web browser at
|
||||
http://localhost:9090, or to the network name or IP address of the server.
|
||||
|
||||
Kudos to [Tesseract Cat](https://github.com/TesseractCat) for
|
||||
contributing this code.
|
||||
contributing this code, and to [dagf2101](https://github.com/dagf2101)
|
||||
for refining it.
|
||||
|
||||

|
||||
|
||||
@ -176,17 +237,23 @@ you want to run, one line per prompt. The text file must be composed
|
||||
with a text editor (e.g. Notepad) and not a word processor. Each line
|
||||
should look like what you would type at the dream> prompt:
|
||||
|
||||
~~~~
|
||||
```
|
||||
a beautiful sunny day in the park, children playing -n4 -C10
|
||||
stormy weather on a mountain top, goats grazing -s100
|
||||
innovative packaging for a squid's dinner -S137038382
|
||||
~~~~
|
||||
```
|
||||
|
||||
Then pass this file's name to dream.py when you invoke it:
|
||||
|
||||
~~~~
|
||||
(ldm) ~/stable-diffusion$ python3 scripts/dream.py --from_file="path/to/prompts.txt"
|
||||
~~~~
|
||||
```
|
||||
(ldm) ~/stable-diffusion$ python3 scripts/dream.py --from_file "path/to/prompts.txt"
|
||||
```
|
||||
|
||||
You may read a series of prompts from standard input by providing a filename of "-":
|
||||
|
||||
```
|
||||
(ldm) ~/stable-diffusion$ echo "a beautiful day" | python3 scripts/dream.py --from_file -
|
||||
```
|
||||
|
||||
## Shortcut for reusing seeds from the previous command
|
||||
|
||||
@ -202,7 +269,7 @@ Here's an example of using this to do a quick refinement. It also
|
||||
illustrates using the new **-G** switch to turn on upscaling and
|
||||
face enhancement (see previous section):
|
||||
|
||||
~~~~
|
||||
```
|
||||
dream> a cute child playing hopscotch -G0.5
|
||||
[...]
|
||||
outputs/img-samples/000039.3498014304.png: "a cute child playing hopscotch" -s50 -b1 -W512 -H512 -C7.5 -mk_lms -S3498014304
|
||||
@ -212,9 +279,7 @@ dream> a cute child playing hopscotch -G1.0 -s100 -S -1
|
||||
reusing previous seed 3498014304
|
||||
[...]
|
||||
outputs/img-samples/000040.3498014304.png: "a cute child playing hopscotch" -G1.0 -s100 -b1 -W512 -H512 -C7.5 -mk_lms -S3498014304
|
||||
~~~~
|
||||
|
||||
|
||||
```
|
||||
|
||||
## Weighted Prompts
|
||||
|
||||
@ -222,9 +287,9 @@ You may weight different sections of the prompt to tell the sampler to attach di
|
||||
priority to them, by adding :(number) to the end of the section you wish to up- or downweight.
|
||||
For example consider this prompt:
|
||||
|
||||
~~~~
|
||||
```
|
||||
tabby cat:0.25 white duck:0.75 hybrid
|
||||
~~~~
|
||||
```
|
||||
|
||||
This will tell the sampler to invest 25% of its effort on the tabby
|
||||
cat aspect of the image and 75% on the white duck aspect
|
||||
@ -239,7 +304,7 @@ and introducing a new vocabulary to the fixed model.
|
||||
|
||||
To train, prepare a folder that contains images sized at 512x512 and execute the following:
|
||||
|
||||
~~~~
|
||||
```
|
||||
# As the default backend is not available on Windows, if you're using that platform, execute SET PL_TORCH_DISTRIBUTED_BACKEND=gloo
|
||||
(ldm) ~/stable-diffusion$ python3 ./main.py --base ./configs/stable-diffusion/v1-finetune.yaml \
|
||||
-t \
|
||||
@ -248,14 +313,14 @@ To train, prepare a folder that contains images sized at 512x512 and execute the
|
||||
--gpus 0, \
|
||||
--data_root D:/textual-inversion/my_cat \
|
||||
--init_word 'cat'
|
||||
~~~~
|
||||
```
|
||||
|
||||
During the training process, files will be created in /logs/[project][time][project]/
|
||||
where you can see the process.
|
||||
|
||||
conditioning* contains the training prompts
|
||||
conditioning\* contains the training prompts
|
||||
inputs, reconstruction the input images for the training epoch
|
||||
samples, samples scaled for a sample of the prompt and one with the init word provided
|
||||
samples, samples scaled for a sample of the prompt and one with the init word provided
|
||||
|
||||
On a RTX3090, the process for SD will take ~1h @1.6 iterations/sec.
|
||||
|
||||
@ -269,28 +334,29 @@ heat death of the universe, when you find a low loss epoch or around
|
||||
Once the model is trained, specify the trained .pt file when starting
|
||||
dream using
|
||||
|
||||
~~~~
|
||||
```
|
||||
(ldm) ~/stable-diffusion$ python3 ./scripts/dream.py --embedding_path /path/to/embedding.pt --full_precision
|
||||
~~~~
|
||||
```
|
||||
|
||||
Then, to utilize your subject at the dream prompt
|
||||
|
||||
~~~
|
||||
```
|
||||
dream> "a photo of *"
|
||||
~~~
|
||||
```
|
||||
|
||||
this also works with image2image
|
||||
~~~~
|
||||
|
||||
```
|
||||
dream> "waterfall and rainbow in the style of *" --init_img=./init-images/crude_drawing.png --strength=0.5 -s100 -n4
|
||||
~~~~
|
||||
```
|
||||
|
||||
It's also possible to train multiple tokens (modify the placeholder string in configs/stable-diffusion/v1-finetune.yaml) and combine LDM checkpoints using:
|
||||
|
||||
~~~~
|
||||
```
|
||||
(ldm) ~/stable-diffusion$ python3 ./scripts/merge_embeddings.py \
|
||||
--manager_ckpts /path/to/first/embedding.pt /path/to/second/embedding.pt [...] \
|
||||
--output_path /path/to/output/embedding.pt
|
||||
~~~~
|
||||
```
|
||||
|
||||
Credit goes to @rinongal and the repository located at
|
||||
https://github.com/rinongal/textual_inversion Please see the
|
||||
@ -298,80 +364,104 @@ repository and associated paper for details and limitations.
|
||||
|
||||
## Changes
|
||||
|
||||
* v1.11 (26 August 2022)
|
||||
* NEW FEATURE: Support upscaling and face enhancement using the GFPGAN module. (kudos to [Oceanswave](https://github.com/Oceanswave)
|
||||
* You now can specify a seed of -1 to use the previous image's seed, -2 to use the seed for the image generated before that, etc.
|
||||
Seed memory only extends back to the previous command, but will work on all images generated with the -n# switch.
|
||||
* Variant generation support temporarily disabled pending more general solution.
|
||||
* Created a feature branch named **yunsaki-morphing-dream** which adds experimental support for
|
||||
iteratively modifying the prompt and its parameters. Please see[ Pull Request #86](https://github.com/lstein/stable-diffusion/pull/86)
|
||||
for a synopsis of how this works. Note that when this feature is eventually added to the main branch, it will may be modified
|
||||
significantly.
|
||||
|
||||
* v1.10 (25 August 2022)
|
||||
* A barebones but fully functional interactive web server for online generation of txt2img and img2img.
|
||||
|
||||
* v1.09 (24 August 2022)
|
||||
* A new -v option allows you to generate multiple variants of an initial image
|
||||
in img2img mode. (kudos to [Oceanswave](https://github.com/Oceanswave). [
|
||||
See this discussion in the PR for examples and details on use](https://github.com/lstein/stable-diffusion/pull/71#issuecomment-1226700810))
|
||||
* Added ability to personalize text to image generation (kudos to [Oceanswave](https://github.com/Oceanswave) and [nicolai256](https://github.com/nicolai256))
|
||||
* Enabled all of the samplers from k_diffusion
|
||||
|
||||
* v1.08 (24 August 2022)
|
||||
* Escape single quotes on the dream> command before trying to parse. This avoids
|
||||
parse errors.
|
||||
* Removed instruction to get Python3.8 as first step in Windows install.
|
||||
Anaconda3 does it for you.
|
||||
* Added bounds checks for numeric arguments that could cause crashes.
|
||||
* Cleaned up the copyright and license agreement files.
|
||||
- v1.13 (in process)
|
||||
- Supports a Google Colab notebook for a standalone server running on Google hardware [Arturo Mendivil](https://github.com/artmen1516)
|
||||
- WebUI supports GFPGAN/ESRGAN facial reconstruction and upscaling [Kevin Gibbons](https://github.com/bakkot)
|
||||
- WebUI supports incremental display of in-progress images during generation [Kevin Gibbons](https://github.com/bakkot)
|
||||
- Output directory can be specified on the dream> command line.
|
||||
- The grid was displaying duplicated images when not enough images to fill the final row [Muhammad Usama](https://github.com/SMUsamaShah)
|
||||
- Can specify --grid on dream.py command line as the default.
|
||||
- Miscellaneous internal bug and stability fixes.
|
||||
|
||||
* v1.07 (23 August 2022)
|
||||
* Image filenames will now never fill gaps in the sequence, but will be assigned the
|
||||
next higher name in the chosen directory. This ensures that the alphabetic and chronological
|
||||
sort orders are the same.
|
||||
- v1.12 (28 August 2022)
|
||||
|
||||
* v1.06 (23 August 2022)
|
||||
* Added weighted prompt support contributed by [xraxra](https://github.com/xraxra)
|
||||
* Example of using weighted prompts to tweak a demonic figure contributed by [bmaltais](https://github.com/bmaltais)
|
||||
- Improved file handling, including ability to read prompts from standard input.
|
||||
(kudos to [Yunsaki](https://github.com/yunsaki)
|
||||
- The web server is now integrated with the dream.py script. Invoke by adding --web to
|
||||
the dream.py command arguments.
|
||||
- Face restoration and upscaling via GFPGAN and Real-ESGAN are now automatically
|
||||
enabled if the GFPGAN directory is located as a sibling to Stable Diffusion.
|
||||
VRAM requirements are modestly reduced. Thanks to both [Blessedcoolant](https://github.com/blessedcoolant) and
|
||||
[Oceanswave](https://github.com/oceanswave) for their work on this.
|
||||
- You can now swap samplers on the dream> command line. [Blessedcoolant](https://github.com/blessedcoolant)
|
||||
|
||||
* v1.05 (22 August 2022 - after the drop)
|
||||
* Filenames now use the following formats:
|
||||
000010.95183149.png -- Two files produced by the same command (e.g. -n2),
|
||||
000010.26742632.png -- distinguished by a different seed.
|
||||
- v1.11 (26 August 2022)
|
||||
- NEW FEATURE: Support upscaling and face enhancement using the GFPGAN module. (kudos to [Oceanswave](https://github.com/Oceanswave)
|
||||
- You now can specify a seed of -1 to use the previous image's seed, -2 to use the seed for the image generated before that, etc.
|
||||
Seed memory only extends back to the previous command, but will work on all images generated with the -n# switch.
|
||||
- Variant generation support temporarily disabled pending more general solution.
|
||||
- Created a feature branch named **yunsaki-morphing-dream** which adds experimental support for
|
||||
iteratively modifying the prompt and its parameters. Please see[ Pull Request #86](https://github.com/lstein/stable-diffusion/pull/86)
|
||||
for a synopsis of how this works. Note that when this feature is eventually added to the main branch, it will may be modified
|
||||
significantly.
|
||||
- v1.10 (25 August 2022)
|
||||
- A barebones but fully functional interactive web server for online generation of txt2img and img2img.
|
||||
- v1.09 (24 August 2022)
|
||||
- A new -v option allows you to generate multiple variants of an initial image
|
||||
in img2img mode. (kudos to [Oceanswave](https://github.com/Oceanswave). [
|
||||
See this discussion in the PR for examples and details on use](https://github.com/lstein/stable-diffusion/pull/71#issuecomment-1226700810))
|
||||
- Added ability to personalize text to image generation (kudos to [Oceanswave](https://github.com/Oceanswave) and [nicolai256](https://github.com/nicolai256))
|
||||
- Enabled all of the samplers from k_diffusion
|
||||
- v1.08 (24 August 2022)
|
||||
|
||||
000011.455191342.01.png -- Two files produced by the same command using
|
||||
000011.455191342.02.png -- a batch size>1 (e.g. -b2). They have the same seed.
|
||||
- Escape single quotes on the dream> command before trying to parse. This avoids
|
||||
parse errors.
|
||||
- Removed instruction to get Python3.8 as first step in Windows install.
|
||||
Anaconda3 does it for you.
|
||||
- Added bounds checks for numeric arguments that could cause crashes.
|
||||
- Cleaned up the copyright and license agreement files.
|
||||
|
||||
000011.4160627868.grid#1-4.png -- a grid of four images (-g); the whole grid can
|
||||
be regenerated with the indicated key
|
||||
- v1.07 (23 August 2022)
|
||||
|
||||
* It should no longer be possible for one image to overwrite another
|
||||
* You can use the "cd" and "pwd" commands at the dream> prompt to set and retrieve
|
||||
the path of the output directory.
|
||||
|
||||
* v1.04 (22 August 2022 - after the drop)
|
||||
* Updated README to reflect installation of the released weights.
|
||||
* Suppressed very noisy and inconsequential warning when loading the frozen CLIP
|
||||
tokenizer.
|
||||
- Image filenames will now never fill gaps in the sequence, but will be assigned the
|
||||
next higher name in the chosen directory. This ensures that the alphabetic and chronological
|
||||
sort orders are the same.
|
||||
|
||||
* v1.03 (22 August 2022)
|
||||
* The original txt2img and img2img scripts from the CompViz repository have been moved into
|
||||
a subfolder named "orig_scripts", to reduce confusion.
|
||||
|
||||
* v1.02 (21 August 2022)
|
||||
* A copy of the prompt and all of its switches and options is now stored in the corresponding
|
||||
- v1.06 (23 August 2022)
|
||||
|
||||
- Added weighted prompt support contributed by [xraxra](https://github.com/xraxra)
|
||||
- Example of using weighted prompts to tweak a demonic figure contributed by [bmaltais](https://github.com/bmaltais)
|
||||
|
||||
- v1.05 (22 August 2022 - after the drop)
|
||||
|
||||
- Filenames now use the following formats:
|
||||
000010.95183149.png -- Two files produced by the same command (e.g. -n2),
|
||||
000010.26742632.png -- distinguished by a different seed.
|
||||
|
||||
000011.455191342.01.png -- Two files produced by the same command using
|
||||
000011.455191342.02.png -- a batch size>1 (e.g. -b2). They have the same seed.
|
||||
|
||||
000011.4160627868.grid#1-4.png -- a grid of four images (-g); the whole grid can
|
||||
be regenerated with the indicated key
|
||||
|
||||
- It should no longer be possible for one image to overwrite another
|
||||
- You can use the "cd" and "pwd" commands at the dream> prompt to set and retrieve
|
||||
the path of the output directory.
|
||||
|
||||
- v1.04 (22 August 2022 - after the drop)
|
||||
|
||||
- Updated README to reflect installation of the released weights.
|
||||
- Suppressed very noisy and inconsequential warning when loading the frozen CLIP
|
||||
tokenizer.
|
||||
|
||||
- v1.03 (22 August 2022)
|
||||
|
||||
- The original txt2img and img2img scripts from the CompViz repository have been moved into
|
||||
a subfolder named "orig_scripts", to reduce confusion.
|
||||
|
||||
- v1.02 (21 August 2022)
|
||||
|
||||
- A copy of the prompt and all of its switches and options is now stored in the corresponding
|
||||
image in a tEXt metadata field named "Dream". You can read the prompt using scripts/images2prompt.py,
|
||||
or an image editor that allows you to explore the full metadata.
|
||||
**Please run "conda env update -f environment.yaml" to load the k_lms dependencies!!**
|
||||
**Please run "conda env update -f environment.yaml" to load the k_lms dependencies!!**
|
||||
|
||||
* v1.01 (21 August 2022)
|
||||
* added k_lms sampling.
|
||||
**Please run "conda env update -f environment.yaml" to load the k_lms dependencies!!**
|
||||
* use half precision arithmetic by default, resulting in faster execution and lower memory requirements
|
||||
- v1.01 (21 August 2022)
|
||||
- added k_lms sampling.
|
||||
**Please run "conda env update -f environment.yaml" to load the k_lms dependencies!!**
|
||||
- use half precision arithmetic by default, resulting in faster execution and lower memory requirements
|
||||
Pass argument --full_precision to dream.py to get slower but more accurate image generation
|
||||
|
||||
|
||||
## Installation
|
||||
|
||||
There are separate installation walkthroughs for [Linux/Mac](#linuxmac) and [Windows](#windows).
|
||||
@ -379,38 +469,48 @@ There are separate installation walkthroughs for [Linux/Mac](#linuxmac) and [Win
|
||||
### Linux/Mac
|
||||
|
||||
1. You will need to install the following prerequisites if they are not already available. Use your
|
||||
operating system's preferred installer
|
||||
* Python (version 3.8.5 recommended; higher may work)
|
||||
* git
|
||||
operating system's preferred installer
|
||||
|
||||
- Python (version 3.8.5 recommended; higher may work)
|
||||
- git
|
||||
|
||||
2. Install the Python Anaconda environment manager using pip3.
|
||||
|
||||
```
|
||||
~$ pip3 install anaconda
|
||||
```
|
||||
|
||||
After installing anaconda, you should log out of your system and log back in. If the installation
|
||||
worked, your command prompt will be prefixed by the name of the current anaconda environment, "(base)".
|
||||
|
||||
3. Copy the stable-diffusion source code from GitHub:
|
||||
|
||||
```
|
||||
(base) ~$ git clone https://github.com/lstein/stable-diffusion.git
|
||||
```
|
||||
|
||||
This will create stable-diffusion folder where you will follow the rest of the steps.
|
||||
|
||||
4. Enter the newly-created stable-diffusion folder. From this step forward make sure that you are working in the stable-diffusion directory!
|
||||
|
||||
```
|
||||
(base) ~$ cd stable-diffusion
|
||||
(base) ~/stable-diffusion$
|
||||
```
|
||||
|
||||
5. Use anaconda to copy necessary python packages, create a new python environment named "ldm",
|
||||
and activate the environment.
|
||||
and activate the environment.
|
||||
|
||||
```
|
||||
(base) ~/stable-diffusion$ conda env create -f environment.yaml
|
||||
(base) ~/stable-diffusion$ conda activate ldm
|
||||
(ldm) ~/stable-diffusion$
|
||||
```
|
||||
|
||||
After these steps, your command prompt will be prefixed by "(ldm)" as shown above.
|
||||
|
||||
6. Load a couple of small machine-learning models required by stable diffusion:
|
||||
|
||||
```
|
||||
(ldm) ~/stable-diffusion$ python3 scripts/preload_models.py
|
||||
```
|
||||
@ -430,13 +530,14 @@ to a page that prompts you to click the "download" link. Save the file somewhere
|
||||
|
||||
Now run the following commands from within the stable-diffusion directory. This will create a symbolic
|
||||
link from the stable-diffusion model.ckpt file, to the true location of the sd-v1-4.ckpt file.
|
||||
|
||||
|
||||
```
|
||||
(ldm) ~/stable-diffusion$ mkdir -p models/ldm/stable-diffusion-v1
|
||||
(ldm) ~/stable-diffusion$ ln -sf /path/to/sd-v1-4.ckpt models/ldm/stable-diffusion-v1/model.ckpt
|
||||
```
|
||||
|
||||
8. Start generating images!
|
||||
|
||||
```
|
||||
# for the pre-release weights use the -l or --liaon400m switch
|
||||
(ldm) ~/stable-diffusion$ python3 scripts/dream.py -l
|
||||
@ -447,15 +548,18 @@ link from the stable-diffusion model.ckpt file, to the true location of the sd-v
|
||||
# for additional configuration switches and arguments, use -h or --help
|
||||
(ldm) ~/stable-diffusion$ python3 scripts/dream.py -h
|
||||
```
|
||||
9. Subsequently, to relaunch the script, be sure to run "conda activate ldm" (step 5, second command), enter the "stable-diffusion"
|
||||
directory, and then launch the dream script (step 8). If you forget to activate the ldm environment, the script will fail with multiple ModuleNotFound errors.
|
||||
|
||||
9. Subsequently, to relaunch the script, be sure to run "conda activate ldm" (step 5, second command), enter the "stable-diffusion"
|
||||
directory, and then launch the dream script (step 8). If you forget to activate the ldm environment, the script will fail with multiple ModuleNotFound errors.
|
||||
|
||||
#### Updating to newer versions of the script
|
||||
|
||||
This distribution is changing rapidly. If you used the "git clone" method (step 5) to download the stable-diffusion directory, then to update to the latest and greatest version, launch the Anaconda window, enter "stable-diffusion", and type:
|
||||
|
||||
```
|
||||
(ldm) ~/stable-diffusion$ git pull
|
||||
```
|
||||
|
||||
This will bring your local copy into sync with the remote one.
|
||||
|
||||
### Windows
|
||||
@ -467,24 +571,30 @@ This will bring your local copy into sync with the remote one.
|
||||
3. Launch Anaconda from the Windows Start menu. This will bring up a command window. Type all the remaining commands in this window.
|
||||
|
||||
4. Run the command:
|
||||
|
||||
```
|
||||
git clone https://github.com/lstein/stable-diffusion.git
|
||||
```
|
||||
|
||||
This will create stable-diffusion folder where you will follow the rest of the steps.
|
||||
|
||||
5. Enter the newly-created stable-diffusion folder. From this step forward make sure that you are working in the stable-diffusion directory!
|
||||
|
||||
```
|
||||
cd stable-diffusion
|
||||
```
|
||||
|
||||
6. Run the following two commands:
|
||||
|
||||
```
|
||||
conda env create -f environment.yaml (step 6a)
|
||||
conda activate ldm (step 6b)
|
||||
```
|
||||
|
||||
This will install all python requirements and activate the "ldm" environment which sets PATH and other environment variables properly.
|
||||
|
||||
7. Run the command:
|
||||
|
||||
```
|
||||
python scripts\preload_models.py
|
||||
```
|
||||
@ -497,30 +607,32 @@ downloaded just-in-time)
|
||||
8. Now you need to install the weights for the big stable diffusion model.
|
||||
|
||||
For running with the released weights, you will first need to set up
|
||||
an acount with Hugging Face (https://huggingface.co). Use your
|
||||
an acount with Hugging Face (https://huggingface.co). Use your
|
||||
credentials to log in, and then point your browser at
|
||||
https://huggingface.co/CompVis/stable-diffusion-v-1-4-original. You
|
||||
https://huggingface.co/CompVis/stable-diffusion-v-1-4-original. You
|
||||
may be asked to sign a license agreement at this point.
|
||||
|
||||
Click on "Files and versions" near the top of the page, and then click
|
||||
on the file named "sd-v1-4.ckpt". You'll be taken to a page that
|
||||
prompts you to click the "download" link. Now save the file somewhere
|
||||
safe on your local machine. The weight file is >4 GB in size, so
|
||||
safe on your local machine. The weight file is >4 GB in size, so
|
||||
downloading may take a while.
|
||||
|
||||
Now run the following commands from **within the stable-diffusion
|
||||
directory** to copy the weights file to the right place:
|
||||
|
||||
|
||||
```
|
||||
mkdir -p models\ldm\stable-diffusion-v1
|
||||
copy C:\path\to\sd-v1-4.ckpt models\ldm\stable-diffusion-v1\model.ckpt
|
||||
```
|
||||
|
||||
Please replace "C:\path\to\sd-v1.4.ckpt" with the correct path to wherever
|
||||
you stashed this file. If you prefer not to copy or move the .ckpt file,
|
||||
you stashed this file. If you prefer not to copy or move the .ckpt file,
|
||||
you may instead create a shortcut to it from within
|
||||
"models\ldm\stable-diffusion-v1\".
|
||||
|
||||
9. Start generating images!
|
||||
|
||||
```
|
||||
# for the pre-release weights
|
||||
python scripts\dream.py -l
|
||||
@ -528,14 +640,17 @@ python scripts\dream.py -l
|
||||
# for the post-release weights
|
||||
python scripts\dream.py
|
||||
```
|
||||
|
||||
10. Subsequently, to relaunch the script, first activate the Anaconda command window (step 3), enter the stable-diffusion directory (step 5, "cd \path\to\stable-diffusion"), run "conda activate ldm" (step 6b), and then launch the dream script (step 9).
|
||||
|
||||
#### Updating to newer versions of the script
|
||||
|
||||
This distribution is changing rapidly. If you used the "git clone" method (step 5) to download the stable-diffusion directory, then to update to the latest and greatest version, launch the Anaconda window, enter "stable-diffusion", and type:
|
||||
|
||||
```
|
||||
git pull
|
||||
```
|
||||
|
||||
This will bring your local copy into sync with the remote one.
|
||||
|
||||
## Simplified API for text to image generation
|
||||
@ -544,22 +659,21 @@ For programmers who wish to incorporate stable-diffusion into other
|
||||
products, this repository includes a simplified API for text to image generation, which
|
||||
lets you create images from a prompt in just three lines of code:
|
||||
|
||||
~~~~
|
||||
```
|
||||
from ldm.simplet2i import T2I
|
||||
model = T2I()
|
||||
outputs = model.txt2img("a unicorn in manhattan")
|
||||
~~~~
|
||||
```
|
||||
|
||||
Outputs is a list of lists in the format [[filename1,seed1],[filename2,seed2]...]
|
||||
Please see ldm/simplet2i.py for more information.
|
||||
|
||||
|
||||
## Workaround for machines with limited internet connectivity
|
||||
|
||||
My development machine is a GPU node in a high-performance compute
|
||||
cluster which has no connection to the internet. During model
|
||||
initialization, stable-diffusion tries to download the Bert tokenizer
|
||||
and a file needed by the kornia library. This obviously didn't work
|
||||
and a file needed by the kornia library. This obviously didn't work
|
||||
for me.
|
||||
|
||||
To work around this, I have modified ldm/modules/encoders/modules.py
|
||||
@ -570,7 +684,7 @@ prior to running the code on an isolated one. This assumes that both
|
||||
machines share a common network-mounted filesystem with a common
|
||||
.cache directory.
|
||||
|
||||
~~~~
|
||||
```
|
||||
(ldm) ~/stable-diffusion$ python3 ./scripts/preload_models.py
|
||||
preloading bert tokenizer...
|
||||
Downloading: 100%|██████████████████████████████████| 28.0/28.0 [00:00<00:00, 49.3kB/s]
|
||||
@ -582,7 +696,7 @@ preloading kornia requirements...
|
||||
Downloading: "https://github.com/DagnyT/hardnet/raw/master/pretrained/train_liberty_with_aug/checkpoint_liberty_with_aug.pth" to /u/lstein/.cache/torch/hub/checkpoints/checkpoint_liberty_with_aug.pth
|
||||
100%|███████████████████████████████████████████████| 5.10M/5.10M [00:00<00:00, 101MB/s]
|
||||
...success
|
||||
~~~~
|
||||
```
|
||||
|
||||
If you don't need this change and want to download the files just in
|
||||
time, copy over the file ldm/modules/encoders/modules.py from the
|
||||
@ -595,16 +709,15 @@ For support,
|
||||
please use this repository's GitHub Issues tracking service. Feel free
|
||||
to send me an email if you use and like the script.
|
||||
|
||||
*Original Author:* Lincoln D. Stein <lincoln.stein@gmail.com>
|
||||
_Original Author:_ Lincoln D. Stein <lincoln.stein@gmail.com>
|
||||
|
||||
*Contributions by:*
|
||||
_Contributions by:_
|
||||
[Peter Kowalczyk](https://github.com/slix), [Henry Harrison](https://github.com/hwharrison),
|
||||
[xraxra](https://github.com/xraxra), [bmaltais](https://github.com/bmaltais), [Sean McLellan](https://github.com/Oceanswave),
|
||||
[nicolai256](https://github.com/nicolai256), [Benjamin Warner](https://github.com/warner-benjamin),
|
||||
[tildebyte](https://github.com/tildebyte),
|
||||
[tildebyte](https://github.com/tildebyte),[yunsaki](https://github.com/yunsaki)
|
||||
and [Tesseract Cat](https://github.com/TesseractCat)
|
||||
|
||||
|
||||
Original portions of the software are Copyright (c) 2020 Lincoln D. Stein (https://github.com/lstein)
|
||||
|
||||
#Further Reading
|
||||
|
256
Stable_Diffusion_AI_Notebook.ipynb
Normal file
256
Stable_Diffusion_AI_Notebook.ipynb
Normal file
@ -0,0 +1,256 @@
|
||||
{
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 0,
|
||||
"metadata": {
|
||||
"colab": {
|
||||
"name": "Stable_Diffusion_AI_Notebook.ipynb",
|
||||
"provenance": [],
|
||||
"collapsed_sections": [],
|
||||
"private_outputs": true
|
||||
},
|
||||
"kernelspec": {
|
||||
"name": "python3",
|
||||
"display_name": "Python 3"
|
||||
},
|
||||
"language_info": {
|
||||
"name": "python"
|
||||
},
|
||||
"accelerator": "GPU",
|
||||
"gpuClass": "standard"
|
||||
},
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"source": [
|
||||
"# Stable Diffusion AI Notebook\n",
|
||||
"\n",
|
||||
"<img src=\"https://user-images.githubusercontent.com/60411196/186547976-d9de378a-9de8-4201-9c25-c057a9c59bad.jpeg\" alt=\"stable-diffusion-ai\" width=\"170px\"/> <br>\n",
|
||||
"#### Instructions:\n",
|
||||
"1. Execute each cell in order to mount a Dream bot and create images from text. <br>\n",
|
||||
"2. Once cells 1-8 were run correctly you'll be executing a terminal in cell #9, you'll to enter `pipenv run scripts/dream.py` command to run Dream bot.<br> \n",
|
||||
"3. After launching dream bot, you'll see: <br> `Dream > ` in terminal. <br> Insert a command, eg. `Dream > Astronaut floating in a distant galaxy`, or type `-h` for help.\n",
|
||||
"3. After completion you'll see your generated images in path `stable-diffusion/outputs/img-samples/`, you can also display images in cell #10.\n",
|
||||
"4. To quit Dream bot use `q` command. <br> \n",
|
||||
"---\n",
|
||||
"<font color=\"red\">Note:</font> It takes some time to load, but after installing all dependencies you can use the bot all time you want while colab instance is up. <br>\n",
|
||||
"<font color=\"red\">Requirements:</font> For this notebook to work you need to have [Stable-Diffusion-v-1-4](https://huggingface.co/CompVis/stable-diffusion-v-1-4-original) stored in your Google Drive, it will be needed in cell #6\n",
|
||||
"##### For more details visit Github repository: [lstein/stable-diffusion](https://github.com/lstein/stable-diffusion)\n",
|
||||
"---\n"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "ycYWcsEKc6w7"
|
||||
}
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"source": [
|
||||
"#@title 1. Check current GPU assigned\n",
|
||||
"!nvidia-smi -L\n",
|
||||
"!nvidia-smi"
|
||||
],
|
||||
"metadata": {
|
||||
"cellView": "form",
|
||||
"id": "a2Z5Qu_o8VtQ"
|
||||
},
|
||||
"execution_count": null,
|
||||
"outputs": []
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"cellView": "form",
|
||||
"id": "vbI9ZsQHzjqF"
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#@title 2. Download stable-diffusion Repository\n",
|
||||
"from os.path import exists\n",
|
||||
"\n",
|
||||
"if exists(\"/content/stable-diffusion/\")==True:\n",
|
||||
" print(\"Already downloaded repo\")\n",
|
||||
"else:\n",
|
||||
" !git clone --quiet https://github.com/lstein/stable-diffusion.git # Original repo\n",
|
||||
" %cd stable-diffusion/\n",
|
||||
" !git checkout --quiet tags/release-1.09\n",
|
||||
" "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"source": [
|
||||
"#@title 3. Install Python 3.8 \n",
|
||||
"%%capture --no-stderr\n",
|
||||
"import gc\n",
|
||||
"!apt-get -qq install python3.8\n",
|
||||
"gc.collect()"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "daHlozvwKesj",
|
||||
"cellView": "form"
|
||||
},
|
||||
"execution_count": null,
|
||||
"outputs": []
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"source": [
|
||||
"#@title 4. Install dependencies from file in a VirtualEnv\n",
|
||||
"#@markdown Be patient, it takes ~ 5 - 7min <br>\n",
|
||||
"%%capture --no-stderr\n",
|
||||
"#Virtual environment\n",
|
||||
"!pip install pipenv -q\n",
|
||||
"!pip install colab-xterm\n",
|
||||
"%load_ext colabxterm\n",
|
||||
"!pipenv --python 3.8\n",
|
||||
"!pipenv install -r requirements.txt --skip-lock\n",
|
||||
"gc.collect()\n"
|
||||
],
|
||||
"metadata": {
|
||||
"cellView": "form",
|
||||
"id": "QbXcGXYEFSNB"
|
||||
},
|
||||
"execution_count": null,
|
||||
"outputs": []
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"source": [
|
||||
"#@title 5. Mount google Drive\n",
|
||||
"from google.colab import drive\n",
|
||||
"drive.mount('/content/drive')"
|
||||
],
|
||||
"metadata": {
|
||||
"cellView": "form",
|
||||
"id": "YEWPV-sF1RDM"
|
||||
},
|
||||
"execution_count": null,
|
||||
"outputs": []
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"source": [
|
||||
"#@title 6. Drive Path to model\n",
|
||||
"#@markdown Path should start with /content/drive/path-to-your-file <br>\n",
|
||||
"#@markdown <font color=\"red\">Note:</font> Model should be downloaded from https://huggingface.co <br>\n",
|
||||
"#@markdown Lastest release: [Stable-Diffusion-v-1-4](https://huggingface.co/CompVis/stable-diffusion-v-1-4-original)\n",
|
||||
"from os.path import exists\n",
|
||||
"\n",
|
||||
"model_path = \"\" #@param {type:\"string\"}\n",
|
||||
"if exists(model_path)==True:\n",
|
||||
" print(\"✅ Valid directory\")\n",
|
||||
"else: \n",
|
||||
" print(\"❌ File doesn't exist\")"
|
||||
],
|
||||
"metadata": {
|
||||
"cellView": "form",
|
||||
"id": "zRTJeZ461WGu"
|
||||
},
|
||||
"execution_count": null,
|
||||
"outputs": []
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"source": [
|
||||
"#@title 7. Symlink to model\n",
|
||||
"\n",
|
||||
"from os.path import exists\n",
|
||||
"import os \n",
|
||||
"\n",
|
||||
"# Folder creation if it doesn't exist\n",
|
||||
"if exists(\"/content/stable-diffusion/models/ldm/stable-diffusion-v1\")==True:\n",
|
||||
" print(\"❗ Dir stable-diffusion-v1 already exists\")\n",
|
||||
"else:\n",
|
||||
" %mkdir /content/stable-diffusion/models/ldm/stable-diffusion-v1\n",
|
||||
" print(\"✅ Dir stable-diffusion-v1 created\")\n",
|
||||
"\n",
|
||||
"# Symbolic link if it doesn't exist\n",
|
||||
"if exists(\"/content/stable-diffusion/models/ldm/stable-diffusion-v1/model.ckpt\")==True:\n",
|
||||
" print(\"❗ Symlink already created\")\n",
|
||||
"else: \n",
|
||||
" src = model_path\n",
|
||||
" dst = '/content/stable-diffusion/models/ldm/stable-diffusion-v1/model.ckpt'\n",
|
||||
" os.symlink(src, dst) \n",
|
||||
" print(\"✅ Symbolic link created successfully\")"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "UY-NNz4I8_aG",
|
||||
"cellView": "form"
|
||||
},
|
||||
"execution_count": null,
|
||||
"outputs": []
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"source": [
|
||||
"#@title 8. Load small ML models required\n",
|
||||
"%%capture --no-stderr\n",
|
||||
"!pipenv run scripts/preload_models.py\n",
|
||||
"gc.collect()"
|
||||
],
|
||||
"metadata": {
|
||||
"cellView": "form",
|
||||
"id": "ChIDWxLVHGGJ"
|
||||
},
|
||||
"execution_count": null,
|
||||
"outputs": []
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"source": [
|
||||
"#@title 9. Run Terminal and Execute Dream bot\n",
|
||||
"#@markdown <font color=\"blue\">Steps:</font> <br>\n",
|
||||
"#@markdown 1. Execute command `pipenv run scripts/dream.py` to run dream bot.<br>\n",
|
||||
"#@markdown 2. After initialized you'll see `Dream>` line.<br>\n",
|
||||
"#@markdown 3. Example text: `Astronaut floating in a distant galaxy` <br>\n",
|
||||
"#@markdown 4. To quit Dream bot use: `q` command.<br>\n",
|
||||
"\n",
|
||||
"#Run from virtual env\n",
|
||||
"\n",
|
||||
"%xterm\n",
|
||||
"gc.collect()"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "ir4hCrMIuUpl",
|
||||
"cellView": "form"
|
||||
},
|
||||
"execution_count": null,
|
||||
"outputs": []
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"source": [
|
||||
"#@title 10. Show generated images\n",
|
||||
"\n",
|
||||
"import glob\n",
|
||||
"import matplotlib.pyplot as plt\n",
|
||||
"import matplotlib.image as mpimg\n",
|
||||
"%matplotlib inline\n",
|
||||
"\n",
|
||||
"images = []\n",
|
||||
"for img_path in glob.glob('/content/stable-diffusion/outputs/img-samples/*.png'):\n",
|
||||
" images.append(mpimg.imread(img_path))\n",
|
||||
"\n",
|
||||
"# Remove ticks and labels on x-axis and y-axis both\n",
|
||||
"\n",
|
||||
"plt.figure(figsize=(20,10))\n",
|
||||
"\n",
|
||||
"columns = 5\n",
|
||||
"for i, image in enumerate(images):\n",
|
||||
" ax = plt.subplot(len(images) / columns + 1, columns, i + 1)\n",
|
||||
" ax.axes.xaxis.set_visible(False)\n",
|
||||
" ax.axes.yaxis.set_visible(False)\n",
|
||||
" ax.axis('off')\n",
|
||||
" plt.imshow(image)\n",
|
||||
" gc.collect()\n",
|
||||
"\n"
|
||||
],
|
||||
"metadata": {
|
||||
"cellView": "form",
|
||||
"id": "qnLohSHmKoGk"
|
||||
},
|
||||
"execution_count": null,
|
||||
"outputs": []
|
||||
}
|
||||
]
|
||||
}
|
@ -14,6 +14,8 @@ from math import sqrt, floor, ceil
|
||||
from PIL import Image, PngImagePlugin
|
||||
|
||||
# -------------------image generation utils-----
|
||||
|
||||
|
||||
class PngWriter:
|
||||
def __init__(self, outdir, prompt=None, batch_size=1):
|
||||
self.outdir = outdir
|
||||
@ -23,18 +25,19 @@ class PngWriter:
|
||||
self.files_written = []
|
||||
os.makedirs(outdir, exist_ok=True)
|
||||
|
||||
def write_image(self, image, seed):
|
||||
def write_image(self, image, seed, upscaled=False):
|
||||
self.filepath = self.unique_filename(
|
||||
seed, self.filepath
|
||||
seed, upscaled, self.filepath
|
||||
) # will increment name in some sensible way
|
||||
try:
|
||||
prompt = f'{self.prompt} -S{seed}'
|
||||
self.save_image_and_prompt_to_png(image, prompt, self.filepath)
|
||||
except IOError as e:
|
||||
print(e)
|
||||
self.files_written.append([self.filepath, seed])
|
||||
if not upscaled:
|
||||
self.files_written.append([self.filepath, seed])
|
||||
|
||||
def unique_filename(self, seed, previouspath=None):
|
||||
def unique_filename(self, seed, upscaled=False, previouspath=None):
|
||||
revision = 1
|
||||
|
||||
if previouspath is None:
|
||||
@ -57,7 +60,7 @@ class PngWriter:
|
||||
basename = os.path.basename(previouspath)
|
||||
x = re.match('^(\d+)\..*\.png', basename)
|
||||
if not x:
|
||||
return self.unique_filename(seed, previouspath)
|
||||
return self.unique_filename(seed, upscaled, previouspath)
|
||||
|
||||
basecount = int(x.groups()[0])
|
||||
series = 0
|
||||
@ -65,13 +68,12 @@ class PngWriter:
|
||||
while not finished:
|
||||
series += 1
|
||||
filename = f'{basecount:06}.{seed}.png'
|
||||
if self.batch_size > 1 or os.path.exists(
|
||||
os.path.join(self.outdir, filename)
|
||||
):
|
||||
path = os.path.join(self.outdir, filename)
|
||||
if self.batch_size > 1 or os.path.exists(path):
|
||||
if upscaled:
|
||||
break
|
||||
filename = f'{basecount:06}.{seed}.{series:02}.png'
|
||||
finished = not os.path.exists(
|
||||
os.path.join(self.outdir, filename)
|
||||
)
|
||||
finished = not os.path.exists(path)
|
||||
return os.path.join(self.outdir, filename)
|
||||
|
||||
def save_image_and_prompt_to_png(self, image, prompt, path):
|
||||
@ -84,14 +86,17 @@ class PngWriter:
|
||||
if None in (rows, cols):
|
||||
rows = floor(sqrt(image_cnt)) # try to make it square
|
||||
cols = ceil(image_cnt / rows)
|
||||
width = image_list[0].width
|
||||
width = image_list[0].width
|
||||
height = image_list[0].height
|
||||
|
||||
grid_img = Image.new('RGB', (width * cols, height * rows))
|
||||
i = 0
|
||||
for r in range(0, rows):
|
||||
for c in range(0, cols):
|
||||
i = r * rows + c
|
||||
if i>=len(image_list):
|
||||
break
|
||||
grid_img.paste(image_list[i], (c * width, r * height))
|
||||
i = i + 1
|
||||
|
||||
return grid_img
|
||||
|
||||
@ -101,6 +106,8 @@ class PromptFormatter:
|
||||
self.t2i = t2i
|
||||
self.opt = opt
|
||||
|
||||
# note: the t2i object should provide all these values.
|
||||
# there should be no need to or against opt values
|
||||
def normalize_prompt(self):
|
||||
"""Normalize the prompt and switches"""
|
||||
t2i = self.t2i
|
||||
@ -113,13 +120,15 @@ class PromptFormatter:
|
||||
switches.append(f'-W{opt.width or t2i.width}')
|
||||
switches.append(f'-H{opt.height or t2i.height}')
|
||||
switches.append(f'-C{opt.cfg_scale or t2i.cfg_scale}')
|
||||
switches.append(f'-m{t2i.sampler_name}')
|
||||
switches.append(f'-A{opt.sampler_name or t2i.sampler_name}')
|
||||
if opt.init_img:
|
||||
switches.append(f'-I{opt.init_img}')
|
||||
if opt.strength and opt.init_img is not None:
|
||||
switches.append(f'-f{opt.strength or t2i.strength}')
|
||||
if opt.gfpgan_strength:
|
||||
switches.append(f'-G{opt.gfpgan_strength}')
|
||||
if opt.upscale:
|
||||
switches.append(f'-U {" ".join([str(u) for u in opt.upscale])}')
|
||||
if t2i.full_precision:
|
||||
switches.append('-F')
|
||||
return ' '.join(switches)
|
||||
|
149
ldm/dream/server.py
Normal file
149
ldm/dream/server.py
Normal file
@ -0,0 +1,149 @@
|
||||
import json
|
||||
import base64
|
||||
import mimetypes
|
||||
import os
|
||||
from http.server import BaseHTTPRequestHandler, ThreadingHTTPServer
|
||||
from ldm.dream.pngwriter import PngWriter
|
||||
|
||||
class DreamServer(BaseHTTPRequestHandler):
|
||||
model = None
|
||||
|
||||
def do_GET(self):
|
||||
if self.path == "/":
|
||||
self.send_response(200)
|
||||
self.send_header("Content-type", "text/html")
|
||||
self.end_headers()
|
||||
with open("./static/dream_web/index.html", "rb") as content:
|
||||
self.wfile.write(content.read())
|
||||
else:
|
||||
path = "." + self.path
|
||||
cwd = os.getcwd()
|
||||
is_in_cwd = os.path.commonprefix((os.path.realpath(path), cwd)) == cwd
|
||||
if not (is_in_cwd and os.path.exists(path)):
|
||||
self.send_response(404)
|
||||
return
|
||||
mime_type = mimetypes.guess_type(path)[0]
|
||||
if mime_type is not None:
|
||||
self.send_response(200)
|
||||
self.send_header("Content-type", mime_type)
|
||||
self.end_headers()
|
||||
with open("." + self.path, "rb") as content:
|
||||
self.wfile.write(content.read())
|
||||
else:
|
||||
self.send_response(404)
|
||||
|
||||
def do_POST(self):
|
||||
self.send_response(200)
|
||||
self.send_header("Content-type", "application/json")
|
||||
self.end_headers()
|
||||
|
||||
content_length = int(self.headers['Content-Length'])
|
||||
post_data = json.loads(self.rfile.read(content_length))
|
||||
prompt = post_data['prompt']
|
||||
initimg = post_data['initimg']
|
||||
iterations = int(post_data['iterations'])
|
||||
steps = int(post_data['steps'])
|
||||
width = int(post_data['width'])
|
||||
height = int(post_data['height'])
|
||||
cfgscale = float(post_data['cfgscale'])
|
||||
gfpgan_strength = float(post_data['gfpgan_strength'])
|
||||
upscale_level = post_data['upscale_level']
|
||||
upscale_strength = post_data['upscale_strength']
|
||||
upscale = [int(upscale_level),float(upscale_strength)] if upscale_level != '' else None
|
||||
seed = None if int(post_data['seed']) == -1 else int(post_data['seed'])
|
||||
|
||||
print(f"Request to generate with prompt: {prompt}")
|
||||
# In order to handle upscaled images, the PngWriter needs to maintain state
|
||||
# across images generated by each call to prompt2img(), so we define it in
|
||||
# the outer scope of image_done()
|
||||
config = post_data.copy() # Shallow copy
|
||||
config['initimg'] = ''
|
||||
|
||||
images_generated = 0 # helps keep track of when upscaling is started
|
||||
images_upscaled = 0 # helps keep track of when upscaling is completed
|
||||
pngwriter = PngWriter(
|
||||
"./outputs/img-samples/", config['prompt'], 1
|
||||
)
|
||||
|
||||
# if upscaling is requested, then this will be called twice, once when
|
||||
# the images are first generated, and then again when after upscaling
|
||||
# is complete. The upscaling replaces the original file, so the second
|
||||
# entry should not be inserted into the image list.
|
||||
def image_done(image, seed, upscaled=False):
|
||||
pngwriter.write_image(image, seed, upscaled)
|
||||
|
||||
# Append post_data to log, but only once!
|
||||
if not upscaled:
|
||||
current_image = pngwriter.files_written[-1]
|
||||
with open("./outputs/img-samples/dream_web_log.txt", "a") as log:
|
||||
log.write(f"{current_image[0]}: {json.dumps(config)}\n")
|
||||
self.wfile.write(bytes(json.dumps(
|
||||
{'event':'result', 'files':current_image, 'config':config}
|
||||
) + '\n',"utf-8"))
|
||||
|
||||
# control state of the "postprocessing..." message
|
||||
upscaling_requested = upscale or gfpgan_strength>0
|
||||
nonlocal images_generated # NB: Is this bad python style? It is typical usage in a perl closure.
|
||||
nonlocal images_upscaled # NB: Is this bad python style? It is typical usage in a perl closure.
|
||||
if upscaled:
|
||||
images_upscaled += 1
|
||||
else:
|
||||
images_generated +=1
|
||||
if upscaling_requested:
|
||||
action = None
|
||||
if images_generated >= iterations:
|
||||
if images_upscaled < iterations:
|
||||
action = 'upscaling-started'
|
||||
else:
|
||||
action = 'upscaling-done'
|
||||
if action:
|
||||
x = images_upscaled+1
|
||||
self.wfile.write(bytes(json.dumps(
|
||||
{'event':action,'processed_file_cnt':f'{x}/{iterations}'}
|
||||
) + '\n',"utf-8"))
|
||||
|
||||
def image_progress(image, step):
|
||||
self.wfile.write(bytes(json.dumps(
|
||||
{'event':'step', 'step':step}
|
||||
) + '\n',"utf-8"))
|
||||
|
||||
if initimg is None:
|
||||
# Run txt2img
|
||||
self.model.prompt2image(prompt,
|
||||
iterations=iterations,
|
||||
cfg_scale = cfgscale,
|
||||
width = width,
|
||||
height = height,
|
||||
seed = seed,
|
||||
steps = steps,
|
||||
gfpgan_strength = gfpgan_strength,
|
||||
upscale = upscale,
|
||||
step_callback=image_progress,
|
||||
image_callback=image_done)
|
||||
else:
|
||||
# Decode initimg as base64 to temp file
|
||||
with open("./img2img-tmp.png", "wb") as f:
|
||||
initimg = initimg.split(",")[1] # Ignore mime type
|
||||
f.write(base64.b64decode(initimg))
|
||||
|
||||
# Run img2img
|
||||
self.model.prompt2image(prompt,
|
||||
init_img = "./img2img-tmp.png",
|
||||
iterations = iterations,
|
||||
cfg_scale = cfgscale,
|
||||
seed = seed,
|
||||
steps = steps,
|
||||
gfpgan_strength=gfpgan_strength,
|
||||
upscale = upscale,
|
||||
step_callback=image_progress,
|
||||
image_callback=image_done)
|
||||
|
||||
# Remove the temp file
|
||||
os.remove("./img2img-tmp.png")
|
||||
|
||||
print(f"Prompt generated!")
|
||||
|
||||
|
||||
class ThreadingDreamServer(ThreadingHTTPServer):
|
||||
def __init__(self, server_address):
|
||||
super(ThreadingDreamServer, self).__init__(server_address, DreamServer)
|
165
ldm/gfpgan/gfpgan_tools.py
Normal file
165
ldm/gfpgan/gfpgan_tools.py
Normal file
@ -0,0 +1,165 @@
|
||||
import torch
|
||||
import warnings
|
||||
import os
|
||||
import sys
|
||||
import numpy as np
|
||||
|
||||
from PIL import Image
|
||||
from scripts.dream import create_argv_parser
|
||||
|
||||
arg_parser = create_argv_parser()
|
||||
opt = arg_parser.parse_args()
|
||||
|
||||
|
||||
def _run_gfpgan(image, strength, prompt, seed, upsampler_scale=4):
|
||||
print(f'\n* GFPGAN - Restoring Faces: {prompt} : seed:{seed}')
|
||||
with warnings.catch_warnings():
|
||||
warnings.filterwarnings('ignore', category=DeprecationWarning)
|
||||
warnings.filterwarnings('ignore', category=UserWarning)
|
||||
|
||||
try:
|
||||
model_path = os.path.join(opt.gfpgan_dir, opt.gfpgan_model_path)
|
||||
if not os.path.isfile(model_path):
|
||||
raise Exception('GFPGAN model not found at path ' + model_path)
|
||||
|
||||
sys.path.append(os.path.abspath(opt.gfpgan_dir))
|
||||
from gfpgan import GFPGANer
|
||||
|
||||
bg_upsampler = _load_gfpgan_bg_upsampler(
|
||||
opt.gfpgan_bg_upsampler, upsampler_scale, opt.gfpgan_bg_tile
|
||||
)
|
||||
|
||||
gfpgan = GFPGANer(
|
||||
model_path=model_path,
|
||||
upscale=upsampler_scale,
|
||||
arch='clean',
|
||||
channel_multiplier=2,
|
||||
bg_upsampler=bg_upsampler,
|
||||
)
|
||||
except Exception:
|
||||
import traceback
|
||||
|
||||
print('Error loading GFPGAN:', file=sys.stderr)
|
||||
print(traceback.format_exc(), file=sys.stderr)
|
||||
|
||||
if gfpgan is None:
|
||||
print(
|
||||
f'GFPGAN not initialized, it must be loaded via the --gfpgan argument'
|
||||
)
|
||||
return image
|
||||
|
||||
image = image.convert('RGB')
|
||||
|
||||
cropped_faces, restored_faces, restored_img = gfpgan.enhance(
|
||||
np.array(image, dtype=np.uint8),
|
||||
has_aligned=False,
|
||||
only_center_face=False,
|
||||
paste_back=True,
|
||||
)
|
||||
res = Image.fromarray(restored_img)
|
||||
|
||||
if strength < 1.0:
|
||||
# Resize the image to the new image if the sizes have changed
|
||||
if restored_img.size != image.size:
|
||||
image = image.resize(res.size)
|
||||
res = Image.blend(image, res, strength)
|
||||
|
||||
if torch.cuda.is_available():
|
||||
torch.cuda.empty_cache()
|
||||
gfpgan = None
|
||||
|
||||
return res
|
||||
|
||||
|
||||
def _load_gfpgan_bg_upsampler(bg_upsampler, upsampler_scale, bg_tile=400):
|
||||
if bg_upsampler == 'realesrgan':
|
||||
if not torch.cuda.is_available(): # CPU
|
||||
warnings.warn(
|
||||
'The unoptimized RealESRGAN is slow on CPU. We do not use it. '
|
||||
'If you really want to use it, please modify the corresponding codes.'
|
||||
)
|
||||
bg_upsampler = None
|
||||
else:
|
||||
model_path = {
|
||||
2: 'https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.1/RealESRGAN_x2plus.pth',
|
||||
4: 'https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x4plus.pth',
|
||||
}
|
||||
|
||||
if upsampler_scale not in model_path:
|
||||
return None
|
||||
|
||||
from basicsr.archs.rrdbnet_arch import RRDBNet
|
||||
from realesrgan import RealESRGANer
|
||||
|
||||
if upsampler_scale == 4:
|
||||
model = RRDBNet(
|
||||
num_in_ch=3,
|
||||
num_out_ch=3,
|
||||
num_feat=64,
|
||||
num_block=23,
|
||||
num_grow_ch=32,
|
||||
scale=4,
|
||||
)
|
||||
if upsampler_scale == 2:
|
||||
model = RRDBNet(
|
||||
num_in_ch=3,
|
||||
num_out_ch=3,
|
||||
num_feat=64,
|
||||
num_block=23,
|
||||
num_grow_ch=32,
|
||||
scale=2,
|
||||
)
|
||||
|
||||
bg_upsampler = RealESRGANer(
|
||||
scale=upsampler_scale,
|
||||
model_path=model_path[upsampler_scale],
|
||||
model=model,
|
||||
tile=bg_tile,
|
||||
tile_pad=10,
|
||||
pre_pad=0,
|
||||
half=True,
|
||||
) # need to set False in CPU mode
|
||||
else:
|
||||
bg_upsampler = None
|
||||
|
||||
return bg_upsampler
|
||||
|
||||
|
||||
def real_esrgan_upscale(image, strength, upsampler_scale, prompt, seed):
|
||||
print(
|
||||
f'\n* Real-ESRGAN Upscaling: {prompt} : seed:{seed} : scale:{upsampler_scale}x'
|
||||
)
|
||||
|
||||
with warnings.catch_warnings():
|
||||
warnings.filterwarnings('ignore', category=DeprecationWarning)
|
||||
warnings.filterwarnings('ignore', category=UserWarning)
|
||||
|
||||
try:
|
||||
upsampler = _load_gfpgan_bg_upsampler(
|
||||
opt.gfpgan_bg_upsampler, upsampler_scale, opt.gfpgan_bg_tile
|
||||
)
|
||||
except Exception:
|
||||
import traceback
|
||||
|
||||
print('Error loading Real-ESRGAN:', file=sys.stderr)
|
||||
print(traceback.format_exc(), file=sys.stderr)
|
||||
|
||||
output, img_mode = upsampler.enhance(
|
||||
np.array(image, dtype=np.uint8),
|
||||
outscale=upsampler_scale,
|
||||
alpha_upsampler=opt.gfpgan_bg_upsampler,
|
||||
)
|
||||
|
||||
res = Image.fromarray(output)
|
||||
|
||||
if strength < 1.0:
|
||||
# Resize the image to the new image if the sizes have changed
|
||||
if output.size != image.size:
|
||||
image = image.resize(res.size)
|
||||
res = Image.blend(image, res, strength)
|
||||
|
||||
if torch.cuda.is_available():
|
||||
torch.cuda.empty_cache()
|
||||
upsampler = None
|
||||
|
||||
return res
|
@ -61,6 +61,9 @@ class KSampler(object):
|
||||
# this has to come in the same format as the conditioning, # e.g. as encoded tokens, ...
|
||||
**kwargs,
|
||||
):
|
||||
def route_callback(k_callback_values):
|
||||
if img_callback is not None:
|
||||
img_callback(k_callback_values['x'], k_callback_values['i'])
|
||||
|
||||
sigmas = self.model.get_sigmas(S)
|
||||
if x_T:
|
||||
@ -78,7 +81,8 @@ class KSampler(object):
|
||||
}
|
||||
return (
|
||||
K.sampling.__dict__[f'sample_{self.schedule}'](
|
||||
model_wrap_cfg, x, sigmas, extra_args=extra_args
|
||||
model_wrap_cfg, x, sigmas, extra_args=extra_args,
|
||||
callback=route_callback
|
||||
),
|
||||
None,
|
||||
)
|
||||
|
@ -215,11 +215,13 @@ class EmbeddingManager(nn.Module):
|
||||
ckpt_path,
|
||||
)
|
||||
|
||||
def load(self, ckpt_path):
|
||||
def load(self, ckpt_path, full=True):
|
||||
ckpt = torch.load(ckpt_path, map_location='cpu')
|
||||
|
||||
self.string_to_token_dict = ckpt['string_to_token']
|
||||
self.string_to_param_dict = ckpt['string_to_param']
|
||||
self.string_to_token_dict = ckpt["string_to_token"]
|
||||
self.string_to_param_dict = ckpt["string_to_param"]
|
||||
if not full:
|
||||
for key, value in self.string_to_param_dict.items():
|
||||
self.string_to_param_dict[key] = torch.nn.Parameter(value.half())
|
||||
|
||||
def get_embedding_norms_squared(self):
|
||||
all_params = torch.cat(
|
||||
|
207
ldm/simplet2i.py
207
ldm/simplet2i.py
@ -7,7 +7,6 @@
|
||||
import torch
|
||||
import numpy as np
|
||||
import random
|
||||
import sys
|
||||
import os
|
||||
from omegaconf import OmegaConf
|
||||
from PIL import Image
|
||||
@ -21,7 +20,7 @@ from contextlib import contextmanager, nullcontext
|
||||
import transformers
|
||||
import time
|
||||
import re
|
||||
import traceback
|
||||
import sys
|
||||
|
||||
from ldm.util import instantiate_from_config
|
||||
from ldm.models.diffusion.ddim import DDIMSampler
|
||||
@ -123,6 +122,7 @@ class T2I:
|
||||
cfg_scale=7.5,
|
||||
weights='models/ldm/stable-diffusion-v1/model.ckpt',
|
||||
config='configs/stable-diffusion/v1-inference.yaml',
|
||||
grid=False,
|
||||
width=512,
|
||||
height=512,
|
||||
sampler_name='klms',
|
||||
@ -133,9 +133,9 @@ class T2I:
|
||||
full_precision=False,
|
||||
strength=0.75, # default in scripts/img2img.py
|
||||
embedding_path=None,
|
||||
latent_diffusion_weights=False, # just to keep track of this parameter when regenerating prompt
|
||||
# just to keep track of this parameter when regenerating prompt
|
||||
latent_diffusion_weights=False,
|
||||
device='cuda',
|
||||
gfpgan=None,
|
||||
):
|
||||
self.batch_size = batch_size
|
||||
self.iterations = iterations
|
||||
@ -148,6 +148,7 @@ class T2I:
|
||||
self.sampler_name = sampler_name
|
||||
self.latent_channels = latent_channels
|
||||
self.downsampling_factor = downsampling_factor
|
||||
self.grid = grid
|
||||
self.ddim_eta = ddim_eta
|
||||
self.precision = precision
|
||||
self.full_precision = full_precision
|
||||
@ -157,7 +158,8 @@ class T2I:
|
||||
self.sampler = None
|
||||
self.latent_diffusion_weights = latent_diffusion_weights
|
||||
self.device = device
|
||||
self.gfpgan = gfpgan
|
||||
|
||||
self.session_peakmem = torch.cuda.max_memory_allocated()
|
||||
if seed is None:
|
||||
self.seed = self._new_seed()
|
||||
else:
|
||||
@ -175,16 +177,15 @@ class T2I:
|
||||
outdir, prompt, kwargs.get('batch_size', self.batch_size)
|
||||
)
|
||||
for r in results:
|
||||
metadata_str = f'prompt2png("{prompt}" {kwargs} seed={r[1]}' # gets written into the PNG
|
||||
pngwriter.write_image(r[0], r[1])
|
||||
return pngwriter.files_written
|
||||
|
||||
def txt2img(self, prompt, **kwargs):
|
||||
outdir = kwargs.get('outdir', 'outputs/img-samples')
|
||||
outdir = kwargs.pop('outdir', 'outputs/img-samples')
|
||||
return self.prompt2png(prompt, outdir, **kwargs)
|
||||
|
||||
def img2img(self, prompt, **kwargs):
|
||||
outdir = kwargs.get('outdir', 'outputs/img-samples')
|
||||
outdir = kwargs.pop('outdir', 'outputs/img-samples')
|
||||
assert (
|
||||
'init_img' in kwargs
|
||||
), 'call to img2img() must include the init_img argument'
|
||||
@ -202,14 +203,18 @@ class T2I:
|
||||
ddim_eta=None,
|
||||
skip_normalize=False,
|
||||
image_callback=None,
|
||||
step_callback=None,
|
||||
# these are specific to txt2img
|
||||
width=None,
|
||||
height=None,
|
||||
# these are specific to img2img
|
||||
init_img=None,
|
||||
strength=None,
|
||||
gfpgan_strength=None,
|
||||
gfpgan_strength=0,
|
||||
save_original=False,
|
||||
upscale=None,
|
||||
variants=None,
|
||||
sampler_name=None,
|
||||
**args,
|
||||
): # eat up additional cruft
|
||||
"""
|
||||
@ -228,9 +233,14 @@ class T2I:
|
||||
gfpgan_strength // strength for GFPGAN. 0.0 preserves image exactly, 1.0 replaces it completely
|
||||
ddim_eta // image randomness (eta=0.0 means the same seed always produces the same image)
|
||||
variants // if >0, the 1st generated image will be passed back to img2img to generate the requested number of variants
|
||||
step_callback // a function or method that will be called each step
|
||||
image_callback // a function or method that will be called each time an image is generated
|
||||
|
||||
To use the callback, define a function of method that receives two arguments, an Image object
|
||||
To use the step callback, define a function that receives two arguments:
|
||||
- Image GPU data
|
||||
- The step number
|
||||
|
||||
To use the image callback, define a function of method that receives two arguments, an Image object
|
||||
and the seed. You can then do whatever you like with the image, including converting it to
|
||||
different formats and manipulating it. For example:
|
||||
|
||||
@ -262,14 +272,19 @@ class T2I:
|
||||
h = int(height / 64) * 64
|
||||
if h != height or w != width:
|
||||
print(
|
||||
f'Height and width must be multiples of 64. Resizing to {h}x{w}'
|
||||
f'Height and width must be multiples of 64. Resizing to {h}x{w}.'
|
||||
)
|
||||
height = h
|
||||
width = w
|
||||
|
||||
scope = autocast if self.precision == 'autocast' else nullcontext
|
||||
|
||||
if sampler_name and (sampler_name != self.sampler_name):
|
||||
self.sampler_name = sampler_name
|
||||
self._set_sampler()
|
||||
|
||||
tic = time.time()
|
||||
torch.cuda.torch.cuda.reset_peak_memory_stats()
|
||||
results = list()
|
||||
|
||||
try:
|
||||
@ -285,6 +300,7 @@ class T2I:
|
||||
skip_normalize=skip_normalize,
|
||||
init_img=init_img,
|
||||
strength=strength,
|
||||
callback=step_callback,
|
||||
)
|
||||
else:
|
||||
images_iterator = self._txt2img(
|
||||
@ -297,32 +313,54 @@ class T2I:
|
||||
skip_normalize=skip_normalize,
|
||||
width=width,
|
||||
height=height,
|
||||
callback=step_callback,
|
||||
)
|
||||
|
||||
with scope(self.device.type), self.model.ema_scope():
|
||||
for n in trange(iterations, desc='Sampling'):
|
||||
for n in trange(iterations, desc='Generating'):
|
||||
seed_everything(seed)
|
||||
iter_images = next(images_iterator)
|
||||
for image in iter_images:
|
||||
try:
|
||||
# if gfpgan strength is none or less than or equal to 0.0 then
|
||||
# don't even attempt to use GFPGAN.
|
||||
# if the user specified a value of -G that satisifies the condition and
|
||||
# --gfpgan wasn't specified, at startup then
|
||||
# the net result is a message gets printed - nothing else happens.
|
||||
if gfpgan_strength is not None and gfpgan_strength > 0.0:
|
||||
image = self._run_gfpgan(
|
||||
image, gfpgan_strength
|
||||
)
|
||||
except Exception as e:
|
||||
print(
|
||||
f'Error running GFPGAN - Your image was not enhanced.\n{e}'
|
||||
)
|
||||
results.append([image, seed])
|
||||
if image_callback is not None:
|
||||
image_callback(image, seed)
|
||||
seed = self._new_seed()
|
||||
|
||||
if upscale is not None or gfpgan_strength > 0:
|
||||
for result in results:
|
||||
image, seed = result
|
||||
try:
|
||||
if upscale is not None:
|
||||
from ldm.gfpgan.gfpgan_tools import (
|
||||
real_esrgan_upscale,
|
||||
)
|
||||
if len(upscale) < 2:
|
||||
upscale.append(0.75)
|
||||
image = real_esrgan_upscale(
|
||||
image,
|
||||
upscale[1],
|
||||
int(upscale[0]),
|
||||
prompt,
|
||||
seed,
|
||||
)
|
||||
if gfpgan_strength > 0:
|
||||
from ldm.gfpgan.gfpgan_tools import _run_gfpgan
|
||||
|
||||
image = _run_gfpgan(
|
||||
image, gfpgan_strength, prompt, seed, 1
|
||||
)
|
||||
except Exception as e:
|
||||
print(
|
||||
f'Error running RealESRGAN - Your image was not upscaled.\n{e}'
|
||||
)
|
||||
if image_callback is not None:
|
||||
if save_original:
|
||||
image_callback(image, seed)
|
||||
else:
|
||||
image_callback(image, seed, upscaled=True)
|
||||
else: # no callback passed, so we simply replace old image with rescaled one
|
||||
result[0] = image
|
||||
|
||||
except KeyboardInterrupt:
|
||||
print('*interrupted*')
|
||||
print(
|
||||
@ -333,7 +371,21 @@ class T2I:
|
||||
print('Are you sure your system has an adequate NVIDIA GPU?')
|
||||
|
||||
toc = time.time()
|
||||
print(f'{len(results)} images generated in', '%4.2fs' % (toc - tic))
|
||||
self.session_peakmem = max(
|
||||
self.session_peakmem, torch.cuda.max_memory_allocated()
|
||||
)
|
||||
print('Usage stats:')
|
||||
print(
|
||||
f' {len(results)} image(s) generated in', '%4.2fs' % (toc - tic)
|
||||
)
|
||||
print(
|
||||
f' Max VRAM used for this generation:',
|
||||
'%4.2fG' % (torch.cuda.max_memory_allocated() / 1e9),
|
||||
)
|
||||
print(
|
||||
f' Max VRAM used since script start: ',
|
||||
'%4.2fG' % (self.session_peakmem / 1e9),
|
||||
)
|
||||
return results
|
||||
|
||||
@torch.no_grad()
|
||||
@ -348,6 +400,7 @@ class T2I:
|
||||
skip_normalize,
|
||||
width,
|
||||
height,
|
||||
callback,
|
||||
):
|
||||
"""
|
||||
An infinite iterator of images from the prompt.
|
||||
@ -371,6 +424,7 @@ class T2I:
|
||||
unconditional_guidance_scale=cfg_scale,
|
||||
unconditional_conditioning=uc,
|
||||
eta=ddim_eta,
|
||||
img_callback=callback
|
||||
)
|
||||
yield self._samples_to_images(samples)
|
||||
|
||||
@ -386,6 +440,7 @@ class T2I:
|
||||
skip_normalize,
|
||||
init_img,
|
||||
strength,
|
||||
callback, # Currently not implemented for img2img
|
||||
):
|
||||
"""
|
||||
An infinite iterator of images from the prompt and the initial image
|
||||
@ -492,46 +547,50 @@ class T2I:
|
||||
self.device = self._get_device()
|
||||
model = self._load_model_from_config(config, self.weights)
|
||||
if self.embedding_path is not None:
|
||||
model.embedding_manager.load(self.embedding_path)
|
||||
model.embedding_manager.load(
|
||||
self.embedding_path, self.full_precision
|
||||
)
|
||||
self.model = model.to(self.device)
|
||||
# model.to doesn't change the cond_stage_model.device used to move the tokenizer output, so set it here
|
||||
self.model.cond_stage_model.device = self.device
|
||||
except AttributeError:
|
||||
import traceback
|
||||
print('Error loading model. Only the CUDA backend is supported',file=sys.stderr)
|
||||
print(traceback.format_exc(),file=sys.stderr)
|
||||
raise SystemExit
|
||||
|
||||
msg = f'setting sampler to {self.sampler_name}'
|
||||
if self.sampler_name == 'plms':
|
||||
self.sampler = PLMSSampler(self.model, device=self.device)
|
||||
elif self.sampler_name == 'ddim':
|
||||
self.sampler = DDIMSampler(self.model, device=self.device)
|
||||
elif self.sampler_name == 'k_dpm_2_a':
|
||||
self.sampler = KSampler(
|
||||
self.model, 'dpm_2_ancestral', device=self.device
|
||||
)
|
||||
elif self.sampler_name == 'k_dpm_2':
|
||||
self.sampler = KSampler(
|
||||
self.model, 'dpm_2', device=self.device
|
||||
)
|
||||
elif self.sampler_name == 'k_euler_a':
|
||||
self.sampler = KSampler(
|
||||
self.model, 'euler_ancestral', device=self.device
|
||||
)
|
||||
elif self.sampler_name == 'k_euler':
|
||||
self.sampler = KSampler(
|
||||
self.model, 'euler', device=self.device
|
||||
)
|
||||
elif self.sampler_name == 'k_heun':
|
||||
self.sampler = KSampler(self.model, 'heun', device=self.device)
|
||||
elif self.sampler_name == 'k_lms':
|
||||
self.sampler = KSampler(self.model, 'lms', device=self.device)
|
||||
else:
|
||||
msg = f'unsupported sampler {self.sampler_name}, defaulting to plms'
|
||||
self.sampler = PLMSSampler(self.model, device=self.device)
|
||||
|
||||
print(msg)
|
||||
self._set_sampler()
|
||||
|
||||
return self.model
|
||||
|
||||
def _set_sampler(self):
|
||||
msg = f'>> Setting Sampler to {self.sampler_name}'
|
||||
if self.sampler_name == 'plms':
|
||||
self.sampler = PLMSSampler(self.model, device=self.device)
|
||||
elif self.sampler_name == 'ddim':
|
||||
self.sampler = DDIMSampler(self.model, device=self.device)
|
||||
elif self.sampler_name == 'k_dpm_2_a':
|
||||
self.sampler = KSampler(
|
||||
self.model, 'dpm_2_ancestral', device=self.device
|
||||
)
|
||||
elif self.sampler_name == 'k_dpm_2':
|
||||
self.sampler = KSampler(self.model, 'dpm_2', device=self.device)
|
||||
elif self.sampler_name == 'k_euler_a':
|
||||
self.sampler = KSampler(
|
||||
self.model, 'euler_ancestral', device=self.device
|
||||
)
|
||||
elif self.sampler_name == 'k_euler':
|
||||
self.sampler = KSampler(self.model, 'euler', device=self.device)
|
||||
elif self.sampler_name == 'k_heun':
|
||||
self.sampler = KSampler(self.model, 'heun', device=self.device)
|
||||
elif self.sampler_name == 'k_lms':
|
||||
self.sampler = KSampler(self.model, 'lms', device=self.device)
|
||||
else:
|
||||
msg = f'>> Unsupported Sampler: {self.sampler_name}, Defaulting to plms'
|
||||
self.sampler = PLMSSampler(self.model, device=self.device)
|
||||
|
||||
print(msg)
|
||||
|
||||
def _load_model_from_config(self, config, ckpt):
|
||||
print(f'Loading model from {ckpt}')
|
||||
pl_sd = torch.load(ckpt, map_location='cpu')
|
||||
@ -548,13 +607,16 @@ class T2I:
|
||||
)
|
||||
else:
|
||||
print(
|
||||
'Using half precision math. Call with --full_precision to use slower but more accurate full precision.'
|
||||
'Using half precision math. Call with --full_precision to use more accurate but VRAM-intensive full precision.'
|
||||
)
|
||||
model.half()
|
||||
return model
|
||||
|
||||
def _load_img(self, path):
|
||||
image = Image.open(path).convert('RGB')
|
||||
print(f'image path = {path}, cwd = {os.getcwd()}')
|
||||
with Image.open(path) as img:
|
||||
image = img.convert('RGB')
|
||||
|
||||
w, h = image.size
|
||||
print(f'loaded input image of size ({w}, {h}) from {path}')
|
||||
w, h = map(
|
||||
@ -612,28 +674,3 @@ class T2I:
|
||||
weights.append(1.0)
|
||||
remaining = 0
|
||||
return prompts, weights
|
||||
|
||||
def _run_gfpgan(self, image, strength):
|
||||
if self.gfpgan is None:
|
||||
print(
|
||||
f'GFPGAN not initialized, it must be loaded via the --gfpgan argument'
|
||||
)
|
||||
return image
|
||||
|
||||
image = image.convert('RGB')
|
||||
|
||||
cropped_faces, restored_faces, restored_img = self.gfpgan.enhance(
|
||||
np.array(image, dtype=np.uint8),
|
||||
has_aligned=False,
|
||||
only_center_face=False,
|
||||
paste_back=True,
|
||||
)
|
||||
res = Image.fromarray(restored_img)
|
||||
|
||||
if strength < 1.0:
|
||||
# Resize the image to the new image if the sizes have changed
|
||||
if restored_img.size != image.size:
|
||||
image = image.resize(res.size)
|
||||
res = Image.blend(image, res, strength)
|
||||
|
||||
return res
|
||||
|
370
scripts/dream.py
370
scripts/dream.py
@ -9,8 +9,7 @@ import copy
|
||||
import warnings
|
||||
import ldm.dream.readline
|
||||
from ldm.dream.pngwriter import PngWriter, PromptFormatter
|
||||
|
||||
debugging = False
|
||||
from ldm.dream.server import DreamServer, ThreadingDreamServer
|
||||
|
||||
|
||||
def main():
|
||||
@ -52,7 +51,9 @@ def main():
|
||||
weights=weights,
|
||||
full_precision=opt.full_precision,
|
||||
config=config,
|
||||
latent_diffusion_weights=opt.laion400m, # this is solely for recreating the prompt
|
||||
grid = opt.grid,
|
||||
# this is solely for recreating the prompt
|
||||
latent_diffusion_weights=opt.laion400m,
|
||||
embedding_path=opt.embedding_path,
|
||||
device=opt.device,
|
||||
)
|
||||
@ -64,80 +65,50 @@ def main():
|
||||
# gets rid of annoying messages about random seed
|
||||
logging.getLogger('pytorch_lightning').setLevel(logging.ERROR)
|
||||
|
||||
# load the infile as a list of lines
|
||||
infile = None
|
||||
try:
|
||||
if opt.infile is not None:
|
||||
infile = open(opt.infile, 'r')
|
||||
except FileNotFoundError as e:
|
||||
print(e)
|
||||
exit(-1)
|
||||
if opt.infile:
|
||||
try:
|
||||
if os.path.isfile(opt.infile):
|
||||
infile = open(opt.infile, 'r', encoding='utf-8')
|
||||
elif opt.infile == '-': # stdin
|
||||
infile = sys.stdin
|
||||
else:
|
||||
raise FileNotFoundError(f'{opt.infile} not found.')
|
||||
except (FileNotFoundError, IOError) as e:
|
||||
print(f'{e}. Aborting.')
|
||||
sys.exit(-1)
|
||||
|
||||
# preload the model
|
||||
t2i.load_model()
|
||||
|
||||
# load GFPGAN if requested
|
||||
if opt.use_gfpgan:
|
||||
print('\n* --gfpgan was specified, loading gfpgan...')
|
||||
with warnings.catch_warnings():
|
||||
warnings.filterwarnings('ignore', category=DeprecationWarning)
|
||||
if not infile:
|
||||
print(
|
||||
"\n* Initialization done! Awaiting your command (-h for help, 'q' to quit)"
|
||||
)
|
||||
|
||||
try:
|
||||
model_path = os.path.join(
|
||||
opt.gfpgan_dir, opt.gfpgan_model_path
|
||||
)
|
||||
if not os.path.isfile(model_path):
|
||||
raise Exception(
|
||||
'GFPGAN model not found at path ' + model_path
|
||||
)
|
||||
|
||||
sys.path.append(os.path.abspath(opt.gfpgan_dir))
|
||||
from gfpgan import GFPGANer
|
||||
|
||||
bg_upsampler = load_gfpgan_bg_upsampler(
|
||||
opt.gfpgan_bg_upsampler, opt.gfpgan_bg_tile
|
||||
)
|
||||
|
||||
t2i.gfpgan = GFPGANer(
|
||||
model_path=model_path,
|
||||
upscale=opt.gfpgan_upscale,
|
||||
arch='clean',
|
||||
channel_multiplier=2,
|
||||
bg_upsampler=bg_upsampler,
|
||||
)
|
||||
except Exception:
|
||||
import traceback
|
||||
|
||||
print('Error loading GFPGAN:', file=sys.stderr)
|
||||
print(traceback.format_exc(), file=sys.stderr)
|
||||
|
||||
print(
|
||||
"\n* Initialization done! Awaiting your command (-h for help, 'q' to quit, 'cd' to change output dir, 'pwd' to print output dir)..."
|
||||
)
|
||||
|
||||
log_path = os.path.join(opt.outdir, 'dream_log.txt')
|
||||
with open(log_path, 'a') as log:
|
||||
cmd_parser = create_cmd_parser()
|
||||
main_loop(t2i, opt.outdir, cmd_parser, log, infile)
|
||||
log.close()
|
||||
if infile:
|
||||
infile.close()
|
||||
cmd_parser = create_cmd_parser()
|
||||
if opt.web:
|
||||
dream_server_loop(t2i)
|
||||
else:
|
||||
main_loop(t2i, opt.outdir, cmd_parser, infile)
|
||||
|
||||
|
||||
def main_loop(t2i, outdir, parser, log, infile):
|
||||
def main_loop(t2i, outdir, parser, infile):
|
||||
"""prompt/read/execute loop"""
|
||||
done = False
|
||||
last_seeds = []
|
||||
|
||||
while not done:
|
||||
try:
|
||||
command = infile.readline() if infile else input('dream> ')
|
||||
command = get_next_command(infile)
|
||||
except EOFError:
|
||||
done = True
|
||||
break
|
||||
|
||||
if infile and len(command) == 0:
|
||||
done = True
|
||||
break
|
||||
# skip empty lines
|
||||
if not command.strip():
|
||||
continue
|
||||
|
||||
if command.startswith(('#', '//')):
|
||||
continue
|
||||
@ -152,25 +123,10 @@ def main_loop(t2i, outdir, parser, log, infile):
|
||||
print(str(e))
|
||||
continue
|
||||
|
||||
if len(elements) == 0:
|
||||
continue
|
||||
|
||||
if elements[0] == 'q':
|
||||
done = True
|
||||
break
|
||||
|
||||
if elements[0] == 'cd' and len(elements) > 1:
|
||||
if os.path.exists(elements[1]):
|
||||
print(f'setting image output directory to {elements[1]}')
|
||||
outdir = elements[1]
|
||||
else:
|
||||
print(f'directory {elements[1]} does not exist')
|
||||
continue
|
||||
|
||||
if elements[0] == 'pwd':
|
||||
print(f'current output directory is {outdir}')
|
||||
continue
|
||||
|
||||
if elements[0].startswith(
|
||||
'!dream'
|
||||
): # in case a stored prompt still contains the !dream command
|
||||
@ -207,18 +163,26 @@ def main_loop(t2i, outdir, parser, log, infile):
|
||||
opt.seed = None
|
||||
|
||||
normalized_prompt = PromptFormatter(t2i, opt).normalize_prompt()
|
||||
individual_images = not opt.grid
|
||||
do_grid = opt.grid or t2i.grid
|
||||
individual_images = not do_grid
|
||||
|
||||
if opt.outdir:
|
||||
if not os.path.exists(opt.outdir):
|
||||
os.makedirs(opt.outdir)
|
||||
current_outdir = opt.outdir
|
||||
else:
|
||||
current_outdir = outdir
|
||||
|
||||
# Here is where the images are actually generated!
|
||||
try:
|
||||
file_writer = PngWriter(outdir, normalized_prompt, opt.batch_size)
|
||||
callback = file_writer.write_image if individual_images else None
|
||||
|
||||
image_list = t2i.prompt2image(image_callback=callback, **vars(opt))
|
||||
file_writer = PngWriter(current_outdir, normalized_prompt, opt.batch_size)
|
||||
callback = file_writer.write_image if individual_images else None
|
||||
image_list = t2i.prompt2image(image_callback=callback, **vars(opt))
|
||||
results = (
|
||||
file_writer.files_written if individual_images else image_list
|
||||
)
|
||||
|
||||
if opt.grid and len(results) > 0:
|
||||
if do_grid and len(results) > 0:
|
||||
grid_img = file_writer.make_grid([r[0] for r in results])
|
||||
filename = file_writer.unique_filename(results[0][1])
|
||||
seeds = [a[1] for a in results]
|
||||
@ -239,94 +203,74 @@ def main_loop(t2i, outdir, parser, log, infile):
|
||||
continue
|
||||
|
||||
print('Outputs:')
|
||||
write_log_message(t2i, normalized_prompt, results, log)
|
||||
log_path = os.path.join(current_outdir, 'dream_log.txt')
|
||||
write_log_message(normalized_prompt, results, log_path)
|
||||
|
||||
print('goodbye!')
|
||||
|
||||
|
||||
def load_gfpgan_bg_upsampler(bg_upsampler, bg_tile=400):
|
||||
import torch
|
||||
|
||||
if bg_upsampler == 'realesrgan':
|
||||
if not torch.cuda.is_available(): # CPU
|
||||
import warnings
|
||||
|
||||
warnings.warn(
|
||||
'The unoptimized RealESRGAN is slow on CPU. We do not use it. '
|
||||
'If you really want to use it, please modify the corresponding codes.'
|
||||
)
|
||||
bg_upsampler = None
|
||||
else:
|
||||
from basicsr.archs.rrdbnet_arch import RRDBNet
|
||||
from realesrgan import RealESRGANer
|
||||
|
||||
model = RRDBNet(
|
||||
num_in_ch=3,
|
||||
num_out_ch=3,
|
||||
num_feat=64,
|
||||
num_block=23,
|
||||
num_grow_ch=32,
|
||||
scale=2,
|
||||
)
|
||||
bg_upsampler = RealESRGANer(
|
||||
scale=2,
|
||||
model_path='https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.1/RealESRGAN_x2plus.pth',
|
||||
model=model,
|
||||
tile=bg_tile,
|
||||
tile_pad=10,
|
||||
pre_pad=0,
|
||||
half=True,
|
||||
) # need to set False in CPU mode
|
||||
def get_next_command(infile=None) -> str: #command string
|
||||
if infile is None:
|
||||
command = input('dream> ')
|
||||
else:
|
||||
bg_upsampler = None
|
||||
command = infile.readline()
|
||||
if not command:
|
||||
raise EOFError
|
||||
else:
|
||||
command = command.strip()
|
||||
print(f'#{command}')
|
||||
return command
|
||||
|
||||
return bg_upsampler
|
||||
def dream_server_loop(t2i):
|
||||
print('\n* --web was specified, starting web server...')
|
||||
# Change working directory to the stable-diffusion directory
|
||||
os.chdir(
|
||||
os.path.abspath(os.path.join(os.path.dirname(__file__), '..'))
|
||||
)
|
||||
|
||||
# Start server
|
||||
DreamServer.model = t2i
|
||||
dream_server = ThreadingDreamServer(("0.0.0.0", 9090))
|
||||
print("\nStarted Stable Diffusion dream server!")
|
||||
print("Point your browser at http://localhost:9090 or use the host's DNS name or IP address.")
|
||||
|
||||
try:
|
||||
dream_server.serve_forever()
|
||||
except KeyboardInterrupt:
|
||||
pass
|
||||
|
||||
dream_server.server_close()
|
||||
|
||||
|
||||
# variant generation is going to be superseded by a generalized
|
||||
# "prompt-morph" functionality
|
||||
# def generate_variants(t2i,outdir,opt,previous_gens):
|
||||
# variants = []
|
||||
# print(f"Generating {opt.variants} variant(s)...")
|
||||
# newopt = copy.deepcopy(opt)
|
||||
# newopt.iterations = 1
|
||||
# newopt.variants = None
|
||||
# for r in previous_gens:
|
||||
# newopt.init_img = r[0]
|
||||
# prompt = PromptFormatter(t2i,newopt).normalize_prompt()
|
||||
# print(f"] generating variant for {newopt.init_img}")
|
||||
# for j in range(0,opt.variants):
|
||||
# try:
|
||||
# file_writer = PngWriter(outdir,prompt,newopt.batch_size)
|
||||
# callback = file_writer.write_image
|
||||
# t2i.prompt2image(image_callback=callback,**vars(newopt))
|
||||
# results = file_writer.files_written
|
||||
# variants.append([prompt,results])
|
||||
# except AssertionError as e:
|
||||
# print(e)
|
||||
# continue
|
||||
# print(f'{opt.variants} variants generated')
|
||||
# return variants
|
||||
def write_log_message(prompt, results, log_path):
|
||||
"""logs the name of the output image, prompt, and prompt args to the terminal and log file"""
|
||||
log_lines = [f'{r[0]}: {prompt} -S{r[1]}\n' for r in results]
|
||||
print(*log_lines, sep='')
|
||||
|
||||
with open(log_path, 'a', encoding='utf-8') as file:
|
||||
file.writelines(log_lines)
|
||||
|
||||
|
||||
def write_log_message(t2i, prompt, results, logfile):
|
||||
"""logs the name of the output image, its prompt and seed to the terminal, log file, and a Dream text chunk in the PNG metadata"""
|
||||
last_seed = None
|
||||
img_num = 1
|
||||
seenit = {}
|
||||
|
||||
for r in results:
|
||||
seed = r[1]
|
||||
log_message = f'{r[0]}: {prompt} -S{seed}'
|
||||
|
||||
print(log_message)
|
||||
logfile.write(log_message + '\n')
|
||||
logfile.flush()
|
||||
|
||||
SAMPLER_CHOICES=[
|
||||
'ddim',
|
||||
'k_dpm_2_a',
|
||||
'k_dpm_2',
|
||||
'k_euler_a',
|
||||
'k_euler',
|
||||
'k_heun',
|
||||
'k_lms',
|
||||
'plms',
|
||||
]
|
||||
|
||||
def create_argv_parser():
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Parse script's command line args"
|
||||
description="""Generate images using Stable Diffusion.
|
||||
Use --web to launch the web interface.
|
||||
Use --from_file to load prompts from a file path or standard input ("-").
|
||||
Otherwise you will be dropped into an interactive command prompt (type -h for help.)
|
||||
Other command-line arguments are defaults that can usually be overridden
|
||||
prompt the command prompt.
|
||||
"""
|
||||
)
|
||||
parser.add_argument(
|
||||
'--laion400m',
|
||||
@ -334,51 +278,50 @@ def create_argv_parser():
|
||||
'-l',
|
||||
dest='laion400m',
|
||||
action='store_true',
|
||||
help='fallback to the latent diffusion (laion400m) weights and config',
|
||||
help='Fallback to the latent diffusion (laion400m) weights and config',
|
||||
)
|
||||
parser.add_argument(
|
||||
'--from_file',
|
||||
dest='infile',
|
||||
type=str,
|
||||
help='if specified, load prompts from this file',
|
||||
help='If specified, load prompts from this file',
|
||||
)
|
||||
parser.add_argument(
|
||||
'-n',
|
||||
'--iterations',
|
||||
type=int,
|
||||
default=1,
|
||||
help='number of images to generate',
|
||||
help='Number of images to generate',
|
||||
)
|
||||
parser.add_argument(
|
||||
'-F',
|
||||
'--full_precision',
|
||||
dest='full_precision',
|
||||
action='store_true',
|
||||
help='use slower full precision math for calculations',
|
||||
help='Use slower full precision math for calculations',
|
||||
)
|
||||
parser.add_argument(
|
||||
'--sampler',
|
||||
'-g',
|
||||
'--grid',
|
||||
action='store_true',
|
||||
help='Generate a grid instead of individual images',
|
||||
)
|
||||
parser.add_argument(
|
||||
'-A',
|
||||
'-m',
|
||||
'--sampler',
|
||||
dest='sampler_name',
|
||||
choices=[
|
||||
'ddim',
|
||||
'k_dpm_2_a',
|
||||
'k_dpm_2',
|
||||
'k_euler_a',
|
||||
'k_euler',
|
||||
'k_heun',
|
||||
'k_lms',
|
||||
'plms',
|
||||
],
|
||||
choices=SAMPLER_CHOICES,
|
||||
metavar='SAMPLER_NAME',
|
||||
default='k_lms',
|
||||
help='which sampler to use (k_lms) - can only be set on command line',
|
||||
help=f'Set the initial sampler. Default: k_lms. Supported samplers: {", ".join(SAMPLER_CHOICES)}',
|
||||
)
|
||||
parser.add_argument(
|
||||
'--outdir',
|
||||
'-o',
|
||||
type=str,
|
||||
default='outputs/img-samples',
|
||||
help='directory in which to place generated images and a log of prompts and seeds (outputs/img-samples',
|
||||
help='Directory to save generated images and a log of prompts and seeds. Default: outputs/img-samples',
|
||||
)
|
||||
parser.add_argument(
|
||||
'--embedding_path',
|
||||
@ -390,44 +333,39 @@ def create_argv_parser():
|
||||
'-d',
|
||||
type=str,
|
||||
default='cuda',
|
||||
help='device to run stable diffusion on. defaults to cuda `torch.cuda.current_device()` if avalible',
|
||||
help='Device to run Stable Diffusion on. Defaults to cuda `torch.cuda.current_device()` if avalible',
|
||||
)
|
||||
# GFPGAN related args
|
||||
parser.add_argument(
|
||||
'--gfpgan',
|
||||
dest='use_gfpgan',
|
||||
action='store_true',
|
||||
help='load gfpgan for use in the dreambot. Note: Enabling GFPGAN will require more GPU memory',
|
||||
)
|
||||
parser.add_argument(
|
||||
'--gfpgan_upscale',
|
||||
type=int,
|
||||
default=2,
|
||||
help='The final upsampling scale of the image. Default: 2. Only used if --gfpgan is specified',
|
||||
)
|
||||
parser.add_argument(
|
||||
'--gfpgan_bg_upsampler',
|
||||
type=str,
|
||||
default='realesrgan',
|
||||
help='Background upsampler. Default: realesrgan. Options: realesrgan, none. Only used if --gfpgan is specified',
|
||||
|
||||
)
|
||||
parser.add_argument(
|
||||
'--gfpgan_bg_tile',
|
||||
type=int,
|
||||
default=400,
|
||||
help='Tile size for background sampler, 0 for no tile during testing. Default: 400. Only used if --gfpgan is specified',
|
||||
help='Tile size for background sampler, 0 for no tile during testing. Default: 400.',
|
||||
)
|
||||
parser.add_argument(
|
||||
'--gfpgan_model_path',
|
||||
type=str,
|
||||
default='experiments/pretrained_models/GFPGANv1.3.pth',
|
||||
help='indicates the path to the GFPGAN model, relative to --gfpgan_dir. Only used if --gfpgan is specified',
|
||||
help='Indicates the path to the GFPGAN model, relative to --gfpgan_dir.',
|
||||
)
|
||||
parser.add_argument(
|
||||
'--gfpgan_dir',
|
||||
type=str,
|
||||
default='../GFPGAN',
|
||||
help='indicates the directory containing the GFPGAN code. Only used if --gfpgan is specified',
|
||||
help='Indicates the directory containing the GFPGAN code.',
|
||||
)
|
||||
parser.add_argument(
|
||||
'--web',
|
||||
dest='web',
|
||||
action='store_true',
|
||||
help='Start in web server mode.',
|
||||
)
|
||||
return parser
|
||||
|
||||
@ -437,76 +375,108 @@ def create_cmd_parser():
|
||||
description='Example: dream> a fantastic alien landscape -W1024 -H960 -s100 -n12'
|
||||
)
|
||||
parser.add_argument('prompt')
|
||||
parser.add_argument('-s', '--steps', type=int, help='number of steps')
|
||||
parser.add_argument('-s', '--steps', type=int, help='Number of steps')
|
||||
parser.add_argument(
|
||||
'-S',
|
||||
'--seed',
|
||||
type=int,
|
||||
help='image seed; a +ve integer, or use -1 for the previous seed, -2 for the one before that, etc',
|
||||
help='Image seed; a +ve integer, or use -1 for the previous seed, -2 for the one before that, etc',
|
||||
)
|
||||
parser.add_argument(
|
||||
'-n',
|
||||
'--iterations',
|
||||
type=int,
|
||||
default=1,
|
||||
help='number of samplings to perform (slower, but will provide seeds for individual images)',
|
||||
help='Number of samplings to perform (slower, but will provide seeds for individual images)',
|
||||
)
|
||||
parser.add_argument(
|
||||
'-b',
|
||||
'--batch_size',
|
||||
type=int,
|
||||
default=1,
|
||||
help='number of images to produce per sampling (will not provide seeds for individual images!)',
|
||||
help='Number of images to produce per sampling (will not provide seeds for individual images!)',
|
||||
)
|
||||
parser.add_argument(
|
||||
'-W', '--width', type=int, help='image width, multiple of 64'
|
||||
'-W', '--width', type=int, help='Image width, multiple of 64'
|
||||
)
|
||||
parser.add_argument(
|
||||
'-H', '--height', type=int, help='image height, multiple of 64'
|
||||
'-H', '--height', type=int, help='Image height, multiple of 64'
|
||||
)
|
||||
parser.add_argument(
|
||||
'-C',
|
||||
'--cfg_scale',
|
||||
default=7.5,
|
||||
type=float,
|
||||
help='prompt configuration scale',
|
||||
help='Prompt configuration scale',
|
||||
)
|
||||
parser.add_argument(
|
||||
'-g', '--grid', action='store_true', help='generate a grid'
|
||||
)
|
||||
parser.add_argument(
|
||||
'--outdir',
|
||||
'-o',
|
||||
type=str,
|
||||
default=None,
|
||||
help='Directory to save generated images and a log of prompts and seeds',
|
||||
)
|
||||
parser.add_argument(
|
||||
'-i',
|
||||
'--individual',
|
||||
action='store_true',
|
||||
help='generate individual files (default)',
|
||||
help='Generate individual files (default)',
|
||||
)
|
||||
parser.add_argument(
|
||||
'-I',
|
||||
'--init_img',
|
||||
type=str,
|
||||
help='path to input image for img2img mode (supersedes width and height)',
|
||||
help='Path to input image for img2img mode (supersedes width and height)',
|
||||
)
|
||||
parser.add_argument(
|
||||
'-f',
|
||||
'--strength',
|
||||
default=0.75,
|
||||
type=float,
|
||||
help='strength for noising/unnoising. 0.0 preserves image exactly, 1.0 replaces it completely',
|
||||
help='Strength for noising/unnoising. 0.0 preserves image exactly, 1.0 replaces it completely',
|
||||
)
|
||||
parser.add_argument(
|
||||
'-G',
|
||||
'--gfpgan_strength',
|
||||
default=None,
|
||||
default=0,
|
||||
type=float,
|
||||
help='The strength at which to apply the GFPGAN model to the result, in order to improve faces.',
|
||||
)
|
||||
parser.add_argument(
|
||||
'-U',
|
||||
'--upscale',
|
||||
nargs='+',
|
||||
default=None,
|
||||
type=float,
|
||||
help='Scale factor (2, 4) for upscaling followed by upscaling strength (0-1.0). If strength not specified, defaults to 0.75'
|
||||
)
|
||||
parser.add_argument(
|
||||
'-save_orig',
|
||||
'--save_original',
|
||||
action='store_true',
|
||||
help='Save original. Use it when upscaling to save both versions.',
|
||||
)
|
||||
# variants is going to be superseded by a generalized "prompt-morph" function
|
||||
# parser.add_argument('-v','--variants',type=int,help="in img2img mode, the first generated image will get passed back to img2img to generate the requested number of variants")
|
||||
parser.add_argument(
|
||||
'-x',
|
||||
'--skip_normalize',
|
||||
action='store_true',
|
||||
help='skip subprompt weight normalization',
|
||||
help='Skip subprompt weight normalization',
|
||||
)
|
||||
parser.add_argument(
|
||||
'-A',
|
||||
'-m',
|
||||
'--sampler',
|
||||
dest='sampler_name',
|
||||
default=None,
|
||||
type=str,
|
||||
choices=SAMPLER_CHOICES,
|
||||
metavar='SAMPLER_NAME',
|
||||
help=f'Switch to a different sampler. Supported samplers: {", ".join(SAMPLER_CHOICES)}',
|
||||
)
|
||||
return parser
|
||||
|
||||
|
@ -1,114 +0,0 @@
|
||||
import json
|
||||
import base64
|
||||
import mimetypes
|
||||
import os
|
||||
from pytorch_lightning import logging
|
||||
from http.server import BaseHTTPRequestHandler, ThreadingHTTPServer
|
||||
|
||||
print("Loading model...")
|
||||
from ldm.simplet2i import T2I
|
||||
model = T2I(sampler_name='k_lms')
|
||||
|
||||
# to get rid of annoying warning messages from pytorch
|
||||
import transformers
|
||||
transformers.logging.set_verbosity_error()
|
||||
logging.getLogger("pytorch_lightning").setLevel(logging.ERROR)
|
||||
|
||||
print("Initializing model, be patient...")
|
||||
model.load_model()
|
||||
|
||||
class DreamServer(BaseHTTPRequestHandler):
|
||||
def do_GET(self):
|
||||
if self.path == "/":
|
||||
self.send_response(200)
|
||||
self.send_header("Content-type", "text/html")
|
||||
self.end_headers()
|
||||
with open("./static/dream_web/index.html", "rb") as content:
|
||||
self.wfile.write(content.read())
|
||||
elif os.path.exists("." + self.path):
|
||||
mime_type = mimetypes.guess_type(self.path)[0]
|
||||
if mime_type is not None:
|
||||
self.send_response(200)
|
||||
self.send_header("Content-type", mime_type)
|
||||
self.end_headers()
|
||||
with open("." + self.path, "rb") as content:
|
||||
self.wfile.write(content.read())
|
||||
else:
|
||||
self.send_response(404)
|
||||
else:
|
||||
self.send_response(404)
|
||||
|
||||
def do_POST(self):
|
||||
self.send_response(200)
|
||||
self.send_header("Content-type", "application/json")
|
||||
self.end_headers()
|
||||
|
||||
content_length = int(self.headers['Content-Length'])
|
||||
post_data = json.loads(self.rfile.read(content_length))
|
||||
prompt = post_data['prompt']
|
||||
initimg = post_data['initimg']
|
||||
iterations = int(post_data['iterations'])
|
||||
steps = int(post_data['steps'])
|
||||
width = int(post_data['width'])
|
||||
height = int(post_data['height'])
|
||||
cfgscale = float(post_data['cfgscale'])
|
||||
seed = None if int(post_data['seed']) == -1 else int(post_data['seed'])
|
||||
|
||||
print(f"Request to generate with prompt: {prompt}")
|
||||
|
||||
outputs = []
|
||||
if initimg is None:
|
||||
# Run txt2img
|
||||
outputs = model.txt2img(prompt,
|
||||
iterations=iterations,
|
||||
cfg_scale = cfgscale,
|
||||
width = width,
|
||||
height = height,
|
||||
seed = seed,
|
||||
steps = steps)
|
||||
else:
|
||||
# Decode initimg as base64 to temp file
|
||||
with open("./img2img-tmp.png", "wb") as f:
|
||||
initimg = initimg.split(",")[1] # Ignore mime type
|
||||
f.write(base64.b64decode(initimg))
|
||||
|
||||
# Run img2img
|
||||
outputs = model.img2img(prompt,
|
||||
init_img = "./img2img-tmp.png",
|
||||
iterations = iterations,
|
||||
cfg_scale = cfgscale,
|
||||
seed = seed,
|
||||
steps = steps)
|
||||
# Remove the temp file
|
||||
os.remove("./img2img-tmp.png")
|
||||
|
||||
print(f"Prompt generated with output: {outputs}")
|
||||
|
||||
post_data['initimg'] = '' # Don't send init image back
|
||||
|
||||
# Append post_data to log
|
||||
with open("./outputs/img-samples/dream_web_log.txt", "a") as log:
|
||||
for output in outputs:
|
||||
log.write(f"{output[0]}: {json.dumps(post_data)}\n")
|
||||
|
||||
outputs = [x + [post_data] for x in outputs] # Append config to each output
|
||||
result = {'outputs': outputs}
|
||||
self.wfile.write(bytes(json.dumps(result), "utf-8"))
|
||||
|
||||
if __name__ == "__main__":
|
||||
# Change working directory to the stable-diffusion directory
|
||||
os.chdir(
|
||||
os.path.abspath(os.path.join(os.path.dirname( __file__ ), '..'))
|
||||
)
|
||||
|
||||
# Start server
|
||||
dream_server = ThreadingHTTPServer(("0.0.0.0", 9090), DreamServer)
|
||||
print("\n\n* Started Stable Diffusion dream server! Point your browser at http://localhost:9090 or use the host's DNS name or IP address. *")
|
||||
|
||||
try:
|
||||
dream_server.serve_forever()
|
||||
except KeyboardInterrupt:
|
||||
pass
|
||||
|
||||
dream_server.server_close()
|
||||
|
@ -3,6 +3,9 @@
|
||||
# Before running stable-diffusion on an internet-isolated machine,
|
||||
# run this script from one with internet connectivity. The
|
||||
# two machines must share a common .cache directory.
|
||||
from transformers import CLIPTokenizer, CLIPTextModel
|
||||
import clip
|
||||
from transformers import BertTokenizerFast
|
||||
import sys
|
||||
import transformers
|
||||
import os
|
||||
@ -12,7 +15,6 @@ transformers.logging.set_verbosity_error()
|
||||
|
||||
# this will preload the Bert tokenizer fles
|
||||
print('preloading bert tokenizer...')
|
||||
from transformers import BertTokenizerFast
|
||||
|
||||
tokenizer = BertTokenizerFast.from_pretrained('bert-base-uncased')
|
||||
print('...success')
|
||||
@ -28,8 +30,6 @@ version = 'openai/clip-vit-large-patch14'
|
||||
|
||||
print('preloading CLIP model (Ignore the deprecation warnings)...')
|
||||
sys.stdout.flush()
|
||||
import clip
|
||||
from transformers import CLIPTokenizer, CLIPTextModel
|
||||
|
||||
tokenizer = CLIPTokenizer.from_pretrained(version)
|
||||
transformer = CLIPTextModel.from_pretrained(version)
|
||||
@ -63,6 +63,20 @@ if gfpgan:
|
||||
scale=2,
|
||||
),
|
||||
)
|
||||
|
||||
RealESRGANer(
|
||||
scale=4,
|
||||
model_path='https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x4plus.pth',
|
||||
model=RRDBNet(
|
||||
num_in_ch=3,
|
||||
num_out_ch=3,
|
||||
num_feat=64,
|
||||
num_block=23,
|
||||
num_grow_ch=32,
|
||||
scale=4,
|
||||
),
|
||||
)
|
||||
|
||||
FaceRestoreHelper(1, det_model='retinaface_resnet50')
|
||||
print('...success')
|
||||
except Exception:
|
||||
|
BIN
static/colab_notebook.png
Normal file
BIN
static/colab_notebook.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 799 KiB |
@ -1,61 +1,69 @@
|
||||
* {
|
||||
font-family: 'Arial';
|
||||
}
|
||||
#header {
|
||||
text-decoration: dotted underline;
|
||||
}
|
||||
#search {
|
||||
margin-top: 20vh;
|
||||
margin-left: auto;
|
||||
margin-right: auto;
|
||||
max-width: 800px;
|
||||
|
||||
text-align: center;
|
||||
}
|
||||
fieldset {
|
||||
border: none;
|
||||
}
|
||||
#fieldset-search {
|
||||
display: flex;
|
||||
}
|
||||
#prompt {
|
||||
flex-grow: 1;
|
||||
|
||||
border-radius: 20px 0px 0px 20px;
|
||||
padding: 5px 10px 5px 10px;
|
||||
border: 1px solid black;
|
||||
border-right: none;
|
||||
outline: none;
|
||||
}
|
||||
#submit {
|
||||
border-radius: 0px 20px 20px 0px;
|
||||
padding: 5px 10px 5px 10px;
|
||||
border: 1px solid black;
|
||||
}
|
||||
#results {
|
||||
text-align: center;
|
||||
max-width: 1000px;
|
||||
margin: auto;
|
||||
padding-top: 10px;
|
||||
}
|
||||
img {
|
||||
cursor: pointer;
|
||||
height: 30vh;
|
||||
border-radius: 5px;
|
||||
margin: 10px;
|
||||
}
|
||||
#fieldset-config {
|
||||
line-height:2em;
|
||||
}
|
||||
input[type="number"] {
|
||||
width: 60px;
|
||||
}
|
||||
#seed {
|
||||
width: 150px;
|
||||
}
|
||||
hr {
|
||||
width: 200px;
|
||||
}
|
||||
label {
|
||||
white-space: nowrap;
|
||||
}
|
||||
* {
|
||||
font-family: 'Arial';
|
||||
}
|
||||
#header {
|
||||
text-decoration: dotted underline;
|
||||
}
|
||||
#search {
|
||||
margin-top: 20vh;
|
||||
margin-left: auto;
|
||||
margin-right: auto;
|
||||
max-width: 800px;
|
||||
|
||||
text-align: center;
|
||||
}
|
||||
fieldset {
|
||||
border: none;
|
||||
}
|
||||
#fieldset-search {
|
||||
display: flex;
|
||||
}
|
||||
#scaling-inprocess-message{
|
||||
font-weight: bold;
|
||||
font-style: italic;
|
||||
display: none;
|
||||
}
|
||||
#prompt {
|
||||
flex-grow: 1;
|
||||
|
||||
border-radius: 20px 0px 0px 20px;
|
||||
padding: 5px 10px 5px 10px;
|
||||
border: 1px solid black;
|
||||
border-right: none;
|
||||
outline: none;
|
||||
}
|
||||
#submit {
|
||||
border-radius: 0px 20px 20px 0px;
|
||||
padding: 5px 10px 5px 10px;
|
||||
border: 1px solid black;
|
||||
}
|
||||
#reset-all {
|
||||
background-color: pink;
|
||||
}
|
||||
#results {
|
||||
text-align: center;
|
||||
max-width: 1000px;
|
||||
margin: auto;
|
||||
padding-top: 10px;
|
||||
}
|
||||
img {
|
||||
cursor: pointer;
|
||||
height: 30vh;
|
||||
border-radius: 5px;
|
||||
margin: 10px;
|
||||
}
|
||||
#fieldset-config {
|
||||
line-height:2em;
|
||||
}
|
||||
input[type="number"] {
|
||||
width: 60px;
|
||||
}
|
||||
#seed {
|
||||
width: 150px;
|
||||
}
|
||||
hr {
|
||||
width: 200px;
|
||||
}
|
||||
label {
|
||||
white-space: nowrap;
|
||||
}
|
||||
|
@ -1,66 +1,94 @@
|
||||
<html>
|
||||
<head>
|
||||
<title>Stable Diffusion Dream Server</title>
|
||||
<link rel="icon" href="data:,">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
|
||||
<link rel="stylesheet" href="static/dream_web/index.css">
|
||||
<script src="static/dream_web/index.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="search">
|
||||
<h2 id="header">Stable Diffusion Dream Server</h2>
|
||||
|
||||
<form id="generate-form" method="post" action="#">
|
||||
<fieldset id="fieldset-search">
|
||||
<input type="text" id="prompt" name="prompt">
|
||||
<input type="submit" id="submit" value="Generate">
|
||||
</fieldset>
|
||||
<fieldset id="fieldset-config">
|
||||
<label for="iterations">Images to generate:</label>
|
||||
<input value="1" type="number" id="iterations" name="iterations">
|
||||
<label for="steps">Steps:</label>
|
||||
<input value="50" type="number" id="steps" name="steps">
|
||||
<label for="cfgscale">Cfg Scale:</label>
|
||||
<input value="7.5" type="number" id="cfgscale" name="cfgscale" step="any">
|
||||
<span>•</span>
|
||||
<label title="Set to multiple of 64" for="width">Width:</label>
|
||||
<select id="width" name="width" value="512">
|
||||
<option value="64">64</option> <option value="128">128</option>
|
||||
<option value="192">192</option> <option value="256">256</option>
|
||||
<option value="320">320</option> <option value="384">384</option>
|
||||
<option value="448">448</option> <option value="512">512</option>
|
||||
<option value="576">576</option> <option value="640">640</option>
|
||||
<option value="704">704</option> <option value="768">768</option>
|
||||
<option value="832">832</option> <option value="896">896</option>
|
||||
<option value="960">960</option> <option value="1024">1024</option>
|
||||
</select>
|
||||
<label title="Set to multiple of 64" for="height">Height:</label>
|
||||
<select id="height" name="height" value="512">
|
||||
<option value="64">64</option> <option value="128">128</option>
|
||||
<option value="192">192</option> <option value="256">256</option>
|
||||
<option value="320">320</option> <option value="384">384</option>
|
||||
<option value="448">448</option> <option value="512">512</option>
|
||||
<option value="576">576</option> <option value="640">640</option>
|
||||
<option value="704">704</option> <option value="768">768</option>
|
||||
<option value="832">832</option> <option value="896">896</option>
|
||||
<option value="960">960</option> <option value="1024">1024</option>
|
||||
</select>
|
||||
<br>
|
||||
<label title="Upload an image to use img2img" for="initimg">Img2Img Init:</label>
|
||||
<input type="file" id="initimg" name="initimg" accept=".jpg, .jpeg, .png">
|
||||
<label title="Set to -1 for random seed" for="seed">Seed:</label>
|
||||
<input value="-1" type="number" id="seed" name="seed">
|
||||
<button type="button" id="reset">↺</button>
|
||||
</fieldset>
|
||||
</form>
|
||||
<div id="about">For news and support for this web service, visit our <a href="http://github.com/lstein/stable-diffusion">GitHub site</a></div>
|
||||
</div>
|
||||
<hr>
|
||||
<div id="results">
|
||||
<div id="no-results-message">
|
||||
<i><p>No results...</p></i>
|
||||
</div>
|
||||
</div>
|
||||
</body>
|
||||
</html>
|
||||
<html>
|
||||
<head>
|
||||
<title>Stable Diffusion Dream Server</title>
|
||||
<link rel="icon" href="data:,">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
|
||||
<link rel="stylesheet" href="static/dream_web/index.css">
|
||||
<script src="static/dream_web/index.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="search">
|
||||
<h2 id="header">Stable Diffusion Dream Server</h2>
|
||||
|
||||
<form id="generate-form" method="post" action="#">
|
||||
<fieldset id="fieldset-search">
|
||||
<input type="text" id="prompt" name="prompt">
|
||||
<input type="submit" id="submit" value="Generate">
|
||||
</fieldset>
|
||||
<fieldset id="fieldset-config">
|
||||
<label for="iterations">Images to generate:</label>
|
||||
<input value="1" type="number" id="iterations" name="iterations" size="4">
|
||||
<label for="steps">Steps:</label>
|
||||
<input value="50" type="number" id="steps" name="steps">
|
||||
<label for="cfgscale">Cfg Scale:</label>
|
||||
<input value="7.5" type="number" id="cfgscale" name="cfgscale" step="any">
|
||||
<label for="sampler">Sampler:</label>
|
||||
<select id="sampler" name="sampler" value="k_lms">
|
||||
<option value="ddim">DDIM</option>
|
||||
<option value="plms">PLMS</option>
|
||||
<option value="k_lms" selected>KLMS</option>
|
||||
<option value="k_dpm_2">KDPM_2</option>
|
||||
<option value="k_dpm_2_a">KDPM_2A</option>
|
||||
<option value="k_euler">KEULER</option>
|
||||
<option value="k_heun">KHEUN</option>
|
||||
</select>
|
||||
<br>
|
||||
<label title="Set to multiple of 64" for="width">Width:</label>
|
||||
<select id="width" name="width" value="512">
|
||||
<option value="64">64</option> <option value="128">128</option>
|
||||
<option value="192">192</option> <option value="256">256</option>
|
||||
<option value="320">320</option> <option value="384">384</option>
|
||||
<option value="448">448</option> <option value="512" selected>512</option>
|
||||
<option value="576">576</option> <option value="640">640</option>
|
||||
<option value="704">704</option> <option value="768">768</option>
|
||||
<option value="832">832</option> <option value="896">896</option>
|
||||
<option value="960">960</option> <option value="1024">1024</option>
|
||||
</select>
|
||||
<label title="Set to multiple of 64" for="height">Height:</label>
|
||||
<select id="height" name="height" value="512">
|
||||
<option value="64">64</option> <option value="128">128</option>
|
||||
<option value="192">192</option> <option value="256">256</option>
|
||||
<option value="320">320</option> <option value="384">384</option>
|
||||
<option value="448">448</option> <option value="512" selected>512</option>
|
||||
<option value="576">576</option> <option value="640">640</option>
|
||||
<option value="704">704</option> <option value="768">768</option>
|
||||
<option value="832">832</option> <option value="896">896</option>
|
||||
<option value="960">960</option> <option value="1024">1024</option>
|
||||
</select>
|
||||
<br>
|
||||
<label title="Upload an image to use img2img" for="initimg">Img2Img Init:</label>
|
||||
<input type="file" id="initimg" name="initimg" accept=".jpg, .jpeg, .png">
|
||||
<label title="Set to -1 for random seed" for="seed">Seed:</label>
|
||||
<input value="-1" type="number" id="seed" name="seed">
|
||||
<button type="button" id="reset-seed">↺</button>
|
||||
<span>•</span>
|
||||
<button type="button" id="reset-all">Reset to Defaults</button>
|
||||
<br>
|
||||
<p><em>The options below require the GFPGAN and ESRGAN packages to be installed</em></p>
|
||||
<label title="Strength of the gfpgan (face fixing) algorithm." for="gfpgan_strength">GPFGAN Strength:</label>
|
||||
<input value="0.8" min="0" max="1" type="number" id="gfpgan_strength" name="gfpgan_strength" step="0.05">
|
||||
<label title="Upscaling to perform using ESRGAN." for="upscale_level">Upscaling Level</label>
|
||||
<select id="upscale_level" name="upscale_level" value="">
|
||||
<option value="" selected></option>
|
||||
<option value="2">2x</option>
|
||||
<option value="4">4x</option>
|
||||
</select>
|
||||
<label title="Strength of the esrgan (upscaling) algorithm." for="upscale_strength">Upscale Strength:</label>
|
||||
<input value="0.75" min="0" max="1" type="number" id="upscale_strength" name="upscale_strength" step="0.05">
|
||||
</fieldset>
|
||||
</form>
|
||||
<div id="about">For news and support for this web service, visit our <a href="http://github.com/lstein/stable-diffusion">GitHub site</a></div>
|
||||
<br>
|
||||
<progress id="progress" value="0" max="1"></progress>
|
||||
<div id="scaling-inprocess-message">
|
||||
<i><span>Postprocessing...</span><span id="processing_cnt">1/3</span></i>
|
||||
</div>
|
||||
</div>
|
||||
<div id="results">
|
||||
<div id="no-results-message">
|
||||
<i><p>No results...</p></i>
|
||||
</div>
|
||||
</div>
|
||||
</body>
|
||||
</html>
|
||||
|
@ -1,101 +1,129 @@
|
||||
function toBase64(file) {
|
||||
return new Promise((resolve, reject) => {
|
||||
const r = new FileReader();
|
||||
r.readAsDataURL(file);
|
||||
r.onload = () => resolve(r.result);
|
||||
r.onerror = (error) => reject(error);
|
||||
});
|
||||
}
|
||||
|
||||
function appendOutput(output) {
|
||||
let outputNode = document.createElement("img");
|
||||
outputNode.src = output[0];
|
||||
|
||||
let outputConfig = output[2];
|
||||
let altText = output[1].toString() + " | " + outputConfig.prompt;
|
||||
outputNode.alt = altText;
|
||||
outputNode.title = altText;
|
||||
|
||||
// Reload image config
|
||||
outputNode.addEventListener('click', () => {
|
||||
let form = document.querySelector("#generate-form");
|
||||
for (const [k, v] of new FormData(form)) {
|
||||
form.querySelector(`*[name=${k}]`).value = outputConfig[k];
|
||||
}
|
||||
document.querySelector("#seed").value = output[1];
|
||||
|
||||
saveFields(document.querySelector("#generate-form"));
|
||||
});
|
||||
|
||||
document.querySelector("#results").prepend(outputNode);
|
||||
}
|
||||
|
||||
function appendOutputs(outputs) {
|
||||
for (const output of outputs) {
|
||||
appendOutput(output);
|
||||
}
|
||||
}
|
||||
|
||||
function saveFields(form) {
|
||||
for (const [k, v] of new FormData(form)) {
|
||||
if (typeof v !== 'object') { // Don't save 'file' type
|
||||
localStorage.setItem(k, v);
|
||||
}
|
||||
}
|
||||
}
|
||||
function loadFields(form) {
|
||||
for (const [k, v] of new FormData(form)) {
|
||||
const item = localStorage.getItem(k);
|
||||
if (item != null) {
|
||||
form.querySelector(`*[name=${k}]`).value = item;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
async function generateSubmit(form) {
|
||||
const prompt = document.querySelector("#prompt").value;
|
||||
|
||||
// Convert file data to base64
|
||||
let formData = Object.fromEntries(new FormData(form));
|
||||
formData.initimg = formData.initimg.name !== '' ? await toBase64(formData.initimg) : null;
|
||||
|
||||
// Post as JSON
|
||||
fetch(form.action, {
|
||||
method: form.method,
|
||||
body: JSON.stringify(formData),
|
||||
}).then(async (result) => {
|
||||
let data = await result.json();
|
||||
|
||||
// Re-enable form, remove no-results-message
|
||||
form.querySelector('fieldset').removeAttribute('disabled');
|
||||
document.querySelector("#prompt").value = prompt;
|
||||
|
||||
if (data.outputs.length != 0) {
|
||||
document.querySelector("#no-results-message")?.remove();
|
||||
appendOutputs(data.outputs);
|
||||
} else {
|
||||
alert("Error occurred while generating.");
|
||||
}
|
||||
});
|
||||
|
||||
// Disable form while generating
|
||||
form.querySelector('fieldset').setAttribute('disabled','');
|
||||
document.querySelector("#prompt").value = `Generating: "${prompt}"`;
|
||||
}
|
||||
|
||||
window.onload = () => {
|
||||
document.querySelector("#generate-form").addEventListener('submit', (e) => {
|
||||
e.preventDefault();
|
||||
const form = e.target;
|
||||
|
||||
generateSubmit(form);
|
||||
});
|
||||
document.querySelector("#generate-form").addEventListener('change', (e) => {
|
||||
saveFields(e.target.form);
|
||||
});
|
||||
document.querySelector("#reset").addEventListener('click', (e) => {
|
||||
document.querySelector("#seed").value = -1;
|
||||
saveFields(e.target.form);
|
||||
});
|
||||
loadFields(document.querySelector("#generate-form"));
|
||||
};
|
||||
function toBase64(file) {
|
||||
return new Promise((resolve, reject) => {
|
||||
const r = new FileReader();
|
||||
r.readAsDataURL(file);
|
||||
r.onload = () => resolve(r.result);
|
||||
r.onerror = (error) => reject(error);
|
||||
});
|
||||
}
|
||||
|
||||
function appendOutput(src, seed, config) {
|
||||
let outputNode = document.createElement("img");
|
||||
outputNode.src = src;
|
||||
|
||||
let altText = seed.toString() + " | " + config.prompt;
|
||||
outputNode.alt = altText;
|
||||
outputNode.title = altText;
|
||||
|
||||
// Reload image config
|
||||
outputNode.addEventListener('click', () => {
|
||||
let form = document.querySelector("#generate-form");
|
||||
for (const [k, v] of new FormData(form)) {
|
||||
form.querySelector(`*[name=${k}]`).value = config[k];
|
||||
}
|
||||
document.querySelector("#seed").value = seed;
|
||||
|
||||
saveFields(document.querySelector("#generate-form"));
|
||||
});
|
||||
|
||||
document.querySelector("#results").prepend(outputNode);
|
||||
}
|
||||
|
||||
function saveFields(form) {
|
||||
for (const [k, v] of new FormData(form)) {
|
||||
if (typeof v !== 'object') { // Don't save 'file' type
|
||||
localStorage.setItem(k, v);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
function loadFields(form) {
|
||||
for (const [k, v] of new FormData(form)) {
|
||||
const item = localStorage.getItem(k);
|
||||
if (item != null) {
|
||||
form.querySelector(`*[name=${k}]`).value = item;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
function clearFields(form) {
|
||||
localStorage.clear();
|
||||
let prompt = form.prompt.value;
|
||||
form.reset();
|
||||
form.prompt.value = prompt;
|
||||
}
|
||||
|
||||
async function generateSubmit(form) {
|
||||
const prompt = document.querySelector("#prompt").value;
|
||||
|
||||
// Convert file data to base64
|
||||
let formData = Object.fromEntries(new FormData(form));
|
||||
formData.initimg = formData.initimg.name !== '' ? await toBase64(formData.initimg) : null;
|
||||
|
||||
document.querySelector('progress').setAttribute('max', formData.steps);
|
||||
|
||||
// Post as JSON, using Fetch streaming to get results
|
||||
fetch(form.action, {
|
||||
method: form.method,
|
||||
body: JSON.stringify(formData),
|
||||
}).then(async (response) => {
|
||||
const reader = response.body.getReader();
|
||||
|
||||
let noOutputs = true;
|
||||
while (true) {
|
||||
let {value, done} = await reader.read();
|
||||
value = new TextDecoder().decode(value);
|
||||
if (done) break;
|
||||
|
||||
for (let event of value.split('\n').filter(e => e !== '')) {
|
||||
const data = JSON.parse(event);
|
||||
|
||||
if (data.event == 'result') {
|
||||
noOutputs = false;
|
||||
document.querySelector("#no-results-message")?.remove();
|
||||
appendOutput(data.files[0],data.files[1],data.config)
|
||||
} else if (data.event == 'upscaling-started') {
|
||||
document.getElementById("processing_cnt").textContent=data.processed_file_cnt;
|
||||
document.getElementById("scaling-inprocess-message").style.display = "block";
|
||||
} else if (data.event == 'upscaling-done') {
|
||||
document.getElementById("scaling-inprocess-message").style.display = "none";
|
||||
} else if (data.event == 'step') {
|
||||
document.querySelector('progress').setAttribute('value', data.step.toString());
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Re-enable form, remove no-results-message
|
||||
form.querySelector('fieldset').removeAttribute('disabled');
|
||||
document.querySelector("#prompt").value = prompt;
|
||||
document.querySelector('progress').setAttribute('value', '0');
|
||||
|
||||
if (noOutputs) {
|
||||
alert("Error occurred while generating.");
|
||||
}
|
||||
});
|
||||
|
||||
// Disable form while generating
|
||||
form.querySelector('fieldset').setAttribute('disabled','');
|
||||
document.querySelector("#prompt").value = `Generating: "${prompt}"`;
|
||||
}
|
||||
|
||||
window.onload = () => {
|
||||
document.querySelector("#generate-form").addEventListener('submit', (e) => {
|
||||
e.preventDefault();
|
||||
const form = e.target;
|
||||
|
||||
generateSubmit(form);
|
||||
});
|
||||
document.querySelector("#generate-form").addEventListener('change', (e) => {
|
||||
saveFields(e.target.form);
|
||||
});
|
||||
document.querySelector("#reset-seed").addEventListener('click', (e) => {
|
||||
document.querySelector("#seed").value = -1;
|
||||
saveFields(e.target.form);
|
||||
});
|
||||
document.querySelector("#reset-all").addEventListener('click', (e) => {
|
||||
clearFields(e.target.form);
|
||||
});
|
||||
loadFields(document.querySelector("#generate-form"));
|
||||
};
|
||||
|
Loading…
Reference in New Issue
Block a user