added walkthru, small code fixes

2024-08-30 20:32:17 +00:00 · 2022-09-02 17:54:55 -04:00 · 2022-09-02 17:54:55 -04:00 · dd2af3f93c
commit dd2af3f93c
parent 2d65b03f05
8 changed files with 121 additions and 3 deletions
--- a/README.md
+++ b/README.md
@ -23,6 +23,7 @@ text-to-image generator. This fork supports:
 3. A basic Web interface that allows you to run a local web server for
   generating images in your browser.
 4. A notebook for running the code on Google Colab.
 5. Upscaling and face fixing using the optional ESRGAN and GFPGAN
@ -30,7 +31,11 @@ text-to-image generator. This fork supports:
 6. Weighted subprompts for prompt tuning.
-7. Textual inversion for customization of the prompt language and images.
+7. [Image variations](Variations.md) which allow you to systematically
 generate variations of an image you like and combine two or more
 images together to combine the best features of both.
 8. Textual inversion for customization of the prompt language and images.
 8. ...and more!
--- a/VARIATIONS.md
+++ b/VARIATIONS.md
@ -0,0 +1,113 @@
 # Cheat Sheat for Generating Variations
 Release 1.13 of SD-Dream adds support for image variations. There are two things that you can do:
 1. Generate a series of systematic variations of an image, given a
 prompt. The amount of variation from one image to the next can be
 controlled.
 2. Given two or more variations that you like, you can combine them in
 a weighted fashion
 This cheat sheet provides a quick guide for how this works in
 practice, using variations to create the desired image of Xena,
 Warrior Princess.
 ## Step 1 -- find a base image that you like
 The prompt we will use throughout is "lucy lawless as xena, warrior
 princess, character portrait, high resolution." This will be indicated
 as "prompt" in the examples below.
 First we let SD create a series of images in the usual way, in this case
 requesting six iterations:
 ~~~
 dream> lucy lawless as xena, warrior princess, character portrait, high resolution -n6
 ...
 Outputs:
 ./outputs/Xena/000001.1579445059.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -S1579445059
 ./outputs/Xena/000001.1880768722.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -S1880768722
 ./outputs/Xena/000001.332057179.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -S332057179
 ./outputs/Xena/000001.2224800325.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -S2224800325
 ./outputs/Xena/000001.465250761.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -S465250761
 ./outputs/Xena/000001.3357757885.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -S3357757885
 ~~~
 The one with seed 3357757885 looks nice:
 <img src="static/variation_walkthru/000001.3357757885.png"/>
 Let's try to generate some variations. Using the same seed, we pass
 the argument -v0.1 (or --variant_amount), which generates a series of
 variations each differing by a variation amount of 0.2. This number
 ranges from 0 to 1.0, with higher numbers being larger amounts of
 variation.
 ~~~
 dream> "prompt" -n6 -S3357757885 -v0.2
 ...
 Outputs:
 ./outputs/Xena/000002.784039624.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -V 784039624,0.2 -S3357757885
 ./outputs/Xena/000002.3647897225.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -V 3647897225,0.2 -S3357757885
 ./outputs/Xena/000002.917731034.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -V 917731034,0.2 -S3357757885
 ./outputs/Xena/000002.4116285959.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -V 4116285959,0.2 -S3357757885
 ./outputs/Xena/000002.1614299449.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -V 1614299449,0.2 -S3357757885
 ./outputs/Xena/000002.1335553075.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -V 1335553075,0.2 -S3357757885
 ~~~
 Note that the output for each image has a -V option giving the
 "variant subseed" for that image, consisting of a seed followed by the
 variation amount used to generate it.
 This gives us a series of closely-related variations, including the
 two shown here.
 <img src="static/variation_walkthru/000002.3647897225.png">
 <img src="static/variation_walkthru/000002.1614299449.png">
 I like the expression on Xena's face in the first one (subseed
 3647897225), and the armor on her shoulder in the second one (subseed
 1614299449). Can we combine them to get the best of both worlds?
 We combine the two variations using -V (--with_variations). Again, we
 must provide the seed for the originally-chosen image in order for
 this to work.
 ~~~
 dream> "prompt"  -S3357757885 -V3647897225,0.1;1614299449,0.1
 Outputs:
 ./outputs/Xena/000003.1614299449.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -V 3647897225,0.1;1614299449,0.1 -S3357757885
 ~~~
 Here we are providing equal weights (0.1 and 0.1) for both the
 subseeds. The resulting image is close, but not exactly what I
 wanted:
 <img src="static/variation_walkthru/000003.1614299449.png">
 We could either try combining the images with different weights, or we
 can generate more variations around the almost-but-not-quite image. We
 do the latter, using both the -V (combining) and -v (variation
 strength) options. Note that we use -n6 to generate 6 variations:
 ~~~~
 dream> "prompt" -S3357757885 -V3647897225,0.1;1614299449,0.1 -v0.05 -n6
 Outputs:
 ./outputs/Xena/000004.3279757577.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -V 3647897225,0.1;1614299449,0.1;3279757577,0.05 -S3357757885
 ./outputs/Xena/000004.2853129515.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -V 3647897225,0.1;1614299449,0.1;2853129515,0.05 -S3357757885
 ./outputs/Xena/000004.3747154981.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -V 3647897225,0.1;1614299449,0.1;3747154981,0.05 -S3357757885
 ./outputs/Xena/000004.2664260391.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -V 3647897225,0.1;1614299449,0.1;2664260391,0.05 -S3357757885
 ./outputs/Xena/000004.1642517170.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -V 3647897225,0.1;1614299449,0.1;1642517170,0.05 -S3357757885
 ./outputs/Xena/000004.2183375608.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -V 3647897225,0.1;1614299449,0.1;2183375608,0.05 -S3357757885
 ~~~
 This produces six images, all slight variations on the combination of
 the chosen two images. Here's the one I like best:
 <img src="static/variation_walkthru/000004.3747154981.png">
 As you can see, this is a very powerful too, which when combined with
 subprompt weighting, gives you great control over the content and
 quality of your generated images.
--- a/ldm/simplet2i.py
+++ b/ldm/simplet2i.py
@ -266,7 +266,6 @@ class T2I:
        """
        # TODO: convert this into a getattr() loop
        steps                 = steps      or self.steps
        seed                  = seed       or self.seed
        width                 = width      or self.width
        height                = height     or self.height
        cfg_scale             = cfg_scale  or self.cfg_scale
@ -296,6 +295,7 @@ class T2I:
            assert all(0 <= weight <= 1 for _, weight in with_variations),\
                f'variation weights must be in [0.0, 1.0]: got {[weight for _, weight in with_variations]}'
        seed                  = seed       or self.seed
        width, height, _ = self._resolution_check(width, height, log=True)
        # TODO: - Check if this is still necessary to run on M1 devices.
@ -319,7 +319,7 @@ class T2I:
            if init_img:
                assert os.path.exists(init_img), f'{init_img}: File not found'
                init_image = self._load_img(init_img, width, height, fit).to(self.device)
-                with scope(device.type):
+                with scope(self.device.type):
                    init_latent = self.model.get_first_stage_encoding(
                        self.model.encode_first_stage(init_image)
                    ) # move to latent space
--- a/static/variation_walkthru/000001.3357757885.png
+++ b/static/variation_walkthru/000001.3357757885.png
--- a/static/variation_walkthru/000002.1614299449.png
+++ b/static/variation_walkthru/000002.1614299449.png
--- a/static/variation_walkthru/000002.3647897225.png
+++ b/static/variation_walkthru/000002.3647897225.png
--- a/static/variation_walkthru/000003.1614299449.png
+++ b/static/variation_walkthru/000003.1614299449.png
--- a/static/variation_walkthru/000004.3747154981.png
+++ b/static/variation_walkthru/000004.3747154981.png