added walkthru, small code fixes

2024-08-30 20:32:17 +00:00 · 2022-09-02 17:54:55 -04:00
parent 2d65b03f05
commit dd2af3f93c
8 changed files with 121 additions and 3 deletions
--- a/README.md
+++ b/README.md
@ -23,6 +23,7 @@ text-to-image generator. This fork supports:

 3. A basic Web interface that allows you to run a local web server for
   generating images in your browser.
+
 4. A notebook for running the code on Google Colab.

 5. Upscaling and face fixing using the optional ESRGAN and GFPGAN
@ -30,7 +31,11 @@ text-to-image generator. This fork supports:

 6. Weighted subprompts for prompt tuning.

-7. Textual inversion for customization of the prompt language and images.
+7. [Image variations](Variations.md) which allow you to systematically
+generate variations of an image you like and combine two or more
+images together to combine the best features of both.
+
+8. Textual inversion for customization of the prompt language and images.

 8. ...and more!

--- a/VARIATIONS.md
+++ b/VARIATIONS.md
@ -0,0 +1,113 @@
+# Cheat Sheat for Generating Variations
+
+Release 1.13 of SD-Dream adds support for image variations. There are two things that you can do:
+
+1. Generate a series of systematic variations of an image, given a
+prompt. The amount of variation from one image to the next can be
+controlled.
+
+2. Given two or more variations that you like, you can combine them in
+a weighted fashion
+
+This cheat sheet provides a quick guide for how this works in
+practice, using variations to create the desired image of Xena,
+Warrior Princess.
+
+## Step 1 -- find a base image that you like
+
+The prompt we will use throughout is "lucy lawless as xena, warrior
+princess, character portrait, high resolution." This will be indicated
+as "prompt" in the examples below.
+
+First we let SD create a series of images in the usual way, in this case
+requesting six iterations:
+
+~~~
+dream> lucy lawless as xena, warrior princess, character portrait, high resolution -n6
+...
+Outputs:
+./outputs/Xena/000001.1579445059.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -S1579445059
+./outputs/Xena/000001.1880768722.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -S1880768722
+./outputs/Xena/000001.332057179.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -S332057179
+./outputs/Xena/000001.2224800325.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -S2224800325
+./outputs/Xena/000001.465250761.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -S465250761
+./outputs/Xena/000001.3357757885.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -S3357757885
+~~~
+
+The one with seed 3357757885 looks nice:
+
+<img src="static/variation_walkthru/000001.3357757885.png"/>
+
+Let's try to generate some variations. Using the same seed, we pass
+the argument -v0.1 (or --variant_amount), which generates a series of
+variations each differing by a variation amount of 0.2. This number
+ranges from 0 to 1.0, with higher numbers being larger amounts of
+variation.
+
+~~~
+dream> "prompt" -n6 -S3357757885 -v0.2
+...
+Outputs:
+./outputs/Xena/000002.784039624.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -V 784039624,0.2 -S3357757885
+./outputs/Xena/000002.3647897225.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -V 3647897225,0.2 -S3357757885
+./outputs/Xena/000002.917731034.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -V 917731034,0.2 -S3357757885
+./outputs/Xena/000002.4116285959.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -V 4116285959,0.2 -S3357757885
+./outputs/Xena/000002.1614299449.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -V 1614299449,0.2 -S3357757885
+./outputs/Xena/000002.1335553075.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -V 1335553075,0.2 -S3357757885
+~~~
+
+Note that the output for each image has a -V option giving the
+"variant subseed" for that image, consisting of a seed followed by the
+variation amount used to generate it.
+
+This gives us a series of closely-related variations, including the
+two shown here.
+
+<img src="static/variation_walkthru/000002.3647897225.png">
+<img src="static/variation_walkthru/000002.1614299449.png">
+
+
+I like the expression on Xena's face in the first one (subseed
+3647897225), and the armor on her shoulder in the second one (subseed
+1614299449). Can we combine them to get the best of both worlds?
+
+We combine the two variations using -V (--with_variations). Again, we
+must provide the seed for the originally-chosen image in order for
+this to work.
+
+~~~
+dream> "prompt"  -S3357757885 -V3647897225,0.1;1614299449,0.1
+Outputs:
+./outputs/Xena/000003.1614299449.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -V 3647897225,0.1;1614299449,0.1 -S3357757885
+~~~
+
+Here we are providing equal weights (0.1 and 0.1) for both the
+subseeds. The resulting image is close, but not exactly what I
+wanted:
+
+<img src="static/variation_walkthru/000003.1614299449.png">
+
+We could either try combining the images with different weights, or we
+can generate more variations around the almost-but-not-quite image. We
+do the latter, using both the -V (combining) and -v (variation
+strength) options. Note that we use -n6 to generate 6 variations:
+
+~~~~
+dream> "prompt" -S3357757885 -V3647897225,0.1;1614299449,0.1 -v0.05 -n6
+Outputs:
+./outputs/Xena/000004.3279757577.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -V 3647897225,0.1;1614299449,0.1;3279757577,0.05 -S3357757885
+./outputs/Xena/000004.2853129515.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -V 3647897225,0.1;1614299449,0.1;2853129515,0.05 -S3357757885
+./outputs/Xena/000004.3747154981.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -V 3647897225,0.1;1614299449,0.1;3747154981,0.05 -S3357757885
+./outputs/Xena/000004.2664260391.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -V 3647897225,0.1;1614299449,0.1;2664260391,0.05 -S3357757885
+./outputs/Xena/000004.1642517170.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -V 3647897225,0.1;1614299449,0.1;1642517170,0.05 -S3357757885
+./outputs/Xena/000004.2183375608.png: "prompt" -s50 -W512 -H512 -C7.5 -Ak_lms -V 3647897225,0.1;1614299449,0.1;2183375608,0.05 -S3357757885
+~~~
+
+This produces six images, all slight variations on the combination of
+the chosen two images. Here's the one I like best:
+
+<img src="static/variation_walkthru/000004.3747154981.png">
+
+As you can see, this is a very powerful too, which when combined with
+subprompt weighting, gives you great control over the content and
+quality of your generated images.
--- a/ldm/simplet2i.py
+++ b/ldm/simplet2i.py
@ -266,7 +266,6 @@ class T2I:
        """
        # TODO: convert this into a getattr() loop
        steps                 = steps      or self.steps
-        seed                  = seed       or self.seed
        width                 = width      or self.width
        height                = height     or self.height
        cfg_scale             = cfg_scale  or self.cfg_scale
@ -296,6 +295,7 @@ class T2I:
            assert all(0 <= weight <= 1 for _, weight in with_variations),\
                f'variation weights must be in [0.0, 1.0]: got {[weight for _, weight in with_variations]}'

+        seed                  = seed       or self.seed
        width, height, _ = self._resolution_check(width, height, log=True)

        # TODO: - Check if this is still necessary to run on M1 devices.
@ -319,7 +319,7 @@ class T2I:
            if init_img:
                assert os.path.exists(init_img), f'{init_img}: File not found'
                init_image = self._load_img(init_img, width, height, fit).to(self.device)
-                with scope(device.type):
+                with scope(self.device.type):
                    init_latent = self.model.get_first_stage_encoding(
                        self.model.encode_first_stage(init_image)
                    ) # move to latent space
--- a/static/variation_walkthru/000001.3357757885.png
+++ b/static/variation_walkthru/000001.3357757885.png
--- a/static/variation_walkthru/000002.1614299449.png
+++ b/static/variation_walkthru/000002.1614299449.png
--- a/static/variation_walkthru/000002.3647897225.png
+++ b/static/variation_walkthru/000002.3647897225.png
--- a/static/variation_walkthru/000003.1614299449.png
+++ b/static/variation_walkthru/000003.1614299449.png
--- a/static/variation_walkthru/000004.3747154981.png
+++ b/static/variation_walkthru/000004.3747154981.png