Add Latent Diffusion & Image Lab (#17)

* Add Latent Diffusion & Image Lab * Update versions
2022-09-02 09:55:36 +02:00 · 2022-09-02 09:55:36 +02:00 · 089fc524d8
commit 089fc524d8
parent 0d8b7d4ac8
5 changed files with 77 additions and 32 deletions
--- a/.github/ISSUE_TEMPLATE/bug.md
+++ b/.github/ISSUE_TEMPLATE/bug.md
@ -7,7 +7,7 @@ assignees: ''

 ---

-**Has this issue been opened before? Check issues [here](https://github.com/AbdBarho/stable-diffusion-webui-docker/issues?q=is%3Aissue) and in [this one as well](https://github.com/hlky/stable-diffusion-webui)**
+**Has this issue been opened before? Check the [FAQ](https://github.com/AbdBarho/stable-diffusion-webui-docker/wiki/Main), the [issues](https://github.com/AbdBarho/stable-diffusion-webui-docker/issues?q=is%3Aissue) and in [the issues in the WebUI repo](https://github.com/hlky/stable-diffusion-webui)**



--- a/README.md
+++ b/README.md
@ -2,19 +2,23 @@

 Run Stable Diffusion on your machine with a nice UI without any hassle!

-This repository provides the [WebUI](https://github.com/hlky/stable-diffusion-webui) as docker for easy setup and deployment. Please note that this repo delivers all cutting-edge unstable changes from the WebUI, so expect some bugs.
+This repository provides the [WebUI](https://github.com/hlky/stable-diffusion-webui) as a docker image for easy setup and deployment. Please note that the WebUI is experimental and evolving quickly, so expect some bugs.

-### Features
+## Features

 - Interactive UI with many features, and more on the way!
 - Support for 6GB GPU cards.
 - GFPGAN for face reconstruction, RealESRGAN for super-sampling.
- [Textual Inversion](https://github.com/hlky/sd-enable-textual-inversion)
+- Experimental:
+  - [Textual Inversion](https://github.com/hlky/sd-enable-textual-inversion)
+  - Latent Diffusion Super Resolution
+  - GoBig
+  - GoLatent
 - many more!

 ## Setup

-make sure you have docker installed and up to date. Download this repo and run:
+Make sure you have an **up to date** version of docker installed. Download this repo and run:

 ```
 docker compose build
@ -25,15 +29,20 @@ you can let it build in the background while you download the different models
 - [Stable Diffusion v1.4 (4GB)](https://www.googleapis.com/storage/v1/b/aai-blog-files/o/sd-v1-4.ckpt?alt=media), rename to `model.ckpt`
 - (Optional) [GFPGANv1.3.pth (333MB)](https://github.com/TencentARC/GFPGAN/releases/download/v1.3.0/GFPGANv1.3.pth).
 - (Optional) [RealESRGAN_x4plus.pth (64MB)](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x4plus.pth) and [RealESRGAN_x4plus_anime_6B.pth (18MB)](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth).
+- (Optional) [LDSR](https://heibox.uni-heidelberg.de/f/578df07c8fc04ffbadf3/?dl=1) and [its configuration file](https://heibox.uni-heidelberg.de/f/31a76b13ea27482981b4/?dl=1), rename to `LDSR.ckpt` and `LDSR.yaml` respectively.
+<!-- - (Optional) [RealESRGAN_x2plus.pth (64MB)](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.1/RealESRGAN_x2plus.pth)
+- TODO: (I still need to find the RealESRGAN_x2plus_6b.pth) -->

 Put all of the downloaded files in the `models` folder, it should look something like this:

 ```
 models/
+├── model.ckpt
 ├── GFPGANv1.3.pth
 ├── RealESRGAN_x4plus.pth
 ├── RealESRGAN_x4plus_anime_6B.pth
-└── model.ckpt
+├── LDSR.ckpt
+└── LDSR.yaml
 ```

 ## Run
@ -52,12 +61,9 @@ Note: the first start will take sometime as some other models will be downloaded

 in the `docker-compose.yml` you can change the `CLI_ARGS` variable, which contains the arguments that will be passed to the WebUI. By default: `--extra-models-cpu --optimized-turbo` are given, which allow you to use this model on a 6GB GPU. However, some features might not be available in the mode.

-[You can find the full list of arguments here](https://github.com/hlky/stable-diffusion/blob/c5b2c86f1479dec75b0e92dd37f9357a68594bda/scripts/webui.py)
+[You can find the full list of arguments here.](https://github.com/hlky/stable-diffusion/blob/d667ff52a36b4e79526f01555bfbf85428f334ce/scripts/webui.py)

-## FAQ
-
- To enable [Textual Inversion](https://github.com/hlky/sd-enable-textual-inversion) remove `--optimize` and `--optimize-turbo` flags and add `--no-half`, [more info here](https://github.com/AbdBarho/stable-diffusion-webui-docker/issues/6).
- If [output is a always green imagee](https://github.com/AbdBarho/stable-diffusion-webui-docker/issues/9), use `--precision full --no-half`.
+You can find fixes to common issues [in the wiki page.](https://github.com/AbdBarho/stable-diffusion-webui-docker/wiki/Main)

 # Disclaimer

--- a/build/Dockerfile
+++ b/build/Dockerfile
@ -2,18 +2,23 @@

 FROM continuumio/miniconda3:4.12.0

-
 RUN conda install python=3.8.5 && conda clean -a -y
 RUN conda install pytorch==1.11.0 torchvision==0.12.0 cudatoolkit=11.3 -c pytorch && conda clean -a -y
 RUN git clone https://github.com/hlky/stable-diffusion.git && cd stable-diffusion && git reset --hard ff8c2d0b709f1e4180fb19fa5c27ec28c414cedd
 RUN conda env update --file stable-diffusion/environment.yaml --name base && conda clean -a -y

+# Fix: Module PIL has not attribute "Resampling"
+RUN conda install -c anaconda pillow==9.2.0 && conda clean -a -y
+
+
+SHELL ["/bin/bash", "-ceuxo", "pipefail"]
+
 # fonts for generating the grid
 RUN apt-get update && apt install fonts-dejavu-core rsync -y && apt-get clean

 # Note: don't update the sha of previous versions because the install will take forever
 # instead, update the repo state in a later step
-RUN cd stable-diffusion && git pull && git reset --hard c5b2c86f1479dec75b0e92dd37f9357a68594bda && \
+RUN cd stable-diffusion && git pull && git reset --hard d667ff52a36b4e79526f01555bfbf85428f334ce && \
  conda env update --file environment.yaml --name base && conda clean -a -y

 # download dev UI version, update the sha below in case you want some other version
@ -28,31 +33,32 @@ RUN cd stable-diffusion && git pull && git reset --hard c5b2c86f1479dec75b0e92dd
 # cd / && rm -rf stable-diffusion-webui
 # EOF

-# Textual-inversion:
+# Textual inversion
 RUN <<EOF
 git clone https://github.com/hlky/sd-enable-textual-inversion.git &&
 cd /sd-enable-textual-inversion && git reset --hard 08f9b5046552d17cf7327b30a98410222741b070 &&
-rsync -a /sd-enable-textual-inversion/ /stable-diffusion/
+rsync -a /sd-enable-textual-inversion/ /stable-diffusion/ &&
+rm -rf /sd-enable-textual-inversion
 EOF

+# Latent diffusion
+RUN <<EOF
+git clone https://github.com/devilismyfriend/latent-diffusion &&
+cd /latent-diffusion &&
+git reset --hard 4119cf038fb953360fb004e48adb9913eed3594a &&
+# hacks all the way down
+mv ldm ldm_latent &&
+sed -i -- 's/from ldm/from ldm_latent/g' *.py
+# dont forget to update the yaml!!
+EOF
+
+
 # add info
-COPY info.py /info.py
-RUN  python /info.py /stable-diffusion/frontend/frontend.py
+COPY . /docker/
+RUN python /docker/info.py /stable-diffusion/frontend/frontend.py

 WORKDIR /stable-diffusion
-ENV TRANSFORMERS_CACHE=/cache/transformers TORCH_HOME=/cache/torch CLI_ARGS="" \
-  GFPGAN_PATH=/stable-diffusion/src/gfpgan/experiments/pretrained_models/GFPGANv1.3.pth \
-  RealESRGAN_PATH=/stable-diffusion/src/realesrgan/experiments/pretrained_models/RealESRGAN_x4plus.pth \
-  RealESRGAN_ANIME_PATH=/stable-diffusion/src/realesrgan/experiments/pretrained_models/RealESRGAN_x4plus_anime_6B.pth
+ENV TRANSFORMERS_CACHE=/cache/transformers TORCH_HOME=/cache/torch CLI_ARGS=""
 EXPOSE 7860
-CMD \
-  for path in "${GFPGAN_PATH}" "${RealESRGAN_PATH}" "${RealESRGAN_ANIME_PATH}"; do \
-  name=$(basename "${path}"); \
-  base=$(dirname "${path}"); \
-  test -f "/models/${name}" && mkdir -p "${base}" && ln -sf "/models/${name}" "${path}" && echo "Mounted ${name}";\
-  done;\
-  # force facexlib cache
-  mkdir -p /cache/weights/ && rm -rf /stable-diffusion/src/facexlib/facexlib/weights && \
-  ln -sf  /cache/weights/ /stable-diffusion/src/facexlib/facexlib/ && \
-  # run, -u to not buffer stdout / stderr
-  python3 -u scripts/webui.py --outdir /output --ckpt /models/model.ckpt --save-metadata ${CLI_ARGS}
+# run, -u to not buffer stdout / stderr
+CMD /docker/mount.sh && python3 -u scripts/webui.py --outdir /output --ckpt /models/model.ckpt --ldsr-dir /latent-diffusion --save-metadata ${CLI_ARGS}
--- a/build/mount.sh
+++ b/build/mount.sh
@ -0,0 +1,31 @@
+#!/bin/bash
+
+set -e
+
+declare -A MODELS
+MODELS["/stable-diffusion/src/gfpgan/experiments/pretrained_models/GFPGANv1.3.pth"]=GFPGANv1.3.pth
+MODELS["/stable-diffusion/src/realesrgan/experiments/pretrained_models/RealESRGAN_x4plus.pth"]=RealESRGAN_x4plus.pth
+MODELS["/stable-diffusion/src/realesrgan/experiments/pretrained_models/RealESRGAN_x4plus_anime_6B.pth"]=RealESRGAN_x4plus_anime_6B.pth
+MODELS["/latent-diffusion/experiments/pretrained_models/model.ckpt"]=LDSR.ckpt
+# MODELS["/latent-diffusion/experiments/pretrained_models/project.yaml"]=LDSR.yaml
+
+for path in "${!MODELS[@]}"; do
+  name=${MODELS[$path]}
+  base=$(dirname "${path}")
+  from_path="/models/${name}"
+  if test -f "${from_path}"; then
+    mkdir -p "${base}" && ln -sf "${from_path}" "${path}" && echo "Mounted ${name}"
+  else
+    echo "Skipping ${name}"
+  fi
+done
+
+# hack for latent-diffusion
+if test -f /models/LDSR.yaml; then
+  sed 's/ldm\./ldm_latent\./g' /models/LDSR.yaml >/latent-diffusion/experiments/pretrained_models/project.yaml
+fi
+
+# force facexlib cache
+mkdir -p /cache/weights/
+rm -rf /stable-diffusion/src/facexlib/facexlib/weights
+ln -sf /cache/weights/ /stable-diffusion/src/facexlib/facexlib/
--- a/models/.gitignore
+++ b/models/.gitignore
@ -2,3 +2,5 @@
 /GFPGANv1.3.pth
 /RealESRGAN_x4plus.pth
 /RealESRGAN_x4plus_anime_6B.pth
+/LDSR.ckpt
+/LDSR.yaml