mirror of
https://github.com/invoke-ai/InvokeAI
synced 2025-07-26 05:17:55 +00:00
docs: clean up contributing docs
This commit is contained in:
committed by
Kent Keirsey
parent
b6190651ad
commit
3fcd3d490e
@ -2,40 +2,41 @@
|
||||
|
||||
Invoke AI originated as a project built by the community, and that vision carries forward today as we aim to build the best pro-grade tools available. We work together to incorporate the latest in AI/ML research, making these tools available in over 20 languages to artists and creatives around the world as part of our fully permissive OSS project designed for individual users to self-host and use.
|
||||
|
||||
|
||||
# Methods of Contributing to Invoke AI
|
||||
Anyone who wishes to contribute to InvokeAI, whether features, bug fixes, code cleanup, testing, code reviews, documentation or translation is very much encouraged to do so.
|
||||
We welcome contributions, whether features, bug fixes, code cleanup, testing, code reviews, documentation or translation. Please check in with us before diving in to code to ensure your work aligns with our vision.
|
||||
|
||||
## Development
|
||||
If you’d like to help with development, please see our [development guide](contribution_guides/development.md).
|
||||
|
||||
If you’d like to help with development, please see our [development guide](contribution_guides/development.md).
|
||||
|
||||
**New Contributors:** If you’re unfamiliar with contributing to open source projects, take a look at our [new contributor guide](contribution_guides/newContributorChecklist.md).
|
||||
|
||||
## Nodes
|
||||
|
||||
If you’d like to add a Node, please see our [nodes contribution guide](../nodes/contributingNodes.md).
|
||||
|
||||
## Support and Triaging
|
||||
Helping support other users in [Discord](https://discord.gg/ZmtBAhwWhy) and on Github are valuable forms of contribution that we greatly appreciate.
|
||||
|
||||
We receive many issues and requests for help from users. We're limited in bandwidth relative to our the user base, so providing answers to questions or helping identify causes of issues is very helpful. By doing this, you enable us to spend time on the highest priority work.
|
||||
Helping support other users in [Discord](https://discord.gg/ZmtBAhwWhy) and on Github are valuable forms of contribution that we greatly appreciate.
|
||||
|
||||
We receive many issues and requests for help from users. We're limited in bandwidth relative to our the user base, so providing answers to questions or helping identify causes of issues is very helpful. By doing this, you enable us to spend time on the highest priority work.
|
||||
|
||||
## Documentation
|
||||
|
||||
If you’d like to help with documentation, please see our [documentation guide](contribution_guides/documentation.md).
|
||||
|
||||
## Translation
|
||||
|
||||
If you'd like to help with translation, please see our [translation guide](contribution_guides/translation.md).
|
||||
|
||||
## Tutorials
|
||||
Please reach out to @imic or @hipsterusername on [Discord](https://discord.gg/ZmtBAhwWhy) to help create tutorials for InvokeAI.
|
||||
## Tutorials
|
||||
|
||||
We hope you enjoy using our software as much as we enjoy creating it, and we hope that some of those of you who are reading this will elect to become part of our contributor community.
|
||||
Please reach out to @hipsterusername on [Discord](https://discord.gg/ZmtBAhwWhy) to help create tutorials for InvokeAI.
|
||||
|
||||
|
||||
# Contributors
|
||||
## Contributors
|
||||
|
||||
This project is a combined effort of dedicated people from across the world. [Check out the list of all these amazing people](https://invoke-ai.github.io/InvokeAI/other/CONTRIBUTORS/). We thank them for their time, hard work and effort.
|
||||
|
||||
# Code of Conduct
|
||||
## Code of Conduct
|
||||
|
||||
The InvokeAI community is a welcoming place, and we want your help in maintaining that. Please review our [Code of Conduct](https://github.com/invoke-ai/InvokeAI/blob/main/CODE_OF_CONDUCT.md) to learn more - it's essential to maintaining a respectful and inclusive environment.
|
||||
|
||||
@ -49,12 +50,3 @@ By making a contribution to this project, you certify that:
|
||||
This disclaimer is not a license and does not grant any rights or permissions. You must obtain necessary permissions and licenses, including from third parties, before contributing to this project.
|
||||
|
||||
This disclaimer is provided "as is" without warranty of any kind, whether expressed or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose, or non-infringement. In no event shall the authors or copyright holders be liable for any claim, damages, or other liability, whether in an action of contract, tort, or otherwise, arising from, out of, or in connection with the contribution or the use or other dealings in the contribution.
|
||||
# Support
|
||||
|
||||
For support, please use this repository's [GitHub Issues](https://github.com/invoke-ai/InvokeAI/issues), or join the [Discord](https://discord.gg/ZmtBAhwWhy).
|
||||
|
||||
Original portions of the software are Copyright (c) 2023 by respective contributors.
|
||||
|
||||
---
|
||||
|
||||
Remember, your contributions help make this project great. We're excited to see what you'll bring to our community!
|
||||
|
@ -1,13 +1,13 @@
|
||||
# Documentation
|
||||
|
||||
Documentation is an important part of any open source project. It provides a clear and concise way to communicate how the software works, how to use it, and how to troubleshoot issues. Without proper documentation, it can be difficult for users to understand the purpose and functionality of the project.
|
||||
Documentation is an important part of any open source project. It provides a clear and concise way to communicate how the software works, how to use it, and how to troubleshoot issues. Without proper documentation, it can be difficult for users to understand the purpose and functionality of the project.
|
||||
|
||||
## Contributing
|
||||
|
||||
All documentation is maintained in the InvokeAI GitHub repository. If you come across documentation that is out of date or incorrect, please submit a pull request with the necessary changes.
|
||||
All documentation is maintained in our [GitHub repository](https://github.com/invoke-ai/InvokeAI). If you come across documentation that is out of date or incorrect, please submit a pull request with the necessary changes.
|
||||
|
||||
When updating or creating documentation, please keep in mind InvokeAI is a tool for everyone, not just those who have familiarity with generative art.
|
||||
When updating or creating documentation, please keep in mind Invoke is a tool for everyone, not just those who have familiarity with generative art.
|
||||
|
||||
## Help & Questions
|
||||
|
||||
Please ping @imic or @hipsterusername in the [Discord](https://discord.com/channels/1020123559063990373/1049495067846524939) if you have any questions.
|
||||
Please ping @hipsterusername on [Discord](https://discord.gg/ZmtBAhwWhy) if you have any questions.
|
||||
|
@ -16,4 +16,4 @@ Please check Weblate's [documentation](https://docs.weblate.org/en/latest/index
|
||||
|
||||
## Thanks
|
||||
|
||||
Thanks to the InvokeAI community for their efforts to translate the project!
|
||||
Thanks to the InvokeAI community for their efforts to translate the project!
|
||||
|
54
docs/contributors.md
Normal file
54
docs/contributors.md
Normal file
@ -0,0 +1,54 @@
|
||||
---
|
||||
title: Contributors
|
||||
---
|
||||
|
||||
We thank [all contributors](https://github.com/invoke-ai/InvokeAI/graphs/contributors) for their time and hard work!
|
||||
|
||||
## **Original Author**
|
||||
|
||||
- [Lincoln D. Stein](mailto:lincoln.stein@gmail.com)
|
||||
|
||||
## **Current Core Team**
|
||||
|
||||
- @lstein (Lincoln Stein) - Co-maintainer
|
||||
- @blessedcoolant - Co-maintainer
|
||||
- @hipsterusername (Kent Keirsey) - Co-maintainer, CEO, Positive Vibes
|
||||
- @psychedelicious (Spencer Mabrito) - Web Team Leader
|
||||
- @joshistoast (Josh Corbett) - Web Development
|
||||
- @cheerio (Mary Rogers) - Lead Engineer & Web App Development
|
||||
- @ebr (Eugene Brodsky) - Cloud/DevOps/Sofware engineer; your friendly neighbourhood cluster-autoscaler
|
||||
- @sunija - Standalone version
|
||||
- @brandon (Brandon Rising) - Platform, Infrastructure, Backend Systems
|
||||
- @ryanjdick (Ryan Dick) - Machine Learning & Training
|
||||
- @JPPhoto - Core image generation nodes
|
||||
- @dunkeroni - Image generation backend
|
||||
- @SkunkWorxDark - Image generation backend
|
||||
- @glimmerleaf (Devon Hopkins) - Community Wizard
|
||||
- @gogurt enjoyer - Discord moderator and end user support
|
||||
- @whosawhatsis - Discord moderator and end user support
|
||||
- @dwringer - Discord moderator and end user support
|
||||
- @526christian - Discord moderator and end user support
|
||||
- @harvester62 - Discord moderator and end user support
|
||||
|
||||
## **Honored Team Alumni**
|
||||
|
||||
- @StAlKeR7779 (Sergey Borisov) - Torch stack, ONNX, model management, optimization
|
||||
- @damian0815 - Attention Systems and Compel Maintainer
|
||||
- @netsvetaev (Artur) - Localization support
|
||||
- @Kyle0654 (Kyle Schouviller) - Node Architect and General Backend Wizard
|
||||
- @tildebyte - Installation and configuration
|
||||
- @mauwii (Matthias Wilde) - Installation, release, continuous integration
|
||||
- @chainchompa (Jennifer Player) - Web Development & Chain-Chomping
|
||||
- @millu (Millun Atluri) - Community Wizard, Documentation, Node-wrangler,
|
||||
- @genomancer (Gregg Helt) - Controlnet support
|
||||
- @keturn (Kevin Turner) - Diffusers
|
||||
|
||||
## **Original CompVis (Stable Diffusion) Authors**
|
||||
|
||||
- [Robin Rombach](https://github.com/rromb)
|
||||
- [Patrick von Platen](https://github.com/patrickvonplaten)
|
||||
- [ablattmann](https://github.com/ablattmann)
|
||||
- [Patrick Esser](https://github.com/pesser)
|
||||
- [owenvincent](https://github.com/owenvincent)
|
||||
- [apolinario](https://github.com/apolinario)
|
||||
- [Charles Packer](https://github.com/cpacker)
|
@ -1,380 +0,0 @@
|
||||
---
|
||||
title: Contributors
|
||||
---
|
||||
|
||||
# :octicons-person-24: Contributors
|
||||
|
||||
The list of all the amazing people who have contributed to the various features that you get to
|
||||
experience in this fork.
|
||||
|
||||
We thank them for all of their time and hard work.
|
||||
|
||||
## **Original Author**
|
||||
|
||||
- [Lincoln D. Stein](mailto:lincoln.stein@gmail.com)
|
||||
|
||||
## **Current Core Team**
|
||||
|
||||
* @lstein (Lincoln Stein) - Co-maintainer
|
||||
* @blessedcoolant - Co-maintainer
|
||||
* @hipsterusername (Kent Keirsey) - Co-maintainer, CEO, Positive Vibes
|
||||
* @psychedelicious (Spencer Mabrito) - Web Team Leader
|
||||
* @chainchompa (Jennifer Player) - Web Development & Chain-Chomping
|
||||
* @josh is toast (Josh Corbett) - Web Development
|
||||
* @cheerio (Mary Rogers) - Lead Engineer & Web App Development
|
||||
* @ebr (Eugene Brodsky) - Cloud/DevOps/Sofware engineer; your friendly neighbourhood cluster-autoscaler
|
||||
* @sunija - Standalone version
|
||||
* @genomancer (Gregg Helt) - Controlnet support
|
||||
* @brandon (Brandon Rising) - Platform, Infrastructure, Backend Systems
|
||||
* @ryanjdick (Ryan Dick) - Machine Learning & Training
|
||||
* @JPPhoto - Core image generation nodes
|
||||
* @dunkeroni - Image generation backend
|
||||
* @SkunkWorxDark - Image generation backend
|
||||
* @keturn (Kevin Turner) - Diffusers
|
||||
* @millu (Millun Atluri) - Community Wizard, Documentation, Node-wrangler,
|
||||
* @glimmerleaf (Devon Hopkins) - Community Wizard
|
||||
* @gogurt enjoyer - Discord moderator and end user support
|
||||
* @whosawhatsis - Discord moderator and end user support
|
||||
* @dwinrger - Discord moderator and end user support
|
||||
* @526christian - Discord moderator and end user support
|
||||
* @harvester62 - Discord moderator and end user support
|
||||
|
||||
|
||||
## **Honored Team Alumni**
|
||||
|
||||
* @StAlKeR7779 (Sergey Borisov) - Torch stack, ONNX, model management, optimization
|
||||
* @damian0815 - Attention Systems and Compel Maintainer
|
||||
* @netsvetaev (Artur) - Localization support
|
||||
* @Kyle0654 (Kyle Schouviller) - Node Architect and General Backend Wizard
|
||||
* @tildebyte - Installation and configuration
|
||||
* @mauwii (Matthias Wilde) - Installation, release, continuous integration
|
||||
|
||||
|
||||
## **Full List of Contributors by Commit Name**
|
||||
|
||||
- 이승석
|
||||
- AbdBarho
|
||||
- ablattmann
|
||||
- AdamOStark
|
||||
- Adam Rice
|
||||
- Airton Silva
|
||||
- Aldo Hoeben
|
||||
- Alexander Eichhorn
|
||||
- Alexandre D. Roberge
|
||||
- Alexandre Macabies
|
||||
- Alfie John
|
||||
- Andreas Rozek
|
||||
- Andre LaBranche
|
||||
- Andy Bearman
|
||||
- Andy Luhrs
|
||||
- Andy Pilate
|
||||
- Anonymous
|
||||
- Anthony Monthe
|
||||
- Any-Winter-4079
|
||||
- apolinario
|
||||
- Ar7ific1al
|
||||
- ArDiouscuros
|
||||
- Armando C. Santisbon
|
||||
- Arnold Cordewiner
|
||||
- Arthur Holstvoogd
|
||||
- artmen1516
|
||||
- Artur
|
||||
- Arturo Mendivil
|
||||
- Ben Alkov
|
||||
- Benjamin Warner
|
||||
- Bernard Maltais
|
||||
- blessedcoolant
|
||||
- blhook
|
||||
- BlueAmulet
|
||||
- Bouncyknighter
|
||||
- Brandon
|
||||
- Brandon Rising
|
||||
- Brent Ozar
|
||||
- Brian Racer
|
||||
- bsilvereagle
|
||||
- c67e708d
|
||||
- camenduru
|
||||
- CapableWeb
|
||||
- Carson Katri
|
||||
- chainchompa
|
||||
- Chloe
|
||||
- Chris Dawson
|
||||
- Chris Hayes
|
||||
- Chris Jones
|
||||
- chromaticist
|
||||
- Claus F. Strasburger
|
||||
- cmdr2
|
||||
- cody
|
||||
- Conor Reid
|
||||
- Cora Johnson-Roberson
|
||||
- coreco
|
||||
- cosmii02
|
||||
- cpacker
|
||||
- Cragin Godley
|
||||
- creachec
|
||||
- CrypticWit
|
||||
- d8ahazard
|
||||
- damian
|
||||
- damian0815
|
||||
- Damian at mba
|
||||
- Damian Stewart
|
||||
- Daniel Manzke
|
||||
- Danny Beer
|
||||
- Dan Sully
|
||||
- Darren Ringer
|
||||
- David Burnett
|
||||
- David Ford
|
||||
- David Regla
|
||||
- David Sisco
|
||||
- David Wager
|
||||
- Daya Adianto
|
||||
- db3000
|
||||
- DekitaRPG
|
||||
- Denis Olshin
|
||||
- Dennis
|
||||
- dependabot[bot]
|
||||
- Dmitry Parnas
|
||||
- Dobrynia100
|
||||
- Dominic Letz
|
||||
- DrGunnarMallon
|
||||
- Drun555
|
||||
- dunkeroni
|
||||
- Edward Johan
|
||||
- elliotsayes
|
||||
- Elrik
|
||||
- ElrikUnderlake
|
||||
- Eric Khun
|
||||
- Eric Wolf
|
||||
- Eugene
|
||||
- Eugene Brodsky
|
||||
- ExperimentalCyborg
|
||||
- Fabian Bahl
|
||||
- Fabio 'MrWHO' Torchetti
|
||||
- Fattire
|
||||
- fattire
|
||||
- Felipe Nogueira
|
||||
- Félix Sanz
|
||||
- figgefigge
|
||||
- Gabriel Mackievicz Telles
|
||||
- gabrielrotbart
|
||||
- gallegonovato
|
||||
- Gérald LONLAS
|
||||
- Gille
|
||||
- GitHub Actions Bot
|
||||
- glibesyck
|
||||
- gogurtenjoyer
|
||||
- Gohsuke Shimada
|
||||
- greatwolf
|
||||
- greentext2
|
||||
- Gregg Helt
|
||||
- H4rk
|
||||
- Håvard Gulldahl
|
||||
- henry
|
||||
- Henry van Megen
|
||||
- hipsterusername
|
||||
- hj
|
||||
- Hosted Weblate
|
||||
- Iman Karim
|
||||
- ismail ihsan bülbül
|
||||
- ItzAttila
|
||||
- Ivan Efimov
|
||||
- jakehl
|
||||
- Jakub Kolčář
|
||||
- JamDon2
|
||||
- James Reynolds
|
||||
- Jan Skurovec
|
||||
- Jari Vetoniemi
|
||||
- Jason Toffaletti
|
||||
- Jaulustus
|
||||
- Jeff Mahoney
|
||||
- Jennifer Player
|
||||
- jeremy
|
||||
- Jeremy Clark
|
||||
- JigenD
|
||||
- Jim Hays
|
||||
- Johan Roxendal
|
||||
- Johnathon Selstad
|
||||
- Jonathan
|
||||
- Jordan Hewitt
|
||||
- Joseph Dries III
|
||||
- Josh Corbett
|
||||
- JPPhoto
|
||||
- jspraul
|
||||
- junzi
|
||||
- Justin Wong
|
||||
- Juuso V
|
||||
- Kaspar Emanuel
|
||||
- Katsuyuki-Karasawa
|
||||
- Keerigan45
|
||||
- Kent Keirsey
|
||||
- Kevin Brack
|
||||
- Kevin Coakley
|
||||
- Kevin Gibbons
|
||||
- Kevin Schaul
|
||||
- Kevin Turner
|
||||
- Kieran Klaassen
|
||||
- krummrey
|
||||
- Kyle
|
||||
- Kyle Lacy
|
||||
- Kyle Schouviller
|
||||
- Lawrence Norton
|
||||
- LemonDouble
|
||||
- Leo Pasanen
|
||||
- Lincoln Stein
|
||||
- LoganPederson
|
||||
- Lynne Whitehorn
|
||||
- majick
|
||||
- Marco Labarile
|
||||
- Marta Nahorniuk
|
||||
- Martin Kristiansen
|
||||
- Mary Hipp
|
||||
- maryhipp
|
||||
- Mary Hipp Rogers
|
||||
- mastercaster
|
||||
- mastercaster9000
|
||||
- Matthias Wild
|
||||
- mauwii
|
||||
- michaelk71
|
||||
- mickr777
|
||||
- Mihai
|
||||
- Mihail Dumitrescu
|
||||
- Mikhail Tishin
|
||||
- Millun Atluri
|
||||
- Minjune Song
|
||||
- Mitchell Allain
|
||||
- mitien
|
||||
- mofuzz
|
||||
- Muhammad Usama
|
||||
- Name
|
||||
- _nderscore
|
||||
- Neil Wang
|
||||
- nekowaiz
|
||||
- nemuruibai
|
||||
- Netzer R
|
||||
- Nicholas Koh
|
||||
- Nicholas Körfer
|
||||
- nicolai256
|
||||
- Niek van der Maas
|
||||
- noodlebox
|
||||
- Nuno Coração
|
||||
- ofirkris
|
||||
- Olivier Louvignes
|
||||
- owenvincent
|
||||
- pand4z31
|
||||
- Patrick Esser
|
||||
- Patrick Tien
|
||||
- Patrick von Platen
|
||||
- Paul Curry
|
||||
- Paul Sajna
|
||||
- pejotr
|
||||
- Peter Baylies
|
||||
- Peter Lin
|
||||
- plucked
|
||||
- prixt
|
||||
- psychedelicious
|
||||
- psychedelicious@windows
|
||||
- Rainer Bernhardt
|
||||
- Riccardo Giovanetti
|
||||
- Rich Jones
|
||||
- rmagur1203
|
||||
- Rob Baines
|
||||
- Robert Bolender
|
||||
- Robin Rombach
|
||||
- Rohan Barar
|
||||
- Rohinish
|
||||
- rpagliuca
|
||||
- rromb
|
||||
- Rupesh Sreeraman
|
||||
- Ryan
|
||||
- Ryan Cao
|
||||
- Ryan Dick
|
||||
- Saifeddine
|
||||
- Saifeddine ALOUI
|
||||
- Sam
|
||||
- SammCheese
|
||||
- Sam McLeod
|
||||
- Sammy
|
||||
- sammyf
|
||||
- Samuel Husso
|
||||
- Saurav Maheshkar
|
||||
- Scott Lahteine
|
||||
- Sean McLellan
|
||||
- Sebastian Aigner
|
||||
- Sergey Borisov
|
||||
- Sergey Krashevich
|
||||
- Shapor Naghibzadeh
|
||||
- Shawn Zhong
|
||||
- Simona Liliac
|
||||
- Simon Vans-Colina
|
||||
- skunkworxdark
|
||||
- slashtechno
|
||||
- SoheilRezaei
|
||||
- Song, Pengcheng
|
||||
- spezialspezial
|
||||
- ssantos
|
||||
- StAlKeR7779
|
||||
- Stefan Tobler
|
||||
- Stephan Koglin-Fischer
|
||||
- SteveCaruso
|
||||
- Steve Martinelli
|
||||
- Steven Frank
|
||||
- Surisen
|
||||
- System X - Files
|
||||
- Taylor Kems
|
||||
- techicode
|
||||
- techybrain-dev
|
||||
- tesseractcat
|
||||
- thealanle
|
||||
- Thomas
|
||||
- tildebyte
|
||||
- Tim Cabbage
|
||||
- Tom
|
||||
- Tom Elovi Spruce
|
||||
- Tom Gouville
|
||||
- tomosuto
|
||||
- Travco
|
||||
- Travis Palmer
|
||||
- tyler
|
||||
- unknown
|
||||
- user1
|
||||
- vedant-3010
|
||||
- Vedant Madane
|
||||
- veprogames
|
||||
- wa.code
|
||||
- wfng92
|
||||
- whjms
|
||||
- whosawhatsis
|
||||
- Will
|
||||
- William Becher
|
||||
- William Chong
|
||||
- Wilson E. Alvarez
|
||||
- woweenie
|
||||
- Wubbbi
|
||||
- xra
|
||||
- Yeung Yiu Hung
|
||||
- ymgenesis
|
||||
- Yorzaren
|
||||
- Yosuke Shinya
|
||||
- yun saki
|
||||
- ZachNagengast
|
||||
- Zadagu
|
||||
- zeptofine
|
||||
- Zerdoumi
|
||||
- Васянатор
|
||||
- 冯不游
|
||||
- 唐澤 克幸
|
||||
|
||||
## **Original CompVis (Stable Diffusion) Authors**
|
||||
|
||||
- [Robin Rombach](https://github.com/rromb)
|
||||
- [Patrick von Platen](https://github.com/patrickvonplaten)
|
||||
- [ablattmann](https://github.com/ablattmann)
|
||||
- [Patrick Esser](https://github.com/pesser)
|
||||
- [owenvincent](https://github.com/owenvincent)
|
||||
- [apolinario](https://github.com/apolinario)
|
||||
- [Charles Packer](https://github.com/cpacker)
|
||||
|
||||
---
|
||||
|
||||
_If you have contributed and don't see your name on the list of contributors, please let one of the
|
||||
collaborators know about the omission, or feel free to make a pull request._
|
@ -1,255 +0,0 @@
|
||||
---
|
||||
title: CompViz-Readme
|
||||
---
|
||||
|
||||
# _README from [CompViz/stable-diffusion](https://github.com/CompVis/stable-diffusion)_
|
||||
|
||||
_Stable Diffusion was made possible thanks to a collaboration with
|
||||
[Stability AI](https://stability.ai/) and [Runway](https://runwayml.com/) and
|
||||
builds upon our previous work:_
|
||||
|
||||
[**High-Resolution Image Synthesis with Latent Diffusion Models**](https://ommer-lab.com/research/latent-diffusion-models/)<br/>
|
||||
[Robin Rombach](https://github.com/rromb)\*,
|
||||
[Andreas Blattmann](https://github.com/ablattmann)\*,
|
||||
[Dominik Lorenz](https://github.com/qp-qp)\,
|
||||
[Patrick Esser](https://github.com/pesser),
|
||||
[Björn Ommer](https://hci.iwr.uni-heidelberg.de/Staff/bommer)<br/>
|
||||
|
||||
## **CVPR '22 Oral**
|
||||
|
||||
which is available on [GitHub](https://github.com/CompVis/latent-diffusion). PDF
|
||||
at [arXiv](https://arxiv.org/abs/2112.10752). Please also visit our
|
||||
[Project page](https://ommer-lab.com/research/latent-diffusion-models/).
|
||||
|
||||

|
||||
[Stable Diffusion](#stable-diffusion-v1) is a latent text-to-image diffusion
|
||||
model. Thanks to a generous compute donation from
|
||||
[Stability AI](https://stability.ai/) and support from
|
||||
[LAION](https://laion.ai/), we were able to train a Latent Diffusion Model on
|
||||
512x512 images from a subset of the [LAION-5B](https://laion.ai/blog/laion-5b/)
|
||||
database. Similar to Google's [Imagen](https://arxiv.org/abs/2205.11487), this
|
||||
model uses a frozen CLIP ViT-L/14 text encoder to condition the model on text
|
||||
prompts. With its 860M UNet and 123M text encoder, the model is relatively
|
||||
lightweight and runs on a GPU with at least 10GB VRAM. See
|
||||
[this section](#stable-diffusion-v1) below and the
|
||||
[model card](https://huggingface.co/CompVis/stable-diffusion).
|
||||
|
||||
## Requirements
|
||||
|
||||
A suitable [conda](https://conda.io/) environment named `ldm` can be created and
|
||||
activated with:
|
||||
|
||||
```
|
||||
conda env create
|
||||
conda activate ldm
|
||||
```
|
||||
|
||||
Note that the first line may be abbreviated `conda env create`, since conda will
|
||||
look for `environment.yml` by default.
|
||||
|
||||
You can also update an existing
|
||||
[latent diffusion](https://github.com/CompVis/latent-diffusion) environment by
|
||||
running
|
||||
|
||||
```bash
|
||||
conda install pytorch torchvision -c pytorch
|
||||
pip install transformers==4.19.2
|
||||
pip install -e .
|
||||
```
|
||||
|
||||
## Stable Diffusion v1
|
||||
|
||||
Stable Diffusion v1 refers to a specific configuration of the model architecture
|
||||
that uses a downsampling-factor 8 autoencoder with an 860M UNet and CLIP
|
||||
ViT-L/14 text encoder for the diffusion model. The model was pretrained on
|
||||
256x256 images and then finetuned on 512x512 images.
|
||||
|
||||
\*Note: Stable Diffusion v1 is a general text-to-image diffusion model and
|
||||
therefore mirrors biases and (mis-)conceptions that are present in its training
|
||||
data. Details on the training procedure and data, as well as the intended use of
|
||||
the model can be found in the corresponding
|
||||
[model card](https://huggingface.co/CompVis/stable-diffusion). Research into the
|
||||
safe deployment of general text-to-image models is an ongoing effort. To prevent
|
||||
misuse and harm, we currently provide access to the checkpoints only for
|
||||
[academic research purposes upon request](https://stability.ai/academia-access-form).
|
||||
**This is an experiment in safe and community-driven publication of a capable
|
||||
and general text-to-image model. We are working on a public release with a more
|
||||
permissive license that also incorporates ethical considerations.\***
|
||||
|
||||
[Request access to Stable Diffusion v1 checkpoints for academic research](https://stability.ai/academia-access-form)
|
||||
|
||||
### Weights
|
||||
|
||||
We currently provide three checkpoints, `sd-v1-1.ckpt`, `sd-v1-2.ckpt` and
|
||||
`sd-v1-3.ckpt`, which were trained as follows,
|
||||
|
||||
- `sd-v1-1.ckpt`: 237k steps at resolution `256x256` on
|
||||
[laion2B-en](https://huggingface.co/datasets/laion/laion2B-en). 194k steps at
|
||||
resolution `512x512` on
|
||||
[laion-high-resolution](https://huggingface.co/datasets/laion/laion-high-resolution)
|
||||
(170M examples from LAION-5B with resolution `>= 1024x1024`).
|
||||
- `sd-v1-2.ckpt`: Resumed from `sd-v1-1.ckpt`. 515k steps at resolution
|
||||
`512x512` on "laion-improved-aesthetics" (a subset of laion2B-en, filtered to
|
||||
images with an original size `>= 512x512`, estimated aesthetics score `> 5.0`,
|
||||
and an estimated watermark probability `< 0.5`. The watermark estimate is from
|
||||
the LAION-5B metadata, the aesthetics score is estimated using an
|
||||
[improved aesthetics estimator](https://github.com/christophschuhmann/improved-aesthetic-predictor)).
|
||||
- `sd-v1-3.ckpt`: Resumed from `sd-v1-2.ckpt`. 195k steps at resolution
|
||||
`512x512` on "laion-improved-aesthetics" and 10\% dropping of the
|
||||
text-conditioning to improve
|
||||
[classifier-free guidance sampling](https://arxiv.org/abs/2207.12598).
|
||||
|
||||
Evaluations with different classifier-free guidance scales (1.5, 2.0, 3.0, 4.0,
|
||||
5.0, 6.0, 7.0, 8.0) and 50 PLMS sampling steps show the relative improvements of
|
||||
the checkpoints: 
|
||||
|
||||
### Text-to-Image with Stable Diffusion
|
||||
|
||||

|
||||

|
||||
|
||||
Stable Diffusion is a latent diffusion model conditioned on the (non-pooled)
|
||||
text embeddings of a CLIP ViT-L/14 text encoder.
|
||||
|
||||
#### Sampling Script
|
||||
|
||||
After [obtaining the weights](#weights), link them
|
||||
|
||||
```
|
||||
mkdir -p models/ldm/stable-diffusion-v1/
|
||||
ln -s <path/to/model.ckpt> models/ldm/stable-diffusion-v1/model.ckpt
|
||||
```
|
||||
|
||||
and sample with
|
||||
|
||||
```
|
||||
python scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --plms
|
||||
```
|
||||
|
||||
By default, this uses a guidance scale of `--scale 7.5`,
|
||||
[Katherine Crowson's implementation](https://github.com/CompVis/latent-diffusion/pull/51)
|
||||
of the [PLMS](https://arxiv.org/abs/2202.09778) sampler, and renders images of
|
||||
size 512x512 (which it was trained on) in 50 steps. All supported arguments are
|
||||
listed below (type `python scripts/txt2img.py --help`).
|
||||
|
||||
```commandline
|
||||
usage: txt2img.py [-h] [--prompt [PROMPT]] [--outdir [OUTDIR]] [--skip_grid] [--skip_save] [--ddim_steps DDIM_STEPS] [--plms] [--laion400m] [--fixed_code] [--ddim_eta DDIM_ETA] [--n_iter N_ITER] [--H H] [--W W] [--C C] [--f F] [--n_samples N_SAMPLES] [--n_rows N_ROWS]
|
||||
[--scale SCALE] [--from-file FROM_FILE] [--config CONFIG] [--ckpt CKPT] [--seed SEED] [--precision {full,autocast}]
|
||||
|
||||
optional arguments:
|
||||
-h, --help show this help message and exit
|
||||
--prompt [PROMPT] the prompt to render
|
||||
--outdir [OUTDIR] dir to write results to
|
||||
--skip_grid do not save a grid, only individual samples. Helpful when evaluating lots of samples
|
||||
--skip_save do not save individual samples. For speed measurements.
|
||||
--ddim_steps DDIM_STEPS
|
||||
number of ddim sampling steps
|
||||
--plms use plms sampling
|
||||
--laion400m uses the LAION400M model
|
||||
--fixed_code if enabled, uses the same starting code across samples
|
||||
--ddim_eta DDIM_ETA ddim eta (eta=0.0 corresponds to deterministic sampling
|
||||
--n_iter N_ITER sample this often
|
||||
--H H image height, in pixel space
|
||||
--W W image width, in pixel space
|
||||
--C C latent channels
|
||||
--f F downsampling factor
|
||||
--n_samples N_SAMPLES
|
||||
how many samples to produce for each given prompt. A.k.a. batch size
|
||||
(note that the seeds for each image in the batch will be unavailable)
|
||||
--n_rows N_ROWS rows in the grid (default: n_samples)
|
||||
--scale SCALE unconditional guidance scale: eps = eps(x, empty) + scale * (eps(x, cond) - eps(x, empty))
|
||||
--from-file FROM_FILE
|
||||
if specified, load prompts from this file
|
||||
--config CONFIG path to config which constructs model
|
||||
--ckpt CKPT path to checkpoint of model
|
||||
--seed SEED the seed (for reproducible sampling)
|
||||
--precision {full,autocast}
|
||||
evaluate at this precision
|
||||
|
||||
```
|
||||
|
||||
Note: The inference config for all v1 versions is designed to be used with
|
||||
EMA-only checkpoints. For this reason `use_ema=False` is set in the
|
||||
configuration, otherwise the code will try to switch from non-EMA to EMA
|
||||
weights. If you want to examine the effect of EMA vs no EMA, we provide "full"
|
||||
checkpoints which contain both types of weights. For these, `use_ema=False` will
|
||||
load and use the non-EMA weights.
|
||||
|
||||
#### Diffusers Integration
|
||||
|
||||
Another way to download and sample Stable Diffusion is by using the
|
||||
[diffusers library](https://github.com/huggingface/diffusers/tree/main#new--stable-diffusion-is-now-fully-compatible-with-diffusers)
|
||||
|
||||
```py
|
||||
# make sure you're logged in with `huggingface-cli login`
|
||||
from torch import autocast
|
||||
from diffusers import StableDiffusionPipeline, LMSDiscreteScheduler
|
||||
|
||||
pipe = StableDiffusionPipeline.from_pretrained(
|
||||
"CompVis/stable-diffusion-v1-3-diffusers",
|
||||
use_auth_token=True
|
||||
)
|
||||
|
||||
prompt = "a photo of an astronaut riding a horse on mars"
|
||||
with autocast("cuda"):
|
||||
image = pipe(prompt)["sample"][0]
|
||||
|
||||
image.save("astronaut_rides_horse.png")
|
||||
```
|
||||
|
||||
### Image Modification with Stable Diffusion
|
||||
|
||||
By using a diffusion-denoising mechanism as first proposed by
|
||||
[SDEdit](https://arxiv.org/abs/2108.01073), the model can be used for different
|
||||
tasks such as text-guided image-to-image translation and upscaling. Similar to
|
||||
the txt2img sampling script, we provide a script to perform image modification
|
||||
with Stable Diffusion.
|
||||
|
||||
The following describes an example where a rough sketch made in
|
||||
[Pinta](https://www.pinta-project.com/) is converted into a detailed artwork.
|
||||
|
||||
```
|
||||
python scripts/img2img.py --prompt "A fantasy landscape, trending on artstation" --init-img <path-to-img.jpg> --strength 0.8
|
||||
```
|
||||
|
||||
Here, strength is a value between 0.0 and 1.0, that controls the amount of noise
|
||||
that is added to the input image. Values that approach 1.0 allow for lots of
|
||||
variations but will also produce images that are not semantically consistent
|
||||
with the input. See the following example.
|
||||
|
||||
**Input**
|
||||
|
||||

|
||||
|
||||
**Outputs**
|
||||
|
||||

|
||||

|
||||
|
||||
This procedure can, for example, also be used to upscale samples from the base
|
||||
model.
|
||||
|
||||
## Comments
|
||||
|
||||
- Our codebase for the diffusion models builds heavily on
|
||||
[OpenAI's ADM codebase](https://github.com/openai/guided-diffusion) and
|
||||
[https://github.com/lucidrains/denoising-diffusion-pytorch](https://github.com/lucidrains/denoising-diffusion-pytorch).
|
||||
Thanks for open-sourcing!
|
||||
|
||||
- The implementation of the transformer encoder is from
|
||||
[x-transformers](https://github.com/lucidrains/x-transformers) by
|
||||
[lucidrains](https://github.com/lucidrains?tab=repositories).
|
||||
|
||||
## BibTeX
|
||||
|
||||
```
|
||||
@misc{rombach2021highresolution,
|
||||
title={High-Resolution Image Synthesis with Latent Diffusion Models},
|
||||
author={Robin Rombach and Andreas Blattmann and Dominik Lorenz and Patrick Esser and Björn Ommer},
|
||||
year={2021},
|
||||
eprint={2112.10752},
|
||||
archivePrefix={arXiv},
|
||||
primaryClass={cs.CV}
|
||||
}
|
||||
|
||||
```
|
Reference in New Issue
Block a user