diff --git a/docs/index.md b/docs/index.md index 3c5bd3904b..c38f840d32 100644 --- a/docs/index.md +++ b/docs/index.md @@ -93,9 +93,15 @@ getting InvokeAI up and running on your system. For alternative installation and upgrade instructions, please see: [InvokeAI Installation Overview](installation/) -Linux users who wish to make use of the PyPatchMatch inpainting functions will -need to perform a bit of extra work to enable this module. Instructions can be -found at [Installing PyPatchMatch](installation/060_INSTALL_PATCHMATCH.md). +Users who wish to make use of the **PyPatchMatch** inpainting functions +will need to perform a bit of extra work to enable this +module. Instructions can be found at [Installing +PyPatchMatch](installation/060_INSTALL_PATCHMATCH.md). + +If you have an NVIDIA card, you can benefit from the significant +memory savings and performance benefits provided by Facebook Lab's +**xFormers** module. Instructions for Linux and Windows users can be found +at [Installing xFormers](installation/070_INSTALL_XFORMERS.md). ## :fontawesome-solid-computer: Hardware Requirements diff --git a/docs/installation/070_INSTALL_XFORMERS.md b/docs/installation/070_INSTALL_XFORMERS.md new file mode 100644 index 0000000000..be54a3ee86 --- /dev/null +++ b/docs/installation/070_INSTALL_XFORMERS.md @@ -0,0 +1,206 @@ +--- +title: Installing xFormers +--- + +# :material-image-size-select-large: Installing xformers + +xFormers is toolbox that integrates with the pyTorch and CUDA +libraries to provide accelerated performance and reduced memory +consumption for applications using the transformers machine learning +architecture. After installing xFormers, InvokeAI users who have +CUDA GPUs will see a noticeable decrease in GPU memory consumption and +an increase in speed. + +xFormers can be installed into a working InvokeAI installation without +any code changes or other updates. This document explains how to +install xFormers. + +## Pip Install + +For both Windows and Linux, you can install `xformers` in just a +couple of steps from the command line. + +If you are used to launching `invoke.sh` or `invoke.bat` to start +InvokeAI, then run the launcher and select the "developer's console" +to get to the command line. If you run invoke.py directly from the +command line, then just be sure to activate it's virtual environment. + +Then run the following three commands: + +```sh +pip install xformers==0.0.16rc425 +pip install triton +python -m xformers.info output +``` + +The first command installs `xformers`, the second installs the +`triton` training accelerator, and the third prints out the `xformers` +installation status. If all goes well, you'll see a report like the +following: + +```sh +xFormers 0.0.16rc425 +memory_efficient_attention.cutlassF: available +memory_efficient_attention.cutlassB: available +memory_efficient_attention.flshattF: available +memory_efficient_attention.flshattB: available +memory_efficient_attention.smallkF: available +memory_efficient_attention.smallkB: available +memory_efficient_attention.tritonflashattF: available +memory_efficient_attention.tritonflashattB: available +swiglu.fused.p.cpp: available +is_triton_available: True +is_functorch_available: False +pytorch.version: 1.13.1+cu117 +pytorch.cuda: available +gpu.compute_capability: 8.6 +gpu.name: NVIDIA RTX A2000 12GB +build.info: available +build.cuda_version: 1107 +build.python_version: 3.10.9 +build.torch_version: 1.13.1+cu117 +build.env.TORCH_CUDA_ARCH_LIST: 5.0+PTX 6.0 6.1 7.0 7.5 8.0 8.6 +build.env.XFORMERS_BUILD_TYPE: Release +build.env.XFORMERS_ENABLE_DEBUG_ASSERTIONS: None +build.env.NVCC_FLAGS: None +build.env.XFORMERS_PACKAGE_FROM: wheel-v0.0.16rc425 +source.privacy: open source +``` + +## Source Builds + +`xformers` is currently under active development and at some point you +may wish to build it from sourcce to get the latest features and +bugfixes. + +### Source Build on Linux + +Note that xFormers only works with true NVIDIA GPUs and will not work +properly with the ROCm driver for AMD acceleration. + +xFormers is not currently available as a pip binary wheel and must be +installed from source. These instructions were written for a system +running Ubuntu 22.04, but other Linux distributions should be able to +adapt this recipe. + +#### 1. Install CUDA Toolkit 11.7 + +You will need the CUDA developer's toolkit in order to compile and +install xFormers. **Do not try to install Ubuntu's nvidia-cuda-toolkit +package.** It is out of date and will cause conflicts among the NVIDIA +driver and binaries. Instead install the CUDA Toolkit package provided +by NVIDIA itself. Go to [CUDA Toolkit 11.7 +Downloads](https://developer.nvidia.com/cuda-11-7-0-download-archive) +and use the target selection wizard to choose your platform and Linux +distribution. Select an installer type of "runfile (local)" at the +last step. + +This will provide you with a recipe for downloading and running a +install shell script that will install the toolkit and drivers. For +example, the install script recipe for Ubuntu 22.04 running on a +x86_64 system is: + +``` +wget https://developer.download.nvidia.com/compute/cuda/11.7.0/local_installers/cuda_11.7.0_515.43.04_linux.run +sudo sh cuda_11.7.0_515.43.04_linux.run +``` + +Rather than cut-and-paste this example, We recommend that you walk +through the toolkit wizard in order to get the most up to date +installer for your system. + +#### 2. Confirm/Install pyTorch 1.13 with CUDA 11.7 support + +If you are using InvokeAI 2.3 or higher, these will already be +installed. If not, you can check whether you have the needed libraries +using a quick command. Activate the invokeai virtual environment, +either by entering the "developer's console", or manually with a +command similar to `source ~/invokeai/.venv/bin/activate` (depending +on where your `invokeai` directory is. + +Then run the command: + +```sh +python -c 'exec("import torch\nprint(torch.__version__)")' +``` + +If it prints __1.13.1+cu117__ you're good. If not, you can install the +most up to date libraries with this command: + +```sh +pip install --upgrade --force-reinstall torch torchvision +``` + +#### 3. Install the triton module + +This module isn't necessary for xFormers image inference optimization, +but avoids a startup warning. + +```sh +pip install triton +``` + +#### 4. Install source code build prerequisites + +To build xFormers from source, you will need the `build-essentials` +package. If you don't have it installed already, run: + +```sh +sudo apt install build-essential +``` + +#### 5. Build xFormers + +There is no pip wheel package for xFormers at this time (January +2023). Although there is a conda package, InvokeAI no longer +officially supports conda installations and you're on your own if you +wish to try this route. + +Following the recipe provided at the [xFormers GitHub +page](https://github.com/facebookresearch/xformers), and with the +InvokeAI virtual environment active (see step 1) run the following +commands: + +```sh +pip install ninja +export TORCH_CUDA_ARCH_LIST="6.0;6.1;6.2;7.0;7.2;7.5;8.0;8.6" +pip install -v -U git+https://github.com/facebookresearch/xformers.git@main#egg=xformers +``` + +The TORCH_CUDA_ARCH_LIST is a list of GPU architectures to compile +xFormer support for. You can speed up compilation by selecting +the architecture specific for your system. You'll find the list of +GPUs and their architectures at NVIDIA's [GPU Compute +Capability](https://developer.nvidia.com/cuda-gpus) table. + +If the compile and install completes successfully, you can check that +xFormers is installed with this command: + +```sh +python -m xformers.info +``` + +If suiccessful, the top of the listing should indicate "available" for +each of the `memory_efficient_attention` modules, as shown here: + +```sh +memory_efficient_attention.cutlassF: available +memory_efficient_attention.cutlassB: available +memory_efficient_attention.flshattF: available +memory_efficient_attention.flshattB: available +memory_efficient_attention.smallkF: available +memory_efficient_attention.smallkB: available +memory_efficient_attention.tritonflashattF: available +memory_efficient_attention.tritonflashattB: available +[...] +``` + +You can now launch InvokeAI and enjoy the benefits of xFormers. + +### Windows + +To come + + +--- +(c) Copyright 2023 Lincoln Stein and the InvokeAI Development Team diff --git a/docs/installation/index.md b/docs/installation/index.md index ef50cbab5f..51753f2c9b 100644 --- a/docs/installation/index.md +++ b/docs/installation/index.md @@ -18,7 +18,9 @@ experience and preferences. InvokeAI and its dependencies. We offer two recipes: one suited to those who prefer the `conda` tool, and one suited to those who prefer `pip` and Python virtual environments. In our hands the pip install - is faster and more reliable, but your mileage may vary. + is faster and more reliable, but your mileage may vary. + Note that the conda installation method is currently deprecated and + will not be supported at some point in the future. This method is recommended for users who have previously used `conda` or `pip` in the past, developers, and anyone who wishes to remain on diff --git a/ldm/invoke/CLI.py b/ldm/invoke/CLI.py index 0de25bf458..abf83bc112 100644 --- a/ldm/invoke/CLI.py +++ b/ldm/invoke/CLI.py @@ -45,6 +45,7 @@ def main(): Globals.try_patchmatch = args.patchmatch Globals.always_use_cpu = args.always_use_cpu Globals.internet_available = args.internet_available and check_internet() + Globals.disable_xformers = not args.xformers print(f'>> Internet connectivity is {Globals.internet_available}') if not args.conf: @@ -902,7 +903,7 @@ def prepare_image_metadata( try: filename = opt.fnformat.format(**wildcards) except KeyError as e: - print(f'** The filename format contains an unknown key \'{e.args[0]}\'. Will use \'{{prefix}}.{{seed}}.png\' instead') + print(f'** The filename format contains an unknown key \'{e.args[0]}\'. Will use {{prefix}}.{{seed}}.png\' instead') filename = f'{prefix}.{seed}.png' except IndexError: print(f'** The filename format is broken or complete. Will use \'{{prefix}}.{{seed}}.png\' instead') diff --git a/ldm/invoke/args.py b/ldm/invoke/args.py index 400d1f720d..c918e4fba7 100644 --- a/ldm/invoke/args.py +++ b/ldm/invoke/args.py @@ -482,6 +482,12 @@ class Args(object): action='store_true', help='Force free gpu memory before final decoding', ) + model_group.add_argument( + '--xformers', + action=argparse.BooleanOptionalAction, + default=True, + help='Enable/disable xformers support (default enabled if installed)', + ) model_group.add_argument( "--always_use_cpu", dest="always_use_cpu", diff --git a/ldm/invoke/ckpt_to_diffuser.py b/ldm/invoke/ckpt_to_diffuser.py index 86281623a6..9b1735f831 100644 --- a/ldm/invoke/ckpt_to_diffuser.py +++ b/ldm/invoke/ckpt_to_diffuser.py @@ -21,7 +21,7 @@ import os import re import torch from pathlib import Path -from ldm.invoke.globals import Globals +from ldm.invoke.globals import Globals, global_cache_dir from safetensors.torch import load_file try: @@ -637,7 +637,7 @@ def convert_ldm_bert_checkpoint(checkpoint, config): def convert_ldm_clip_checkpoint(checkpoint): - text_model = CLIPTextModel.from_pretrained("openai/clip-vit-large-patch14") + text_model = CLIPTextModel.from_pretrained("openai/clip-vit-large-patch14",cache_dir=global_cache_dir('hub')) keys = list(checkpoint.keys()) @@ -677,7 +677,8 @@ textenc_pattern = re.compile("|".join(protected.keys())) def convert_paint_by_example_checkpoint(checkpoint): - config = CLIPVisionConfig.from_pretrained("openai/clip-vit-large-patch14") + cache_dir = global_cache_dir('hub') + config = CLIPVisionConfig.from_pretrained("openai/clip-vit-large-patch14",cache_dir=cache_dir) model = PaintByExampleImageEncoder(config) keys = list(checkpoint.keys()) @@ -744,7 +745,8 @@ def convert_paint_by_example_checkpoint(checkpoint): def convert_open_clip_checkpoint(checkpoint): - text_model = CLIPTextModel.from_pretrained("stabilityai/stable-diffusion-2", subfolder="text_encoder") + cache_dir=global_cache_dir('hub') + text_model = CLIPTextModel.from_pretrained("stabilityai/stable-diffusion-2", subfolder="text_encoder", cache_dir=cache_dir) keys = list(checkpoint.keys()) @@ -795,6 +797,7 @@ def convert_ckpt_to_diffuser(checkpoint_path:str, ): checkpoint = load_file(checkpoint_path) if Path(checkpoint_path).suffix == '.safetensors' else torch.load(checkpoint_path) + cache_dir = global_cache_dir('hub') # Sometimes models don't have the global_step item if "global_step" in checkpoint: @@ -904,7 +907,7 @@ def convert_ckpt_to_diffuser(checkpoint_path:str, if model_type == "FrozenOpenCLIPEmbedder": text_model = convert_open_clip_checkpoint(checkpoint) - tokenizer = CLIPTokenizer.from_pretrained("stabilityai/stable-diffusion-2", subfolder="tokenizer") + tokenizer = CLIPTokenizer.from_pretrained("stabilityai/stable-diffusion-2", subfolder="tokenizer",cache_dir=global_cache_dir('diffusers')) pipe = StableDiffusionPipeline( vae=vae, text_encoder=text_model, @@ -917,8 +920,8 @@ def convert_ckpt_to_diffuser(checkpoint_path:str, ) elif model_type == "PaintByExample": vision_model = convert_paint_by_example_checkpoint(checkpoint) - tokenizer = CLIPTokenizer.from_pretrained("openai/clip-vit-large-patch14") - feature_extractor = AutoFeatureExtractor.from_pretrained("CompVis/stable-diffusion-safety-checker") + tokenizer = CLIPTokenizer.from_pretrained("openai/clip-vit-large-patch14",cache_dir=cache_dir) + feature_extractor = AutoFeatureExtractor.from_pretrained("CompVis/stable-diffusion-safety-checker",cache_dir=cache_dir) pipe = PaintByExamplePipeline( vae=vae, image_encoder=vision_model, @@ -929,9 +932,9 @@ def convert_ckpt_to_diffuser(checkpoint_path:str, ) elif model_type in ['FrozenCLIPEmbedder','WeightedFrozenCLIPEmbedder']: text_model = convert_ldm_clip_checkpoint(checkpoint) - tokenizer = CLIPTokenizer.from_pretrained("openai/clip-vit-large-patch14") - safety_checker = StableDiffusionSafetyChecker.from_pretrained("CompVis/stable-diffusion-safety-checker") - feature_extractor = AutoFeatureExtractor.from_pretrained("CompVis/stable-diffusion-safety-checker") + tokenizer = CLIPTokenizer.from_pretrained("openai/clip-vit-large-patch14",cache_dir=cache_dir) + safety_checker = StableDiffusionSafetyChecker.from_pretrained("CompVis/stable-diffusion-safety-checker",cache_dir=cache_dir) + feature_extractor = AutoFeatureExtractor.from_pretrained("CompVis/stable-diffusion-safety-checker",cache_dir=cache_dir) pipe = StableDiffusionPipeline( vae=vae, text_encoder=text_model, @@ -944,7 +947,7 @@ def convert_ckpt_to_diffuser(checkpoint_path:str, else: text_config = create_ldm_bert_config(original_config) text_model = convert_ldm_bert_checkpoint(checkpoint, text_config) - tokenizer = BertTokenizerFast.from_pretrained("bert-base-uncased") + tokenizer = BertTokenizerFast.from_pretrained("bert-base-uncased",cache_dir=cache_dir) pipe = LDMTextToImagePipeline(vqvae=vae, bert=text_model, tokenizer=tokenizer, unet=unet, scheduler=scheduler) pipe.save_pretrained( diff --git a/ldm/invoke/generator/diffusers_pipeline.py b/ldm/invoke/generator/diffusers_pipeline.py index 5e62abf9df..54e9d555af 100644 --- a/ldm/invoke/generator/diffusers_pipeline.py +++ b/ldm/invoke/generator/diffusers_pipeline.py @@ -39,6 +39,7 @@ from diffusers.utils.outputs import BaseOutput from torchvision.transforms.functional import resize as tv_resize from transformers import CLIPFeatureExtractor, CLIPTextModel, CLIPTokenizer +from ldm.invoke.globals import Globals from ldm.models.diffusion.shared_invokeai_diffusion import InvokeAIDiffuserComponent, ThresholdSettings from ldm.modules.textual_inversion_manager import TextualInversionManager @@ -306,7 +307,7 @@ class StableDiffusionGeneratorPipeline(StableDiffusionPipeline): textual_inversion_manager=self.textual_inversion_manager ) - if is_xformers_available(): + if is_xformers_available() and not Globals.disable_xformers: self.enable_xformers_memory_efficient_attention() def image_from_embeddings(self, latents: torch.Tensor, num_inference_steps: int, diff --git a/ldm/invoke/generator/txt2img2img.py b/ldm/invoke/generator/txt2img2img.py index 1dba0cfafb..47692a6bbb 100644 --- a/ldm/invoke/generator/txt2img2img.py +++ b/ldm/invoke/generator/txt2img2img.py @@ -3,6 +3,7 @@ ldm.invoke.generator.txt2img inherits from ldm.invoke.generator ''' import math +from diffusers.utils.logging import get_verbosity, set_verbosity, set_verbosity_error from typing import Callable, Optional import torch @@ -66,6 +67,8 @@ class Txt2Img2Img(Generator): second_pass_noise = self.get_noise_like(resized_latents) + verbosity = get_verbosity() + set_verbosity_error() pipeline_output = pipeline.img2img_from_latents_and_embeddings( resized_latents, num_inference_steps=steps, @@ -73,6 +76,7 @@ class Txt2Img2Img(Generator): strength=strength, noise=second_pass_noise, callback=step_callback) + set_verbosity(verbosity) return pipeline.numpy_to_pil(pipeline_output.images)[0] diff --git a/ldm/invoke/globals.py b/ldm/invoke/globals.py index 137171aa33..5bd5597b78 100644 --- a/ldm/invoke/globals.py +++ b/ldm/invoke/globals.py @@ -43,6 +43,9 @@ Globals.always_use_cpu = False # The CLI will test connectivity at startup time. Globals.internet_available = True +# Whether to disable xformers +Globals.disable_xformers = False + # whether we are forcing full precision Globals.full_precision = False diff --git a/ldm/invoke/model_manager.py b/ldm/invoke/model_manager.py index 44628aac75..295710efb7 100644 --- a/ldm/invoke/model_manager.py +++ b/ldm/invoke/model_manager.py @@ -27,6 +27,7 @@ import torch import safetensors import transformers from diffusers import AutoencoderKL, logging as dlogging +from diffusers.utils.logging import get_verbosity, set_verbosity, set_verbosity_error from omegaconf import OmegaConf from omegaconf.dictconfig import DictConfig from picklescan.scanner import scan_file_path @@ -871,11 +872,11 @@ class ModelManager(object): return model # diffusers really really doesn't like us moving a float16 model onto CPU - import logging - logging.getLogger('diffusers.pipeline_utils').setLevel(logging.CRITICAL) + verbosity = get_verbosity() + set_verbosity_error() model.cond_stage_model.device = 'cpu' model.to('cpu') - logging.getLogger('pipeline_utils').setLevel(logging.INFO) + set_verbosity(verbosity) for submodel in ('first_stage_model','cond_stage_model','model'): try: diff --git a/ldm/modules/encoders/modules.py b/ldm/modules/encoders/modules.py index aafb1299ad..32ac0de7a1 100644 --- a/ldm/modules/encoders/modules.py +++ b/ldm/modules/encoders/modules.py @@ -1,18 +1,16 @@ import math -import os.path +from functools import partial from typing import Optional +import clip +import kornia import torch import torch.nn as nn -from functools import partial -import clip -from einops import rearrange, repeat +from einops import repeat from transformers import CLIPTokenizer, CLIPTextModel -import kornia -from ldm.invoke.devices import choose_torch_device -from ldm.invoke.globals import Globals, global_cache_dir -#from ldm.modules.textual_inversion_manager import TextualInversionManager +from ldm.invoke.devices import choose_torch_device +from ldm.invoke.globals import global_cache_dir from ldm.modules.x_transformer import ( Encoder, TransformerWrapper, @@ -654,21 +652,22 @@ class WeightedFrozenCLIPEmbedder(FrozenCLIPEmbedder): per_token_weights += [weight] * len(this_fragment_token_ids) # leave room for bos/eos - if len(all_token_ids) > self.max_length - 2: - excess_token_count = len(all_token_ids) - self.max_length - 2 + max_token_count_without_bos_eos_markers = self.max_length - 2 + if len(all_token_ids) > max_token_count_without_bos_eos_markers: + excess_token_count = len(all_token_ids) - max_token_count_without_bos_eos_markers # TODO build nice description string of how the truncation was applied # this should be done by calling self.tokenizer.convert_ids_to_tokens() then passing the result to # self.tokenizer.convert_tokens_to_string() for the token_ids on each side of the truncation limit. print(f">> Prompt is {excess_token_count} token(s) too long and has been truncated") - all_token_ids = all_token_ids[0:self.max_length] - per_token_weights = per_token_weights[0:self.max_length] + all_token_ids = all_token_ids[0:max_token_count_without_bos_eos_markers] + per_token_weights = per_token_weights[0:max_token_count_without_bos_eos_markers] - # pad out to a 77-entry array: [eos_token, , eos_token, ..., eos_token] + # pad out to a 77-entry array: [bos_token, , eos_token, pad_token…] # (77 = self.max_length) all_token_ids = [self.tokenizer.bos_token_id] + all_token_ids + [self.tokenizer.eos_token_id] per_token_weights = [1.0] + per_token_weights + [1.0] pad_length = self.max_length - len(all_token_ids) - all_token_ids += [self.tokenizer.eos_token_id] * pad_length + all_token_ids += [self.tokenizer.pad_token_id] * pad_length per_token_weights += [1.0] * pad_length all_token_ids_tensor = torch.tensor(all_token_ids, dtype=torch.long).to(self.device) diff --git a/ldm/modules/prompt_to_embeddings_converter.py b/ldm/modules/prompt_to_embeddings_converter.py index ab989e4892..dea15d61b4 100644 --- a/ldm/modules/prompt_to_embeddings_converter.py +++ b/ldm/modules/prompt_to_embeddings_converter.py @@ -3,8 +3,9 @@ import math import torch from transformers import CLIPTokenizer, CLIPTextModel -from ldm.modules.textual_inversion_manager import TextualInversionManager from ldm.invoke.devices import torch_dtype +from ldm.modules.textual_inversion_manager import TextualInversionManager + class WeightedPromptFragmentsToEmbeddingsConverter(): @@ -22,8 +23,8 @@ class WeightedPromptFragmentsToEmbeddingsConverter(): return self.tokenizer.model_max_length def get_embeddings_for_weighted_prompt_fragments(self, - text: list[str], - fragment_weights: list[float], + text: list[list[str]], + fragment_weights: list[list[float]], should_return_tokens: bool = False, device='cpu' ) -> torch.Tensor: @@ -198,12 +199,12 @@ class WeightedPromptFragmentsToEmbeddingsConverter(): all_token_ids = all_token_ids[0:max_token_count_without_bos_eos_markers] per_token_weights = per_token_weights[0:max_token_count_without_bos_eos_markers] - # pad out to a self.max_length-entry array: [eos_token, , eos_token, ..., eos_token] + # pad out to a self.max_length-entry array: [bos_token, , eos_token, pad_token…] # (typically self.max_length == 77) all_token_ids = [self.tokenizer.bos_token_id] + all_token_ids + [self.tokenizer.eos_token_id] per_token_weights = [1.0] + per_token_weights + [1.0] pad_length = self.max_length - len(all_token_ids) - all_token_ids += [self.tokenizer.eos_token_id] * pad_length + all_token_ids += [self.tokenizer.pad_token_id] * pad_length per_token_weights += [1.0] * pad_length all_token_ids_tensor = torch.tensor(all_token_ids, dtype=torch.long, device=device) diff --git a/scripts/configure_invokeai.py b/scripts/configure_invokeai.py index 9d17a73317..fec1cc6135 100755 --- a/scripts/configure_invokeai.py +++ b/scripts/configure_invokeai.py @@ -291,7 +291,7 @@ for more information. Visit https://huggingface.co/settings/tokens to generate a token. (Sign up for an account if needed). -Paste the token below using Ctrl-V on macOS/Linux, or Ctrl-Shift-V or right-click on Windows. +Paste the token below using Ctrl-V on macOS/Linux, or Ctrl-Shift-V or right-click on Windows. Alternatively press 'Enter' to skip this step and continue. You may re-run the configuration script again in the future if you do not wish to set the token right now. ''') @@ -676,7 +676,8 @@ def download_weights(opt:dict) -> Union[str, None]: return access_token = authenticate() - HfFolder.save_token(access_token) + if access_token is not None: + HfFolder.save_token(access_token) print('\n** DOWNLOADING WEIGHTS **') successfully_downloaded = download_weight_datasets(models, access_token, precision=precision) diff --git a/scripts/textual_inversion_fe.py b/scripts/textual_inversion_fe.py index 941afcf613..82446e98a7 100755 --- a/scripts/textual_inversion_fe.py +++ b/scripts/textual_inversion_fe.py @@ -115,6 +115,14 @@ class textualInversionForm(npyscreen.FormMultiPageAction): value=self.precisions.index(saved_args.get('mixed_precision','fp16')), max_height=4, ) + self.num_train_epochs = self.add_widget_intelligent( + npyscreen.TitleSlider, + name='Number of training epochs:', + out_of=1000, + step=50, + lowest=1, + value=saved_args.get('num_train_epochs',100) + ) self.max_train_steps = self.add_widget_intelligent( npyscreen.TitleSlider, name='Max Training Steps:', @@ -131,6 +139,22 @@ class textualInversionForm(npyscreen.FormMultiPageAction): lowest=1, value=saved_args.get('train_batch_size',8), ) + self.gradient_accumulation_steps = self.add_widget_intelligent( + npyscreen.TitleSlider, + name='Gradient Accumulation Steps (may need to decrease this to resume from a checkpoint):', + out_of=10, + step=1, + lowest=1, + value=saved_args.get('gradient_accumulation_steps',4) + ) + self.lr_warmup_steps = self.add_widget_intelligent( + npyscreen.TitleSlider, + name='Warmup Steps:', + out_of=100, + step=1, + lowest=0, + value=saved_args.get('lr_warmup_steps',0), + ) self.learning_rate = self.add_widget_intelligent( npyscreen.TitleText, name="Learning Rate:", @@ -154,22 +178,6 @@ class textualInversionForm(npyscreen.FormMultiPageAction): scroll_exit = True, value=self.lr_schedulers.index(saved_args.get('lr_scheduler','constant')), ) - self.gradient_accumulation_steps = self.add_widget_intelligent( - npyscreen.TitleSlider, - name='Gradient Accumulation Steps:', - out_of=10, - step=1, - lowest=1, - value=saved_args.get('gradient_accumulation_steps',4) - ) - self.lr_warmup_steps = self.add_widget_intelligent( - npyscreen.TitleSlider, - name='Warmup Steps:', - out_of=100, - step=1, - lowest=0, - value=saved_args.get('lr_warmup_steps',0), - ) def initializer_changed(self): placeholder = self.placeholder_token.value @@ -236,7 +244,7 @@ class textualInversionForm(npyscreen.FormMultiPageAction): # all the integers for attr in ('train_batch_size','gradient_accumulation_steps', - 'max_train_steps','lr_warmup_steps'): + 'num_train_epochs','max_train_steps','lr_warmup_steps'): args[attr] = int(getattr(self,attr).value) # the floats (just one) @@ -324,6 +332,7 @@ if __name__ == '__main__': save_args(args) try: + print(f'DEBUG: args = {args}') do_textual_inversion_training(**args) copy_to_embeddings_folder(args) except Exception as e: