fix second conflict in CLI.py

2024-08-30 20:32:17 +00:00 · 2023-01-24 14:21:21 -05:00
parent b2f288d6ec eaf7934d74
commit 61403fe306
12 changed files with 413 additions and 129 deletions
--- a/configs/INITIAL_MODELS.yaml
+++ b/configs/INITIAL_MODELS.yaml
@ -93,6 +93,7 @@ voxel_art-1.0:
   format: ckpt
   vae:
     repo_id: stabilityai/sd-vae-ft-mse
+     file: vae-ft-mse-840000-ema-pruned.ckpt
   recommended: False
   width: 512
   height: 512
@ -102,7 +103,7 @@ ft-mse-improved-autoencoder-840000:
   format: ckpt
   config: VAE/default
   file: vae-ft-mse-840000-ema-pruned.ckpt
-   recommended: False
+   recommended: True
   width: 512
   height: 512
 trinart_vae:
--- a/docs/assets/textual-inversion/ti-frontend.png
+++ b/docs/assets/textual-inversion/ti-frontend.png
--- a/docs/features/TEXTUAL_INVERSION.md
+++ b/docs/features/TEXTUAL_INVERSION.md
@ -10,83 +10,263 @@ You may personalize the generated images to provide your own styles or objects
 by training a new LDM checkpoint and introducing a new vocabulary to the fixed
 model as a (.pt) embeddings file. Alternatively, you may use or train
 HuggingFace Concepts embeddings files (.bin) from
-<https://huggingface.co/sd-concepts-library> and its associated notebooks.
+<https://huggingface.co/sd-concepts-library> and its associated
+notebooks.

-## **Training**
+## **Hardware and Software Requirements**

-To train, prepare a folder that contains images sized at 512x512 and execute the
-following:
+You will need a GPU to perform training in a reasonable length of
+time, and at least 12 GB of VRAM. We recommend using the [`xformers`
+library](../installation/070_INSTALL_XFORMERS) to accelerate the
+training process further. During training, about ~8 GB is temporarily
+needed in order to store intermediate models, checkpoints and logs.

-### WINDOWS
+## **Preparing for Training**

-As the default backend is not available on Windows, if you're using that
-platform, set the environment variable `PL_TORCH_DISTRIBUTED_BACKEND` to `gloo`
+To train, prepare a folder that contains 3-5 images that illustrate
+the object or concept. It is good to provide a variety of examples or
+poses to avoid overtraining the system. Format these images as PNG
+(preferred) or JPG. You do not need to resize or crop the images in
+advance, but for more control you may wish to do so.

-```bash
-python3 ./main.py -t \
-    --base ./configs/stable-diffusion/v1-finetune.yaml \
-    --actual_resume ./models/ldm/stable-diffusion-v1/model.ckpt \
-    -n my_cat \
-    --gpus 0 \
-    --data_root D:/textual-inversion/my_cat \
-    --init_word 'cat'
+Place the training images in a directory on the machine InvokeAI runs
+on. We recommend placing them in a subdirectory of the
+`text-inversion-training-data` folder located in the InvokeAI root
+directory, ordinarily `~/invokeai` (Linux/Mac), or
+`C:\Users\your_name\invokeai` (Windows). For example, to create an
+embedding for the "psychedelic" style, you'd place the training images
+into the directory
+`~invokeai/text-inversion-training-data/psychedelic`.
+
+## **Launching Training Using the Console Front End**
+
+InvokeAI 2.3 and higher comes with a text console-based training front
+end. From within the `invoke.sh`/`invoke.bat` Invoke launcher script,
+start the front end by selecting choice (3):
+
+```sh
+Do you want to generate images using the
+1. command-line
+2. browser-based UI
+3. textual inversion training
+4. open the developer console
+Please enter 1, 2, 3, or 4: [1] 3
 ```

-During the training process, files will be created in
-`/logs/[project][time][project]/` where you can see the process.
+From the command line, with the InvokeAI virtual environment active,
+you can launch the front end with the command
+`textual_inversion_fe`.

-Conditioning contains the training prompts inputs, reconstruction the input
-images for the training epoch samples, samples scaled for a sample of the prompt
-and one with the init word provided.
+This will launch a text-based front end that will look like this:

-On a RTX3090, the process for SD will take ~1h @1.6 iterations/sec.
+<figure markdown>
+![ti-frontend](../assets/textual-inversion/ti-frontend.png)
+</figure>

-!!! note
+The interface is keyboard-based. Move from field to field using
+control-N (^N) to move to the next field and control-P (^P) to the
+previous one. <Tab> and <shift-TAB> work as well. Once a field is
+active, use the cursor keys. In a checkbox group, use the up and down
+cursor keys to move from choice to choice, and <space> to select a
+choice. In a scrollbar, use the left and right cursor keys to increase
+and decrease the value of the scroll. In textfields, type the desired
+values.

-    According to the associated paper, the optimal number of
-    images is 3-5. Your model may not converge if you use more images than
-    that.
+The number of parameters may look intimidating, but in most cases the
+predefined defaults work fine. The red circled fields in the above
+illustration are the ones you will adjust most frequently.

-Training will run indefinitely, but you may wish to stop it (with ctrl-c) before
-the heat death of the universe, when you find a low loss epoch or around ~5000
-iterations. Note that you can set a fixed limit on the number of training steps
-by decreasing the "max_steps" option in
-configs/stable_diffusion/v1-finetune.yaml (currently set to 4000000)
+### Model Name

-## **Run the Model**
+This will list all the diffusers models that are currently
+installed. Select the one you wish to use as the basis for your
+embedding. Be aware that if you use a SD-1.X-based model for your
+training, you will only be able to use this embedding with other
+SD-1.X-based models. Similarly, if you train on SD-2.X, you will only
+be able to use the embeddings with models based on SD-2.X.

-Once the model is trained, specify the trained .pt or .bin file when starting
-invoke using
+### Trigger Term

-```bash
-python3 ./scripts/invoke.py \
-    --embedding_path /path/to/embedding.pt
+This is the prompt term you will use to trigger the embedding. Type a
+single word or phrase you wish to use as the trigger, example
+"psychedelic" (without angle brackets). Within InvokeAI, you will then
+be able to activate the trigger using the syntax `<psychedelic>`.
+
+### Initializer
+
+This is a single character that is used internally during the training
+process as a placeholder for the trigger term. It defaults to "*" and
+can usually be left alone.
+
+### Resume from last saved checkpoint
+
+As training proceeds, textual inversion will write a series of
+intermediate files that can be used to resume training from where it
+was left off in the case of an interruption. This checkbox will be
+automatically selected if you provide a previously used trigger term
+and at least one checkpoint file is found on disk.
+
+Note that as of 20 January 2023, resume does not seem to be working
+properly due to an issue with the upstream code.
+
+### Data Training Directory
+
+This is the location of the images to be used for training. When you
+select a trigger term like "my-trigger", the frontend will prepopulate
+this field with `~/invokeai/text-inversion-training-data/my-trigger`,
+but you can change the path to wherever you want.
+
+### Output Destination Directory
+
+This is the location of the logs, checkpoint files, and embedding
+files created during training. When you select a trigger term like
+"my-trigger", the frontend will prepopulate this field with
+`~/invokeai/text-inversion-output/my-trigger`, but you can change the
+path to wherever you want.
+
+### Image resolution
+
+The images in the training directory will be automatically scaled to
+the value you use here. For best results, you will want to use the
+same default resolution of the underlying model (512 pixels for
+SD-1.5, 768 for the larger version of SD-2.1).
+
+### Center crop images
+
+If this is selected, your images will be center cropped to make them
+square before resizing them to the desired resolution. Center cropping
+can indiscriminately cut off the top of subjects' heads for portrait
+aspect images, so if you have images like this, you may wish to use a
+photoeditor to manually crop them to a square aspect ratio.
+
+### Mixed precision
+
+Select the floating point precision for the embedding. "no" will
+result in a full 32-bit precision, "fp16" will provide 16-bit
+precision, and "bf16" will provide mixed precision (only available
+when XFormers is used).
+
+### Max training steps
+
+How many steps the training will take before the model converges. Most
+training sets will converge with 2000-3000 steps.
+
+### Batch size
+
+This adjusts how many training images are processed simultaneously in
+each step. Higher values will cause the training process to run more
+quickly, but use more memory. The default size will run with GPUs with
+as little as 12 GB.
+
+### Learning rate
+
+The rate at which the system adjusts its internal weights during
+training. Higher values risk overtraining (getting the same image each
+time), and lower values will take more steps to train a good
+model. The default of 0.0005 is conservative; you may wish to increase
+it to 0.005 to speed up training.
+
+### Scale learning rate by number of GPUs, steps and batch size
+
+If this is selected (the default) the system will adjust the provided
+learning rate to improve performance.
+
+### Use xformers acceleration
+
+This will activate XFormers memory-efficient attention. You need to
+have XFormers installed for this to have an effect.
+
+### Learning rate scheduler
+
+This adjusts how the learning rate changes over the course of
+training. The default "constant" means to use a constant learning rate
+for the entire training session. The other values scale the learning
+rate according to various formulas.
+
+Only "constant" is supported by the XFormers library.
+
+### Gradient accumulation steps
+
+This is a parameter that allows you to use bigger batch sizes than
+your GPU's VRAM would ordinarily accommodate, at the cost of some
+performance.
+
+### Warmup steps
+
+If "constant_with_warmup" is selected in the learning rate scheduler,
+then this provides the number of warmup steps. Warmup steps have a
+very low learning rate, and are one way of preventing early
+overtraining.
+
+## The training run
+
+Start the training run by advancing to the OK button (bottom right)
+and pressing <enter>. A series of progress messages will be displayed
+as the training process proceeds. This may take an hour or two,
+depending on settings and the speed of your system. Various log and
+checkpoint files will be written into the output directory (ordinarily
+`~/invokeai/text-inversion-output/my-model/`)
+
+At the end of successful training, the system will copy the file
+`learned_embeds.bin` into the InvokeAI root directory's `embeddings`
+directory, using a subdirectory named after the trigger token. For
+example, if the trigger token was `psychedelic`, then look for the
+embeddings file in
+`~/invokeai/embeddings/psychedelic/learned_embeds.bin`
+
+You may now launch InvokeAI and try out a prompt that uses the trigger
+term. For example `a plate of banana sushi in <psychedelic> style`.
+
+## **Training with the Command-Line Script**
+
+InvokeAI also comes with a traditional command-line script for
+launching textual inversion training. It is named
+`textual_inversion`, and can be launched from within the
+"developer's console", or from the command line after activating
+InvokeAI's virtual environment.
+
+It accepts a large number of arguments, which can be summarized by
+passing the `--help` argument:
+
+```sh
+textual_inversion --help
 ```

-Then, to utilize your subject at the invoke prompt
-
-```bash
-invoke> "a photo of *"
+Typical usage is shown here:
+```sh
+python textual_inversion.py \
+       --model=stable-diffusion-1.5 \
+       --resolution=512 \
+       --learnable_property=style \
+       --initializer_token='*' \
+       --placeholder_token='<psychedelic>' \
+       --train_data_dir=/home/lstein/invokeai/training-data/psychedelic \
+       --output_dir=/home/lstein/invokeai/text-inversion-training/psychedelic \
+       --scale_lr \
+       --train_batch_size=8 \
+       --gradient_accumulation_steps=4 \
+       --max_train_steps=3000 \
+       --learning_rate=0.0005 \
+       --resume_from_checkpoint=latest \
+       --lr_scheduler=constant \
+       --mixed_precision=fp16 \
+       --only_save_embeds
 ```

-This also works with image2image
+## Reading

-```bash
-invoke> "waterfall and rainbow in the style of *" --init_img=./init-images/crude_drawing.png --strength=0.5 -s100 -n4
-```
+For more information on textual inversion, please see the following
+resources:

-For .pt files it's also possible to train multiple tokens (modify the
-placeholder string in `configs/stable-diffusion/v1-finetune.yaml`) and combine
-LDM checkpoints using:
+* The [textual inversion repository](https://github.com/rinongal/textual_inversion) and
+  associated paper for details and limitations.
+* [HuggingFace's textual inversion training
+  page](https://huggingface.co/docs/diffusers/training/text_inversion)
+* [HuggingFace example script
+  documentation](https://github.com/huggingface/diffusers/tree/main/examples/textual_inversion)
+  (Note that this script is similar to, but not identical, to
+  `textual_inversion`, but produces embed files that are completely compatible.

-```bash
-python3 ./scripts/merge_embeddings.py \
-    --manager_ckpts /path/to/first/embedding.pt \
-    [</path/to/second/embedding.pt>,[...]] \
-    --output_path /path/to/output/embedding.pt
-```
+---

-Credit goes to rinongal and the repository
-
-Please see [the repository](https://github.com/rinongal/textual_inversion) and
-associated paper for details and limitations.
+copyright (c) 2023, Lincoln Stein and the InvokeAI Development Team
--- a/ldm/generate.py
+++ b/ldm/generate.py
@ -852,6 +852,7 @@ class Generate:
            model_data = cache.get_model(model_name)
        except Exception as e:
            print(f'** model {model_name} could not be loaded: {str(e)}')
+            print(traceback.format_exc(), file=sys.stderr)
            if previous_model_name is None:
                raise e
            print(f'** trying to reload previous model')
--- a/ldm/invoke/CLI.py
+++ b/ldm/invoke/CLI.py
@ -578,7 +578,7 @@ def import_model(model_path:str, gen, opt, completer):
    elif re.match('^[\w.+-]+/[\w.+-]+$',model_path):
        model_name = import_diffuser_model(model_path, gen, opt, completer)
    elif os.path.isdir(model_path):
-        model_name = import_diffuser_model(model_path, gen, opt, completer)
+        model_name = import_diffuser_model(Path(model_path), gen, opt, completer)
    else:
        print(f'** {model_path} is neither the path to a .ckpt file nor a diffusers repository id. Can\'t import.')

@ -589,8 +589,7 @@ def import_model(model_path:str, gen, opt, completer):
        print('** model failed to load. Discarding configuration entry')
        gen.model_manager.del_model(model_name)
        return
-
-    if input('Make this the default model? [n] ') in ('y','Y'):
+    if input('Make this the default model? [n] ').strip() in ('y','Y'):
        gen.model_manager.set_default_model(model_name)

    gen.model_manager.commit(opt.conf)
@ -607,10 +606,14 @@ def import_diffuser_model(path_or_repo:str, gen, opt, completer)->str:
        model_name=default_name,
        model_description=default_description
    )
+    vae = None
+    if input('Replace this model\'s VAE with "stabilityai/sd-vae-ft-mse"? [n] ').strip() in ('y','Y'):
+        vae = dict(repo_id='stabilityai/sd-vae-ft-mse')

    if not manager.import_diffuser_model(
            path_or_repo,
            model_name = model_name,
+            vae = vae,
            description = model_description):
        print('** model failed to import')
        return None
@ -628,17 +631,28 @@ def import_ckpt_model(path_or_url:str, gen, opt, completer)->str:
    )
    config_file = None
    default = Path(Globals.root,'configs/stable-diffusion/v1-inference.yaml')
+
    completer.complete_extensions(('.yaml','.yml'))
    completer.set_line(str(default))
    done = False
    while not done:
        config_file = input('Configuration file for this model: ').strip()
        done = os.path.exists(config_file)
+
+    completer.complete_extensions(('.ckpt','.safetensors'))
+    vae = None
+    default = Path(Globals.root,'models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt')
+    completer.set_line(str(default))
+    done = False
+    while not done:
+        vae = input('VAE file for this model (leave blank for none): ').strip() or None
+        done = (not vae) or os.path.exists(vae)
    completer.complete_extensions(None)

    if not manager.import_ckpt_model(
            path_or_url,
            config = config_file,
+            vae = vae,
            model_name = model_name,
            model_description = model_description,
            commit_to_conf = opt.conf,
@ -710,7 +724,7 @@ def optimize_model(model_name_or_path:str, gen, opt, completer):
        return

    completer.update_models(gen.model_manager.list_models())
-    if input(f'Load optimized model {model_name}? [y] ') not in ('n','N'):
+    if input(f'Load optimized model {model_name}? [y] ').strip() not in ('n','N'):
        gen.set_model(model_name)

    response = input(f'Delete the original .ckpt file at ({ckpt_path} ? [n] ')
@ -726,7 +740,12 @@ def del_config(model_name:str, gen, opt, completer):
    if model_name not in gen.model_manager.config:
        print(f"** Unknown model {model_name}")
        return
-    gen.model_manager.del_model(model_name)
+
+    if input(f'Remove {model_name} from the list of models known to InvokeAI? [y] ').strip().startswith(('n','N')):
+        return
+    
+    delete_completely = input('Completely remove the model file or directory from disk? [n] ').startswith(('y','Y'))
+    gen.model_manager.del_model(model_name,delete_files=delete_completely)
    gen.model_manager.commit(opt.conf)
    print(f'** {model_name} deleted')
    completer.update_models(gen.model_manager.list_models())
--- a/ldm/invoke/configure_invokeai.py
+++ b/ldm/invoke/configure_invokeai.py
@ -747,7 +747,7 @@ def initialize_rootdir(root:str,yes_to_all:bool=False):

    safety_checker = '--nsfw_checker' if enable_safety_checker else '--no-nsfw_checker'

-    for name in ('models','configs','embeddings'):
+    for name in ('models','configs','embeddings','text-inversion-data','text-inversion-training-data'):
        os.makedirs(os.path.join(root,name), exist_ok=True)
    for src in (['configs']):
        dest = os.path.join(root,src)
--- a/ldm/invoke/model_manager.py
+++ b/ldm/invoke/model_manager.py
@ -18,7 +18,9 @@ import traceback
 import warnings
 import safetensors.torch
 from pathlib import Path
+from shutil import move, rmtree
 from typing import Union, Any
+from huggingface_hub import scan_cache_dir
 from ldm.util import download_with_progress_bar

 import torch
@ -35,6 +37,9 @@ from ldm.invoke.globals import Globals, global_models_dir, global_autoscan_dir,
 from ldm.util import instantiate_from_config, ask_user

 DEFAULT_MAX_MODELS=2
+VAE_TO_REPO_ID = { # hack, see note in convert_and_import()
+    'vae-ft-mse-840000-ema-pruned':  'stabilityai/sd-vae-ft-mse',
+    }

 class ModelManager(object):
    def __init__(self,
@ -230,7 +235,7 @@ class ModelManager(object):
                line = f'\033[1m{line}\033[0m'
            print(line)

-    def del_model(self, model_name:str) -> None:
+    def del_model(self, model_name:str, delete_files:bool=False) -> None:
        '''
        Delete the named model.
        '''
@ -238,9 +243,25 @@ class ModelManager(object):
        if model_name not in omega:
            print(f'** Unknown model {model_name}')
            return
+        # save these for use in deletion later
+        conf = omega[model_name]
+        repo_id = conf.get('repo_id',None)
+        path = self._abs_path(conf.get('path',None))
+        weights = self._abs_path(conf.get('weights',None))
+
        del omega[model_name]
        if model_name in self.stack:
            self.stack.remove(model_name)
+        if delete_files:
+            if weights:
+                print(f'** deleting file {weights}')
+                Path(weights).unlink(missing_ok=True)
+            elif path:
+                print(f'** deleting directory {path}')
+                rmtree(path,ignore_errors=True)
+            elif repo_id:
+                print(f'** deleting the cached model directory for {repo_id}')
+                self._delete_model_from_cache(repo_id)

    def add_model(self, model_name:str, model_attributes:dict, clobber:bool=False) -> None:
        '''
@ -417,7 +438,7 @@ class ModelManager(object):
            safety_checker=None,
            local_files_only=not Globals.internet_available
        )
-        if 'vae' in mconfig:
+        if 'vae' in mconfig and mconfig['vae'] is not None:
             vae = self._load_vae(mconfig['vae'])
             pipeline_args.update(vae=vae)
        if not isinstance(name_or_path,Path):
@ -523,11 +544,12 @@ class ModelManager(object):
            print('>> Model scanned ok!')

    def import_diffuser_model(self,
-                        repo_or_path:Union[str,Path],
-                        model_name:str=None,
-                        description:str=None,
-                        commit_to_conf:Path=None,
-                        )->bool:
+                              repo_or_path:Union[str,Path],
+                              model_name:str=None,
+                              description:str=None,
+                              vae:dict=None,
+                              commit_to_conf:Path=None,
+                              )->bool:
        '''
        Attempts to install the indicated diffuser model and returns True if successful.

@ -543,6 +565,7 @@ class ModelManager(object):
        description = description or f'imported diffusers model {model_name}'
        new_config = dict(
            description=description,
+            vae=vae,
            format='diffusers',
        )
        if isinstance(repo_or_path,Path) and repo_or_path.exists():
@ -556,18 +579,22 @@ class ModelManager(object):
        return True

    def import_ckpt_model(self,
-                    weights:Union[str,Path],
-                    config:Union[str,Path]='configs/stable-diffusion/v1-inference.yaml',
-                    model_name:str=None,
-                    model_description:str=None,
-                    commit_to_conf:Path=None,
-    )->bool:
+                          weights:Union[str,Path],
+                          config:Union[str,Path]='configs/stable-diffusion/v1-inference.yaml',
+                          vae:Union[str,Path]=None,
+                          model_name:str=None,
+                          model_description:str=None,
+                          commit_to_conf:Path=None,
+                          )->bool:
        '''
        Attempts to install the indicated ckpt file and returns True if successful.

        "weights" can be either a path-like object corresponding to a local .ckpt file
        or a http/https URL pointing to a remote model.

+        "vae" is a Path or str object pointing to a ckpt or safetensors file to be used
+        as the VAE for this model.
+
        "config" is the model config file to use with this ckpt file. It defaults to
        v1-inference.yaml. If a URL is provided, the config will be downloaded.

@ -594,6 +621,8 @@ class ModelManager(object):
            width=512,
            height=512
        )
+        if vae:
+            new_config['vae'] = vae
        self.add_model(model_name, new_config, True)
        if commit_to_conf:
            self.commit(commit_to_conf)
@ -633,7 +662,7 @@ class ModelManager(object):

    def convert_and_import(self,
                           ckpt_path:Path,
-                           diffuser_path:Path,
+                           diffusers_path:Path,
                           model_name=None,
                           model_description=None,
                           commit_to_conf:Path=None,
@ -645,46 +674,56 @@ class ModelManager(object):
        new_config = None
        from ldm.invoke.ckpt_to_diffuser import convert_ckpt_to_diffuser
        import transformers
-        if diffuser_path.exists():
-            print(f'ERROR: The path {str(diffuser_path)} already exists. Please move or remove it and try again.')
+        if diffusers_path.exists():
+            print(f'ERROR: The path {str(diffusers_path)} already exists. Please move or remove it and try again.')
            return

-        model_name = model_name or diffuser_path.name
+        model_name = model_name or diffusers_path.name
        model_description = model_description or 'Optimized version of {model_name}'
-        print(f'>> {model_name}: optimizing (30-60s).')
+        print(f'>> Optimizing {model_name} (30-60s)')
        try:
            verbosity =transformers.logging.get_verbosity()
            transformers.logging.set_verbosity_error()
-            convert_ckpt_to_diffuser(ckpt_path, diffuser_path,extract_ema=True)
+            convert_ckpt_to_diffuser(ckpt_path, diffusers_path,extract_ema=True)
            transformers.logging.set_verbosity(verbosity)
-            print(f'>> Success. Optimized model is now located at {str(diffuser_path)}')
-            print(f'>> Writing new config file entry for {model_name}...',end='')
+            print(f'>> Success. Optimized model is now located at {str(diffusers_path)}')
+            print(f'>> Writing new config file entry for {model_name}')
            new_config = dict(
-                path=str(diffuser_path),
+                path=str(diffusers_path),
                description=model_description,
                format='diffusers',
            )
+
+            # HACK (LS): in the event that the original entry is using a custom ckpt VAE, we try to
+            # map that VAE onto a diffuser VAE using a hard-coded dictionary.
+            # I would prefer to do this differently: We load the ckpt model into memory, swap the
+            # VAE in memory, and then pass that to convert_ckpt_to_diffuser() so that the swapped
+            # VAE is built into the model. However, when I tried this I got obscure key errors.
+            if model_name in self.config and (vae_ckpt_path := self.model_info(model_name)['vae']):
+                vae_basename = Path(vae_ckpt_path).stem
+                diffusers_vae = None
+                if (diffusers_vae := VAE_TO_REPO_ID.get(vae_basename,None)):
+                    print(f'>> {vae_basename} VAE corresponds to known {diffusers_vae} diffusers version')
+                    new_config.update(
+                        vae = {'repo_id': diffusers_vae}
+                    )
+                else:
+                    print(f'** Custom VAE "{vae_basename}" found, but corresponding diffusers model unknown')
+                    print(f'** Using "stabilityai/sd-vae-ft-mse"; If this isn\'t right, please edit the model config')
+                    new_config.update(
+                        vae = {'repo_id': 'stabilityai/sd-vae-ft-mse'}
+                    )
+
            self.del_model(model_name)
            self.add_model(model_name, new_config, True)
            if commit_to_conf:
                self.commit(commit_to_conf)
+            print('>> Conversion succeeded')
        except Exception as e:
            print(f'** Conversion failed: {str(e)}')
-            traceback.print_exc()

-        print('done.')
        return new_config

-    def del_config(self, model_name:str, gen, opt, completer):
-        current_model = gen.model_name
-        if model_name == current_model:
-            print("** Can't delete active model. !switch to another model first. **")
-            return
-        gen.model_manager.del_model(model_name)
-        gen.model_manager.commit(opt.conf)
-        print(f'** {model_name} deleted')
-        completer.del_model(model_name)
-
    def search_models(self, search_folder):
        print(f'>> Finding Models In: {search_folder}')
        models_folder_ckpt = Path(search_folder).glob('**/*.ckpt')
@ -766,7 +805,6 @@ class ModelManager(object):

        print('** Legacy version <= 2.2.5 model directory layout detected. Reorganizing.')
        print('** This is a quick one-time operation.')
-        from shutil import move, rmtree

        # transformer files get moved into the hub directory
        if cls._is_huggingface_hub_directory_present():
@ -982,6 +1020,27 @@ class ModelManager(object):

        return vae

+    @staticmethod
+    def _delete_model_from_cache(repo_id):
+        cache_info = scan_cache_dir(global_cache_dir('diffusers'))
+
+        # I'm sure there is a way to do this with comprehensions
+        # but the code quickly became incomprehensible!
+        hashes_to_delete = set()
+        for repo in cache_info.repos:
+            if repo.repo_id==repo_id:
+                for revision in repo.revisions:
+                    hashes_to_delete.add(revision.commit_hash)
+        strategy = cache_info.delete_revisions(*hashes_to_delete)
+        print(f'** deletion of this model is expected to free {strategy.expected_freed_size_str}')
+        strategy.execute()
+
+    @staticmethod
+    def _abs_path(path:Union(str,Path))->Path:
+        if path is None or Path(path).is_absolute():
+            return path
+        return Path(Globals.root,path).resolve()
+
    @staticmethod
    def _is_huggingface_hub_directory_present() -> bool:
        return os.getenv('HF_HOME') is not None or os.getenv('XDG_CACHE_HOME') is not None
--- a/ldm/invoke/textual_inversion_training.py
+++ b/ldm/invoke/textual_inversion_training.py
@ -4,7 +4,6 @@
 # and modified slightly by Lincoln Stein (@lstein) to work with InvokeAI

 import argparse
-from argparse import Namespace
 import logging
 import math
 import os
@ -207,6 +206,12 @@ def parse_args():
    parser.add_argument("--adam_epsilon", type=float, default=1e-08, help="Epsilon value for the Adam optimizer")
    parser.add_argument("--push_to_hub", action="store_true", help="Whether or not to push the model to the Hub.")
    parser.add_argument("--hub_token", type=str, default=None, help="The token to use to push to the Model Hub.")
+    parser.add_argument(
+        "--hub_model_id",
+        type=str,
+        default=None,
+        help="The name of the repository to keep in sync with the local `output_dir`.",
+    )
    parser.add_argument(
        "--logging_dir",
        type=Path,
@ -455,7 +460,8 @@ def do_textual_inversion_training(
        checkpointing_steps:int=500,
        resume_from_checkpoint:Path=None,
        enable_xformers_memory_efficient_attention:bool=False,
-        root_dir:Path=None
+        root_dir:Path=None,
+        hub_model_id:str=None,
 ):
    env_local_rank = int(os.environ.get("LOCAL_RANK", -1))
    if env_local_rank != -1 and env_local_rank != local_rank:
@ -518,10 +524,10 @@ def do_textual_inversion_training(
    pretrained_model_name_or_path = model_conf.get('repo_id',None) or Path(model_conf.get('path'))
    assert pretrained_model_name_or_path, f"models.yaml error: neither 'repo_id' nor 'path' is defined for {model}"
    pipeline_args = dict(cache_dir=global_cache_dir('diffusers'))
-    
+
    # Load tokenizer
    if tokenizer_name:
-        tokenizer = CLIPTokenizer.from_pretrained(tokenizer_name,cache_dir=global_cache_dir('transformers'))
+        tokenizer = CLIPTokenizer.from_pretrained(tokenizer_name,**pipeline_args)
    else:
        tokenizer = CLIPTokenizer.from_pretrained(pretrained_model_name_or_path, subfolder="tokenizer", **pipeline_args)

@ -631,7 +637,7 @@ def do_textual_inversion_training(
        text_encoder, optimizer, train_dataloader, lr_scheduler
    )

-    # For mixed precision training we cast the text_encoder and vae weights to half-precision
+    # For mixed precision training we cast the unet and vae weights to half-precision
    # as these models are only used for inference, keeping weights in full precision is not required.
    weight_dtype = torch.float32
    if accelerator.mixed_precision == "fp16":
@ -670,6 +676,7 @@ def do_textual_inversion_training(
    logger.info(f"  Total optimization steps = {max_train_steps}")
    global_step = 0
    first_epoch = 0
+    resume_step = None

    # Potentially load in the weights and states from a previous save
    if resume_from_checkpoint:
@ -680,15 +687,22 @@ def do_textual_inversion_training(
            dirs = os.listdir(output_dir)
            dirs = [d for d in dirs if d.startswith("checkpoint")]
            dirs = sorted(dirs, key=lambda x: int(x.split("-")[1]))
-            path = dirs[-1]
-        accelerator.print(f"Resuming from checkpoint {path}")
-        accelerator.load_state(os.path.join(output_dir, path))
-        global_step = int(path.split("-")[1])
-
-        resume_global_step = global_step * gradient_accumulation_steps
-        first_epoch = resume_global_step // num_update_steps_per_epoch
-        resume_step = resume_global_step % num_update_steps_per_epoch
+            path = dirs[-1] if len(dirs) > 0 else None
+            
+        if path is None:
+            accelerator.print(
+                f"Checkpoint '{resume_from_checkpoint}' does not exist. Starting a new training run."
+            )
+            resume_from_checkpoint = None
+        else:
+            accelerator.print(f"Resuming from checkpoint {path}")
+            accelerator.load_state(os.path.join(output_dir, path))
+            global_step = int(path.split("-")[1])

+            resume_global_step = global_step * gradient_accumulation_steps
+            first_epoch = global_step // num_update_steps_per_epoch
+            resume_step = resume_global_step % (num_update_steps_per_epoch * gradient_accumulation_steps)
+            
    # Only show the progress bar once on each machine.
    progress_bar = tqdm(range(global_step, max_train_steps), disable=not accelerator.is_local_main_process)
    progress_bar.set_description("Steps")
@ -700,7 +714,7 @@ def do_textual_inversion_training(
        text_encoder.train()
        for step, batch in enumerate(train_dataloader):
            # Skip steps until we reach the resumed step
-            if resume_from_checkpoint and epoch == first_epoch and step < resume_step:
+            if resume_step and resume_from_checkpoint and epoch == first_epoch and step < resume_step:
                if step % gradient_accumulation_steps == 0:
                    progress_bar.update(1)
                continue
--- a/ldm/modules/textual_inversion_manager.py
+++ b/ldm/modules/textual_inversion_manager.py
@ -72,8 +72,9 @@ class TextualInversionManager():
                self._add_textual_inversion(embedding_info['name'],
                                            embedding_info['embedding'],
                                            defer_injecting_tokens=defer_injecting_tokens)
-            except ValueError:
-                print(f'   | ignoring incompatible embedding {embedding_info["name"]}')
+            except ValueError as e:
+                print(f'   | Ignoring incompatible embedding {embedding_info["name"]}')
+                print(f'   | The error was {str(e)}')
        else:
            print(f'>> Failed to load embedding located at {ckpt_path}. Unsupported file.')

@ -157,7 +158,8 @@ class TextualInversionManager():
                try:
                    self._inject_tokens_and_assign_embeddings(ti)
                except ValueError as e:
-                    print(f'   | ignoring incompatible embedding trigger {ti.trigger_string}')
+                    print(f'   | Ignoring incompatible embedding trigger {ti.trigger_string}')
+                    print(f'   | The error was {str(e)}')
                    continue
                injected_token_ids.append(ti.trigger_token_id)
                injected_token_ids.extend(ti.pad_token_ids)
--- a/scripts/orig_scripts/main.py
+++ b/scripts/orig_scripts/main.py
--- a/scripts/textual_inversion.py
+++ b/scripts/textual_inversion.py
@ -1,11 +1,11 @@
 #!/usr/bin/env python

 # Copyright 2023, Lincoln Stein @lstein
-from ldm.invoke.globals import Globals, set_root
+from ldm.invoke.globals import Globals, global_set_root
 from ldm.invoke.textual_inversion_training import parse_args, do_textual_inversion_training

 if __name__ == "__main__":
    args = parse_args()
-    set_root(args.root_dir or Globals.root)
+    global_set_root(args.root_dir or Globals.root)
    kwargs = vars(args)
    do_textual_inversion_training(**kwargs)
--- a/scripts/textual_inversion_fe.py
+++ b/scripts/textual_inversion_fe.py
@ -6,14 +6,15 @@ import sys
 import re
 import shutil
 import traceback
+import curses
 from ldm.invoke.globals import Globals, global_set_root
 from omegaconf import OmegaConf
 from pathlib import Path
 from typing import List
 import argparse

-TRAINING_DATA = 'training-data'
-TRAINING_DIR = 'text-inversion-training'
+TRAINING_DATA = 'text-inversion-training-data'
+TRAINING_DIR = 'text-inversion-output'
 CONF_FILE = 'preferences.conf'

 class textualInversionForm(npyscreen.FormMultiPageAction):
@ -43,6 +44,11 @@ class textualInversionForm(npyscreen.FormMultiPageAction):
        except:
            pass

+        self.add_widget_intelligent(
+            npyscreen.FixedText,
+            value='Use ctrl-N and ctrl-P to move to the <N>ext and <P>revious fields, cursor arrows to make a selection, and space to toggle checkboxes.'
+        )
+
        self.model = self.add_widget_intelligent(
            npyscreen.TitleSelectOne,
            name='Model Name:',
@ -82,18 +88,18 @@ class textualInversionForm(npyscreen.FormMultiPageAction):
            max_height=4,
        )
        self.train_data_dir = self.add_widget_intelligent(
-            npyscreen.TitleFilenameCombo,
+            npyscreen.TitleFilename,
            name='Data Training Directory:',
            select_dir=True,
-            must_exist=True,
-            value=saved_args.get('train_data_dir',Path(Globals.root) / TRAINING_DATA / default_placeholder_token)
+            must_exist=False,
+            value=str(saved_args.get('train_data_dir',Path(Globals.root) / TRAINING_DATA / default_placeholder_token))
        )
        self.output_dir = self.add_widget_intelligent(
-            npyscreen.TitleFilenameCombo,
+            npyscreen.TitleFilename,
            name='Output Destination Directory:',
            select_dir=True,
            must_exist=False,
-            value=saved_args.get('output_dir',Path(Globals.root) / TRAINING_DIR / default_placeholder_token)
+            value=str(saved_args.get('output_dir',Path(Globals.root) / TRAINING_DIR / default_placeholder_token))
        )
        self.resolution = self.add_widget_intelligent(
            npyscreen.TitleSelectOne,
@ -182,8 +188,8 @@ class textualInversionForm(npyscreen.FormMultiPageAction):
    def initializer_changed(self):
        placeholder = self.placeholder_token.value
        self.prompt_token.value = f'(Trigger by using <{placeholder}> in your prompts)'
-        self.train_data_dir.value = Path(Globals.root) / TRAINING_DATA / placeholder
-        self.output_dir.value = Path(Globals.root) / TRAINING_DIR / placeholder
+        self.train_data_dir.value = str(Path(Globals.root) / TRAINING_DATA / placeholder)
+        self.output_dir.value = str(Path(Globals.root) / TRAINING_DIR / placeholder)
        self.resume_from_checkpoint.value = Path(self.output_dir.value).exists()
        
    def on_ok(self):
@ -221,7 +227,7 @@ class textualInversionForm(npyscreen.FormMultiPageAction):

    def get_model_names(self)->(List[str],int):
        conf = OmegaConf.load(os.path.join(Globals.root,'configs/models.yaml'))
-        model_names = list(conf.keys())
+        model_names = [idx for idx in sorted(list(conf.keys())) if conf[idx].get('format',None)=='diffusers']
        defaults = [idx for idx in range(len(model_names)) if 'default' in conf[model_names[idx]]]
        return (model_names,defaults[0])

@ -288,7 +294,9 @@ def save_args(args:dict):
    '''
    Save the current argument values to an omegaconf file
    '''
-    conf_file = Path(Globals.root) / TRAINING_DIR / CONF_FILE
+    dest_dir = Path(Globals.root) / TRAINING_DIR
+    os.makedirs(dest_dir, exist_ok=True)
+    conf_file = dest_dir / CONF_FILE
    conf = OmegaConf.create(args)
    OmegaConf.save(config=conf, f=conf_file)