# Aider chat conversation:
USER: `invokeai/backend/model_manager/convert_ckpt_to_diffusers.py` Im having issues with cpu memory, not gpu memory
ASSISTANT: To address CPU memory issues, we can make the following changes:
1. Use `del` to explicitly delete large objects when they are no longer needed.
2. Use `gc.collect()` to force garbage collection after deleting large objects.
Here are the *SEARCH/REPLACE* blocks:
invokeai/backend/model_manager/convert_ckpt_to_diffusers.py
```python
<<<<<<< SEARCH
vae = AutoencoderKL(**vae_config)
with torch.no_grad():
vae.load_state_dict(converted_vae_checkpoint)
vae.to(precision)
torch.cuda.empty_cache()
=======
vae = AutoencoderKL(**vae_config)
with torch.no_grad():
vae.load_state_dict(converted_vae_checkpoint)
del converted_vae_checkpoint # Free memory
import gc
gc.collect()
vae.to(precision)
>>>>>>> REPLACE
```
invokeai/backend/model_manager/convert_ckpt_to_diffusers.py
```python
<<<<<<< SEARCH
pipe = download_from_original_stable_diffusion_ckpt(Path(checkpoint_path).as_posix(), **kwargs)
with torch.no_grad():
pipe = pipe.to(precision)
torch.cuda.empty_cache()
=======
pipe = download_from_original_stable_diffusion_ckpt(Path(checkpoint_path).as_posix(), **kwargs)
with torch.no_grad():
del kwargs # Free memory
import gc
gc.collect()
pipe = pipe.to(precision)
>>>>>>> REPLACE
```
invokeai/backend/model_manager/convert_ckpt_to_diffusers.py
```python
<<<<<<< SEARCH
pipe = download_controlnet_from_original_ckpt(checkpoint_path.as_posix(), **kwargs)
with torch.no_grad():
pipe = pipe.to(precision)
torch.cuda.empty_cache()
=======
pipe = download_controlnet_from_original_ckpt(checkpoint_path.as_posix(), **kwargs)
with torch.no_grad():
del kwargs # Free memory
import gc
gc.collect()
pipe = pipe.to(precision)
>>>>>>> REPLACE
```
# Aider chat conversation:
USER: `invokeai/backend/model_manager/convert_ckpt_to_diffusers.py` Could you improve the memory usage efficiency of this code?
ASSISTANT: To improve the memory usage efficiency of the code, we can make the following changes:
1. Use `with torch.no_grad()` to avoid storing intermediate values for backpropagation.
2. Move the model to the desired device only when necessary.
3. Clear the cache after moving the model to the desired device.
Here are the *SEARCH/REPLACE* blocks:
invokeai/backend/model_manager/convert_ckpt_to_diffusers.py
```python
<<<<<<< SEARCH
vae.load_state_dict(converted_vae_checkpoint)
vae.to(precision)
=======
with torch.no_grad():
vae.load_state_dict(converted_vae_checkpoint)
vae.to(precision)
torch.cuda.empty_cache()
>>>>>>> REPLACE
```
invokeai/backend/model_manager/convert_ckpt_to_diffusers.py
```python
<<<<<<< SEARCH
pipe = pipe.to(precision)
=======
with torch.no_grad():
pipe = pipe.to(precision)
torch.cuda.empty_cache()
>>>>>>> REPLACE
```
invokeai/backend/model_manager/convert_ckpt_to_diffusers.py
```python
<<<<<<< SEARCH
pipe = pipe.to(precision)
=======
with torch.no_grad():
pipe = pipe.to(precision)
torch.cuda.empty_cache()
>>>>>>> REPLACE
```
## Summary
- Updated the documentation for `TextualInversionManager`
- Updated the `self.tokenizer.model_max_length` access to work with the
latest transformers version. Thanks to @skunkworxdark for looking into
this here:
https://github.com/invoke-ai/InvokeAI/issues/6445#issuecomment-2133098342
## Related Issues / Discussions
Closes#6445
## QA Instructions
I tested with `transformers==4.41.1`, and compared the results against a
recent InvokeAI version before updating tranformers - no change, as
expected.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
This is required to get these event fields to deserialize correctly. If omitted, pydantic uses `BaseInvocation`/`BaseInvocationOutput`, which is not correct.
This is similar to the workaround in the `Graph` and `GraphExecutionState` classes where we need to fanagle pydantic with manual validation handling.
Note about the huge diff: I had a different version of pydantic installed at some point, which slightly altered a _ton_ of schema components. This typegen was done on the correct version of pydantic and un-does those alterations, in addition to the intentional changes to event models.
There's no longer any need for session-scoped events now that we have the session queue. Session started/completed/canceled map 1-to-1 to queue item status events, but queue item status events also have an event for failed state.
We can simplify queue and processor handling substantially by removing session events and instead using queue item events.
- Remove the session-scoped events entirely.
- Remove all event handling from session queue. The processor still needs to respond to some events from the queue: `QueueClearedEvent`, `BatchEnqueuedEvent` and `QueueItemStatusChangedEvent`.
- Pass an `is_canceled` callback to the invocation context instead of the cancel event
- Update processor logic to ensure the local instance of the current queue item is synced with the instance in the database. This prevents race conditions and ensures lifecycle callback do not get stale callbacks.
- Update docstrings and comments
- Add `complete_queue_item` method to session queue service as an explicit way to mark a queue item as successfully completed. Previously, the queue listened for session complete events to do this.
Closes#6442
- Restore calculation of step percentage but in the backend instead of client
- Simplify signatures for denoise progress event callbacks
- Clean up `step_callback.py` (types, do not recreate constant matrix on every step, formatting)
We don't need to use the payload schema registry. All our events are dispatched as pydantic models, which are already validated on instantiation.
We do want to add all events to the OpenAPI schema, and we referred to the payload schema registry for this. To get all events, add a simple helper to EventBase. This is functionally identical to using the schema registry.