# Aider chat conversation:
USER: `invokeai/backend/model_manager/convert_ckpt_to_diffusers.py` Im having issues with cpu memory, not gpu memory
ASSISTANT: To address CPU memory issues, we can make the following changes:
1. Use `del` to explicitly delete large objects when they are no longer needed.
2. Use `gc.collect()` to force garbage collection after deleting large objects.
Here are the *SEARCH/REPLACE* blocks:
invokeai/backend/model_manager/convert_ckpt_to_diffusers.py
```python
<<<<<<< SEARCH
vae = AutoencoderKL(**vae_config)
with torch.no_grad():
vae.load_state_dict(converted_vae_checkpoint)
vae.to(precision)
torch.cuda.empty_cache()
=======
vae = AutoencoderKL(**vae_config)
with torch.no_grad():
vae.load_state_dict(converted_vae_checkpoint)
del converted_vae_checkpoint # Free memory
import gc
gc.collect()
vae.to(precision)
>>>>>>> REPLACE
```
invokeai/backend/model_manager/convert_ckpt_to_diffusers.py
```python
<<<<<<< SEARCH
pipe = download_from_original_stable_diffusion_ckpt(Path(checkpoint_path).as_posix(), **kwargs)
with torch.no_grad():
pipe = pipe.to(precision)
torch.cuda.empty_cache()
=======
pipe = download_from_original_stable_diffusion_ckpt(Path(checkpoint_path).as_posix(), **kwargs)
with torch.no_grad():
del kwargs # Free memory
import gc
gc.collect()
pipe = pipe.to(precision)
>>>>>>> REPLACE
```
invokeai/backend/model_manager/convert_ckpt_to_diffusers.py
```python
<<<<<<< SEARCH
pipe = download_controlnet_from_original_ckpt(checkpoint_path.as_posix(), **kwargs)
with torch.no_grad():
pipe = pipe.to(precision)
torch.cuda.empty_cache()
=======
pipe = download_controlnet_from_original_ckpt(checkpoint_path.as_posix(), **kwargs)
with torch.no_grad():
del kwargs # Free memory
import gc
gc.collect()
pipe = pipe.to(precision)
>>>>>>> REPLACE
```
# Aider chat conversation:
USER: `invokeai/backend/model_manager/convert_ckpt_to_diffusers.py` Could you improve the memory usage efficiency of this code?
ASSISTANT: To improve the memory usage efficiency of the code, we can make the following changes:
1. Use `with torch.no_grad()` to avoid storing intermediate values for backpropagation.
2. Move the model to the desired device only when necessary.
3. Clear the cache after moving the model to the desired device.
Here are the *SEARCH/REPLACE* blocks:
invokeai/backend/model_manager/convert_ckpt_to_diffusers.py
```python
<<<<<<< SEARCH
vae.load_state_dict(converted_vae_checkpoint)
vae.to(precision)
=======
with torch.no_grad():
vae.load_state_dict(converted_vae_checkpoint)
vae.to(precision)
torch.cuda.empty_cache()
>>>>>>> REPLACE
```
invokeai/backend/model_manager/convert_ckpt_to_diffusers.py
```python
<<<<<<< SEARCH
pipe = pipe.to(precision)
=======
with torch.no_grad():
pipe = pipe.to(precision)
torch.cuda.empty_cache()
>>>>>>> REPLACE
```
invokeai/backend/model_manager/convert_ckpt_to_diffusers.py
```python
<<<<<<< SEARCH
pipe = pipe.to(precision)
=======
with torch.no_grad():
pipe = pipe.to(precision)
torch.cuda.empty_cache()
>>>>>>> REPLACE
```
This is required to get these event fields to deserialize correctly. If omitted, pydantic uses `BaseInvocation`/`BaseInvocationOutput`, which is not correct.
This is similar to the workaround in the `Graph` and `GraphExecutionState` classes where we need to fanagle pydantic with manual validation handling.
Note about the huge diff: I had a different version of pydantic installed at some point, which slightly altered a _ton_ of schema components. This typegen was done on the correct version of pydantic and un-does those alterations, in addition to the intentional changes to event models.
There's no longer any need for session-scoped events now that we have the session queue. Session started/completed/canceled map 1-to-1 to queue item status events, but queue item status events also have an event for failed state.
We can simplify queue and processor handling substantially by removing session events and instead using queue item events.
- Remove the session-scoped events entirely.
- Remove all event handling from session queue. The processor still needs to respond to some events from the queue: `QueueClearedEvent`, `BatchEnqueuedEvent` and `QueueItemStatusChangedEvent`.
- Pass an `is_canceled` callback to the invocation context instead of the cancel event
- Update processor logic to ensure the local instance of the current queue item is synced with the instance in the database. This prevents race conditions and ensures lifecycle callback do not get stale callbacks.
- Update docstrings and comments
- Add `complete_queue_item` method to session queue service as an explicit way to mark a queue item as successfully completed. Previously, the queue listened for session complete events to do this.
Closes#6442
- Restore calculation of step percentage but in the backend instead of client
- Simplify signatures for denoise progress event callbacks
- Clean up `step_callback.py` (types, do not recreate constant matrix on every step, formatting)
We don't need to use the payload schema registry. All our events are dispatched as pydantic models, which are already validated on instantiation.
We do want to add all events to the OpenAPI schema, and we referred to the payload schema registry for this. To get all events, add a simple helper to EventBase. This is functionally identical to using the schema registry.
The model loader emits events. During testing, it doesn't have access to a fully-mocked events service, so the test fails when attempting to call a nonexistent method. There was a check for this previously, but I accidentally removed it. Restored.
- Remove ABCs, they do not work well with pydantic
- Remove the event type classvar - unused
- Remove clever logic to require an event name - we already get validation for this during schema registration.
- Rename event bases to all end in "Base"