InvokeAI

mirror of https://github.com/invoke-ai/InvokeAI synced 2024-08-30 20:32:17 +00:00

Author	SHA1	Message	Date
psychedelicious	e365d35c93	docs(processor): update docstrings, comments	2024-05-24 20:02:24 +10:00
psychedelicious	2dd3a85ade	feat(processor): update enriched errors & `fail_queue_item()`	2024-05-24 20:02:24 +10:00
psychedelicious	a8492bd7e4	feat(events): add enriched errors to events	2024-05-24 20:02:24 +10:00
psychedelicious	25954ea750	feat(queue): session queue error handling - Add handling for new error columns `error_type`, `error_message`, `error_traceback`. - Update queue item model to include the new data. The `error_traceback` field has an alias of `error` for backwards compatibility. - Add `fail_queue_item` method. This was previously handled by `cancel_queue_item`. Splitting this functionality makes failing a queue item a bit more explicit. We also don't need to handle multiple optional error args. -	2024-05-24 20:02:24 +10:00
psychedelicious	887b73aece	feat(db): add `error_type`, `error_message`, rename `error` -> `error_traceback` to `session_queue` table	2024-05-24 20:02:24 +10:00
psychedelicious	3c41c67d13	fix(processor): restore missing update of session	2024-05-24 20:02:24 +10:00
psychedelicious	6c79be7dc3	chore: ruff	2024-05-24 20:02:24 +10:00
psychedelicious	097619ef51	feat(processor): get user/project from queue item w/ fallback	2024-05-24 20:02:24 +10:00
psychedelicious	a1f7a9cd6f	fix(app): fix logging of error classes instead of class names	2024-05-24 20:02:24 +10:00
psychedelicious	25b9c19eed	feat(app): handle preparation errors as node errors We were not handling node preparation errors as node errors before. Here's the explanation, copied from a comment that is no longer required: --- TODO(psyche): Sessions only support errors on nodes, not on the session itself. When an error occurs outside node execution, it bubbles up to the processor where it is treated as a queue item error. Nodes are pydantic models. When we prepare a node in `session.next()`, we set its inputs. This can cause a pydantic validation error. For example, consider a resize image node which has a constraint on its `width` input field - it must be greater than zero. During preparation, if the width is set to zero, pydantic will raise a validation error. When this happens, it breaks the flow before `invocation` is set. We can't set an error on the invocation because we didn't get far enough to get it - we don't know its id. Hence, we just set it as a queue item error. --- This change wraps the node preparation step with exception handling. A new `NodeInputError` exception is raised when there is a validation error. This error has the node (in the state it was in just prior to the error) and an identifier of the input that failed. This allows us to mark the node that failed preparation as errored, correctly making such errors _node_ errors and not _processor_ errors. It's much easier to diagnose these situations. The error messages look like this: > Node b5ac87c6-0678-4b8c-96b9-d215aee12175 has invalid incoming input for height Some of the exception handling logic is cleaned up.	2024-05-24 20:02:24 +10:00
psychedelicious	cc2d877699	docs(app): explain why errors are handled poorly	2024-05-24 20:02:24 +10:00
psychedelicious	be82404759	tidy(app): "outputs" -> "output"	2024-05-24 20:02:24 +10:00
psychedelicious	33f9fe2c86	tidy(app): rearrange proccessor	2024-05-24 20:02:24 +10:00
psychedelicious	1d973f92ff	feat(app): support multiple processor lifecycle callbacks	2024-05-24 20:02:24 +10:00
psychedelicious	7f70cde038	feat(app): make things in session runner private	2024-05-24 20:02:24 +10:00
psychedelicious	47722528a3	feat(app): iterate on processor split 2 - Use protocol to define callbacks, this allows them to have kwargs - Shuffle the profiler around a bit - Move `thread_limit` and `polling_interval` to `__init__`; `start` is called programmatically and will never get these args in practice	2024-05-24 20:02:24 +10:00
psychedelicious	be41c84305	feat(app): iterate on processor split - Add `OnNodeError` and `OnNonFatalProcessorError` callbacks - Move all session/node callbacks to `SessionRunner` - this ensures we dump perf stats before resetting them and generally makes sense to me - Remove `complete` event from `SessionRunner`, it's essentially the same as `OnAfterRunSession` - Remove extraneous `next_invocation` block, which would treat a processor error as a node error - Simplify loops - Add some callbacks for testing, to be removed before merge	2024-05-24 20:02:24 +10:00
brandonrising	82b4298b03	Fix next node calling logic	2024-05-24 20:02:24 +10:00
brandonrising	fa6c7badd6	Run ruff	2024-05-24 20:02:24 +10:00
brandonrising	45d2504c1e	Break apart session processor and the running of each session into separate classes	2024-05-24 20:02:24 +10:00
psychedelicious	93e4c3dbc2	feat(app): update queue item's session on session completion The session is never updated in the queue after it is first enqueued. As a result, the queue detail view in the frontend never never updates and the session itself doesn't show outputs, execution graph, etc. We need a new method on the queue service to update a queue item's session, then call it before updating the queue item's status. Queue item status may be updated via a session-type event _or_ queue-type event. Adding the updated session to all these events is a hairy - simpler to just update the session before we do anything that could trigger a queue item status change event: - Before calling `emit_session_complete` in the processor (handles session error, completed and cancel events and the corresponding queue events) - Before calling `cancel_queue_item` in the processor (handles another way queue items can be canceled, outside the session execution loop) When serializing the session, both in the new service method and the `get_queue_item` endpoint, we need to use `exclude_none=True` to prevent unexpected validation errors.	2024-05-24 08:59:49 +10:00
psychedelicious	17e1fc5254	chore(app): ruff	2024-05-18 09:21:45 +10:00
maryhipp	84e031edc2	add nulable project also	2024-05-18 09:21:45 +10:00
maryhipp	b6b7e737e0	ruff	2024-05-18 09:21:45 +10:00
maryhipp	5f3e7afd45	add nullable user to invocation error events	2024-05-18 09:21:45 +10:00
psychedelicious	b0cfca9d24	fix(app): pass image metadata as stringified json	2024-05-18 09:04:37 +10:00
psychedelicious	985ef89825	fix(app): type annotations in images service	2024-05-18 09:04:37 +10:00
psychedelicious	5928ade5fd	feat(app): simplified create image API Graph, metadata and workflow all take stringified JSON only. This makes the API consistent and means we don't need to do a round-trip of pydantic parsing when handling this data. It also prevents a failure mode where an uploaded image's metadata, workflow or graph are old and don't match the current schema. As before, the frontend does strict validation and parsing when loading these values.	2024-05-18 09:04:37 +10:00
psychedelicious	93ebc175c6	fix(app): retain graph in metadata when uploading images	2024-05-18 09:04:37 +10:00
psychedelicious	922716d2ab	feat(ui): store graph in image metadata The previous super-minimal implementation had a major issue - the saved workflow didn't take into account batched field values. When generating with multiple iterations or dynamic prompts, the same workflow with the first prompt, seed, etc was stored in each image. As a result, when the batch results in multiple queue items, only one of the images has the correct workflow - the others are mismatched. To work around this, we can store the _graph_ in the image metadata (alongside the workflow, if generated via workflow editor). When loading a workflow from an image, we can choose to load the workflow or the graph, preferring the workflow. Internally, we need to update images router image-saving services. The changes are minimal. To avoid pydantic errors deserializing the graph, when we extract it from the image, we will leave it as stringified JSON and let the frontend's more sophisticated and flexible parsing handle it. The worklow is also changed to just return stringified JSON, so the API is consistent.	2024-05-18 09:04:37 +10:00
psychedelicious	d861bc690e	feat(mm): handle PC_PATH_MAX on external drives on macOS `PC_PATH_MAX` doesn't exist for (some?) external drives on macOS. We need error handling when retrieving this value. Also added error handling for `PC_NAME_MAX` just in case. This does work for me for external drives on macOS, though. Closes #6277	2024-04-30 07:57:03 -04:00
psychedelicious	2cee436ecf	tidy(app): remove unused class	2024-04-23 17:12:14 +10:00
psychedelicious	e6386d969f	fix(app): only clear tempdirs if ephemeral and before creating tempdir Also, this needs to happen in init, else it deletes the temp dir created in init	2024-04-23 17:12:14 +10:00
Lincoln Stein	53808149fb	moved cleanup routine into object_serializer_disk.py	2024-04-23 17:12:14 +10:00
Lincoln Stein	2b9f06dc4c	Re-enable app shutdown actions (#6244 ) * closes #6242 * only override sigINT during slow model scanning * fix ruff formatting --------- Co-authored-by: Lincoln Stein <lstein@gmail.com>	2024-04-19 06:45:42 -04:00
Lincoln Stein	fce6b3e44c	maybe solve race issue	2024-04-16 13:09:26 +10:00
Lincoln Stein	e93f4d632d	[util] Add generic torch device class (#6174 ) * introduce new abstraction layer for GPU devices * add unit test for device abstraction * fix ruff * convert TorchDeviceSelect into a stateless class * move logic to select context-specific execution device into context API * add mock hardware environments to pytest * remove dangling mocker fixture * fix unit test for running on non-CUDA systems * remove unimplemented get_execution_device() call * remove autocast precision * Multiple changes: 1. Remove TorchDeviceSelect.get_execution_device(), as well as calls to context.models.get_execution_device(). 2. Rename TorchDeviceSelect to TorchDevice 3. Added back the legacy public API defined in `invocation_api`, including choose_precision(). 4. Added a config file migration script to accommodate removal of precision=autocast. * add deprecation warnings to choose_torch_device() and choose_precision() * fix test crash * remove app_config argument from choose_torch_device() and choose_torch_dtype() --------- Co-authored-by: Lincoln Stein <lstein@gmail.com>	2024-04-15 13:12:49 +00:00
psychedelicious	b18442ded4	fix(queue): poll queue on finished queue item When a queue item is finished (completed, canceled, failed), immediately poll the queue for the next queue item. Closes #6189	2024-04-12 07:31:47 +10:00
Lincoln Stein	dedf0c6ffa	fix ruff issues	2024-04-12 07:19:16 +10:00
Lincoln Stein	579082ac10	[mm] clear the cache entry for a model that got an OOM during loading	2024-04-12 07:19:16 +10:00
fieldOfView	dca30d5462	(feat) add a method to get the path of an image from the invocation context Fixes #6175	2024-04-08 18:42:55 +10:00
Lincoln Stein	812f10730f	adjust free vram calculation for models that will be removed by lazy offloading (#6150 ) Co-authored-by: Lincoln Stein <lstein@gmail.com>	2024-04-04 22:51:12 -04:00
psychedelicious	8c15d14099	fix: use locale encoding We have had a few bugs with v4 related to file encodings, especially on Windows. Windows uses its own character encodings instead of `utf-8`, often `cp1252`. Some characters cannot be decoded using `utf-8`, causing `UnicodeDecodeError`. There are a couple places where this can cause problems: - In the installer bootstrap, we install or upgrade `pip` and decode the result, using `subprocess`. The input to this includes the user's home dir. In #6105, the user had one of the problematic characters in their username. `subprocess` attempts and fails to decode the username, which crashes the installer. To fix this, we need to use `locale.getpreferredencoding()` when executing the command. - Similarly, in the model install service and config class, we attempt to load a yaml config file. If a problematic character is in the path to the file (which often includes the user's home dir), we can get the same error. One example is #6129 in which the models.yaml migration fails. To fix this, we need to open the file with `locale.getpreferredencoding()`.	2024-04-04 15:30:47 +11:00
psychedelicious	9c51abb46e	fix(config): get root from venv This logic was a bit wonky. It only selected the `venv` parent if there was already an `invokeai.yaml` file in it. Removed this constraint.	2024-04-04 10:54:23 +11:00
psychedelicious	7ff2371c07	fix(mm): do not rename model file if model record is renamed Renaming the model file to the model name introduces unnecessary contraints on model names. For example, a model name can technically be any length, but a model _filename_ cannot be too long. There are also constraints on valid characters for filenames which shouldn't be applied to model record names. I believe the old behaviour is a holdover from the old system.	2024-04-04 07:17:38 +11:00
psychedelicious	e655399324	fix(config): handle windows paths in invokeai.yaml migration for legacy_conf_dir The logic incorrectly set the `legacy_conf_dir` on windows, where the slashes go the other direction. Handle this case and update tests to catch it.	2024-04-02 08:06:59 -04:00
psychedelicious	f75de8a35c	feat(db): add migration 9 - empty session queue Empties the session queue. This is done to prevent any lingering session queue items from causing pydantic errors due to changed schemas.	2024-04-02 13:25:14 +11:00
psychedelicious	4049217728	feat(db): back up database before running migrations Just in case.	2024-04-02 09:10:53 +11:00
psychedelicious	f83edcf990	feat(nodes): simplify processor loop with an early continue Prefer an early return/continue to reduce the indentation of the processor loop. Easier to read. There are other ways to improve its structure but at first glance, they seem to involve changing the logic in scarier ways.	2024-04-01 08:39:25 +11:00
psychedelicious	a6dd50aeaf	fix(nodes): 100% cpu usage when processor paused Should be waiting on the resume event instead of checking it in a loop	2024-04-01 08:39:25 +11:00

1 2 3 4 5 ...

923 Commits