InvokeAI

mirror of https://github.com/invoke-ai/InvokeAI synced 2025-07-25 12:55:55 +00:00

Author	SHA1	Message	Date
Kevin Turner	52a8ad1c18	chore: rename model.size to model.file_size to disambiguate from RAM size or pixel size	2025-04-10 09:53:03 +10:00
Kevin Turner	98260a8efc	test: add size field to test model configs	2025-04-10 09:53:03 +10:00
Billy	182580ff69	Imports	2025-03-26 12:55:10 +11:00
Billy	8e9d5c1187	Ruff formatting	2025-03-26 12:30:31 +11:00
Billy	99aac5870e	Remove star imports	2025-03-26 12:27:00 +11:00
Ryan Dick	5357d6e08e	Rename ConcatenatedLoRALayer to MergedLayerPatch. And other minor cleanup.	2025-01-28 14:51:35 +00:00
Ryan Dick	28514ba59a	Update ConcatenatedLoRALayer to work with all sub-layer types.	2025-01-28 14:51:35 +00:00
Ryan Dick	e2f05d0800	Add unit tests for LoKR patch layers. The new tests trigger a bug when LoKR layers are applied to BnB-quantized layers (also impacts several other LoRA variant types).	2025-01-22 09:20:40 +11:00
Ryan Dick	36a3869af0	Add keep_ram_copy_of_weights config option.	2025-01-16 15:35:25 +00:00
Ryan Dick	c76d08d1fd	Add keep_ram_copy option to CachedModelOnlyFullLoad.	2025-01-16 15:08:23 +00:00
Ryan Dick	04087c38ce	Add keep_ram_copy option to CachedModelWithPartialLoad.	2025-01-16 14:51:44 +00:00
Ryan Dick	974b4671b1	Deprecate the `ram` and `vram` configs to make the migration to dynamic memory limits smoother for users who had previously overriden these values.	2025-01-07 16:45:29 +00:00
Ryan Dick	d7ab464176	Offload the current model when locking if it is already partially loaded and we have insufficient VRAM.	2025-01-07 02:53:44 +00:00
Ryan Dick	5eafe1ec7a	Fix ModelCache execution device selection in unit tests.	2025-01-07 01:20:15 +00:00
Ryan Dick	a167632f09	Calculate model cache size limits dynamically based on the available RAM / VRAM.	2025-01-07 01:14:20 +00:00
Ryan Dick	402dd840a1	Add seed to flaky unit test.	2025-01-07 00:31:00 +00:00
Ryan Dick	d0bfa019be	Add 'enable_partial_loading' config flag.	2025-01-07 00:31:00 +00:00
Ryan Dick	535e45cedf	First pass at adding partial loading support to the ModelCache.	2025-01-07 00:30:58 +00:00
Ryan Dick	9a0a226ce1	Fix bitsandbytes imports in unit tests on MacOS.	2024-12-30 10:41:48 -05:00
Ryan Dick	52fc5a64d4	Add a unit test for a LoRA patch applied to a quantized linear layer with weights streamed from CPU to GPU.	2024-12-29 17:14:55 +00:00
Ryan Dick	a8bef59699	First pass at making custom layer patches work with weights streamed from the CPU to the GPU.	2024-12-29 17:01:37 +00:00
Ryan Dick	6d49ee839c	Switch the LayerPatcher to use 'custom modules' to manage layer patching.	2024-12-29 01:18:30 +00:00
Ryan Dick	918f541af8	Add unit test for a SetParameterLayer patch applied to a CustomFluxRMSNorm layer.	2024-12-28 20:44:48 +00:00
Ryan Dick	93e76b61d6	Add CustomFluxRMSNorm layer.	2024-12-28 20:33:38 +00:00
Ryan Dick	f2981979f9	Get custom layer patches working with all quantized linear layer types.	2024-12-27 22:00:22 +00:00
Ryan Dick	ef970a1cdc	Add support for FluxControlLoRALayer in CustomLinear layers and add a unit test for it.	2024-12-27 21:00:47 +00:00
Ryan Dick	5ee7405f97	Add more unit tests for custom module LoRA patching: multiple LoRAs and ConcatenatedLoRALayers.	2024-12-27 19:47:21 +00:00
Ryan Dick	e24e386a27	Add support for patches to CustomModuleMixin and add a single unit test (more to come).	2024-12-27 18:57:13 +00:00
Ryan Dick	b06d61e3c0	Improve custom layer wrap/unwrap logic.	2024-12-27 16:29:48 +00:00
Ryan Dick	7d6ab0ceb2	Add a CustomModuleMixin class with a flag for enabling/disabling autocasting (since it incurs some runtime speed overhead.)	2024-12-26 20:08:30 +00:00
Ryan Dick	9692a36dd6	Use a fixture to parameterize tests in test_all_custom_modules.py so that a fresh instance of the layer under test is initialized for each test.	2024-12-26 19:41:25 +00:00
Ryan Dick	b0b699a01f	Add unit test to test that isinstance(...) behaves as expected with custom module types.	2024-12-26 18:45:56 +00:00
Ryan Dick	a8b2c4c3d2	Add inference tests for all custom module types (i.e. to test autocasting from cpu to device).	2024-12-26 18:33:46 +00:00
Ryan Dick	03944191db	Split test_autocast_modules.py into separate test files to mirror the source file structure.	2024-12-24 22:29:11 +00:00
Ryan Dick	987c9ae076	Move custom autocast modules to separate files in a custom_modules/ directory.	2024-12-24 22:21:31 +00:00
Ryan Dick	0fc538734b	Skip flaky test when running on Github Actions, and further reduce peak unit test memory.	2024-12-24 14:32:11 +00:00
Ryan Dick	7214d4969b	Workaround a weird quirk of QuantState.to() and add a unit test to exercise it.	2024-12-24 14:32:11 +00:00
Ryan Dick	a83a999b79	Reduce peak memory used for unit tests.	2024-12-24 14:32:11 +00:00
Ryan Dick	f8a6accf8a	Fix bitsandbytes imports to avoid ImportErrors on MacOS.	2024-12-24 14:32:11 +00:00
Ryan Dick	f8ab414f99	Add CachedModelOnlyFullLoad to mirror the CachedModelWithPartialLoad for models that cannot or should not be partially loaded.	2024-12-24 14:32:11 +00:00
Ryan Dick	c6795a1b47	Make CachedModelWithPartialLoad work with models that have non-persistent buffers.	2024-12-24 14:32:11 +00:00
Ryan Dick	0a8fc74ae9	Add CachedModelWithPartialLoad to manage partially-loaded models using the new autocast modules.	2024-12-24 14:32:11 +00:00
Ryan Dick	dc54e8763b	Add CustomInvokeLinearNF4 to enable CPU -> GPU streaming for InvokeLinearNF4 layers.	2024-12-24 14:32:11 +00:00
Ryan Dick	1b56020876	Add CustomInvokeLinear8bitLt layer for device streaming with InvokeLinear8bitLt layers.	2024-12-24 14:32:11 +00:00
Ryan Dick	97d56f7dc9	Add torch module autocast unit test for GGUF-quantized models.	2024-12-24 14:32:11 +00:00
Ryan Dick	fe0ef2c27c	Add torch module autocast utilities.	2024-12-24 14:32:11 +00:00
Ryan Dick	d30a9ced38	Rename model_cache_default.py -> model_cache.py.	2024-12-24 14:23:18 +00:00
Ryan Dick	e0bfa6157b	Remove ModelCacheBase.	2024-12-24 14:23:18 +00:00
Ryan Dick	fef26a5f2f	Consolidate all LoRA patching logic in the LoRAPatcher.	2024-09-15 04:39:56 +03:00
Ryan Dick	92b8477299	Fixup FLUX LoRA unit tests.	2024-09-15 04:39:56 +03:00

1 2

81 Commits