InvokeAI

mirror of https://github.com/invoke-ai/InvokeAI synced 2024-08-30 20:32:17 +00:00

Author	SHA1	Message	Date
Ryan Dick	83f82c5ddf	Switch the CLIP-L start model to use our hosted version - which is much smaller.	2024-08-26 20:17:50 -04:00
Brandon Rising	101de8c25d	Update t5 encoder formats to accurately reflect the quantization strategy and data type	2024-08-26 20:17:50 -04:00
Ryan Dick	75d8ac378c	Update the T5 8-bit quantized starter model to use the BnB LLM.int8() variant.	2024-08-26 20:17:50 -04:00
Brandon Rising	1047584b3e	Only import bnb quantize file if bitsandbytes is installed	2024-08-26 20:17:50 -04:00
Ryan Dick	a0bf20bcee	Run FLUX VAE decoding in the user's preferred dtype rather than float32. Tested, and seems to work well at float16.	2024-08-26 20:17:50 -04:00
Ryan Dick	1c1f2c6664	Add comment about incorrect T5 Tokenizer size calculation.	2024-08-26 20:17:50 -04:00
Brandon Rising	c27d59baf7	Run ruff	2024-08-26 20:17:50 -04:00
Brandon Rising	72398350b4	More flux loader cleanup	2024-08-26 20:17:50 -04:00
Brandon Rising	df9445c351	Various styling and exception type updates	2024-08-26 20:17:50 -04:00
Brandon Rising	87b7a2e39b	Switch inheritance class of flux model loaders	2024-08-26 20:17:50 -04:00
Brandon Rising	57168d719b	Fix styling/lint	2024-08-26 20:17:50 -04:00
Brandon Rising	dee6d2c98e	Fix support for 8b quantized t5 encoders, update exception messages in flux loaders	2024-08-26 20:17:50 -04:00
Ryan Dick	0c5e11f521	Fix FLUX output image clamping. And a few other minor fixes to make inference work with the full bfloat16 FLUX transformer model.	2024-08-26 20:17:50 -04:00
Brandon Rising	a63f842a13	Select dev/schnell based on state dict, use correct max seq len based on dev/schnell, and shift in inference, separate vae flux params into separate config	2024-08-26 20:17:50 -04:00
Brandon Rising	4bd7fda694	Install sub directories with folders correctly, ensure consistent dtype of tensors in flux pipeline and vae	2024-08-26 20:17:50 -04:00
Brandon Rising	81f0886d6f	Working inference node with quantized bnb nf4 checkpoint	2024-08-26 20:17:50 -04:00
Brandon Rising	1bd90e0fd4	Run ruff, setup initial text to image node	2024-08-26 20:17:50 -04:00
Brandon Rising	436f18ff55	Add backend functions and classes for Flux implementation, Update the way flux encoders/tokenizers are loaded for prompt encoding, Update way flux vae is loaded	2024-08-26 20:17:50 -04:00
Brandon Rising	9ed53af520	Run Ruff	2024-08-26 20:17:50 -04:00
Brandon Rising	56fda669fd	Manage quantization of models within the loader	2024-08-26 20:17:50 -04:00
blessedcoolant	4f8a4b0f22	Merge branch 'main' into depth_anything_v2	2024-08-03 00:38:57 +05:30
Ryan Dick	b9dc3460ba	Rename SegmentAnythingModel -> SegmentAnythingPipeline.	2024-08-01 09:57:47 -04:00
Ryan Dick	fca119773b	Split invokeai/backend/image_util/segment_anything/ dir into grounding_dino/ and segment_anything/	2024-07-31 12:28:47 -04:00
Ryan Dick	9f448fecb7	Move invokeai/backend/grounded_sam -> invokeai/backend/image_util/grounded_sam	2024-07-31 10:00:30 -04:00
blessedcoolant	18f89ed5ed	fix: Make DepthAnything work with Invoke's Model Management	2024-07-31 03:57:54 +05:30
Ryan Dick	ff6398f7d8	Add a GroundedSamInvocation for image segmentation from a text prompt (Grounding DINO + Segment Anything Model).	2024-07-30 11:12:26 -04:00
psychedelicious	74cef38bcf	fix(backend): add refiner to single-file `load_classes` Fixes single-file refiner loading.	2024-07-26 05:08:01 +10:00
Lincoln Stein	97a7f51721	don't use cpu state_dict for model unpatching when executing on cpu (#6631 ) Co-authored-by: Lincoln Stein <lstein@gmail.com>	2024-07-18 15:34:01 -04:00
Ryan Dick	81991e072b	Merge branch 'main' into ryan/spandrel-upscale	2024-07-16 15:14:08 -04:00
psychedelicious	38343917f8	fix(backend): revert non-blocking device transfer In #6490 we enabled non-blocking torch device transfers throughout the model manager's memory management code. When using this torch feature, torch attempts to wait until the tensor transfer has completed before allowing any access to the tensor. Theoretically, that should make this a safe feature to use. This provides a small performance improvement but causes race conditions in some situations. Specific platforms/systems are affected, and complicated data dependencies can make this unsafe. - Intermittent black images on MPS devices - reported on discord and #6545, fixed with special handling in #6549. - Intermittent OOMs and black images on a P4000 GPU on Windows - reported in #6613, fixed in this commit. On my system, I haven't experience any issues with generation, but targeted testing of non-blocking ops did expose a race condition when moving tensors from CUDA to CPU. One workaround is to use torch streams with manual sync points. Our application logic is complicated enough that this would be a lot of work and feels ripe for edge cases and missed spots. Much safer is to fully revert non-locking - which is what this change does.	2024-07-16 08:59:42 +10:00
Ryan Dick	7b5d4935b4	Merge branch 'main' into ryan/spandrel-upscale	2024-07-09 13:47:11 -04:00
Ryan Dick	af63c538ed	Demote error log to warning to models treated as having size 0.	2024-07-09 08:35:43 -04:00
Ryan Dick	1d449097cc	Apply ruff rule to disallow all relative imports.	2024-07-04 09:35:37 -04:00
Ryan Dick	9da5925287	Add ruff rule to disallow relative parent imports.	2024-07-04 09:35:37 -04:00
Ryan Dick	414750a45d	Update calc_model_size_by_data(...) to handle all expected model types, and to log an error if an unexpected model type is received.	2024-07-04 09:08:25 -04:00
Ryan Dick	a405f14ea2	Fix SpandrelImageToImageModel size calculation for the model cache.	2024-07-03 16:38:16 -04:00
Ryan Dick	2a1514272f	Set the dtype correctly for SpandrelImageToImageModels when they are loaded.	2024-07-03 16:28:21 -04:00
Ryan Dick	59ce9cf41c	WIP - Begin to integrate SpandreImageToImageModel type into the model manager.	2024-07-03 16:28:21 -04:00
Ryan Dick	e4813f800a	Update calc_model_size_by_data(...) to handle all expected model types, and to log an error if an unexpected model type is received.	2024-07-02 21:51:45 -04:00
Lincoln Stein	3e0fb45dd7	Load single-file checkpoints directly without conversion (#6510 ) * use model_class.load_singlefile() instead of converting; works, but performance is poor * adjust the convert api - not right just yet * working, needs sql migrator update * rename migration_11 before conflict merge with main * Update invokeai/backend/model_manager/load/model_loaders/stable_diffusion.py Co-authored-by: Ryan Dick <ryanjdick3@gmail.com> * Update invokeai/backend/model_manager/load/model_loaders/stable_diffusion.py Co-authored-by: Ryan Dick <ryanjdick3@gmail.com> * implement lightweight version-by-version config migration * simplified config schema migration code * associate sdxl config with sdxl VAEs * remove use of original_config_file in load_single_file() --------- Co-authored-by: Lincoln Stein <lstein@gmail.com> Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>	2024-06-27 17:31:28 -04:00
Ryan Dick	14775cc9c4	ruff format	2024-06-27 09:45:13 -04:00
psychedelicious	c7562dd6c0	fix(backend): mps should not use `non_blocking` We can get black outputs when moving tensors from CPU to MPS. It appears MPS to CPU is fine. See: - https://github.com/pytorch/pytorch/issues/107455 - https://discuss.pytorch.org/t/should-we-set-non-blocking-to-true/38234/28 Changes: - Add properties for each device on `TorchDevice` as a convenience. - Add `get_non_blocking` static method on `TorchDevice`. This utility takes a torch device and returns the flag to be used for non_blocking when moving a tensor to the device provided. - Update model patching and caching APIs to use this new utility. Fixes: #6545	2024-06-27 19:15:23 +10:00
Lincoln Stein	b03073d888	[MM] Add support for probing and loading SDXL VAE checkpoint files (#6524 ) * add support for probing and loading SDXL VAE checkpoint files * broaden regexp probe for SDXL VAEs --------- Co-authored-by: Lincoln Stein <lstein@gmail.com>	2024-06-20 02:57:27 +00:00
Lincoln Stein	a3cb5da130	Improve RAM<->VRAM memory copy performance in LoRA patching and elsewhere (#6490 ) * allow model patcher to optimize away the unpatching step when feasible * remove lazy_offloading functionality * allow model patcher to optimize away the unpatching step when feasible * remove lazy_offloading functionality * do not save original weights if there is a CPU copy of state dict * Update invokeai/backend/model_manager/load/load_base.py Co-authored-by: Ryan Dick <ryanjdick3@gmail.com> * documentation fixes requested during penultimate review * add non-blocking=True parameters to several torch.nn.Module.to() calls, for slight performance increases * fix ruff errors * prevent crash on non-cuda-enabled systems --------- Co-authored-by: Lincoln Stein <lstein@gmail.com> Co-authored-by: Kent Keirsey <31807370+hipsterusername@users.noreply.github.com> Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>	2024-06-13 17:10:03 +00:00
psychedelicious	fde58ce0a3	Merge remote-tracking branch 'origin/main' into lstein/feat/simple-mm2-api	2024-06-07 14:23:41 +10:00
Lincoln Stein	f81b8bc9f6	add support for generic loading of diffusers directories	2024-06-07 13:54:30 +10:00
Lincoln Stein	2871676f79	LoRA patching optimization (#6439 ) * allow model patcher to optimize away the unpatching step when feasible * remove lazy_offloading functionality * allow model patcher to optimize away the unpatching step when feasible * remove lazy_offloading functionality * do not save original weights if there is a CPU copy of state dict * Update invokeai/backend/model_manager/load/load_base.py Co-authored-by: Ryan Dick <ryanjdick3@gmail.com> * documentation fixes added during penultimate review --------- Co-authored-by: Lincoln Stein <lstein@gmail.com> Co-authored-by: Kent Keirsey <31807370+hipsterusername@users.noreply.github.com> Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>	2024-06-06 13:53:35 +00:00
psychedelicious	e7513f6088	docs(mm): add comment in `move_model_to_device`	2024-06-03 10:56:04 +10:00
Lincoln Stein	2276f327e5	Merge branch 'main' into lstein/feat/simple-mm2-api	2024-06-02 09:45:31 -04:00
Lincoln Stein	21a60af881	when unlocking models, offload_unlocked_models should prune to vram limit only (#6450 ) Co-authored-by: Lincoln Stein <lstein@gmail.com>	2024-05-29 03:01:21 +00:00

1 2 3

120 Commits