InvokeAI

mirror of https://github.com/invoke-ai/InvokeAI synced 2024-08-30 20:32:17 +00:00

Author	SHA1	Message	Date
Ryan Dick	29fe1533f2	Fix bug in InvokeLinear8bitLt that was causing old state information to persist after loading from a state dict. This manifested as state tensors being left on the GPU even when a model had been offloaded to the CPU cache.	2024-08-29 19:08:18 +00:00
Brandon Rising	65bb46bcca	Rename params for flux and flux vae, add comments explaining use of the config_path in model config	2024-08-26 20:17:50 -04:00
Ryan Dick	635d2f480d	ruff	2024-08-26 20:17:50 -04:00
Brandon Rising	56b9906e2e	Setup scaffolding for in progress images and add ability to cancel the flux node	2024-08-26 20:17:50 -04:00
Ryan Dick	dff4a88baa	Move quantization scripts to a scripts/ subdir.	2024-08-26 20:17:50 -04:00
Ryan Dick	a21f6c4964	Update docs for T5 quantization script.	2024-08-26 20:17:50 -04:00
Ryan Dick	97562504b7	Remove all references to optimum-quanto and downgrade diffusers.	2024-08-26 20:17:50 -04:00
Ryan Dick	b9dd354e2b	Fixes to the T5XXL quantization script.	2024-08-26 20:17:50 -04:00
Ryan Dick	33c2fbd201	Add script for quantizing a T5 model.	2024-08-26 20:17:50 -04:00
Ryan Dick	b66f19d4d1	Add docs to the quantization scripts.	2024-08-26 20:17:50 -04:00
Ryan Dick	4105a78b83	Update load_flux_model_bnb_llm_int8.py to work with a single-file FLUX transformer checkpoint.	2024-08-26 20:17:50 -04:00
Ryan Dick	19a68afb3a	Fix bug in InvokeInt8Params that was causing it to use double the necessary VRAM.	2024-08-26 20:17:50 -04:00
Ryan Dick	cfac7c8189	Move requantize.py to the quatnization/ dir.	2024-08-26 20:17:50 -04:00
Ryan Dick	ac96f187bd	Remove duplicate log_time(...) function.	2024-08-26 20:17:50 -04:00
Brandon Rising	57168d719b	Fix styling/lint	2024-08-26 20:17:50 -04:00
Brandon Rising	4bd7fda694	Install sub directories with folders correctly, ensure consistent dtype of tensors in flux pipeline and vae	2024-08-26 20:17:50 -04:00
Brandon Rising	2d9042fb93	Run Ruff	2024-08-26 20:17:50 -04:00
Brandon Rising	9ed53af520	Run Ruff	2024-08-26 20:17:50 -04:00
Brandon Rising	56fda669fd	Manage quantization of models within the loader	2024-08-26 20:17:50 -04:00
Ryan Dick	1fa6bddc89	WIP on moving from diffusers to FLUX	2024-08-26 20:17:50 -04:00
Ryan Dick	d3a5ca5247	More improvements for LLM.int8() - not fully tested.	2024-08-26 20:17:50 -04:00
Ryan Dick	f01f56a98e	LLM.int8() quantization is working, but still some rough edges to solve.	2024-08-26 20:17:50 -04:00
Ryan Dick	99b0f79784	Clean up NF4 implementation.	2024-08-26 20:17:50 -04:00
Ryan Dick	eeabb7ebe5	Make quantized loading fast for both T5XXL and FLUX transformer.	2024-08-26 20:17:50 -04:00

24 Commits