Commit Graph

1802 Commits

Author SHA1 Message Date
Ryan Dick
70def37280 Move quantization scripts to a scripts/ subdir. 2024-08-23 18:08:37 +00:00
Ryan Dick
8af3c72de7 Update docs for T5 quantization script. 2024-08-23 18:07:14 +00:00
Ryan Dick
6405214940 Remove all references to optimum-quanto and downgrade diffusers. 2024-08-23 18:04:17 +00:00
Ryan Dick
544ab296e7 Update the T5 8-bit quantized starter model to use the BnB LLM.int8() variant. 2024-08-23 18:04:15 +00:00
Ryan Dick
86e49c423c Fixes to the T5XXL quantization script. 2024-08-23 18:03:23 +00:00
Ryan Dick
6d838fa997 Add script for quantizing a T5 model. 2024-08-23 18:03:23 +00:00
Brandon Rising
9ae190fc3e Only import bnb quantize file if bitsandbytes is installed 2024-08-23 13:36:14 -04:00
Ryan Dick
708a4f68da Run FLUX VAE decoding in the user's preferred dtype rather than float32. Tested, and seems to work well at float16. 2024-08-22 18:16:43 +00:00
Ryan Dick
08633c3f04 Move prepare_latent_image_patches(...) to sampling.py with all of the related FLUX inference code. 2024-08-22 17:18:43 +00:00
Ryan Dick
a27250a95e Add comment about incorrect T5 Tokenizer size calculation. 2024-08-22 16:09:46 +00:00
Ryan Dick
afd4913a1b Make FLUX get_noise(...) consistent across devices/dtypes. 2024-08-22 15:56:30 +00:00
Brandon Rising
001d1b6e35 Attribute black-forest-labs/flux for much of the flux code 2024-08-21 15:54:07 -04:00
maryhipp
377fcaf49c added FLUX dev to starter models 2024-08-21 15:47:19 -04:00
Brandon Rising
3d251b4b93 Run ruff 2024-08-21 15:37:27 -04:00
Ryan Dick
42bbab74b3 Add docs to the quantization scripts. 2024-08-21 19:08:28 +00:00
Ryan Dick
203542c7a8 Update load_flux_model_bnb_llm_int8.py to work with a single-file FLUX transformer checkpoint. 2024-08-21 19:08:16 +00:00
Ryan Dick
7f62033f1f Fix bug in InvokeInt8Params that was causing it to use double the necessary VRAM. 2024-08-21 19:08:00 +00:00
maryhipp
09d1f75fe9 add FLUX schnell starter models and submodels as dependenices or adhoc download options 2024-08-21 14:27:35 -04:00
maryhipp
c095af65fb add case for clip embed models in probe 2024-08-21 14:27:35 -04:00
Ryan Dick
e41025ddc7 Move requantize.py to the quatnization/ dir. 2024-08-21 18:21:44 +00:00
Ryan Dick
38c2e7801f Add docs to the requantize(...) function explaining why it was copied from optimum-quanto. 2024-08-21 18:19:47 +00:00
Ryan Dick
d11dc6ddd0 Remove duplicate log_time(...) function. 2024-08-21 18:10:24 +00:00
Brandon Rising
8b0b496c2d More flux loader cleanup 2024-08-21 12:37:25 -04:00
Brandon Rising
ada483f65e Various styling and exception type updates 2024-08-21 11:59:04 -04:00
Brandon Rising
0913d062d8 Switch inheritance class of flux model loaders 2024-08-21 11:30:16 -04:00
Brandon Rising
dd24f83d43 Fix styling/lint 2024-08-21 09:10:22 -04:00
Brandon Rising
da766f5a7e Fix support for 8b quantized t5 encoders, update exception messages in flux loaders 2024-08-21 09:10:22 -04:00
Ryan Dick
120e1cf1e9 Add tqdm progress bar to FLUX denoising. 2024-08-21 09:10:22 -04:00
Ryan Dick
5e2351f3bf Fix FLUX output image clamping. And a few other minor fixes to make inference work with the full bfloat16 FLUX transformer model. 2024-08-21 09:10:22 -04:00
Brandon Rising
d705c3cf0e Select dev/schnell based on state dict, use correct max seq len based on dev/schnell, and shift in inference, separate vae flux params into separate config 2024-08-21 09:10:20 -04:00
Brandon Rising
115f350f6f Install sub directories with folders correctly, ensure consistent dtype of tensors in flux pipeline and vae 2024-08-21 09:09:39 -04:00
Brandon Rising
be6cb2c07c Working inference node with quantized bnb nf4 checkpoint 2024-08-21 09:09:39 -04:00
Brandon Rising
b43ee0b837 Add nf4 bnb quantized format 2024-08-21 09:09:39 -04:00
Brandon Rising
3312fe8fc4 Run ruff, setup initial text to image node 2024-08-21 09:09:39 -04:00
Brandon Rising
01a2449dae Add backend functions and classes for Flux implementation, Update the way flux encoders/tokenizers are loaded for prompt encoding, Update way flux vae is loaded 2024-08-21 09:09:37 -04:00
Brandon Rising
46b6314482 Run Ruff 2024-08-21 09:06:38 -04:00
Brandon Rising
46d5107ff1 Run Ruff 2024-08-21 09:06:38 -04:00
Brandon Rising
6ea1278d22 Manage quantization of models within the loader 2024-08-21 09:06:34 -04:00
Brandon Rising
f425d3aa3c Setup flux model loading in the UI 2024-08-21 09:04:37 -04:00
Ryan Dick
d7a39a4d67 WIP on moving from diffusers to FLUX 2024-08-21 08:59:19 -04:00
Ryan Dick
3e8a550fab More improvements for LLM.int8() - not fully tested. 2024-08-21 08:59:19 -04:00
Ryan Dick
0e96794c6e LLM.int8() quantization is working, but still some rough edges to solve. 2024-08-21 08:59:19 -04:00
Ryan Dick
23a7328a66 Clean up NF4 implementation. 2024-08-21 08:59:19 -04:00
Ryan Dick
c3cf8c3b6b NF4 inference working 2024-08-21 08:59:19 -04:00
Ryan Dick
110d58d107 NF4 loading working... I think. 2024-08-21 08:59:19 -04:00
Ryan Dick
3480e06688 wip 2024-08-21 08:59:19 -04:00
Ryan Dick
3ba60e1656 Split a FluxTextEncoderInvocation out from the FluxTextToImageInvocation. This has the advantage that we benfit from automatic caching when the prompt isn't changed. 2024-08-21 08:59:19 -04:00
Ryan Dick
cdd47b657b Make quantized loading fast for both T5XXL and FLUX transformer. 2024-08-21 08:59:19 -04:00
Ryan Dick
68c712d254 Make quantized loading fast. 2024-08-21 08:59:19 -04:00
Ryan Dick
44d7a74b88 WIP - experimentation 2024-08-21 08:59:19 -04:00