Ryan Dick
|
70def37280
|
Move quantization scripts to a scripts/ subdir.
|
2024-08-23 18:08:37 +00:00 |
|
Ryan Dick
|
8af3c72de7
|
Update docs for T5 quantization script.
|
2024-08-23 18:07:14 +00:00 |
|
Ryan Dick
|
6405214940
|
Remove all references to optimum-quanto and downgrade diffusers.
|
2024-08-23 18:04:17 +00:00 |
|
Ryan Dick
|
544ab296e7
|
Update the T5 8-bit quantized starter model to use the BnB LLM.int8() variant.
|
2024-08-23 18:04:15 +00:00 |
|
Ryan Dick
|
86e49c423c
|
Fixes to the T5XXL quantization script.
|
2024-08-23 18:03:23 +00:00 |
|
Ryan Dick
|
6d838fa997
|
Add script for quantizing a T5 model.
|
2024-08-23 18:03:23 +00:00 |
|
Brandon Rising
|
9ae190fc3e
|
Only import bnb quantize file if bitsandbytes is installed
|
2024-08-23 13:36:14 -04:00 |
|
Ryan Dick
|
708a4f68da
|
Run FLUX VAE decoding in the user's preferred dtype rather than float32. Tested, and seems to work well at float16.
|
2024-08-22 18:16:43 +00:00 |
|
Ryan Dick
|
08633c3f04
|
Move prepare_latent_image_patches(...) to sampling.py with all of the related FLUX inference code.
|
2024-08-22 17:18:43 +00:00 |
|
Ryan Dick
|
a27250a95e
|
Add comment about incorrect T5 Tokenizer size calculation.
|
2024-08-22 16:09:46 +00:00 |
|
Ryan Dick
|
afd4913a1b
|
Make FLUX get_noise(...) consistent across devices/dtypes.
|
2024-08-22 15:56:30 +00:00 |
|
Brandon Rising
|
001d1b6e35
|
Attribute black-forest-labs/flux for much of the flux code
|
2024-08-21 15:54:07 -04:00 |
|
maryhipp
|
377fcaf49c
|
added FLUX dev to starter models
|
2024-08-21 15:47:19 -04:00 |
|
Brandon Rising
|
3d251b4b93
|
Run ruff
|
2024-08-21 15:37:27 -04:00 |
|
Ryan Dick
|
42bbab74b3
|
Add docs to the quantization scripts.
|
2024-08-21 19:08:28 +00:00 |
|
Ryan Dick
|
203542c7a8
|
Update load_flux_model_bnb_llm_int8.py to work with a single-file FLUX transformer checkpoint.
|
2024-08-21 19:08:16 +00:00 |
|
Ryan Dick
|
7f62033f1f
|
Fix bug in InvokeInt8Params that was causing it to use double the necessary VRAM.
|
2024-08-21 19:08:00 +00:00 |
|
maryhipp
|
09d1f75fe9
|
add FLUX schnell starter models and submodels as dependenices or adhoc download options
|
2024-08-21 14:27:35 -04:00 |
|
maryhipp
|
c095af65fb
|
add case for clip embed models in probe
|
2024-08-21 14:27:35 -04:00 |
|
Ryan Dick
|
e41025ddc7
|
Move requantize.py to the quatnization/ dir.
|
2024-08-21 18:21:44 +00:00 |
|
Ryan Dick
|
38c2e7801f
|
Add docs to the requantize(...) function explaining why it was copied from optimum-quanto.
|
2024-08-21 18:19:47 +00:00 |
|
Ryan Dick
|
d11dc6ddd0
|
Remove duplicate log_time(...) function.
|
2024-08-21 18:10:24 +00:00 |
|
Brandon Rising
|
8b0b496c2d
|
More flux loader cleanup
|
2024-08-21 12:37:25 -04:00 |
|
Brandon Rising
|
ada483f65e
|
Various styling and exception type updates
|
2024-08-21 11:59:04 -04:00 |
|
Brandon Rising
|
0913d062d8
|
Switch inheritance class of flux model loaders
|
2024-08-21 11:30:16 -04:00 |
|
Brandon Rising
|
dd24f83d43
|
Fix styling/lint
|
2024-08-21 09:10:22 -04:00 |
|
Brandon Rising
|
da766f5a7e
|
Fix support for 8b quantized t5 encoders, update exception messages in flux loaders
|
2024-08-21 09:10:22 -04:00 |
|
Ryan Dick
|
120e1cf1e9
|
Add tqdm progress bar to FLUX denoising.
|
2024-08-21 09:10:22 -04:00 |
|
Ryan Dick
|
5e2351f3bf
|
Fix FLUX output image clamping. And a few other minor fixes to make inference work with the full bfloat16 FLUX transformer model.
|
2024-08-21 09:10:22 -04:00 |
|
Brandon Rising
|
d705c3cf0e
|
Select dev/schnell based on state dict, use correct max seq len based on dev/schnell, and shift in inference, separate vae flux params into separate config
|
2024-08-21 09:10:20 -04:00 |
|
Brandon Rising
|
115f350f6f
|
Install sub directories with folders correctly, ensure consistent dtype of tensors in flux pipeline and vae
|
2024-08-21 09:09:39 -04:00 |
|
Brandon Rising
|
be6cb2c07c
|
Working inference node with quantized bnb nf4 checkpoint
|
2024-08-21 09:09:39 -04:00 |
|
Brandon Rising
|
b43ee0b837
|
Add nf4 bnb quantized format
|
2024-08-21 09:09:39 -04:00 |
|
Brandon Rising
|
3312fe8fc4
|
Run ruff, setup initial text to image node
|
2024-08-21 09:09:39 -04:00 |
|
Brandon Rising
|
01a2449dae
|
Add backend functions and classes for Flux implementation, Update the way flux encoders/tokenizers are loaded for prompt encoding, Update way flux vae is loaded
|
2024-08-21 09:09:37 -04:00 |
|
Brandon Rising
|
46b6314482
|
Run Ruff
|
2024-08-21 09:06:38 -04:00 |
|
Brandon Rising
|
46d5107ff1
|
Run Ruff
|
2024-08-21 09:06:38 -04:00 |
|
Brandon Rising
|
6ea1278d22
|
Manage quantization of models within the loader
|
2024-08-21 09:06:34 -04:00 |
|
Brandon Rising
|
f425d3aa3c
|
Setup flux model loading in the UI
|
2024-08-21 09:04:37 -04:00 |
|
Ryan Dick
|
d7a39a4d67
|
WIP on moving from diffusers to FLUX
|
2024-08-21 08:59:19 -04:00 |
|
Ryan Dick
|
3e8a550fab
|
More improvements for LLM.int8() - not fully tested.
|
2024-08-21 08:59:19 -04:00 |
|
Ryan Dick
|
0e96794c6e
|
LLM.int8() quantization is working, but still some rough edges to solve.
|
2024-08-21 08:59:19 -04:00 |
|
Ryan Dick
|
23a7328a66
|
Clean up NF4 implementation.
|
2024-08-21 08:59:19 -04:00 |
|
Ryan Dick
|
c3cf8c3b6b
|
NF4 inference working
|
2024-08-21 08:59:19 -04:00 |
|
Ryan Dick
|
110d58d107
|
NF4 loading working... I think.
|
2024-08-21 08:59:19 -04:00 |
|
Ryan Dick
|
3480e06688
|
wip
|
2024-08-21 08:59:19 -04:00 |
|
Ryan Dick
|
3ba60e1656
|
Split a FluxTextEncoderInvocation out from the FluxTextToImageInvocation. This has the advantage that we benfit from automatic caching when the prompt isn't changed.
|
2024-08-21 08:59:19 -04:00 |
|
Ryan Dick
|
cdd47b657b
|
Make quantized loading fast for both T5XXL and FLUX transformer.
|
2024-08-21 08:59:19 -04:00 |
|
Ryan Dick
|
68c712d254
|
Make quantized loading fast.
|
2024-08-21 08:59:19 -04:00 |
|
Ryan Dick
|
44d7a74b88
|
WIP - experimentation
|
2024-08-21 08:59:19 -04:00 |
|