Ryan Dick
|
29fe1533f2
|
Fix bug in InvokeLinear8bitLt that was causing old state information to persist after loading from a state dict. This manifested as state tensors being left on the GPU even when a model had been offloaded to the CPU cache.
|
2024-08-29 19:08:18 +00:00 |
|
Brandon Rising
|
65bb46bcca
|
Rename params for flux and flux vae, add comments explaining use of the config_path in model config
|
2024-08-26 20:17:50 -04:00 |
|
Ryan Dick
|
635d2f480d
|
ruff
|
2024-08-26 20:17:50 -04:00 |
|
Brandon Rising
|
56b9906e2e
|
Setup scaffolding for in progress images and add ability to cancel the flux node
|
2024-08-26 20:17:50 -04:00 |
|
Ryan Dick
|
dff4a88baa
|
Move quantization scripts to a scripts/ subdir.
|
2024-08-26 20:17:50 -04:00 |
|
Ryan Dick
|
a21f6c4964
|
Update docs for T5 quantization script.
|
2024-08-26 20:17:50 -04:00 |
|
Ryan Dick
|
97562504b7
|
Remove all references to optimum-quanto and downgrade diffusers.
|
2024-08-26 20:17:50 -04:00 |
|
Ryan Dick
|
b9dd354e2b
|
Fixes to the T5XXL quantization script.
|
2024-08-26 20:17:50 -04:00 |
|
Ryan Dick
|
33c2fbd201
|
Add script for quantizing a T5 model.
|
2024-08-26 20:17:50 -04:00 |
|
Ryan Dick
|
b66f19d4d1
|
Add docs to the quantization scripts.
|
2024-08-26 20:17:50 -04:00 |
|
Ryan Dick
|
4105a78b83
|
Update load_flux_model_bnb_llm_int8.py to work with a single-file FLUX transformer checkpoint.
|
2024-08-26 20:17:50 -04:00 |
|
Ryan Dick
|
19a68afb3a
|
Fix bug in InvokeInt8Params that was causing it to use double the necessary VRAM.
|
2024-08-26 20:17:50 -04:00 |
|
Ryan Dick
|
cfac7c8189
|
Move requantize.py to the quatnization/ dir.
|
2024-08-26 20:17:50 -04:00 |
|
Ryan Dick
|
ac96f187bd
|
Remove duplicate log_time(...) function.
|
2024-08-26 20:17:50 -04:00 |
|
Brandon Rising
|
57168d719b
|
Fix styling/lint
|
2024-08-26 20:17:50 -04:00 |
|
Brandon Rising
|
4bd7fda694
|
Install sub directories with folders correctly, ensure consistent dtype of tensors in flux pipeline and vae
|
2024-08-26 20:17:50 -04:00 |
|
Brandon Rising
|
2d9042fb93
|
Run Ruff
|
2024-08-26 20:17:50 -04:00 |
|
Brandon Rising
|
9ed53af520
|
Run Ruff
|
2024-08-26 20:17:50 -04:00 |
|
Brandon Rising
|
56fda669fd
|
Manage quantization of models within the loader
|
2024-08-26 20:17:50 -04:00 |
|
Ryan Dick
|
1fa6bddc89
|
WIP on moving from diffusers to FLUX
|
2024-08-26 20:17:50 -04:00 |
|
Ryan Dick
|
d3a5ca5247
|
More improvements for LLM.int8() - not fully tested.
|
2024-08-26 20:17:50 -04:00 |
|
Ryan Dick
|
f01f56a98e
|
LLM.int8() quantization is working, but still some rough edges to solve.
|
2024-08-26 20:17:50 -04:00 |
|
Ryan Dick
|
99b0f79784
|
Clean up NF4 implementation.
|
2024-08-26 20:17:50 -04:00 |
|
Ryan Dick
|
eeabb7ebe5
|
Make quantized loading fast for both T5XXL and FLUX transformer.
|
2024-08-26 20:17:50 -04:00 |
|