Commit Graph

12 Commits

Author SHA1 Message Date
Ryan Dick
e41025ddc7 Move requantize.py to the quatnization/ dir. 2024-08-21 18:21:44 +00:00
Ryan Dick
d11dc6ddd0 Remove duplicate log_time(...) function. 2024-08-21 18:10:24 +00:00
Brandon Rising
dd24f83d43 Fix styling/lint 2024-08-21 09:10:22 -04:00
Brandon Rising
115f350f6f Install sub directories with folders correctly, ensure consistent dtype of tensors in flux pipeline and vae 2024-08-21 09:09:39 -04:00
Brandon Rising
46b6314482 Run Ruff 2024-08-21 09:06:38 -04:00
Brandon Rising
46d5107ff1 Run Ruff 2024-08-21 09:06:38 -04:00
Brandon Rising
6ea1278d22 Manage quantization of models within the loader 2024-08-21 09:06:34 -04:00
Ryan Dick
d7a39a4d67 WIP on moving from diffusers to FLUX 2024-08-21 08:59:19 -04:00
Ryan Dick
3e8a550fab More improvements for LLM.int8() - not fully tested. 2024-08-21 08:59:19 -04:00
Ryan Dick
0e96794c6e LLM.int8() quantization is working, but still some rough edges to solve. 2024-08-21 08:59:19 -04:00
Ryan Dick
23a7328a66 Clean up NF4 implementation. 2024-08-21 08:59:19 -04:00
Ryan Dick
cdd47b657b Make quantized loading fast for both T5XXL and FLUX transformer. 2024-08-21 08:59:19 -04:00