Commit Graph

15 Commits

Author SHA1 Message Date
Brandon Rising
6ea1278d22 Manage quantization of models within the loader 2024-08-21 09:06:34 -04:00
Brandon Rising
f425d3aa3c Setup flux model loading in the UI 2024-08-21 09:04:37 -04:00
Ryan Dick
d7a39a4d67 WIP on moving from diffusers to FLUX 2024-08-21 08:59:19 -04:00
Ryan Dick
0e96794c6e LLM.int8() quantization is working, but still some rough edges to solve. 2024-08-21 08:59:19 -04:00
Ryan Dick
23a7328a66 Clean up NF4 implementation. 2024-08-21 08:59:19 -04:00
Ryan Dick
c3cf8c3b6b NF4 inference working 2024-08-21 08:59:19 -04:00
Ryan Dick
3ba60e1656 Split a FluxTextEncoderInvocation out from the FluxTextToImageInvocation. This has the advantage that we benfit from automatic caching when the prompt isn't changed. 2024-08-21 08:59:19 -04:00
Ryan Dick
cdd47b657b Make quantized loading fast for both T5XXL and FLUX transformer. 2024-08-21 08:59:19 -04:00
Ryan Dick
e8fb8f4d12 Make float16 inference work with FLUX on 24GB GPU. 2024-08-21 08:59:19 -04:00
Ryan Dick
9381211508 Add support for 8-bit quantizatino of the FLUX T5XXL text encoder. 2024-08-21 08:59:19 -04:00
Ryan Dick
8cce4a40d4 Make 8-bit quantization save/reload work for the FLUX transformer. Reload is still very slow with the current optimum.quanto implementation. 2024-08-21 08:59:19 -04:00
Ryan Dick
4833746698 Minor improvements to FLUX workflow. 2024-08-21 08:59:19 -04:00
Ryan Dick
8b9bf55bba Got FLUX schnell working with 8-bit quantization. Still lots of rough edges to clean up. 2024-08-21 08:59:19 -04:00
Ryan Dick
7b199fed4f Use the FluxPipeline.encode_prompt() api rather than trying to run the two text encoders separately. 2024-08-21 08:59:18 -04:00
Ryan Dick
13513465c8 First draft of FluxTextToImageInvocation. 2024-08-21 08:59:18 -04:00