Change pad_to_multiple_of to be 8 for all cases. Add comment about it's temporary status

This commit is contained in:
Wubbbi 2023-11-08 17:04:02 +01:00 committed by Kent Keirsey
parent b9f607be56
commit 6001d3d71d

View File

@ -166,13 +166,13 @@ class ModelPatcher:
init_tokens_count = None
new_tokens_added = None
# This is required since Transformers 4.32
# see https://github.com/huggingface/transformers/pull/25088
# More information: https://docs.nvidia.com/deeplearning/performance/dl-performance-
# matrix-multiplication/index.html#requirements-tc
if "A100" in torch.cuda.get_device_name():
pad_to_multiple_of = 64
else:
# TODO: This is required since Transformers 4.32 see
# https://github.com/huggingface/transformers/pull/25088
# More information by NVIDIA:
# https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
# This value might need to be changed in the future and take the GPUs model into account as there seem
# to be ideal values for different GPUS. This value is temporary!
# For references to the current discussion please see https://github.com/invoke-ai/InvokeAI/pull/4817
pad_to_multiple_of = 8
try: