flash-attn - nikkie-memos

flash-attn

https://pypi.org/project/flash-attn/

https://huggingface.co/docs/transformers/perf_infer_gpu_one?install=NVIDIA#flashattention-2

https://github.com/Dao-AILab/flash-attention

Flash Attention 1（optimum）

Flash Attention 2

setup.pyが無茶苦茶やっているので環境を作りにくい（nvcr.io/nvidia/pytorchが1つの選択肢）

単にpip-compileできなかった（torchのImportError）