FlashAttention-2
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
https://crfm.stanford.edu/2023/07/17/flash2.html