FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
·
Artificial Intelligence
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-AwarenessTri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, Christopher RéPaper: https://arxiv.org/abs/2205.14135 추후 작성하기