LLMの高速化 - work4ai

LLMの高速化

https://zenn.dev/kaeru39/articles/1ea73bfa40c7dfローカルLLMの推論速度を高速化する5つの手法と比較評価

FlashAttention-2

https://gyazo.com/ecb63de41bcc9bf1f1be5c811a834101

https://huggingface.co/collections/VoladorLuYu/efficient-llm-65af19e1e0ea35ad619b3fddefficient-llm