-
DeepSeek Releases Open Source Project FlashMLA
On February 24th, FlashMLA, the first project of DeepSeek Open Source Week, was officially released. According to the official introduction, FlashMLA is inspired by FlashAttention 2&3 and cutlass project. Specifically, FlashMLA is an efficient MLA (Multi-Head Latent Attention) decoding kernel optimized for Hopper GPUs, supporting variable-length sequence processing, and is now in production use. FlashM...- 2.1k