Google's New Architecture Two-Fold: Stronger Than Mamba at Equal Scale

GoogleDeepMindIntroduced new architectures Hawk and Griffin to challenge traditional Transformer models and demonstrate the new potential of RNNs in AI.Hawk and Griffin models show better performance at the same scale than theMambaperformance, proving their competitiveness in processing efficiency and downstream task performance. Both models achieve training efficiencies comparable to Transformer and provide higher throughput and lower latency in the inference process, with especially better performance when dealing with long sequence data.
Search