Tencent Launches Hunyuan-Large Large Model: 389B Total Parameters, Industry's Largest Transformer-Based MoE Model Open-Sourced

TencentAnnouncing the launch of Hunyuan-Large Large ModelOfficially, it'sThe industry has nowOpen SourceThe Transformer-based Maximum MoE Model ofThe program has 389 billion total parameters (389B) and 52 billion active parameters (52B).

Tencent Launches Hunyuan-Large Large Model: 389B Total Parameters, Industry's Largest Transformer-Based MoE Model Open-Sourced

Tencent has open-sourced Hunyuan-A52B-Pretrain, Hunyuan-A52B-Instruct, and Hunyuan-A52B-Instruct-FP8 at Hugging Face, and has released a technical report and a training and reasoning operation manual detailing the model capabilities and the operation of training and reasoning.

Among the modeling technology advantages are the following:

  • High-quality synthesized data: By augmenting training with synthetic data, Hunyuan-Large is able to learn richer representations, handle long contextual inputs, and better generalize to unseen data.
  • KV Cache Compression: Adoption of Grouped Query Attention (GQA) and Cross-Layer Attention (CLA) strategies significantly reduces the memory footprint and computational overhead of the KV cache, and improves inference throughput
  • Expert-specific learning rate scaling: Setting different learning rates for different experts ensures that each sub-model learns effectively from the data and contributes to the overall performance
  • long context processing capabilityThe pre-trained model supports up to 256K text sequences and the Instruct model supports 128K text sequences, which significantly improves the processing power of long context tasks.
  • Extensive benchmarkingExtensive experiments in multiple languages and tasks have proven the effectiveness and safety of Hunyuan-Large in real-world applications.

The relevant links are as follows:

statement:The content is collected from various media platforms such as public websites. If the included content infringes on your rights, please contact us by email and we will deal with it as soon as possible.
HeadlinesInformation

Chongqing Municipality: Aims to Widely Apply Robots in All Economic and Social Sectors by 2027

2024-11-6 0:02:47

Information

Beyond OCR, Google's AI tech InkSight accurately recognizes handwritten text

2024-11-6 1:14:10

Search