DeepSeek R1 671B model can be deployed and run on a single machine, Wave Information launches Metabrain R1 inference server

February 12 news.Wave InformationToday announced the launch of the Metabrain R1 Inference Server, through system innovation and optimization of hardware and software synergies.stand-aloneReady to deploy DeepSeek Model R1 671B.

Note: DeepSeek open-sources a multi-version model in which theDeepSeek R1 671B model as a fully parametric base macromodel, which provides stronger generalization, higher accuracy and better contextual understanding than the distillation model, but also places higher demands on the system's video memory capacity, video memory bandwidth, interconnect bandwidth and latency:

At least 800GB of memory is required at FP8 accuracy, and 1.4TB or more at FP16 / BF16 accuracy..

In addition, DeepSeek R1 is a typical long chain-of-mind model with short-input, long-output applications, and the inference and decoding phase relies on higher memory bandwidth and very low communication latency.

The Metabrain R1 Reasoning Server NF5688G7 comes with the FP8 compute engine natively.Provides 1128GB of HBM3e memory.In order to meet the requirement of no less than 800GB of video memory capacity under FP8 accuracy of 671B model, and to retain sufficient KV cache space while supporting full model inference on a stand-alone basis, the video memory bandwidth of this machine is up to 4.8 TB/s.

In terms of communication, GPU P2P bandwidth reaches 900GB/s, and based on the latest inference framework, it can support 20-30 users concurrently on a single machine. At the same time, a single NF5688G7 is equipped with 3200Gbps lossless expansion network, which can realize agile expansion according to the growth of user's business demand and provide R1 server cluster Turnkey solution.

The Metabrain R1 Reasoning Server NF5868G8 is a high-throughput reasoning server designed for Large Reasoning Model.Industry's first 16 standard PCIe double-width cards on a single machineIt provides up to 1536GB of memory and supports standalone deployment of DeepSeek 671B models at FP16/BF16 accuracy.

The machine adopts a 16-card fully interconnected topology based on PCIe fabric, and the P2P communication bandwidth of any two cards can reach 128GB/s, reducing the communication latency by more than 60%. Through the optimization of hardware and software collaboration, compared with the traditional 2-machine and 8-card PCIe model, the NF5868G8 can improve the inference performance of the DeepSeek 671B model by nearly 40%, and it currently supports a wide range of AI acceleration card options. The NF5868G8 supports multiple AI acceleration card options.

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.

DeepSeek R1 671B model can be deployed and run on a standalone machine, Wave Information launches Metabrain R1 inference server.

AI Gives Rare Disease Patients a New Lease on Life: Finding Life-Saving Drugs from 4,000 Drugs

BBC study: AI chatbots summarize news so wrongly they can't tell fact from opinion

AI Weibo

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

Related content:

AI Gives Rare Disease Patients a New Lease on Life: Finding Life-Saving Drugs from 4,000 Drugs

BBC study: AI chatbots summarize news so wrongly they can't tell fact from opinion

DeepSeek's Late-Night Amplification: 7B Parameters for Everyone's Visual Multimodal Model Janus-Pro-7B Open Source

Italian agency asks DeepSeek for data protection information

Nail AI assistant access to DeepSeek: optional R1, V3 and other three models, support for deep thinking

IBM Enterprise AI Development Platform watsonx.ai Goes Online with DeepSeek R1 Distillation Model

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow