social media giant Meta plans to deploy a customized, second-generation data center this year in its AI Chips, code named "Artemis".
According to Reuters, the new chips will be used for "inferencing," the process of running AI models, in Meta's data centers. The goal of the initiative is to reduce reliance on Nvidia chips and control the cost of AI workloads. Additionally, Meta offers generative AI applications in its service and is training an open-source model called Llama3 that aims to reach GPT-4 levels.
Source Note: The image is generated by AI, and the image is authorized by Midjourney
Meta CEO Mark Zuckerberg recently announced that he plans to end the year with 340,000 Nvidia H100GPUIn total, about 600,000 GPUs are used to run and train AI systems. This makes Meta the only company other than Microsoft that Nvidia hasmaximumof public customers. However, with more powerful and larger scale models, AI workloads and costs continue to increase. In addition to Meta, companies like OpenAI and Microsoft are trying to break this cost spiral with proprietary AI chips and more efficient models.
In May 2023, Meta firsthas launched a new family of chips called Meta Training and Inference Accelerator (MTIA), designed to accelerate and reduce the cost of running neural networks. According to the official announcementFirstThe chips are expected to be in service in 2025 and were already being tested in Meta's data centers at that time. According to Reuters, Artemis is already one of MTIA's moreadvancedVersion.
Meta's move signals their desire to reduce their reliance on Nvidia chips and control the cost of AI workloads through the deployment of their own AI chips. They plan to bring the Artemis chip into production this year, stating, "We believe that our self-developed gas pedals, along with commercially available GPUs, provide a significant increase in performance and efficiency on Meta-specific workloads.optimalportfolio." This initiative will bring greater flexibility and autonomy to Meta, while also promising to reduce the cost of AI workloads.