NVIDIA and Hugging Face launch efficient inference service, increasing AI model token processing efficiency fivefold

Recently, the open source platform Hugging Face and NVIDIA announced an exciting new service, Inference-as-a-Service, which will be powered by NVIDIA's NIM technology. The launch of the new service will allow developers to prototype faster, use open source AI models available on the Hugging Face Hub, and deploy them efficiently.

NVIDIA and Hugging Face launch efficient inference service, increasing AI model token processing efficiency fivefold

The announcement was made at the ongoing SIGGRAPH2024 conference. This conference brings together a large number of experts in computer graphics and interactive technologies, and the unveiling of NVIDIA's partnership with Hugging Face comes at just the right time to open up new opportunities for developers. Through this service, developers can easily deploy powerful Large Language Models (LLMs), such as Llama2 and Mistral AI models, which are optimized by NVIDIA's NIM microservices.

Specifically, when accessed as a NIM, models like the 7 billion parameter Llama3 model are processed five times faster than when deployed on a standard NVIDIA H100Tensor Core GPU system, which is certainly a huge boost. Additionally, this new service supports Train on DGX Cloud, a service that is currently available on Hugging Face.

NVIDIA's NIM is a suite of AI microservices optimized for inference, encompassing both NVIDIA's AI foundation models and open source community models. It significantly improves Token processing efficiency through standard APIs and enhances the NVIDIA DGX Cloud infrastructure to accelerate the responsiveness and stability of AI applications.

The NVIDIA DGX Cloud platform is tailored specifically for generative AI, providing a reliable and accelerated compute infrastructure that helps developers move from prototype to production without a long-term commitment.The partnership between Hugging Face and NVIDIA will further solidify the developer community, and Hugging Face recently announced that its team has become profitable, with the team reaching a size of 2,000 and launching the SmolLM series of small language models. Hugging Face also recently announced that its team is profitable, has reached 220 people, and has launched the SmolLM family of small language models.

statement:The content is collected from various media platforms such as public websites. If the included content infringes on your rights, please contact us by email and we will deal with it as soon as possible.
HeadlinesInformation

Apple iOS 18.1 developer beta is now available, adding AI call recording and transcription features

2024-7-30 10:02:15

Information

Anthropic AI was accused of crawling website data excessively, crawling millions of times in 24 hours

2024-7-31 9:34:27

Search