NvidiaNVIDIA officially released AI Enterprise 5.0, an important product designed to help enterprises accelerate the development of generative artificial intelligence (AI). AI Enterprise 5.0 includes NVIDIA microservices and downloadable software containers, which can be used to deploy generative AI applications and accelerate computing. It is worth mentioning that this product has been adopted by well-known customers such as Uber.
As developers turn to microservices as an efficient way to build modern enterprise applications, NVIDIA AI Enterprise 5.0 provides a wide range of microservices. These include NVIDIA NIM and NVIDIA CUOpt, which are optimized for deploying AI models in production and supportGPUAcceleration provides users with a more efficient reasoning process. NVIDIA reasoning software, including Triton Inference Server, TensorRT, and TensorRT-LLM, supports NIM, reducing deployment time from weeks to minutes. These microservices not only provide industry-standard security and manageability, but are also compatible with enterprise-level management tools, bringing a more convenient deployment experience to enterprises.
In addition, NVIDIA CUOpt, a GPU-accelerated AI microservice, not only set a world record for route optimization, but also enabled dynamic decision-making to reduce cost, time and carbon footprint. As one of the CUDA-X microservices, CUOpt plays an important role in helping industries put artificial intelligence into production.
In the future, AI Enterprise 5.0 will introduce more features. For example, the NVIDIA RAG LLM operator (currently in early access) will help move co-pilot and other generative AI applications that use retrieval-enhanced generation from pilots to actual applications without rewriting any code. The introduction of this feature will further promote the development of enterprises in the field of AI applications.
No matter how users access it, AI Enterprise 5.0 can bring them many benefits. Not only can the product benefit from secure, production-ready and performance-optimized software, but it can also be flexibly deployed in data centers, clouds, workstations or network edges to meet the needs of different scenarios.