GoogleThe company released yesterdayPress releaseThe release is open to researchers and developers around the world Gemma 2 Large Language Model,There are two sizes, 9 billion parameters (9B) and 27 billion parameters (27B).
The Gemma 2 large language model offers higher inference performance, greater efficiency, and significant advances in security compared to the first generation.
In a press release, Google said the Gemma 2-27B model rivals the performance of mainstream models at twice the scale and requires only a single NVIDIA H100 ensor Core GPU or TPU host to achieve this performance, significantly reducing deployment costs.
The Gemma 2-9B model outperforms Llama 3 8B and other open source models of similar size. Google also plans to release the Gemma 2 model with 2.6 billion parameters in the coming months, which is better suited for smartphone AI scenarios.
Google says it redesigned the overall architecture for Gemma 2 to achieve superior performance and inference efficiency.
The main features of Gemma 2 are as follows:
Excellent performance:
The 27B version offers the best performance in its size class and is even more competitive than models twice its size.The 9B version also leads its class in performance, outperforming the Llama 3 8B and other open models in its size class.
Efficiency and cost:
27B Gemma 2 models can run inference efficiently at full precision on a single Google Cloud TPU host, NVIDIA A100 80GB Tensor Core GPU, or NVIDIA H100 Tensor Core GPU, dramatically reducing costs while maintaining high performance. This makes AI deployments easier to achieve and more budget-friendly.
Cross-Hardware Rapid Reasoning
Gemma 2 is optimized to run at blazing speeds on a wide range of hardware, from powerful gaming laptops and high-end desktops to cloud-based setups.
Trying full-precision Gemma 2 in Google AI Studio on a CPU with Gemma.cpp Unlock local performance with a quantized version of NVIDIA RTX or GeForce RTX, or try it out on a NVIDIA RTX or GeForce RTX-equipped home computer with Hugging Face Transformers.