Google releases Gemma 2 open source AI model with 9/27 billion parameters: performance is better than its peers and can be run on a single A100/H100 GPU

Google releases Gemma 2 open source AI model with 9/27 billion parameters: performance is better than its peers and can be run on a single A100/H100 GPU

Google issued a press release yesterday, releasing the Gemma 2 large language model to researchers and developers around the world . It comes in two sizes: 9 billion parameters (9B) and 27 billion parameters (27B).

Compared with the first generation, the Gemma 2 large language model has higher reasoning performance and efficiency, and has made significant progress in security.

Google said in a press release that the performance of the Gemma 2-27B model is comparable to mainstream models twice its size, and this performance can be achieved with just one NVIDIA H100 ensor Core GPU or TPU host, greatly reducing deployment costs.

The Gemma 2-9B model outperforms Llama 3 8B and other open source models of similar size. Google also plans to release a Gemma 2 model with 2.6 billion parameters in the coming months, which is more suitable for AI application scenarios on smartphones.

Google said it has redesigned the overall architecture for Gemma 2 to achieve excellent performance and reasoning efficiency. IT Home attached the main features of Gemma 2 as follows:

Excellent performance:

The 27B version is the best performer in its size class, even more competitive than models twice its size. The 9B version is also the best performer in its class, outperforming the Llama 3 8B and other open models of the same size.

Google releases Gemma 2 open source AI model with 9/27 billion parameters: performance is better than its peers and can be run on a single A100/H100 GPU

Efficiency and cost:

The 27B Gemma 2 model can efficiently run inference at full precision on a single Google Cloud TPU host, NVIDIA A100 80GB Tensor Core GPU, or NVIDIA H100 Tensor Core GPU, maintaining high performance while significantly reducing costs. This makes AI deployment more accessible and budget-friendly.

READ ALSO:

Fast inference across hardware

Gemma 2 is optimised to run at blazing speeds on a wide range of hardware, from powerful gaming laptops and high-end desktops to cloud-based setups.

Try Gemma 2 at full precision in Google AI Studio, unlock native performance on your CPU using the quantised version of Gemma.cpp , or try it on your home PC with NVIDIA RTX or GeForce RTX with Hugging Face Transformers.

Get Faster News Update By Joining Our: WhatsApp Channel

All rights reserved. This material, and other digital content on this website, may not be reproduced, published, broadcast, rewritten or redistributed in whole or in part without written permission from CONVERSEER. Read our Terms Of Use.