Google releases Gemma 2 open source AI model with 9/27 billion parameters: performance is better than its peers and can be run on a single A100/H100 GPU

Google issued a press release yesterday, releasing the Gemma 2 large language model to researchers and developers around the world . It comes in two sizes: 9 billion parameters (9B) and 27 billion parameters (27B).

Compared with the first generation, the Gemma 2 large language model has higher reasoning performance and efficiency, and has made significant progress in security.

Google said in a press release that the performance of the Gemma 2-27B model is comparable to mainstream models twice its size, and this performance can be achieved with just one NVIDIA H100 ensor Core GPU or TPU host, greatly reducing deployment costs.

The Gemma 2-9B model outperforms Llama 3 8B and other open source models of similar size. Google also plans to release a Gemma 2 model with 2.6 billion parameters in the coming months, which is more suitable for AI application scenarios on smartphones.

Google said it has redesigned the overall architecture for Gemma 2 to achieve excellent performance and reasoning efficiency. IT Home attached the main features of Gemma 2 as follows:

Excellent performance:

The 27B version is the best performer in its size class, even more competitive than models twice its size. The 9B version is also the best performer in its class, outperforming the Llama 3 8B and other open models of the same size.

Efficiency and cost:

The 27B Gemma 2 model can efficiently run inference at full precision on a single Google Cloud TPU host, NVIDIA A100 80GB Tensor Core GPU, or NVIDIA H100 Tensor Core GPU, maintaining high performance while significantly reducing costs. This makes AI deployment more accessible and budget-friendly.

Fast inference across hardware

Gemma 2 is optimised to run at blazing speeds on a wide range of hardware, from powerful gaming laptops and high-end desktops to cloud-based setups.

Try Gemma 2 at full precision in Google AI Studio, unlock native performance on your CPU using the quantised version of Gemma.cpp , or try it on your home PC with NVIDIA RTX or GeForce RTX with Hugging Face Transformers.

Share this with others:

“6 out of 10 university students are into Yahoo-Yahoo” – EFCC Chairman

UNICAL Journalism Dept Seeks Partnership with NUJ to Bridge Theory-Practice Gap

Police, EFCC Rescue 14 Kidnapped Victims From Calabar-Oron Waterways

Cross River Gives Erring Clinics 3-Month Ultimatum to Meet Standards or Shut Down

Ikom/Boki 2027: Alvin Ochang and the Rise of a New Order

Celebrations as 18-Year-Old Beats Stage 4 Cancer at UTCH

Mr Leboku 2025 Wades into Ugep Okada Fare Dispute as Union Proposes New Rates

Grace Etim Bassey’s Journey of Service: From Grassroots to Leadership

Court Jails Woman for Selling Newborn Baby in Calabar

Soldiers Kill NYSC Member at His Residence During Operation in Abuja