Google releases Gemma 2 – Tops Gemma in terms of Performance, Efficiency, and Speed

AI Ecosystem Updates | Issue# 1 [July 03, 2024]

What?

  • Update: Google released Gemma 2, an advanced version of the Gemma family of open AI models, to researchers and developers across the globe. More here.
  • Context: Earlier this year, Google introduced Gemma – a family of lightweight, open models built that were built upon the same research and technology that was used to create the Gemini family of models
  • Summary: Available in 9 billion (9B) and 27 billion (27B) parameter sizes, Gemma 2 outperforms its predecessor in terms of its performance and efficiency.

Performance and Efficiency

Gemma 2 boasts of a redesigned architercture and significant performance and efficiency gains over Gemma.

  • Performance and Efficiency: The 27B model offers competitive performance comparable to larger proprietary models. The 9B model delivers state-of-the-art perfromance outperforming Llama 3 8B and other open models with comparable sizes. More here.
  • Cost-efficiency: The 27B model is capable of running inference efficiently on full precision on single Google Cloud TPU hosts or NVIDIA GPUs, thereby, reducing costs without compromising on performance.
  • Speed: Gemma 2 is optimized for fast inference across various hardware setups, from gaming laptops to cloud-based systems.

Compatibility and Accessibility

  • Accessibility: Gemma 2 is available under the Gemma license that allows developers and researchers to use the models and share or commercialize their work.
  • Integrations: Gemma 2 offers seamless integration with major AI tools and frameworks, such as Hugging Face Transformers, PyTorch and TensorFlow via Keras 3.0, JAX, Gemma.cpp, Llama.cpp, and Ollama, facilitating easy integration into existing workflows. The models are also optimized to run on NVIDIA-accelerated infrastructure. The Gemma 2 models can be fine-tuned with Keras and Hugging Face.
  • Deployment: Vertex AI will support efficient deployment and management of the Gemma 2 models starting July 2024.

Responsible AI

  • Safety: Gemma 2 underwent rigorous safety testing and data filtering to mitigate biases and potential harms, with results published on public benchmarks that evaluate safety and representational harms.
  • Responsible AI: Google emphasizes responsible AI development, offering resources, such as the Responsible Generative AI Toolkit for developers and researchers to build and deploy AI responsibly. Google has also open-sourced a Python companion library to go with the LLM Comparator to run comaparitive evaluations with your language models and visualize the results in the app.

Future Directions

  • Empowering Developers: The Gemma models saw over 10 million downloads, with numerous innovative projects like Navarasa, which focuses on India’s linguistic diversity. Gemma 2 will surely push the frontiers of AI applications for developers.
  • Future Work: Google plans to keep working on newer architectures for newer versions of Gemma to ensure they are able to solve for a wider range of AI use cases. The company will soon release a 2.6 Bn parameter Gemma 2 model aimed at balancing lighweight accessibility and performance across diverse AI tasks. More here.

Resources

  • Testing: The 27B Gemma 2 model is available in Google AI Studio for testing.
  • Weights: The weights for the 27B model are downloadable from Kaggle and Hugging Face.
  • Free Access: Gemma 2 is accessible for free through Kaggle and also with the free tier of Colab notebooks.
  • Research: Academic researchers are eligible to apply for Google Cloud credits through the Gemma 2 Academic Research Program until August 9, 2024. Of course, users new to Google Cloud receive $300 in credits that they can use to try the models out.
  • Tutorials: The Gemma Cookbook offers practical examples and recipes to help build and fine-tune applications using Gemma 2.

Thoughts

The AI landscape is changing each day as the AI community helps bring in innovations that further alters the very landscape and the way we interact with the world.

  • New Architecture and Innovation: We hope Google keeps innovating and building newer models that are more powerful, faster, lighter, and cost-efficient.
  • Inference on Edge Devices: We also hope to learn more about if and how we could leverage the power of Gemma 2 models to run inference on edge devices.

Thank you for reading through! I genuinely hope you found the content useful. Feel free to reach out to us at [email protected] and share your feedback and thoughts to help us make it better for you next time.


Acronym used in the blog that has not been defined earlier: Artificial Intelligence (AI).