Browse Models

Google /

Gemma 2 9B

Family

Gemma 2

Type

Foundation Model

License

Gemma License

Released

2024-06-25

How To Use

Laboratory OS

Launch a dedicated cloud GPU server running Laboratory OS to download and run Gemma 2 9B using any compatible app or framework.

Direct Download

Must be used with a compatible app, notebook, or codebase. May run slowly, or not work at all, depending on local system resources, particularly GPU(s) and available VRAM.

Browse Compatible Apps

open-webui /

Open WebUI

Open WebUI is an open-source, self-hosted web interface with a polished, ChatGPT-like user experience for interacting with LLMs. Integrates seamlessly with local Ollama installation.

oobabooga /

Text Generation Web UI

The most full-featured web interface for experimenting with open source Large Language Models. Featuring a wide range of configurable settings, inference engines, and plugins.

Model Report

Google / Gemma 2 9B

Gemma 2 9B is an open-weights decoder-only transformer language model developed by Google as part of the Gemma family. Trained on 8 trillion tokens using TPUv5p infrastructure, the model supports English text generation, question answering, and summarization tasks. Available in both pre-trained and instruction-tuned versions with bfloat16 precision, it demonstrates competitive performance on benchmarks like MMLU and coding evaluations while incorporating safety filtering mechanisms.

Explore the Future of AI

Your server, your data, under your control

Gemma 2 9B is an open-weights large language model from Google, part of the Gemma model family informed by Gemini research. It is a decoder-only, text-to-text model with an architecture designed for efficient deployment. Available in both pre-trained and instruction-tuned versions, Gemma 2 9B is intended for a range of natural language processing tasks, including question answering, summarization, and text generation. The model's weights have been openly released for research and community use, as described in the official Gemma documentation.

Model Architecture and Training

Gemma 2 9B utilizes a decoder-only transformer architecture and supports the English language in its base version. The model's weights are natively stored in bfloat16 precision to optimize performance for inference and training, as detailed in its technical specifications.

The model was trained on a dataset of 8 trillion tokens of text aggregated from a diverse corpus that includes web documents, programming code, and mathematics content. This dataset was curated with robust data filtering protocols to exclude personal and sensitive information and to uphold content quality in compliance with Google’s content policies. Training was conducted using Tensor Processing Units (TPUv5p), and the model leverages the JAX library and Google's ML Pathways infrastructure. The overall training approach follows the design principles described in the Gemini models research paper.

Capabilities and Applications

Gemma 2 9B is engineered for a broad range of generative and analytical tasks within natural language processing. It is capable of producing coherent and contextually relevant text, code, summaries, and conversational responses. Its applications include chatbot development, virtual assistants, and interactive research tools for content creation, summarization, and knowledge extraction.

While the core Gemma 2 9B model is English-only, other Gemma variants offer support for additional languages and extended context windows. The model is intended for both research and practical deployment.

Benchmark Performance

Evaluations have placed Gemma 2 9B among the high-performing open models in its size class as of its release. On benchmarks such as MMLU (multitask language understanding), HellaSwag (commonsense inference), and PIQA (physical commonsense reasoning), Gemma 2 9B achieves scores that are competitive with larger models. For example, in MMLU (5-shot, top-1), it scores 71.3 compared to Gemma 2 27B’s 75.2. On HumanEval, which measures coding ability, Gemma 2 9B achieves a pass@1 rate of 40.2.

The model is also assessed across safety and ethics benchmarks such as RealToxicity, CrowS-Pairs, and the BBQ Dataset. Results indicate performance in minimizing toxicity and bias, which is an ongoing focus within the Gemma development framework.

Limitations and Ethical Considerations

Gemma 2 9B inherits limitations characteristic of large language models. The model’s responses can reflect biases and gaps present in the training data, which may lead to skewed outputs or reduced factual accuracy. It is fine-tuned for clear, explicit prompts and may have difficulty with highly open-ended or ambiguous tasks.

Ethical considerations include risks related to the perpetuation of social biases, the generation of misinformation, and other potential harms. Google employs mitigation strategies, including content monitoring and advanced filtering systems, as described in its responsible AI development resources and the Gemma Prohibited Use Policy. Users must agree to a usage license outlining permitted applications and restrictions before accessing the model.

The Gemma Model Family

Gemma 2 9B is part of a modular ecosystem of models inspired by the Gemini research program. Alongside Gemma 2 9B and the larger Gemma 2 27B, the family includes other specialized variants:

Gemma 3: A series of models with multimodal capabilities, expanded language support, and extended context windows.
Gemma 3n: A series of models that incorporate audio input and are optimized for broader device compatibility.
CodeGemma: A model that specializes in code generation and understanding.
PaliGemma 2: A model that targets visual data processing applications.
ShieldGemma 2: A model focused on evaluating the safety of generative AI model outputs.

Each variant offers tailored capabilities and scales to suit different research and deployment requirements, as documented in the Gemma technical overview.

Licensing, Terms, and Accessibility

Gemma 2 9B is distributed with open weights under a usage license that governs its application. The license, including a Prohibited Use Policy, defines permitted and restricted scenarios to promote scientific transparency and responsible AI practices. Documentation and community resources for Gemma models are shared under the Creative Commons Attribution 4.0 License, while associated code samples are released under the Apache 2.0 License.

Gemma 2 9B

Laboratory OS

Direct Download

Open WebUI

Text Generation Web UI

Explore the Future of AI

Your server, your data, under your control

Gemma 2 9B

Laboratory OS

Direct Download

Open WebUI

Text Generation Web UI

Explore the Future of AI

Your server, your data, under your control

Model Architecture and Training

Capabilities and Applications

Benchmark Performance

Limitations and Ethical Considerations

The Gemma Model Family

Licensing, Terms, and Accessibility

Further Resources