Command R (08-2024)

Family

Command R

Type

Foundation Model

License

CC-BY-NC 4.0 License

Released

2024-08-30

How To Use

Note: Command R (08-2024) weights are released under a CC-BY-NC 4.0 License, and cannot be utilized for commercial purposes. Please read the license to verify if your use case is permitted.

Laboratory OS

Launch a dedicated cloud GPU server running Laboratory OS to download and run Command R (08-2024) using any compatible app or framework.

Direct Download

Must be used with a compatible app, notebook, or codebase. May run slowly, or not work at all, depending on local system resources, particularly GPU(s) and available VRAM.

Browse Compatible Apps

open-webui /

Open WebUI

Open WebUI is an open-source, self-hosted web interface with a polished, ChatGPT-like user experience for interacting with LLMs. Integrates seamlessly with local Ollama installation.

oobabooga /

Text Generation Web UI

The most full-featured web interface for experimenting with open source Large Language Models. Featuring a wide range of configurable settings, inference engines, and plugins.

Model Report

Cohere / Command R (08-2024)

Command R (08-2024) is a 32-billion parameter generative language model developed by Cohere, featuring a 128,000-token context window and support for 23 languages. The model incorporates Grouped Query Attention for enhanced inference efficiency and specializes in retrieval-augmented generation with citation capabilities, tool use, and multilingual comprehension. It demonstrates improved throughput and reduced latency compared to previous versions while offering configurable safety modes for enterprise applications.

Explore the Future of AI

Your server, your data, under your control

Command R 08-2024 is a 32-billion parameter generative large language model developed by Cohere and Cohere Labs and released as a research product in August 2024. As part of the Command R series—including both Command R and Command R+—this model is designed for robust performance in areas such as retrieval-augmented generation (RAG), multilingual comprehension, reasoning, and advanced tool interactions. With a strong focus on enterprise and research applications, Command R 08-2024 introduces upgraded capabilities in efficiency, accuracy, and safety.

Technical Foundations and Architecture

Command R 08-2024 operates as an auto-regressive transformer, employing advancements for higher throughput and reduced latency relative to its predecessors. The model utilizes Grouped Query Attention (GQA), significantly enhancing inference efficiency. It accepts sequences with up to a 128,000-token context window, catering to long-document applications and complex workflows. After pretraining, the model undergoes supervised fine-tuning and preference alignment, ensuring outputs that reflect human preferences for utility and safety.

Multilingual Training and Data Coverage

To achieve robust multilingual capability, Command R 08-2024 is pretrained on a diverse corpus spanning 23 languages. These include English, French, Spanish, Italian, German, Portuguese, Japanese, Korean, Arabic, Simplified Chinese, Russian, Polish, Turkish, Vietnamese, Dutch, Czech, Indonesian, Ukrainian, Romanian, Greek, Hindi, Hebrew, and Persian. Evaluation is conducted across 10 core languages, optimizing performance for global use cases. The training methodology leverages a mixture of supervised and preference-based fine-tuning, especially for tasks involving citation generation and advanced tool use.

Retrieval Augmented Generation and Tool Use

A defining feature of Command R 08-2024 is its advanced retrieval-augmented generation, which allows the model to generate grounded responses with explicit citations. This system is trained to incorporate document snippets and annotate responses with "grounding spans," marking the provenance of generated content. Two citation modes are available: “accurate,” which optimizes for correctness by sequentially selecting relevant documents before final generation, and “fast,” which expedites generation by merging citation and response steps.

The model also integrates advanced tool use functionality. Its single-step tool use, often referenced as function calling, enables the model to interact with external APIs, databases, or search interfaces by selecting appropriate tools and constructing parameterized calls. Multi-step tool use—implemented as "agents"—enables intricate workflows where the model iteratively plans actions, observes outcomes, and refines its strategy accordingly.

For code-related tasks, Command R 08-2024 is optimized to answer queries, provide explanations, or produce code snippets. Notably, lower temperature settings are recommended for code generation to ensure deterministic and precise outputs.

Performance and Benchmarking

Command R 08-2024 demonstrates notable advancements in performance metrics compared to its previous version. In direct evaluations across general, coding, and STEM-specific prompts, the updated model consistently receives higher human preference ratings. The horizontal bar chart below summarizes these results, with 'cmd-r 08-2024' exhibiting substantial increases in all categories.

Further, the updated architecture yields approximately 50% higher throughput and 20% lower latency relative to prior models, while reducing the operational footprint.

Throughput and latency improvement chart

For the larger Command R+ variant, similar efficiency gains are observed:

Command R+ performance improvement chart

Such results underscore improved decision-making for tool use, tighter control over response formatting and length, enhanced handling of structured data, and expanded robustness to variations in prompt structure.

Safety Modes and Responsible Use

Introduced in the 08-2024 update, Command R and Command R+ now support Safety Modes, which allow for context-appropriate moderation of sensitive topics. The “STRICT” mode restricts responses to avoid sensitive subject matter, making it suitable for enterprise and general usage. The default “CONTEXTUAL” mode offers broader conversational flexibility, while core protections—particularly regarding child safety—are always enforced and non-adjustable. Users may opt out of safety modes for specific research purposes, but all uses must comply with the CC-BY-NC license and the Acceptable Use Policy.

Applications and Model Family

Command R 08-2024 is designed for integration into enterprise-grade AI systems, supporting use cases such as retrieval-augmented summarization, knowledge-augmented question answering, code generation, and structured data analysis. Its advanced tool use and multilingual capabilities enable automation of diverse workflows in sectors ranging from finance to consulting. The model’s coding proficiency is best utilized with prompt templates specific to grounded generation, tool use, and agent-based tasks, as described in the official documentation.

In the broader Command R series, Command R+ 08-2024 retains a larger parameter count and delivers slightly higher performance metrics, especially in high-throughput environments and for large-scale inference tasks.

Limitations and Licensing

Despite its capabilities, Command R 08-2024 may not yield optimal performance for traditional code completion absent fine-tuned prompts. Deviations from prescribed templates in grounded generation or tool use generally result in lower accuracy and utility. Access to model files and weights requires agreement to sharing contact information, acceptance of licensing terms, and compliance with privacy policies.

The model is distributed under a Creative Commons BY-NC license, restricting use to non-commercial contexts unless otherwise authorized.