Note: Command R v01 weights are released under a CC-BY-NC 4.0 License, and cannot be utilized for commercial purposes. Please read the license to verify if your use case is permitted.
Laboratory OS
Launch a dedicated cloud GPU server running Laboratory OS to download and run Command R v01 using any compatible app or framework.
Direct Download
Must be used with a compatible app, notebook, or codebase. May run slowly, or not work at all, depending on local system resources, particularly GPU(s) and available VRAM.
Open WebUI is an open-source, self-hosted web interface with a polished, ChatGPT-like user experience for interacting with LLMs. Integrates seamlessly with local Ollama installation.
The most full-featured web interface for experimenting with open source Large Language Models. Featuring a wide range of configurable settings, inference engines, and plugins.
Model Report
Cohere / Command R v01
Command R v01 is a 35-billion-parameter transformer-based language model developed by Cohere, featuring retrieval-augmented generation with explicit citations, tool use capabilities, and multilingual support across ten languages. The model supports a 128,000-token context window and demonstrates performance in enterprise applications, multi-step reasoning tasks, and long-context evaluations, though it requires commercial licensing for enterprise use.
Explore the Future of AI
Your server, your data, under your control
Command R v01 is a 35-billion-parameter generative language model developed by Cohere and Cohere Labs, designed to offer high efficiency and strong accuracy for production-scale applications. Released in March 2024, Command R v01 integrates capabilities in retrieval-augmented generation (RAG), tool use, extended context length, and multilingual functionality, making it suitable for enterprise-level deployments and complex reasoning tasks, as detailed in Cohere's blog post announcing its release.
Command R v01 logo and graphic elements, designed to represent the model's core theme.
Command R v01 is based on an optimized transformer architecture. The model's scale, at 35 billion parameters, supports language understanding and reasoning across a range of tasks, according to Cohere documentation. Following pretraining, the model undergoes supervised fine-tuning and preference training to align outputs with preferences for helpfulness, safety, and factuality.
To support diverse use cases, Command R v01 incorporates chat templates for general conversations, as well as specialized prompt templates for grounded generation and tool use scenarios, documented in the prompt guide. The training process leverages a wide spectrum of supervised and preference data, further improved with techniques for retrieval-augmented generation and tool use tasks.
Technical Capabilities
Retrieval-Augmented Generation
A feature of Command R v01 is its retrieval-augmented generation (RAG) capability. The model can accept document snippets as input and generate responses grounded in those texts, providing explicit citations (grounding spans) that trace information to its source. This facilitates grounded summarization and answer generation, which are relevant in minimizing hallucinations and supporting factual integrity in enterprise applications, as described in the grounded generation documentation.
Command R v01 supports various citation modes, balancing speed and accuracy in grounded answer generation. It predicts relevant documents, generates citations, and constructs answers with referenced sources, optimizing for either precision (accurate mode) or efficiency (fast mode).
Tool Use and Function Calling
The model includes tool use capabilities, also known as "function calling." In single-step tool use, Command R v01 can select external tools (such as APIs or databases), specify parameters, and integrate returned information into its final output. It is also capable of multi-step tool use, enabling workflows that involve planning and execution across several inference cycles, as noted in the tool use documentation.
Command R v01 aligns with Hugging Face's tool use API standards and includes mechanisms to either abstain from tool use or chain multiple actions in complex problem-solving sequences, as detailed in the Hugging Face advanced tool use documentation.
Multilingual Support and Context Window
Command R v01 supports multilingual generation across at least ten widely used languages—English, French, Spanish, Italian, German, Brazilian Portuguese, Japanese, Korean, Simplified Chinese, and Arabic—with pretraining inclusions for languages such as Russian, Turkish, Dutch, Vietnamese, Czech, Hindi, Greek, Hebrew, and Persian, according to the Cohere blog. Its context window extends to 128,000 tokens, enabling handling of long documents and complex, context-rich queries.
Code Interaction
While Command R v01 is optimized for conversational code tasks—such as producing code snippets, explanations, or rewrites—it is not primarily designed for pure code completion scenarios, as noted in the model documentation.
Performance and Evaluation
Command R v01 has been evaluated across multiple benchmarks and real-world applications, demonstrating performance in RAG, tool use, multilingual understanding, and long-context tasks.
Comparative evaluation: Left, human preference for Command R versus Mixtral in enterprise RAG scenarios; Right, KILT end-to-end RAG accuracy of major large language models.
On enterprise RAG applications, human preference assessments showed Command R performed in comparison to Mixtral in a variety of use cases, including document assistance, customer support, and enterprise FAQ search. For knowledge-intensive retrieval, Command R demonstrated accuracy on KILT Wikipedia index tasks when compared to Llama2-70B (chat), Mixtral, among other models, as noted in the KILT Wikipedia index paper.
Accuracy on multi-step reasoning tasks (HotpotQA, Bamboogle) illustrates Command R's performance in tool use compared to other large models.
For multi-step tool use and reasoning tasks, Command R achieved higher accuracy than its peers on benchmarks including HotpotQA and Bamboogle, demonstrating its proficiency in coordinating search, synthesis, and reasoning across several inference cycles, as detailed in research related to HotpotQA.
Multilingual performance on mMMLU and FLoRES benchmarks across several major models.
In multilingual evaluations, Command R v01 demonstrated accuracy and translation performance on the mMMLU and FLoRES benchmarks, often outperforming comparable models in multilingual understanding and translation tasks, as described in the FLoRES research and MMLU research.
Heatmap shows Command R's performance in 'Needles in a Haystack' long-context evaluation, with high scores across varying token lengths and depths.
Command R v01 also demonstrated proficiency in long-context evaluations, recovering inserted facts within prompts up to 128,000 tokens, as illustrated by "Needles in a Haystack" benchmark results, a methodology detailed here.
Applications and Use Cases
Command R v01 is designed for enterprise applications that require reliable, context-rich generation and access to proprietary or dynamic external knowledge bases. Key use cases include:
Retrieval-augmented generation for summarization, analytics, and information packaging, incorporating transparent citations.
Tool use for workflow automation, involving external APIs, databases, code interpreters, and search engines.
Reasoning, question answering, and planning tasks in business and technical domains.
Multilingual customer support, documentation, and translation tasks.
Its architecture enables integration with associated models such as Embed and Rerank for RAG workflows.
Limitations and Licensing
While Command R v01 is effective for a range of language and reasoning tasks, it is not specifically optimized for pure code completion and may require precise prompt templates to realize optimal performance in grounded generation and tool use scenarios, as outlined in the prompting guide.