Launch a dedicated cloud GPU server running Laboratory OS to download and run Devstral using any compatible app or framework.
Direct Download
Must be used with a compatible app, notebook, or codebase. May run slowly, or not work at all, depending on local system resources, particularly GPU(s) and available VRAM.
Open WebUI is an open-source, self-hosted web interface with a polished, ChatGPT-like user experience for interacting with LLMs. Integrates seamlessly with local Ollama installation.
The most full-featured web interface for experimenting with open source Large Language Models. Featuring a wide range of configurable settings, inference engines, and plugins.
Model Report
Mistral AI / Devstral
Devstral is a specialized 23.6 billion parameter language model developed by Mistral AI and All Hands AI, finetuned from Mistral-Small-3.1 for software engineering tasks. The text-only model features a 128,000-token context window and achieves 46.8% on SWE-Bench Verified benchmarks. Released under Apache 2.0 License, it functions as an agentic coding assistant for codebase exploration, multi-file editing, and automated software engineering workflows.
Explore the Future of AI
Your server, your data, under your control
Devstral is a specialized large language model (LLM) collaboratively developed by Mistral AI and All Hands AI, purpose-built for software engineering tasks. Finetuned from Mistral-Small-3.1, Devstral is engineered to act as an agentic coding assistant, with capabilities for codebase exploration, multi-file editing, and integration within software engineering agents. Released under the Apache 2.0 License on May 21, 2025, Devstral is available for both commercial and non-commercial uses and is positioned as a research preview to foster feedback and further development.
Agentic Performance: Chart comparing Devstral with peer models on SWE-Bench Verified, showing its competitive performance relative to model size.
Devstral is finetuned from Mistral-Small-3.1. It has a parameter count of 23.6 billion. The model features a context window of up to 128,000 tokens, allowing for extended reasoning and manipulation within large codebases. Devstral employs the Tekken tokenizer with a vocabulary size of 131,000 tokens, optimizing input and output handling for coding-oriented tasks. To focus exclusively on textual inputs relevant to software engineering, the vision encoder from its base model has been removed, making Devstral a text-only LLM.
The model’s architecture is tailored for agentic workflows. It enables contextual analysis across multiple files, identifies software component relationships, and assists in diagnosing code issues, which are functions for software engineering automation.
Training Methodology and Datasets
Devstral’s training regimen emphasizes real-world applicability to modern software engineering challenges. The model is trained to address GitHub issues using open-source agent scaffolds such as OpenHands and SWE-Agent. These scaffolds define interaction protocols between the model, codebases, and automated test cases, providing a framework for the model to learn how to effect multi-step code changes and verify correctness. Through this approach, Devstral is equipped to reason about, edit, and validate software repositories.
Benchmark Performance
Devstral demonstrates results in software engineering benchmarks, particularly on the SWE-Bench Verified evaluation. Utilizing the OpenHands scaffold for testing, Devstral achieves a score of 46.8%, outperforming previous open-source models by over 6 percentage points.
SWE-Bench Benchmark Results: Devstral's performance compared to other models by size and verified solution rate.
Compared with similarly or larger-scaled models tested on the same scaffold, Devstral exceeds the performance of Claude 3.5 Haiku (40.6%), SWE-smith-LM 32B (40.2%), GPT-4.1-mini (23.6%), Deepseek-V3-0324 (38.8%, 671B), and Qwen3 232B-A22B (34.4%, 232B) models, as documented in benchmark summaries.
Usage Scenarios and Applications
Devstral is built for deployment within agentic software engineering systems. Its compact size facilitates local or on-device operation, which can be relevant for privacy-sensitive environments and integration into continuous integration/deployment workflows. Its primary applications include automated and interactive codebase analysis, bug detection, multi-file code editing, and test suite augmentation.
A common use case is repository test coverage analysis. Devstral can assess, aggregate, and visualize code test metrics, enabling identification of poorly covered modules and supporting targeted improvement.
Test Coverage Distribution: Devstral-generated pie chart of test coverage across files, produced in response to an analysis prompt.
In addition, Devstral can be embedded within agentic coding platforms and IDE plugins, functioning as a backend for interactive code completion, autonomous pull request generation, and resolution of multi-step tasks.
To-Do List App Output: Minimalist application interface generated by Devstral as part of the OpenHands agent workflow tutorial.
Devstral is the first release in a planned family of agentic coding models. Finetuned from Mistral-Small-3.1, it is designed for compatibility with frameworks that define agent actions in software repositories, such as OpenHands. Mistral AI has indicated plans for future commercial and larger-scale agentic coding models featuring greater context lengths and domain adaptation, as outlined in the official announcement.
Limitations and Research Status
Devstral is released as a research preview, and continued improvement is anticipated through community feedback and operational experience. As with any LLM, its outputs are subject to the limitations of its training data and agent scaffold integration, and users are encouraged to validate and review changes proposed by the model in automated coding workflows.
Licensing
Devstral is distributed under the Apache 2.0 License, permitting use, modification, and distribution for both private and commercial applications.