Mistral Small 3.1 (2503)

Mistral Small 3.1 (2503) is a 24-billion parameter transformer-based model developed by Mistral AI and released under Apache 2.0 license. This multimodal and multilingual model processes both text and visual inputs with a context window of 128,000 tokens using the Tekken tokenizer. It demonstrates competitive performance on academic benchmarks including MMLU and GPQA while supporting function calling and structured output generation for automation workflows.

Model Architecture and Capabilities

Mistral Small 3.1 is a transformer-based model comprising 24 billion parameters, offered in both a pretrained base variant and an instruction-finetuned version. The model utilizes the Tekken tokenizer, featuring a vocabulary size of 131,000 tokens, and supports input sequences up to 128,000 tokens, enabling comprehension of long documents and complex conversational contexts.

As a multimodal model, Mistral Small 3.1 is proficient in processing both textual and visual data. Its vision system is capable of detailed analysis, including image-based document classification, content extraction, and scene description. The model's multilingual proficiency spans dozens of languages, including major European, Asian, and Middle Eastern languages, making it suitable for global-scale deployments. Further, it features advanced function-calling and agent-centric capabilities, facilitating structured outputs such as JSON for downstream automation and workflow integration.

Example of Mistral Small 3.1's image understanding on a map of Europe

Output from Mistral Small 3.1 analyzing a political map of Europe, demonstrating country identification, color parsing, and city recognition from visual data.

Full Size Image Image Source

Performance and Benchmarking

Mistral Small 3.1 exhibits competitive performance on a broad array of benchmarks when compared with both open and proprietary models in a similar parameter range. On academic evaluation suites, such as MMLU (Massive Multitask Language Understanding), GPQA (Graduate Level Question Answering), and multilingual tests, it performs at or above the level of leading models like Gemma 3-it (27B), GPT-4o Mini, and Claude 3.5 Haiku.

The model demonstrates strong results on both general and specialized tasks. For instance, its instruction-tuned version achieves 80.6% on standard MMLU, 44.4% on GPQA Main (5-shot CoT), and 64.0% on MMMU for multimodal instruction. In multilingual settings, it averages 71.2% accuracy across diverse language groupings. Its long-context reasoning capabilities are reflected in high scores on the LongBench v2 and RULER benchmarks, where it outperforms comparable models in maintaining accuracy over extended sequences. Additionally, the model delivers low inference latency, supporting high-throughput and responsive applications even in resource-constrained environments, as illustrated by public benchmark analyses.

Training Data and Methodology

While specific details regarding the training data composition remain undisclosed, Mistral Small 3.1 is reported to have built upon the methodologies established in Mistral Small 3, employing large-scale web, scientific, and technical datasets to support its broad reasoning and multilingual abilities. The instruction-tuned variant is further refined to follow complex system prompts and user instructions accurately, using high-quality supervised and reinforcement learning-based strategies to enhance alignment and safety. The vision capabilities are achieved through integration of image-text pretraining, enabling robust interpretation of a wide range of document and natural images.

Applications and Use Cases

The versatility of Mistral Small 3.1 allows deployment across a spectrum of practical applications. Its enhanced instruction-following skills make it suitable as a conversational assistant, supporting dialogue in multiple languages with context persistence over long exchanges. The model's image understanding enables automated document verification, technical diagnostics, quality inspection, and visual customer service scenarios. It is well-suited for agentic deployments requiring on-the-fly decision-making, such as executing structured function calls or integrating with data platforms via JSON outputs.

Furthermore, the model supports domain-specific fine-tuning, enabling its adaptation for specialized subject matter expertise—including legal and medical advisory services and technical troubleshooting—while still maintaining strong foundational reasoning skills. Its efficiency and lightweight design both facilitate deployment in local and edge environments, which is advantageous for privacy-sensitive or latency-critical use cases.

Limitations and Licensing

Despite broad capabilities, Mistral Small 3.1 has certain constraints. The model cannot generate images, access the internet, or transcribe audio and video inputs. Additionally, while it is available in a Transformers-compatible format, optimal operation is recommended using the original weight format, as complete behavioral parity with Transformers-based implementations has not yet been guaranteed.

Mistral Small 3.1 is made available under the Apache 2.0 license, permitting commercial and non-commercial use, modification, and redistribution. This supports a wide range of research and enterprise use cases, promoting transparency and flexibility for adopters.

Related Models in the Mistral Family

Mistral Small 3.1 extends the Mistral family of models, following the release of Mistral Small 3. The Mistral suite serves as the foundation for several derivative models developed by the community, such as DeepHermes 24B by Nous Research, which targets advanced reasoning and instruction following. The continued development of the Mistral series reflects a commitment to accessible, high-performance open-source AI for research and production environments.

Laboratory OS

Direct Download

Open WebUI

Text Generation Web UI

Explore the Future of AI

Your server, your data, under your control