Mistral Small (2409)

Family

Mistral

Type

Foundation Model

License

Mistral AI Research License

Released

2024-09-17

How To Use

Note: Mistral Small (2409) weights are released under a Mistral AI Research License, and cannot be utilized for commercial purposes. Please read the license to verify if your use case is permitted.

Laboratory OS

Launch a dedicated cloud GPU server running Laboratory OS to download and run Mistral Small (2409) using any compatible app or framework.

Direct Download

Must be used with a compatible app, notebook, or codebase. May run slowly, or not work at all, depending on local system resources, particularly GPU(s) and available VRAM.

Browse Compatible Apps

open-webui /

Open WebUI

Open WebUI is an open-source, self-hosted web interface with a polished, ChatGPT-like user experience for interacting with LLMs. Integrates seamlessly with local Ollama installation.

oobabooga /

Text Generation Web UI

The most full-featured web interface for experimenting with open source Large Language Models. Featuring a wide range of configurable settings, inference engines, and plugins.

Model Report

Mistral AI / Mistral Small (2409)

Mistral Small (2409) is an instruction-tuned language model developed by Mistral AI with approximately 22 billion parameters and released in September 2024. The model supports function calling capabilities and processes input sequences up to 32,000 tokens. It features improvements in reasoning, alignment, and code generation compared to its predecessor, while being restricted to research and non-commercial use under Mistral AI's Research License.

Explore the Future of AI

Your server, your data, under your control

Mistral Small (2409), also referenced as Mistral Small v24.09 or Mistral-Small-Instruct-2409, is an instruct fine-tuned small language model developed by Mistral AI. Released on September 17, 2024, it serves as an enhanced successor to Mistral Small v24.02, incorporating improvements in alignment, reasoning, and code-related tasks. Built within the Mistral family of models, Mistral Small (2409) is designed for use in research environments, with licensing constraints focused on non-commercial applications.

Technical Specifications and Model Architecture

Mistral Small (2409) features a neural architecture with approximately 22 billion parameters, supporting a vocabulary size of 32,768 tokens. This configuration enables the model to process input sequences up to 32,000 tokens in length. The underlying structure inherits Mistral AI’s established architectural principles, ensuring compatibility and consistency throughout their product suite.

A distinguishing feature of Mistral Small (2409) is its support for function calling, which facilitates interaction with external tools or APIs—a capability designed to increase practical application in research and experimental deployment contexts. Although full technical details on tokenization algorithms and training datasets remain unspecified by the developers, the model is described as an “instruct fine-tuned version,” suggesting an emphasis on instruction following and alignment with human intent.

Performance and Capabilities

Mistral Small (2409) is positioned for tasks where computational efficiency is prioritized. While specific benchmark data has not been publicly detailed, the model is reported to show enhanced reasoning, human alignment, and code generation capabilities compared to its predecessor, Mistral Small v24.02. Improvements noted by the developers point toward increased reliability in content summarization, translation, sentiment analysis, and code-related completions.

The model’s architecture and fine-tuning enable rapid inference and interaction, suited for research environments that require swift iteration or high-throughput workloads.

Application Domains

Mistral Small (2409) lends itself to a broad array of language processing tasks within research, academic, and non-profit settings. The fine-tuning approach and instruction following capabilities make it suitable for translation, summarization, sentiment analysis, and rapid prototyping of tool-augmented language applications. Its support for programmatic function calling further expands its utility in research projects exploring LLM-assisted tool integration.

By occupying an intermediate position in the Mistral product suite, Mistral Small (2409) serves as a solution between smaller architectures such as Mistral NeMo 12B and more generalized models like Mistral Large 2, which also saw ongoing efficiency enhancements.

Model Licensing and Usage Restrictions

Distribution and usage of Mistral Small (2409) are governed by the Mistral AI Research License. This license permits use, modification, and distribution solely for research, academic, or non-profit purposes. Commercial applications, including integration into products, use by commercial entities, or the offering of hosted inference services, require a separate agreement with Mistral AI.

The license emphasizes strict separation between research and commercial use, specifying that any outputs, model derivatives, or resulting systems must not be used directly or indirectly for business operations or monetization. Further, the license mandates appropriate attribution, prohibits misrepresentation or false endorsement, and requires that all recipients be informed of the license’s terms when the model or derivatives are shared.

To access the model weights and associated files, users must agree to share their contact information with Mistral AI for the purposes of tracking license compliance and, for commercial parties, receiving communications about new developments.

Limitations and Resource Requirements

A notable constraint of Mistral Small (2409) lies in its hardware requirements: self-hosted deployment in research environments typically demands access to substantial GPU memory for inference on a single device. Parallelization can distribute memory demands, but such infrastructure is often limited to well-resourced laboratories or institutional environments. The restriction to research-use only further delineates the intended audience and limits real-world deployments outside of academic or exploratory settings.

The training procedures, data curation strategies, and evaluation benchmarks employed by Mistral AI have not been exhaustively documented in public sources. As with all large language models, users are cautioned regarding potential biases or limitations inherited from the training data and the need for careful evaluation before adopting the model for sensitive research tasks.

Release Context and Model Family

The September 2024 release of Mistral Small (2409) was accompanied by updates to other Mistral AI models, illustrating an ongoing focus on efficiency and accessibility. Notable models within the ecosystem include Mistral NeMo 12B (smaller-scale), Mistral Large 2 (frontier model), and Codestral (code generation focus). These models collectively serve a spectrum of research and development needs, providing a modular suite for experimentation within the licensing framework.