Browse Models
The simplest way to self-host Phi-3 Mini Instruct. Launch a dedicated cloud GPU server running Lab Station OS to download and serve the model using any compatible app or framework.
Download model weights for local inference. Must be used with a compatible app, notebook, or codebase. May run slowly, or not work at all, depending on your system resources, particularly GPU(s) and available VRAM.
Phi-3-Mini is a 3.8B parameter language model with a 128K token context window. Trained on 4.9T tokens, it excels at reasoning tasks while maintaining a small size. Uses structured chat format and shows strong capabilities in mathematics, coding, and logical analysis despite its compact architecture.
The Phi-3-Mini-128K-Instruct is a 3.8 billion-parameter large language model (LLM) developed by Microsoft as part of the Phi-3 family. Released in June 2024, it represents a significant advancement in small language models (SLMs), implementing a dense decoder-only Transformer architecture. The model was trained on an extensive dataset of 4.9 trillion tokens using 512 H100-80G GPUs over a 10-day period, combining synthetic data and filtered web content with an emphasis on high-quality, reasoning-dense material.
The model comes in two context length variants: 4K and 128K tokens, with the 128K variant being particularly notable as the first model of its size to support such an extended context window without significant quality degradation. Both variants underwent supervised fine-tuning (SFT) and direct preference optimization (DPO) to enhance instruction-following capabilities and safety measures.
Phi-3-Mini demonstrates state-of-the-art performance among models with fewer than 13 billion parameters across various benchmarks. It shows particularly strong capabilities in:
When compared to larger models like Mistral-7b-v0.1, Mixtral-8x7b, Gemma 7B, Llama-3-8B-Instruct, and GPT-3.5, Phi-3-Mini-128K-Instruct maintains competitive performance, especially in reasoning tasks. However, its smaller size does result in some limitations regarding factual knowledge compared to larger models. A significant update in June 2024 brought improvements to instruction following, structured output capabilities, reasoning, and long-context understanding, as detailed in the technical report.
The model is designed to work with chat-formatted prompts using specific tags: <|system|>
, <|user|>
, and <|assistant|>
for conversation structure. It's particularly well-suited for applications requiring:
Technical implementation is supported through the transformers
library (development version 4.41.3) and includes ONNX Runtime support, enabling cross-platform deployment across CPU, GPU, and mobile devices. The model is released under the MIT license, making it accessible for both research and commercial applications.
Phi-3-Mini is part of the broader Phi-3 family, which includes upcoming larger models:
These models are being developed according to Microsoft's Responsible AI Standard, with rigorous safety testing and evaluation protocols. The family represents a strategic approach to creating efficient, capable language models that balance performance with resource requirements.