Browse Models
The simplest way to self-host Phi-4. Launch a dedicated cloud GPU server running Lab Station OS to download and serve the model using any compatible app or framework.
Download model weights for local inference. Must be used with a compatible app, notebook, or codebase. May run slowly, or not work at all, depending on your system resources, particularly GPU(s) and available VRAM.
Microsoft's Phi-4 is a 14B parameter language model excelling in STEM tasks, particularly mathematical reasoning. Built with a decoder-only architecture, it features a 16K context window and was trained on 9.8T tokens, with 40% being synthetic data. Notable for outperforming larger models on math and science benchmarks.
Microsoft's Phi-4 represents a significant advancement in small language models (SLMs), combining sophisticated training techniques with focused optimization for complex reasoning tasks. Released on December 12, 2024, this 14-billion parameter dense decoder-only Transformer model demonstrates remarkable capabilities, particularly in mathematical and scientific reasoning domains.
Phi-4 builds upon the architecture of its predecessor, Phi-3-medium, incorporating minimal structural changes but achieving substantial performance improvements through enhanced training methodologies. The model employs the tiktoken
tokenizer for improved multilingual support and features a 16K token context length, upgraded from 4K during mid-training.
The training process, detailed in the technical report, involved approximately 9.8 trillion tokens over 21 days using 1920 H100-80G GPUs. The training data mixture was carefully curated, consisting of:
Unlike previous Phi models that primarily distilled capabilities from GPT-4, Phi-4's training incorporated extensive synthetic data generation through multi-agent prompting, self-revision workflows, and instruction reversal. Approximately 400 billion unweighted tokens were generated across 50 types of synthetic datasets.
Phi-4 demonstrates exceptional performance across various benchmarks, particularly excelling in STEM-related tasks. The model has shown remarkable results in mathematical reasoning, outperforming larger models on specific benchmarks.
When compared to similar-sized models like Qwen-2.5-14B-Instruct, Phi-4 demonstrates superior performance in 9 out of 12 benchmarks. Notably, it surpasses GPT-4o on GPQA and MATH benchmarks and achieves leading scores for coding benchmarks among open-weight models.
The model emphasizes safety through a comprehensive post-training approach combining supervised fine-tuning and direct preference optimization. This includes:
Despite these measures, certain limitations persist:
The model is released under the MIT license, making it accessible for research and development while encouraging responsible AI practices.