Browse Models
The MPT (Mosaic Pretrained Transformer) family consists of open-weight decoder-only transformer models released by MosaicML in 2023. These models utilize FlashAttention and Attention with Linear Biases for computational efficiency and extended context handling. The family includes the base MPT-7B model with approximately 6.7 billion parameters, plus specialized variants for instruction-following, conversation, and long-form text generation with contexts exceeding 65,000 tokens.