Browse Models
The simplest way to self-host Qwen 1.5 72B. Launch a dedicated cloud GPU server running Laboratory OS to download and serve the model using any compatible app or framework.
Download model weights for local inference. Must be used with a compatible app, notebook, or codebase. May run slowly, or not work at all, depending on your system resources, particularly GPU(s) and available VRAM.
Qwen 1.5 72B is a large language model with a 32,768 token context length, trained using DPO and PPO techniques. It excels in mathematical reasoning, programming, and multilingual tasks across 12 languages. Notable for strong performance on MMLU, C-Eval, and GSM8K benchmarks, comparing favorably with GPT-3.5.
The Qwen 1.5 72B model, released in February 2024, represents a significant advancement in the Qwen series of large language models. As part of a comprehensive family of models ranging from 0.5B to 110B parameters, the 72B variant stands out as one of the larger and more capable implementations in the series. All models in the family support an impressive context length of 32,768 tokens, enabling them to process and understand lengthy documents and conversations.
The model's development incorporates advanced alignment techniques, including Direct Policy Optimization (DPO) and Proximal Policy Optimization (PPO), which help ensure the model's outputs align well with human preferences. While the specific architecture details and training data composition are not publicly disclosed, the implementation results demonstrate significant capabilities across multiple domains.
In benchmark evaluations, Qwen 1.5 72B demonstrates remarkable performance across a wide range of tasks. The model consistently outperforms Llama2-70B across all tested benchmarks, showcasing its superior capabilities despite similar parameter counts. Notable evaluation areas include:
The model's multilingual capabilities are particularly noteworthy, with evaluations across 12 different languages showing competitive performance compared to GPT-3.5 and GPT-4. It excels in areas such as:
Long-context understanding represents another strength of the model, with strong performance on the LEval benchmark. The model also shows promising results in retrieval-augmented generation (RAG) tasks, as measured by the RGB benchmark, and demonstrates effective capabilities as an AI agent according to the T-Eval benchmark.
The Qwen 1.5 series, including the 72B model, is designed for broad accessibility and ease of use. Integration with Hugging Face Transformers (version 4.37.0 and later) provides a straightforward path to implementation. The model supports various quantization options, including:
These quantization options allow for deployment in resource-constrained environments while maintaining reasonable performance characteristics.
The Qwen 1.5 series includes multiple model sizes:
Each size comes in both base and chat variants, allowing users to select the most appropriate model for their specific use case. The 72B parameter model represents a sweet spot between the largest 110B model and smaller variants, offering strong performance while remaining more manageable in terms of computational requirements.