Browse Models

Alibaba Cloud /

Qwen 1.5 32B

Family

Qwen 1

Type

Foundation Model

License

Tongyi Qianwen License

Released

2024-01-22

How To Use

Laboratory OS

Launch a dedicated cloud GPU server running Laboratory OS to download and run Qwen 1.5 32B using any compatible app or framework.

Direct Download

Must be used with a compatible app, notebook, or codebase. May run slowly, or not work at all, depending on local system resources, particularly GPU(s) and available VRAM.

Browse Compatible Apps

open-webui /

Open WebUI

Open WebUI is an open-source, self-hosted web interface with a polished, ChatGPT-like user experience for interacting with LLMs. Integrates seamlessly with local Ollama installation.

oobabooga /

Text Generation Web UI

The most full-featured web interface for experimenting with open source Large Language Models. Featuring a wide range of configurable settings, inference engines, and plugins.

Model Report

Alibaba Cloud / Qwen 1.5 32B

Qwen1.5-32B is a 32-billion parameter generative language model developed by Alibaba Cloud's Qwen Team and released in February 2024. The model supports up to 32,768 tokens of context length and demonstrates multilingual capabilities across European, East Asian, and Southeast Asian languages. It achieves competitive performance on language understanding and reasoning benchmarks, with an MMLU score of 73.4, and includes features for retrieval-augmented generation and external system integration.

Explore the Future of AI

Your server, your data, under your control

Qwen1.5-32B is a large-scale generative language model in the Qwen1.5 series, developed by the Qwen Team and released in February 2024. This model, part of the Qwen lineup, incorporates designs for enhanced alignment, multilingual capabilities, long-context processing, and aims for improved performance across a range of natural language processing tasks.

Model Architecture and Training

The Qwen1.5-32B model is built as part of a modular series covering a spectrum of model sizes, from 0.5B to 110B parameters, alongside a mixture-of-experts (MoE) variant. All models in this series are unified by support for an extended context window of up to 32,768 tokens, which enhances their ability to process and retain information over lengthy documents or complex, multi-turn conversations. The architecture is accessible via the Hugging Face Transformers library (version 4.37.0 and above), allowing integration without modification or reliance on custom execution environments, as described in the official Qwen1.5 release documentation.

Alignment to human preferences is achieved using reinforcement learning techniques such as Direct Policy Optimization (DPO) and Proximal Policy Optimization (PPO), which are incorporated during the fine-tuning stages of development. These methods refine the model's tendency to follow human instructions and produce responses that are helpful, accurate, and contextually appropriate, as detailed in the Qwen team's technical blog post. While the specific pretraining datasets employed remain undisclosed, the focus on diverse data and multilingual resources is stated as a priority for broad generalization.

The Qwen1.5-32B model is published in multiple quantized formats, including Int4/Int8 GPTQ, AWQ, and GGUF, enabling efficient inference and adaptation to a range of deployment scenarios.

Performance and Benchmark Evaluation

Qwen1.5-32B demonstrates competitive results across numerous language understanding, reasoning, and coding benchmarks. The base model’s performance includes an MMLU score of 73.4, with scores indicating knowledge and reasoning competencies; a C-Eval score of 83.5; GSM8K at 77.4 for mathematical reasoning; MATH at 36.1; HumanEval at 37.2 for programming tasks; MBPP at 49.4; BBH at 66.8; and CMMLU at 82.3. These results, published in the Qwen1.5 performance overview, indicate its performance relative to peer language models of similar scale.

Although the Qwen1.5-32B model's multilingual and tool-use specific results are less extensively documented than those of larger siblings such as Qwen-1_5-72B, the series as a whole exhibits capabilities in language understanding, translation, and code-related tasks. The model family has been evaluated using benchmarks such as MT-Bench and AlpacaEval v2 for instruction following, L-Eval for long context handling, RGB for retrieval-augmented generation, and additional assessments for code interpretation and tool use.

Scatterplot of model performance on MT-Bench and AlpacaEval benchmarks

Multilinguality and Context Handling

Qwen1.5-32B includes support for multiple languages spanning Europe, East Asia, and Southeast Asia, with demonstrable performance across exams, translation, comprehension, and mathematical reasoning tasks. The design emphasizes uniform multilingual evaluation and large context capacity, allowing applications in translation, cross-lingual information retrieval, and multilingual dialogue.

Illustrative results in the broader Qwen1.5 family indicate score parity or, in some cases, an advantage compared to other leading models in multilingual benchmarks, as highlighted in bar charts comparing Qwen-1_5-72B-Chat with models like GPT-3.5 across languages such as Arabic, French, Vietnamese, Korean, Japanese, and others in a performance comparison.

Multilingual performance chart for Qwen1.5 and GPT-3.5

The consistent support for context lengths up to 32,768 tokens increases usability in scenarios requiring rich document understanding and handling of complex multi-step instructions without premature truncation or loss of coherence.

Integration, Alignment, and System Connectivity

Beyond core language modeling, Qwen1.5-32B is developed with capabilities for integration with external systems. Features include support for retrieval-augmented generation (RAG), enabling the use of up-to-date or private knowledge via connected external databases or APIs. The model series is also designed for API-based interaction, function/tool calling, and interaction with agent-oriented frameworks.

Alignment with human intent and safety guidelines is addressed via the aforementioned DPO and PPO training regimes. These steps contribute to instruction-following reliability and human-aligned outputs, validated through comprehensive crowd-sourcing and evaluation results in alignment-oriented benchmarks, as described in the Qwen1.5 technical overview.

Comparative Position and Limitations

Within the Qwen1.5 series, the 32B model occupies a middle-size tier, offering a balance between scale and resource efficiency. Larger siblings, such as Qwen-1_5-72B, are reported to outperform models like Llama-2-70B across diverse metrics and demonstrate high scores in alignment and multilingual evaluations. However, on code interpretation and visual reasoning tasks, the series is noted to trail behind frontier models such as GPT-4, particularly in tool-use and math-centric reasoning.

While the models achieve parity in many multi-turn, chat, and reasoning benchmarks, ongoing work is planned by the Qwen Team to bolster code reasoning and interpreter accuracy, as noted in the model’s official blog announcement.

Applications and Use Cases

Qwen1.5-32B finds application across a wide range of domains, including language comprehension and generation, coding assistance, complex reasoning, multilingual dialogue, retrieval-augmented workflows, and integration with agent frameworks capable of tool invocation. Fine-tuning support and configurability allow adaptation to specialized domains and research. Local inference in quantized formats and integration with open-source tools further extend practical accessibility.

Qwen 1.5 32B

Laboratory OS

Direct Download

Open WebUI

Text Generation Web UI

Explore the Future of AI

Your server, your data, under your control

Qwen 1.5 32B

Laboratory OS

Direct Download

Open WebUI

Text Generation Web UI

Explore the Future of AI

Your server, your data, under your control

Model Architecture and Training

Performance and Benchmark Evaluation

Multilinguality and Context Handling

Integration, Alignment, and System Connectivity

Comparative Position and Limitations

Applications and Use Cases

Helpful Links