Browse Models
Note: Stable Diffusion 3.5 Turbo weights are released under a Stability AI Non-Commercial Research Community License, and cannot be utilized for commercial purposes. Please read the license to verify if your use case is permitted.
The simplest way to self-host Stable Diffusion 3.5 Turbo. Launch a dedicated cloud GPU server running Lab Station OS to download and serve the model using any compatible app or framework.
Download model weights for local inference. Must be used with a compatible app, notebook, or codebase. May run slowly, or not work at all, depending on your system resources, particularly GPU(s) and available VRAM.
Stable Diffusion 3.5 Turbo is a text-to-image model using Multimodal Diffusion Transformer architecture and Adversarial Diffusion Distillation, enabling image generation in just 4 steps. Features three text encoders (OpenCLIP, CLIP, T5) and shows particular strength in typography and complex prompt interpretation.
Stable Diffusion 3.5 Turbo represents a significant advancement in text-to-image generation, launched by Stability AI on October 22nd, 2024. As part of the broader Stable Diffusion 3.5 family, it builds upon previous iterations while introducing key architectural improvements and optimizations for enhanced performance and accessibility.
At its core, Stable Diffusion 3.5 Turbo is a Multimodal Diffusion Transformer (MMDiT) that leverages Adversarial Diffusion Distillation (ADD) technology. This architecture enables superior image quality, improved typography, and more sophisticated prompt understanding while maintaining remarkable efficiency. The model employs three fixed, pretrained text encoders:
A notable architectural enhancement is the integration of Query-Key (QK) normalization into the transformer blocks, which significantly improves training stability and simplifies fine-tuning processes. More detailed technical specifications can be found in the MMDiT research paper and ADD technical report.
Stable Diffusion 3.5 Turbo is a distilled version of the larger SD 3.5 Large model, optimized specifically for consumer hardware while maintaining exceptional image quality. One of its most notable features is the ability to generate high-quality images in just 4 inference steps, making it significantly more efficient than previous versions.
The model demonstrates remarkable versatility in generating diverse outputs, from photorealistic images to various artistic styles:
In terms of performance benchmarks, Stable Diffusion 3.5 Large (the base model from which Turbo is distilled) leads the market in prompt adherence and rivals much larger models in image quality:
Within the SD 3.5 family, there are several variants optimized for different use cases:
The model excels in generating diverse representations of people and features without requiring extensive prompting:
The model is released under the Stability Community License, which allows free use for:
Stability AI has implemented safety mitigations to reduce harmful content generation risks, though developers are encouraged to implement additional safeguards based on their specific use cases. The model is explicitly not intended for generating factually accurate representations of people or events.