Browse Models
Note: SDXL Turbo weights are released under a Stability AI Non-Commercial Research Community License, and cannot be utilized for commercial purposes. Please read the license to verify if your use case is permitted.
The simplest way to self-host SDXL Turbo. Launch a dedicated cloud GPU server running Lab Station OS to download and serve the model using any compatible app or framework.
Download model weights for local inference. Must be used with a compatible app, notebook, or codebase. May run slowly, or not work at all, depending on your system resources, particularly GPU(s) and available VRAM.
SDXL Turbo uses Adversarial Diffusion Distillation to generate images in a single step. Available in 860M and 3.1B parameter versions, it produces 512x512 images in ~200ms. The model combines adversarial training with score distillation, using a pre-trained diffusion model as guidance.
SDXL Turbo represents a significant advancement in real-time text-to-image generation, introducing a novel distillation technique called Adversarial Diffusion Distillation (ADD) that enables high-quality image synthesis in a single step. As detailed in the research paper, this breakthrough allows for dramatically faster generation compared to its predecessor SDXL 1.0, which required 50 steps for comparable results.
The model's architecture leverages ADD, which combines adversarial training with score distillation from a pre-trained teacher model. This approach maintains high image fidelity while enabling rapid generation. The ADD framework consists of three key components:
The discriminator is conditioned on both text and image embeddings to enhance performance, utilizing Vision Transformers (ViTs) trained with the DINOv2 objective. This architecture allows for iterative refinement, with image quality improving through additional sampling steps when desired.
Two variants were developed during research:
SDXL Turbo demonstrates remarkable performance in both single-step and multi-step configurations. On an A100 GPU, it generates 512x512 images in approximately 207ms, with a single UNet forward evaluation taking just 67ms. Blind tests against state-of-the-art models revealed that SDXL Turbo outperforms:
The model excels at producing photorealistic images but has some limitations:
SDXL Turbo can be implemented using the Diffusers library. The basic setup requires:
pip install diffusers transformers accelerate --upgrade
For text-to-image generation:
AutoPipelineForText2Image.from_pretrained("stabilityai/sdxl-turbo",
torch_dtype=torch.float16,
variant="fp16")
Optimal parameters include:
num_inference_steps=1
guidance_scale=0.0
For image-to-image generation, use AutoPipelineForImage2Image
, ensuring num_inference_steps * strength >= 1
.
The model is licensed under the sai-nc-community license for non-commercial and research purposes. Primary applications include: