Browse Models
The simplest way to self-host Playground v2.5 Aesthetic. Launch a dedicated cloud GPU server running Lab Station OS to download and serve the model using any compatible app or framework.
Download model weights for local inference. Must be used with a compatible app, notebook, or codebase. May run slowly, or not work at all, depending on your system resources, particularly GPU(s) and available VRAM.
Playground v2.5 Aesthetic is a text-to-image model built on SDXL architecture, using Energy-Based Diffusion (EDM) for improved color and contrast. It features balanced aspect ratio handling (9:16 to 16:9) and human-guided training for enhanced facial details and textures. Benchmarks show FID score of 4.48 vs SDXL's 9.55.
Playground v2.5 Aesthetic, released in February 2024, represents a significant advancement in text-to-image generation, building upon the Stable Diffusion XL (SDXL) architecture while introducing novel methods to enhance aesthetic quality. The model utilizes two fixed, pre-trained text encoders (OpenCLIP-ViT/G and CLIP-ViT/L) and is available under the Playground v2.5 Community License.
The developers focused on three primary areas of enhancement:
User studies and benchmarks demonstrate Playground v2.5's superior performance compared to both open-source and closed-source competitors. The model outperforms SDXL by a factor of 4.8x and PixArt-α by 2.4x in aesthetic quality, while also surpassing DALL-E 3 and Midjourney 5.2.
On the MJHQ-30K benchmark, Playground v2.5 achieves significantly lower FID scores (4.48) compared to Playground v2 (7.07) and SDXL (9.55), with particularly strong performance in the "people" and "fashion" categories.
The model supports generation at 1024x1024 resolution and various aspect ratios. It employs two different schedulers:
EDMDPMSolverMultistepScheduler
(default) with a recommended guidance_scale
of 3.0 for sharper detailsEDMEulerScheduler
with a recommended guidance_scale
of 5.0 as an alternative option