AudioLDM | Open Laboratory

Apps Models Pricing

Loading...

/

Terms & Conditions

/

/

Acceptable Use Policy

Browse Models

AudioLDM

Model Family Report

Compatible Apps

Audio WebUI

gitmylo /

Audio WebUI

Experiment with various cutting-edge audio generation models, such as Bark (Text-to-Speech), RVC (Voice Cloning), and MusicGen (Text-to-Music).

Explore the Future of AI

Your server, your data, under your control

AudioLDM is a family of open-weights generative AI models that synthesize and manipulate audio through text-based prompts using latent diffusion techniques. The architecture employs CLAP embeddings to connect natural language with audio content, enabling text-to-audio generation, audio-to-audio transformation, super-resolution, and inpainting capabilities. Available in small and large variants, the models are trained on diverse datasets and support sound effects, music, and speech synthesis applications.