Loading...
Browse Models
AudioLDM is a family of open-weights generative AI models that synthesize and manipulate audio through text-based prompts using latent diffusion techniques. The architecture employs CLAP embeddings to connect natural language with audio content, enabling text-to-audio generation, audio-to-audio transformation, super-resolution, and inpainting capabilities. Available in small and large variants, the models are trained on diverse datasets and support sound effects, music, and speech synthesis applications.