Apps
Models
Pricing
Login
Compatible Apps
(1)
Audio WebUI
Generative Audio AI
Text-To-Speech • Speech-To-Text • Text-To-Music • Audio Translation
Launch Lab Station
openai /
Whisper
Whisper is OpenAI's family of speech recognition models using transformer encoder-decoder architecture, ranging from 39M to 1.55B parameters, capable of robust multilingual transcription, translation, and language identification across diverse acoustic conditions.
Release: 2023-11-08
1 Models
Meta /
AudioCraft
AudioCraft is Meta's suite of AI models for audio/music generation, featuring MusicGen and MAGNeT. Built on EnCodec compression and T5 encoding, these models generate high-quality 32kHz audio through transformers and hybrid architectures.
Release: 2023-11-06
2 Models
suno /
Bark
Bark is a transformer-based text-to-audio model family that uses a three-stage architecture to generate high-quality speech, music, and sound effects from text, featuring multilingual support and natural nonverbal elements.
Release: 2023-04-28
1 Models
Show 3 more