Deepgram Aura
Deepgram Aura text-to-speech with natural voices
Headers
Authorization
string · requiredThe
Authorization
header is used to authenticate with the API using your API key. Value is of the formatBearer YOUR_KEY_HERE
.
Request Body
text
string · requiredText to convert to speech
voice
stringVoice ID or name
language
stringLanguage code (e.g., en-US)
speed
numberSpeech speed (0.5 to 2.0)
Responses
Audio response
Cartesia Sonic
Cartesia Sonic ultra-realistic text-to-speech with 90ms latency, voice cloning, and 15 language support
Headers
Authorization
string · requiredThe
Authorization
header is used to authenticate with the API using your API key. Value is of the formatBearer YOUR_KEY_HERE
.
Request Body
text
string · requiredText to convert to speech
voice
stringVoice ID or name
language
stringLanguage code (e.g., en-US)
speed
numberSpeech speed (0.5 to 2.0)
Responses
Audio response
ElevenLabs Multi-v2
Generate audio from text using ElevenLabs Multi-v2 voice model. Multilingual model supporting 29+ languages with high-quality natural voices.
Headers
Authorization
string · requiredThe
Authorization
header is used to authenticate with the API using your API key. Value is of the formatBearer YOUR_KEY_HERE
.
Request Body
text
string · requiredThe text to convert to speech
voice
stringVoice ID or name to use for synthesis
Default: Rachelvoice_id
stringAlternative parameter name for voice ID
language
stringLanguage code (en, es, fr, de, it, pt, pl, hi, ar, zh, ja, ko, etc.)
Default: enlanguage_code
stringAlternative parameter name for language code
stream
booleanWhether to stream the audio response
Default: falseformat
stringAudio format to return (mp3_44100_128, mp3_44100_192, pcm_16000, pcm_22050, pcm_24000, pcm_44100, ulaw_8000)
Default: mp3_44100_128stability
number · min: 0 · max: 1Voice stability (0.0-1.0)
Default: 0.5similarity_boost
number · min: 0 · max: 1Voice similarity boost (0.0-1.0)
Default: 0.75style
number · min: 0 · max: 1Style control (0.0-1.0)
Default: 0speaking_rate
numberSpeaking rate multiplier
Default: 1text_normalization
string · enumText normalization mode
Enum values:autoonoffDefault: autoseed
integerRandom seed for reproducible audio generation
Responses
Audio response
ElevenLabs Turbo v2.5
Generate audio from text using ElevenLabs Turbo v2.5 voice model. Ultra-fast TTS model with low latency, ideal for real-time applications.
Headers
Authorization
string · requiredThe
Authorization
header is used to authenticate with the API using your API key. Value is of the formatBearer YOUR_KEY_HERE
.
Request Body
text
string · requiredThe text to convert to speech
voice
stringVoice ID or name to use for synthesis
Default: Rachelvoice_id
stringAlternative parameter name for voice ID
language
stringLanguage code (en, es, fr, de, it, pt)
Default: enstream
booleanWhether to stream the audio response
Default: falseformat
stringAudio format to return
Default: mp3_44100_128stability
number · min: 0 · max: 1Voice stability (0.0-1.0)
Default: 0.5similarity_boost
number · min: 0 · max: 1Voice similarity boost (0.0-1.0)
Default: 0.75speaking_rate
numberSpeaking rate multiplier
Default: 1text_normalization
string · enumText normalization mode
Enum values:autoonoffDefault: auto
Responses
Audio response
ElevenLabs TTV v3
Generate audio from text using ElevenLabs TTV v3 voice model. Optimized for synchronized text-to-video applications.
Headers
Authorization
string · requiredThe
Authorization
header is used to authenticate with the API using your API key. Value is of the formatBearer YOUR_KEY_HERE
.
Request Body
text
string · requiredThe text to convert to speech
voice
stringVoice ID or name to use for synthesis
Default: Rachelvoice_id
stringAlternative parameter name for voice ID
language
stringLanguage code (en, es, fr, de, it, pt)
Default: enstream
booleanWhether to stream the audio response
Default: falseformat
stringAudio format to return
Default: mp3_44100_128stability
number · min: 0 · max: 1Voice stability (0.0-1.0)
Default: 0.5similarity_boost
number · min: 0 · max: 1Voice similarity boost (0.0-1.0)
Default: 0.75speaking_rate
numberSpeaking rate multiplier
Default: 1text_normalization
string · enumText normalization mode
Enum values:autoonoffDefault: autoseed
integerRandom seed for reproducible audio generation
Responses
Audio response
ElevenLabs Flash v2.5
Generate audio from text using ElevenLabs Flash v2.5 voice model. Fast TTS model with good balance between speed and quality.
Headers
Authorization
string · requiredThe
Authorization
header is used to authenticate with the API using your API key. Value is of the formatBearer YOUR_KEY_HERE
.
Request Body
text
string · requiredThe text to convert to speech
voice
stringVoice ID or name to use for synthesis
Default: Rachelvoice_id
stringAlternative parameter name for voice ID
language
stringLanguage code (en, es, fr, de, it, pt, pl)
Default: enstream
booleanWhether to stream the audio response
Default: falseformat
stringAudio format to return
Default: mp3_44100_128stability
number · min: 0 · max: 1Voice stability (0.0-1.0)
Default: 0.5similarity_boost
number · min: 0 · max: 1Voice similarity boost (0.0-1.0)
Default: 0.75speaking_rate
numberSpeaking rate multiplier
Default: 1text_normalization
string · enumText normalization mode
Enum values:autoonoffDefault: auto
Responses
Audio response