SLNG.AI API

Documentation

Endpoint:https://api.slng.ai

Text to Speech [TTS]

POST

https://api.slng.ai

/v1/tts

Generate audio from text using selected voice model.

Text to Speech [TTS] › Headers

Authorizationstring · required
The Authorization header is used to authenticate with the API using your API key. Value is of the format Bearer YOUR_KEY_HERE.

Text to Speech [TTS] › Request Body

textstring · required

modelstring
voicestring

Text to Speech [TTS] › Responses

string · binary

VUI [TTS] (Default/USA)

POST

https://api.slng.ai

/v1/tts/vui

Generate audio from text using VUI voice model (default region: USA).

VUI [TTS] (Default/USA) › Headers

Authorizationstring · required
The Authorization header is used to authenticate with the API using your API key. Value is of the format Bearer YOUR_KEY_HERE.

VUI [TTS] (Default/USA) › Request Body

textstring · required
The text to convert to speech

voicestring
Voice to use for synthesis (optional)
Default: default
streamboolean
Whether to stream the audio response
Default: false
asyncboolean
Whether to use async prediction (returns prediction_id)
Default: false

VUI [TTS] (Default/USA) › Responses

string · binary

Orpheus [TTS]

POST

https://api.slng.ai

/v1/tts/orpheus

Generate audio from text using Orpheus voice model via Baseten API. Optimized with TRT-LLM on H100 MIG 40GB hardware. Generates ~83 tokens/second for real-time streaming. Audio format: 24kHz, 16-bit, mono WAV.

Orpheus [TTS] › Headers

Authorizationstring · required
The Authorization header is used to authenticate with the API using your API key. Value is of the format Bearer YOUR_KEY_HERE.

Orpheus [TTS] › Request Body

promptstring · required
The text to convert to speech

voicestring
Voice to use - English: 'tara', 'leah', 'jess', 'leo', 'dan', 'mia', 'zac', 'zoe'; French: 'pierre', 'amelie', 'marie'; German: 'jana', 'thomas', 'max'; etc.
Example: tara
Default: tara
max_tokensnumber
Maximum tokens to generate
Default: 2000
streamboolean
Whether to stream the response
Default: false
asyncboolean
Whether to run asynchronously
Default: false
output_languagestring
Language code - English: 'en' (high quality); French: 'fr' (high quality); German: 'de' (high quality); Korean: 'ko' (high quality); Mandarin: 'zh' (high quality); Spanish: 'es' (medium); Italian: 'it' (medium); Hindi: 'hi' (medium)
Example: en
output_stylestring
Style of speech (e.g., 'cheerful', 'serious', 'excited')

Orpheus [TTS] › Responses

string · binary

Koroko [TTS]

POST

https://api.slng.ai

/v1/tts/koroko

Generate audio from text using Koroko, a frontier TTS model with just 82 million parameters. Offers efficient and high-quality speech synthesis. Audio format: 16-bit WAV.

Koroko [TTS] › Headers

Authorizationstring · required
The Authorization header is used to authenticate with the API using your API key. Value is of the format Bearer YOUR_KEY_HERE.

Koroko [TTS] › Request Body

textstring · required
The text to convert to speech

voicestring
Voice to use (if supported by the model)
streamboolean
Whether to stream the response
Default: false
asyncboolean
Whether to run asynchronously
Default: false

Koroko [TTS] › Responses

string · binary

XTTS-V2 [TTS]

POST

https://api.slng.ai

/v1/tts/xtts-v2

Generate audio from text using XTTS-V2 voice model with voice cloning capabilities in multiple languages. XTTS-V2 is a state-of-the-art text-to-speech model by Coqui. Audio format: WAV (16-bit, 24kHz).

XTTS-V2 [TTS] › Headers

Authorizationstring · required
The Authorization header is used to authenticate with the API using your API key. Value is of the format Bearer YOUR_KEY_HERE.

XTTS-V2 [TTS] › Request Body

textstring · required
The text to convert to speech
speaker_voicestring · required
Base64 encoded audio file for voice cloning (6+ seconds recommended)

languagestring
Target language code - English: 'en', Spanish: 'es', French: 'fr', German: 'de', Italian: 'it', Portuguese: 'pt', Polish: 'pl', Turkish: 'tr', Russian: 'ru', Dutch: 'nl', Czech: 'cs', Arabic: 'ar', Chinese: 'zh', Japanese: 'ja', Korean: 'ko', Hungarian: 'hu', Hindi: 'hi'
Example: en
Default: en
streamboolean
Whether to stream the response
Default: false
asyncboolean
Whether to run asynchronously
Default: false

XTTS-V2 [TTS] › Responses

string · binary

MARS6 [TTS]

POST

https://api.slng.ai

/v1/tts/mars6

Generate audio from text using MARS6 voice model with voice/prosody cloning capabilities in 10 languages. MARS6 is a frontier text-to-speech model by CAMB.AI. Audio format: AAC (adts stream) or FLAC, depending on stream_format parameter.

MARS6 [TTS] › Headers

Authorizationstring · required
The Authorization header is used to authenticate with the API using your API key. Value is of the format Bearer YOUR_KEY_HERE.

MARS6 [TTS] › Request Body

textstring · required
The text to convert to speech
audio_refstring · required
Base64 encoded audio file for voice cloning (6-90 seconds recommended)
languagestring · required
Target language code - English: 'en-us', French: 'fr-fr', German: 'de-de', Spanish: 'es-es', Italian: 'it-it', Portuguese: 'pt-pt', Chinese: 'zh-cn', Japanese: 'ja-jp', Korean: 'ko-kr', Dutch: 'nl-nl'
Example: en-us

ref_textstring
Text transcript of the reference audio (optional but recommended)
streamboolean
Whether to stream the response
Default: true
stream_formatstring · enum
Format for streaming: 'adts' for AAC or 'flac' for FLAC
Enum values:
adts
flac
Default: adts
temperaturenumber
Temperature for generation
Default: 0.7
top_pnumber
Top-p for generation
Default: 0.7
chunk_lengthnumber
Text chunk length for splitting long input
Default: 200
max_new_tokensnumber
Limit on max tokens (0 = unlimited)
Default: 0
repetition_penaltynumber
Repetition penalty for generation
Default: 1.5
asyncboolean
Whether to run asynchronously
Default: false

MARS6 [TTS] › Responses

string · binary

ElevenLabs Multi-v2 [TTS]

POST

https://api.slng.ai

/v1/tts/elevenlabs/multi-v2

Generate audio from text using ElevenLabs Multi-v2 voice model. Multilingual model supporting 29+ languages with high-quality natural voices.

ElevenLabs Multi-v2 [TTS] › Headers

Authorizationstring · required
The Authorization header is used to authenticate with the API using your API key. Value is of the format Bearer YOUR_KEY_HERE.

ElevenLabs Multi-v2 [TTS] › Request Body

textstring · required
The text to convert to speech

voicestring
Voice ID or name to use for synthesis
Default: Rachel
voice_idstring
Alternative parameter name for voice ID
languagestring
Language code (en, es, fr, de, it, pt, pl, hi, ar, zh, ja, ko, etc.)
Default: en
language_codestring
Alternative parameter name for language code
streamboolean
Whether to stream the audio response
Default: false
formatstring
Audio format to return (mp3_44100_128, mp3_44100_192, pcm_16000, pcm_22050, pcm_24000, pcm_44100, ulaw_8000)
Default: mp3_44100_128
stabilitynumber · max: 1
Voice stability (0.0-1.0)
Default: 0.5
similarity_boostnumber · max: 1
Voice similarity boost (0.0-1.0)
Default: 0.75
stylenumber · max: 1
Style control (0.0-1.0)
Default: 0
speaking_ratenumber
Speaking rate multiplier
Default: 1
text_normalizationstring · enum
Text normalization mode
Enum values:
auto
on
off
Default: auto
seedinteger
Random seed for reproducible audio generation

ElevenLabs Multi-v2 [TTS] › Responses

string · binary

ElevenLabs Turbo v2.5 [TTS]

POST

https://api.slng.ai

/v1/tts/elevenlabs/turbo-v2-5

Generate audio from text using ElevenLabs Turbo v2.5 voice model. Ultra-fast TTS model with low latency, ideal for real-time applications.

ElevenLabs Turbo v2.5 [TTS] › Headers

Authorizationstring · required
The Authorization header is used to authenticate with the API using your API key. Value is of the format Bearer YOUR_KEY_HERE.

ElevenLabs Turbo v2.5 [TTS] › Request Body

textstring · required
The text to convert to speech

voicestring
Voice ID or name to use for synthesis
Default: Rachel
voice_idstring
Alternative parameter name for voice ID
languagestring
Language code (en, es, fr, de, it, pt)
Default: en
streamboolean
Whether to stream the audio response
Default: false
formatstring
Audio format to return
Default: mp3_44100_128
stabilitynumber · max: 1
Voice stability (0.0-1.0)
Default: 0.5
similarity_boostnumber · max: 1
Voice similarity boost (0.0-1.0)
Default: 0.75
speaking_ratenumber
Speaking rate multiplier
Default: 1
text_normalizationstring · enum
Text normalization mode
Enum values:
auto
on
off
Default: auto

ElevenLabs Turbo v2.5 [TTS] › Responses

string · binary

ElevenLabs v3 [TTS]

POST

https://api.slng.ai

/v1/tts/elevenlabs/v3

Generate audio from text using ElevenLabs v3 voice model. Latest generation model with high-quality natural voices.

ElevenLabs v3 [TTS] › Headers

Authorizationstring · required
The Authorization header is used to authenticate with the API using your API key. Value is of the format Bearer YOUR_KEY_HERE.

ElevenLabs v3 [TTS] › Request Body

textstring · required
The text to convert to speech

voicestring
Voice ID or name to use for synthesis
Default: Rachel
voice_idstring
Alternative parameter name for voice ID
languagestring
Language code (en, es, fr, de, it, pt, pl, hi, etc.)
Default: en
streamboolean
Whether to stream the audio response
Default: false
formatstring
Audio format to return
Default: mp3_44100_128
stabilitynumber · max: 1
Voice stability (0.0-1.0)
Default: 0.5
similarity_boostnumber · max: 1
Voice similarity boost (0.0-1.0)
Default: 0.75
speaking_ratenumber
Speaking rate multiplier
Default: 1
text_normalizationstring · enum
Text normalization mode
Enum values:
auto
on
off
Default: auto
seedinteger
Random seed for reproducible audio generation

ElevenLabs v3 [TTS] › Responses

string · binary

ElevenLabs TTV v3 [TTS]

POST

https://api.slng.ai

/v1/tts/elevenlabs/ttv-v3

Generate audio from text using ElevenLabs TTV v3 voice model. Optimized for synchronized text-to-video applications.

ElevenLabs TTV v3 [TTS] › Headers

Authorizationstring · required
The Authorization header is used to authenticate with the API using your API key. Value is of the format Bearer YOUR_KEY_HERE.

ElevenLabs TTV v3 [TTS] › Request Body

textstring · required
The text to convert to speech

voicestring
Voice ID or name to use for synthesis
Default: Rachel
voice_idstring
Alternative parameter name for voice ID
languagestring
Language code (en, es, fr, de, it, pt)
Default: en
streamboolean
Whether to stream the audio response
Default: false
formatstring
Audio format to return
Default: mp3_44100_128
stabilitynumber · max: 1
Voice stability (0.0-1.0)
Default: 0.5
similarity_boostnumber · max: 1
Voice similarity boost (0.0-1.0)
Default: 0.75
speaking_ratenumber
Speaking rate multiplier
Default: 1
text_normalizationstring · enum
Text normalization mode
Enum values:
auto
on
off
Default: auto
seedinteger
Random seed for reproducible audio generation

ElevenLabs TTV v3 [TTS] › Responses

string · binary

ElevenLabs Flash v2.5 [TTS]

POST

https://api.slng.ai

/v1/tts/elevenlabs/flash-v2-5

Generate audio from text using ElevenLabs Flash v2.5 voice model. Fast TTS model with good balance between speed and quality.

ElevenLabs Flash v2.5 [TTS] › Headers

Authorizationstring · required
The Authorization header is used to authenticate with the API using your API key. Value is of the format Bearer YOUR_KEY_HERE.

ElevenLabs Flash v2.5 [TTS] › Request Body

textstring · required
The text to convert to speech

voicestring
Voice ID or name to use for synthesis
Default: Rachel
voice_idstring
Alternative parameter name for voice ID
languagestring
Language code (en, es, fr, de, it, pt, pl)
Default: en
streamboolean
Whether to stream the audio response
Default: false
formatstring
Audio format to return
Default: mp3_44100_128
stabilitynumber · max: 1
Voice stability (0.0-1.0)
Default: 0.5
similarity_boostnumber · max: 1
Voice similarity boost (0.0-1.0)
Default: 0.75
speaking_ratenumber
Speaking rate multiplier
Default: 1
text_normalizationstring · enum
Text normalization mode
Enum values:
auto
on
off
Default: auto

ElevenLabs Flash v2.5 [TTS] › Responses

string · binary

VUI [TTS] (India)

POST

https://api.slng.ai

/v1/in/tts/vui

Generate audio from text using VUI voice model (region: India).

VUI [TTS] (India) › Headers

Authorizationstring · required
The Authorization header is used to authenticate with the API using your API key. Value is of the format Bearer YOUR_KEY_HERE.

VUI [TTS] (India) › Responses

Rate Limiting Response

typestring · required
A URI reference that identifies the problem.
titlestring · required
A short, human-readable summary of the problem.
statusnumber · required
The HTTP status code.

instancestring

VUI [TTS] (EU)

POST

https://api.slng.ai

/v1/eu/tts/vui

Generate audio from text using VUI voice model (region: EU).

VUI [TTS] (EU) › Headers

Authorizationstring · required
The Authorization header is used to authenticate with the API using your API key. Value is of the format Bearer YOUR_KEY_HERE.

Twi SpeechT5 [TTS]

POST

https://api.slng.ai

/v1/tts/twi-speecht5

Synthesize Twi speech from text using a specified speaker embedding via Modal API.

Twi SpeechT5 [TTS] › Headers

Authorizationstring · required
The Authorization header is used to authenticate with the API using your API key. Value is of the format Bearer YOUR_KEY_HERE.

Twi SpeechT5 [TTS] › Request Body

textstring · required
The text to synthesize into Twi speech.
speaker_embeddingnumber[] · minItems: 512 · maxItems: 512 · required
A 512-dimensional speaker embedding vector representing the target voice.

Twi SpeechT5 [TTS] › Responses

Synthesized audio waveform (array of floats)

audionumber[]
Raw waveform samples (float array, 16kHz)

MCP Server

POST

https://api.slng.ai

/mcp