Skip to main content
POST
/
v1
/
tts
/
elevenlabs
/
eleven-multilingual:2
curl --request POST \
  --url https://api.slng.ai:2/v1/tts/elevenlabs/eleven-multilingual:2 \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "voice": "pNInz6obpgDQGcFmaJgB",
  "text": "Hello, this is a test of text to speech synthesis."
}
'
{
  "audio": "<string>",
  "request_id": "<string>",
  "voice_id": "<string>",
  "model_id": "<string>",
  "word_count": 123,
  "character_count": 123,
  "duration": 123,
  "language": "<string>"
}

Authorizations

Authorization
string
header
required

API key issued by SLNG. Pass as Authorization: Bearer <token>.

Headers

X-Region-Override
enum<string>

Target region override. Auto-selected if not provided.

Available options:
eu-west-1

Body

application/json
text
string
required

Text to synthesize.

Minimum string length: 1
voice
string

Voice identifier.

language
string

ISO-639-1 language code.

prompt
string

Alternative to text for prompt-based synthesis.

voice_id
enum<string>
default:EXAVITQu4vr4xnSDxMaL

Voice ID (alias for voice). Premade or custom voice ID.

Available options:
bIHbv24MWmeRgasZH58o,
cgSgspJ2msm6clMCkdW9,
cjVigY5qzO86Huf0OWal,
CwhRBWXzGAHq8TQ4Fs17,
EXAVITQu4vr4xnSDxMaL,
FGY2WhTYpPnrIDTdsKH5,
hpp4J3VqNfWAUOO0d1Us,
IKne3meq5aSn9XLyUdCD,
iP95p4xoKVk53GoZ742B,
JBFqnCBsd6RMkjVDRZzb,
N2lVS1w4EtoT3dr4eOWO,
nPczCjzI2devNBz1zQrb,
onwK4e9ZLuTAKqWW03F9,
pFZP5JQG7iQjIQuC4Bku,
pNInz6obpgDQGcFmaJgB,
pqHfZKP75CvOlQylNhV4,
SAz9YHcvj6GT2YYXdXww,
SOYHLrjzK2X1ezoPC6cr,
TX3LPaxmHKxFdv7VOQHJ,
Xb7hH8MSUJpSbSDYk0k2,
XrExE9yKIg1WjnnlVkGX
model_id
string
default:eleven_multilingual_v2

Model identifier. Use eleven_multilingual_v2, eleven_flash_v2, etc.

stream
boolean
default:false

Enable streaming response.

output_format
enum<string>
default:mp3_44100_128

Output format as codec_samplerate_bitrate. Higher quality formats may require premium tier.

Available options:
alaw_8000,
ulaw_8000,
mp3_22050_32,
mp3_24000_48,
mp3_44100_32,
mp3_44100_64,
mp3_44100_96,
mp3_44100_128,
mp3_44100_192,
opus_48000_32,
opus_48000_64,
opus_48000_96,
opus_48000_128,
opus_48000_192,
pcm_8000,
pcm_16000,
pcm_22050,
pcm_24000,
pcm_32000,
pcm_44100,
pcm_48000,
wav_8000,
wav_16000,
wav_22050,
wav_24000,
wav_32000,
wav_44100,
wav_48000
format
enum<string>
default:mp3_44100_128

Alias for output_format. Deprecated: use output_format.

Available options:
alaw_8000,
ulaw_8000,
mp3_22050_32,
mp3_24000_48,
mp3_44100_32,
mp3_44100_64,
mp3_44100_96,
mp3_44100_128,
mp3_44100_192,
opus_48000_32,
opus_48000_64,
opus_48000_96,
opus_48000_128,
opus_48000_192,
pcm_8000,
pcm_16000,
pcm_22050,
pcm_24000,
pcm_32000,
pcm_44100,
pcm_48000,
wav_8000,
wav_16000,
wav_22050,
wav_24000,
wav_32000,
wav_44100,
wav_48000
encoding
enum<string>
default:mp3

SLNG normalized encoding. Maps to output_format.

Available options:
mp3,
pcm,
linear16,
ulaw
language_code
enum<string>
default:en

ISO 639-1 language code. Enforces language for model and text normalization.

Available options:
ar,
cs,
de,
en,
es,
fil,
fr,
hi,
it,
ja,
nl,
pl,
pt,
ro,
sk,
sv,
tr,
zh
voice_settings
object

Voice settings to override stored defaults.

stability
number
default:0.5

Voice stability (0.0-1.0). Shorthand for voice_settings.stability.

Required range: 0 <= x <= 1
similarity_boost
number
default:0.75

Voice similarity (0.0-1.0). Shorthand for voice_settings.similarity_boost.

Required range: 0 <= x <= 1
style
number
default:0

Voice style (0.0-1.0). Shorthand for voice_settings.style.

Required range: 0 <= x <= 1
use_speaker_boost
boolean
default:true

Enable speaker boost. Shorthand for voice_settings.use_speaker_boost.

speed
number
default:1

Speaking rate (0.25-4.0). Shorthand for voice_settings.speed.

Required range: 0.25 <= x <= 4
seed
integer

Random seed for deterministic generation (0-4294967295).

Required range: 0 <= x <= 4294967295
optimize_streaming_latency
integer

Latency optimization level: 0 - default (no optimization) 1 - normal (~50% improvement) 2 - strong (~75% improvement) 3 - max optimization 4 - max + text normalizer off (best latency, may mispronounce)

Required range: 0 <= x <= 4
previous_text
string

Text preceding current request for improved continuity.

next_text
string

Text following current request for improved continuity.

previous_request_ids
string[]

Request IDs of previous generations for continuity. Max 3.

Maximum array length: 3
next_request_ids
string[]

Request IDs of following generations for continuity. Max 3.

Maximum array length: 3
apply_text_normalization
enum<string>
default:auto

Text normalization: auto (system decides), on (always), off (never).

Available options:
auto,
on,
off
apply_language_text_normalization
boolean
default:false

Language-specific text normalization. High latency. Currently only Japanese.

enable_logging
boolean
default:true

Enable request logging. Set false for zero retention mode (enterprise only).

Response

Synthesis successful.

audio
file
required

Audio output.

request_id
string

Unique request identifier. Use for request stitching with previous/next_request_ids.

voice_id
string

Voice ID used.

model_id
string

Model ID used.

word_count
integer

Word count in generated audio.

character_count
integer

Character count processed.

duration
number

Audio duration in seconds.

language
string

Language code used.