Skip to main content
POST
/
v1
/
tts
/
slng
/
inworld
/
max:1.5
curl --request POST \
  --url https://api.slng.ai/v1/tts/slng/inworld/max:1.5 \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "text": "Hello, this is a test of Inworld text to speech.",
  "voice": "Ashley",
  "modelId": "inworld-tts-1.5-max"
}
'
{
  "audioContent": "SUQzBAAAAAAAI1RTU0UAAAAPAAADTGF2ZjYwLjE2LjEwMAAA",
  "usage": {
    "processedCharactersCount": 48,
    "modelId": "inworld-tts-1.5-max"
  }
}

Documentation Index

Fetch the complete documentation index at: https://docs.slng.ai/llms.txt

Use this file to discover all available pages before exploring further.

Authorizations

Authorization
string
header
required

API key issued by SLNG. Pass as Authorization: Bearer <token>.

Headers

X-Region-Override
enum<string>

Target region override. Auto-selected if not provided.

Available options:
us-east-1

Body

application/json

Inworld Max 1.5 synthesis request, SLNG-hosted. Audio output is configured with flat fields (encoding, sample_rate, bit_rate, speaking_rate) that the gateway maps to Inworld's audioConfig.

text
string
required

Text to synthesize. Max 2,000 characters.

Required string length: 1 - 2000
voice
enum<string>
default:Ashley

Inworld voice ID. Voices span 16 languages (en-US, hi-IN, es-ES, fr-FR, de-DE, it-IT, pt-BR, ru-RU, ja-JP, ko-KR, zh-CN, nl-NL, pl-PL, ar-SA, he-IL).

Available options:
Aanya,
Aarav,
Abby,
Alain,
Alex,
Amina,
Anjali,
Arjun,
Ashley,
Asuka,
Avery,
Beatriz,
Bianca,
Blake,
Brandon,
Brian,
Callum,
Camila,
Carter,
Cedric,
Celeste,
Chloe,
Claire,
Clive,
Conrad,
Craig,
Damon,
Darlene,
Deborah,
Dennis,
Derek,
Diego,
Dmitry,
Dominus,
Duncan,
Edward,
Eleanor,
Elena,
Elizabeth,
Elliot,
Erik,
Ethan,
Étienne,
Evan,
Evelyn,
Felix,
Gareth,
Gianni,
Graham,
Grant,
Hades,
Hamish,
Hana,
Hank,
Haruto,
Heitor,
Hélène,
Hina,
Hyunwoo,
Jake,
James,
Jason,
Jessica,
Jing,
Johanna,
Jonah,
Josef,
Julia,
Katrien,
Kayla,
Kelsey,
Lauren,
Lennart,
Levi,
Liam,
Lore,
Loretta,
Lucian,
Luna,
Lupita,
Maitê,
Malcolm,
Manoj,
Marcus,
Mariana,
Mark,
Marlene,
Mateo,
Mathieu,
Mauricio,
Mei,
Mia,
Miguel,
Ming,
Minji,
Miranda,
Mortimer,
Murilo,
Nadia,
Naomi,
Nate,
Nikolai,
Nour,
Oliver,
Olivia,
Omar,
Oren,
Orietta,
Pippa,
Pixie,
Priya,
Rafael,
Reed,
Riley,
Riya,
Ronald,
Rupert,
Saanvi,
Sarah,
Satoshi,
Sebastian,
Selene,
Seojun,
Serena,
Shaun,
Simon,
Snik,
Sofia,
Sophie,
Svetlana,
Szymon,
Tessa,
Theodore,
Timothy,
Trevor,
Tristan,
Tyler,
Veronica,
Victor,
Victoria,
Vinny,
Wendy,
Wojciech,
Xiaoyin,
Xinyi,
Yael,
Yichen,
Yoona
modelId
enum<string>
default:inworld-tts-1.5-max

ID of the Inworld TTS model.

Available options:
inworld-tts-1.5-max
language
string

Optional BCP-47 language tag (e.g. en-US, fr-FR). Auto-detected from text when omitted.

deliveryMode
enum<string>

Only applies to inworld-tts-2. Controls output variation; ignored on Max 1.5.

Available options:
DELIVERY_MODE_UNSPECIFIED,
STABLE,
BALANCED,
CREATIVE
temperature
number
default:1

Higher values produce more expressive output; lower values more deterministic. Range (0, 2].

Required range: 0 < x <= 2
timestampType
enum<string>
default:TIMESTAMP_TYPE_UNSPECIFIED

Controls timestamp metadata returned with the audio. Adds latency.

Available options:
TIMESTAMP_TYPE_UNSPECIFIED,
WORD,
CHARACTER
applyTextNormalization
enum<string>
default:APPLY_TEXT_NORMALIZATION_UNSPECIFIED

Expands numbers, dates, and abbreviations before synthesis. Disabling may reduce latency.

Available options:
APPLY_TEXT_NORMALIZATION_UNSPECIFIED,
ON,
OFF
encoding
enum<string>
default:MP3

Output audio format. Maps to Inworld audioConfig.audioEncoding.

Available options:
LINEAR16,
MP3,
OGG_OPUS,
ALAW,
MULAW,
FLAC,
PCM,
WAV
sample_rate
enum<integer>
default:48000

Output sample rate in Hz. Maps to Inworld audioConfig.sampleRateHertz.

Available options:
8000,
16000,
22050,
24000,
32000,
44100,
48000
bit_rate
integer

Bits per second. Only applies to compressed formats (MP3, OGG_OPUS). Maps to Inworld audioConfig.bitRate.

speaking_rate
number
default:1

Playback speed. Values below 0.8 not recommended for quality. Maps to Inworld audioConfig.speakingRate.

Required range: 0.5 <= x <= 1.5

Response

Synthesis successful. Unlike other SLNG TTS models (which return raw audio bytes), Inworld Max responds with JSON: the request is SLNG-normalized but the response is Inworld's native body passed through unchanged, carrying base64-encoded audio plus usage and optional timestamp metadata.

Inworld Max 1.5 synthesis result. Inworld's native response, passed through by the gateway.

audioContent
string<byte>

Base64-encoded audio in the requested encoding. Max 16MB.

usage
object

Synthesis usage details.

timestampInfo
object

Timestamp metadata. Present only when timestampType is WORD or CHARACTER.