Inworld Max 1.5
Synthesize speech using SLNG-hosted Inworld Max 1.5.
Authorizations
API key issued by SLNG. Pass as Authorization: Bearer <token>.
Headers
Target region override. Auto-selected if not provided.
us-east-1 Body
Inworld Max 1.5 synthesis request, SLNG-hosted. Audio output is configured with flat fields (encoding, sample_rate, bit_rate, speaking_rate) that the gateway maps to Inworld's audioConfig.
Text to synthesize. Max 2,000 characters.
1 - 2000Inworld voice ID. Voices are multilingual — each has a native language but can speak any supported language via the language parameter (best results when language matches the voice's native language). See the full catalog at https://docs.slng.ai/voices/inworld.
ID of the Inworld TTS model.
inworld-tts-1.5-max BCP-47 language tag specifying the language the voice should speak the text in. Optional — when omitted, Inworld uses the voice's original prompt and auto-detects the language from the text. Voices are multilingual; best results when this matches the voice's native language. These are the 15 core languages supported by inworld-tts-1.5-max (inworld-tts-2 additionally supports ~85 experimental languages). An invalid code returns an error.
ar-SA, de-DE, en-US, es-ES, fr-FR, he-IL, hi-IN, it-IT, ja-JP, ko-KR, nl-NL, pl-PL, pt-BR, ru-RU, zh-CN Only applies to inworld-tts-2. Controls output variation; ignored on Max 1.5.
DELIVERY_MODE_UNSPECIFIED, STABLE, BALANCED, CREATIVE Higher values produce more expressive output; lower values more deterministic. Range (0, 2].
0 < x <= 2Controls timestamp metadata returned with the audio. Adds latency.
TIMESTAMP_TYPE_UNSPECIFIED, WORD, CHARACTER Expands numbers, dates, and abbreviations before synthesis. Disabling may reduce latency.
APPLY_TEXT_NORMALIZATION_UNSPECIFIED, ON, OFF Output audio format. Maps to Inworld audioConfig.audioEncoding.
LINEAR16, MP3, OGG_OPUS, ALAW, MULAW, FLAC, PCM, WAV Output sample rate in Hz. Maps to Inworld audioConfig.sampleRateHertz.
8000, 16000, 22050, 24000, 32000, 44100, 48000 Bits per second. Only applies to compressed formats (MP3, OGG_OPUS). Maps to Inworld audioConfig.bitRate.
Playback speed. Values below 0.8 not recommended for quality. Maps to Inworld audioConfig.speakingRate.
0.5 <= x <= 1.5Response
Synthesis successful. Unlike other SLNG TTS models (which return raw audio bytes), Inworld Max responds with JSON: the request is SLNG-normalized but the response is Inworld's native body passed through unchanged, carrying base64-encoded audio plus usage and optional timestamp metadata.
Inworld Max 1.5 synthesis result. Inworld's native response, passed through by the gateway.