HTTP-streaming multilingual TTS for Indian languages with 30+ speaker voices. Returns raw audio bytes (chunked) in the codec selected via output_audio_codec. Unlike sarvam/bulbul:v3, no X-Duration header is sent and no JSON envelope is used.
Documentation Index
Fetch the complete documentation index at: https://docs.slng.ai/llms.txt
Use this file to discover all available pages before exploring further.
API key issued by SLNG. Pass as Authorization: Bearer <token>.
Target world part override. Auto-selected if not provided.
ap Sarvam AI Bulbul streaming TTS request.
Text to synthesize. Supports code-mixed text (English and Indic languages).
1 - 3500Language code in BCP-47 format for text normalization.
bn-IN, en-IN, gu-IN, hi-IN, kn-IN, ml-IN, mr-IN, od-IN, pa-IN, ta-IN, te-IN Speaker voice for the output audio.
shubh, aditya, ritu, priya, neha, rahul, pooja, rohan, simran, kavya, amit, dev, ishita, shreya, ratan, varun, manan, sumit, roopa, kabir, aayan, ashutosh, advait, amelia, sophia, anand, tanya, tarun, sunny, mani, gokul, vijay, shruti, suhani, mohit, kavitha, rehan, soham, rupali Sarvam TTS model identifier.
bulbul:v3 Output audio codec. Determines the response Content-Type.
mp3, wav, aac, opus, flac, linear16, mulaw, alaw Output audio bitrate.
32k, 64k, 128k, 192k, 256k Speech speed (0.5 to 2.0).
0.5 <= x <= 2Output sample rate in Hz.
Controls expressiveness (0.01 to 1.0).
0.01 <= x <= 1Normalize English words and numbers before synthesis.
Synthesis successful. Returns binary audio in the codec specified by output_audio_codec (chunked stream).
Binary audio data.