Documentation Index
Fetch the complete documentation index at: https://docs.slng.ai/llms.txt
Use this file to discover all available pages before exploring further.
HTTP streaming for Sarvam Bulbul v3
Sarvam AI Bulbul Stream v3 is now available as a chunked HTTP endpoint. The response is raw audio bytes in the codec you set withoutput_audio_codec — there is no JSON envelope and no X-Duration header — so you can start playback as soon as the first chunk arrives. Use it to reach the same 30+ Indian-language voices as bulbul:v3 with lower time-to-first-byte.Streaming transcription for Sarvam Saaras v3
Sarvam AI Saaras v3 is now available over WebSocket for real-time transcription across 23 Indian languages. Configure each session with query parameters on the upgrade URL —language-code, mode, sample_rate, input_audio_codec, high_vad_sensitivity, and vad_signals — to tune voice activity detection and audio handling per stream.HTTP transcription for Nova 3 Indic languages
Deepgram Nova 3 now exposes HTTP endpoints for Kannada, Marathi, Tamil, and Telugu. You can transcribe these languages with a single request — binary upload orurl field — instead of opening a WebSocket.Soniox TTS language coverage
Soniox TTS now ships voices for 60+ languages, including Afrikaans, Bengali, Bulgarian, Catalan, Czech, Dutch, Greek, Gujarati, Hebrew, Indonesian, Kannada, Malay, Marathi, Norwegian, Tamil, Telugu, and more. Set thelanguage field to any supported ISO 639-1 code on the Soniox TTS HTTP endpoint to pick a voice catalog.Runtime variables for voice agents
Voice agents can now declareruntime_variables so the model can capture values during a call and reuse them later in webhook URLs or system tool arguments. The built-in set_runtime_variables tool writes the values, which persist for the lifetime of the call or web session. See Voice agents for the setup pattern.New agent regions and world parts
The region override catalog now includes theeu-non-eu and me world parts, plus the asia-south1, asia-southeast2, and australia-southeast1 regions. Pin requests to Sydney, Jakarta, or a non-EU European region with X-Region-Override, or stay inside the Middle East with X-World-Part-Override: me.Voice agents no longer need a separate API key
Voice agent create and duplicate requests no longer acceptslng_api_key. Agents now use the API key you authenticate the request with — drop the field from your payloads. Agent duplication also stops copying the inbound connection and call history; reconnect inbound routing on the copy if you need it.Deepgram Aura 2 region availability
SLNG-hosted Deepgram Aura 2 English is no longer available inus-east-1 — route English TTS to eu-north-1. Aura 2 Spanish drops the na world part and now serves only eu.Pronunciation dictionaries for TTS
You can now create reusable pronunciation dictionaries and attach them to any SLNG TTS request. Define rewrite rules once for brand names, acronyms, or domain terms, then reference the dictionary from HTTP, WebSocket, or Unified TTS calls. Manage dictionaries through the new pronunciation dictionary endpoints.Full voice catalogs on provider pages
Provider voice pages now list every voice in the catalog instead of capping the table at ten per language. The Cartesia Sonic 3 page shows all 745 voices, Sarvam Bulbul shows 405, Deepgram Aura shows 91, Soniox shows 120, Murf shows 111, and Kugel shows 100. You can search and copy any voice ID directly from the provider page.Batch API reference now matches the gateway
The Batch API reference and Batch API guide are realigned with the liveapi.batch.slng.ai gateway. Request and response schemas, supported audio formats, and the three submission flows — direct upload, URL input, and presigned S3 upload — reflect what the service actually accepts.URL-based audio for HTTP transcription
The Whisper Large v3, Cognigy STT bridge, Jambonz STT bridge, and Unmute STT bridge HTTP endpoints now accept aurl field pointing to a publicly accessible audio file. Send a JSON body with url and language instead of a multipart upload when your audio is already hosted somewhere.Deepgram Nova 3 English region change
SLNG-hosted Deepgram Nova 3 English is no longer available ineu-north-1. Route English transcription to australia-southeast1 or us-east-1 instead.voiceai CLI
The newvoiceai CLI runs text-to-speech and speech-to-text from a terminal. Install it with curl, Homebrew, or npm, then pipe audio between SLNG models and other tools without writing an HTTP client.JavaScript and Python SDKs
The typed JavaScript SDK (voiceai-sdk on npm) supports Node, Bun, and Deno. The Python SDK (voiceai-sdk on PyPI) ships sync and async clients for Python 3.9+. Both wrap the full STT and TTS surface so you can drop the raw fetch and WebSocket plumbing.Agent skills for coding agents
Theslng-ai/skills pack teaches Claude Code and similar coding agents to call the SLNG API directly. Point your agent at the skills repo and it can pick models, build init messages, and stream audio for you.LiveKit Agents plugin
Thelivekit-plugins-slng Python package connects LiveKit Agents to any STT or TTS model on the SLNG gateway with a single configuration switch. You can swap providers or regions without changing your agent code.Embed a voice agent on the web
A new browser embed guide walks through adding a SLNG voice session to any web page. It uses LiveKit, a React frontend, and a backend proxy that keeps your API key off the client.Rime Arcana v3 Spanish
Rime Arcana v3 (Spanish) is available as a new TTS endpoint, with both streaming WebSocket and one-shot HTTP synthesis. Choose from ten Spanish voices —aurelio, celestino, lark, luz, mar, nova, pola, seraphina, sirius, and ursa — and pass model: rime/arcana:3-es on init.Rime Arcana v3 adds eu-north-1
Arcana v3 English, Hindi, and Spanish are now available ineu-north-1. You can route Arcana v3 synthesis to North Europe for lower latency in that region.Deepgram Nova 3 adds asia-south1
Deepgram Nova 3 Tamil, Telugu, Marathi, and Kannada are now available inasia-south1, in addition to ap-south-1. You can route South Asian language transcription to the Mumbai GCP region.Deepgram Nova 3 Spanish region change
Deepgram Nova 3 Spanish is no longer available inap-southeast-2. Use australia-southeast1 or us-east-1 instead.Soniox TTS region override removed
Thex-region header is no longer accepted on Soniox TTS v1 requests. Soniox TTS runs only in na, so requests are routed there automatically.Expanded Murf Falcon voice catalog
The Murf Falcon voice catalog now lists 133 voices across more than 20 locales, including new entries for Bengali, Tamil, Telugu, Gujarati, Kannada, Punjabi, Japanese, Korean, Portuguese, Dutch, Polish, Greek, Croatian, and Scottish English. You can browse voices by full locale code and copy thevoice_id for use in your init message.Simplified Unmute TTS bridge requests
The Unmute TTS bridge no longer requires amodel field on HTTP or WebSocket init messages — the model is now inferred from the {model_variant} path. Send only voice and text for HTTP requests, or voice plus optional config on the WebSocket init.Rime Coda Indonesian TTS
Rime Coda is available as a new TTS model in theasia-southeast2 (Jakarta) region. It synthesizes Bahasa Indonesian with low latency across four voices — pujianti_plesmita, siswoko_sigit, taryadi_dani, and usmany_tatianna — and supports streaming WebSocket and one-shot HTTP synthesis.Region and world-part query parameters on bridges
The Cognigy, Jambonz, and Unmute HTTP bridges now accept?region= and ?world-part= query parameters, mirroring the X-Region-Override and X-World-Part-Override headers. Use the query form when your platform cannot set custom headers; if both are present, the header wins. See Integrations overview for details.Cartesia Sonic 3, Murf Falcon, Kugel, Soniox, Reson8, and Sarvam in the Unified API
The Unified API now routes to Cartesia Sonic 3, Murf Falcon, KugelAudio Kugel 1/1-Turbo/2, Soniox TTS v1, Soniox Speech AI v4, Reson8 STT v1, and Sarvam Saaras v3. You can swap between these providers using a single request shape — pass identifiers likecartesia/sonic:3, murf/murftts:falcon, kugelaudio/kugel:2, soniox/tts-rt:v1, soniox/speech-ai:rt-v4, or reson8/reson8stt:v1.Webhook tools support custom HTTP methods and raw payloads
Voice agent webhook tools now accepthttp_method (POST, PUT, PATCH, or DELETE) and webhook_format (envelope or raw). Set webhook_format to raw to send only the tool arguments when the receiving service cannot parse the SLNG envelope.Tool execution tracking for voice agents
You can record webhook, template, human-transfer, and built-in tool activity against a call by posting to the tool executions endpoint. Each record carries the outcome, duration, and HTTP status, and submitted executions surface on the call detail response for debugging or analytics.ElevenLabs Flash v2.5 adds Asia Pacific region
ElevenLabs Flash v2.5 is now available inap, in addition to eu. You can route synthesis to Asia Pacific endpoints for lower latency in that region.Nova 3 multi-language adds eu-north-1
Deepgram Nova 3 multi-language is now available ineu-north-1 as a specific region (in addition to the broader eu world part), giving you direct routing to North Europe.Deepgram Aura 2 English region change
Deepgram Aura 2 English is no longer available inap-southeast-2. Use eu-north-1 or us-east-1 instead.Nova 3 Hindi region change
Deepgram Nova 3 Hindi is no longer available inap-southeast-2. Use asia-south1 instead.Soniox TTS v1 general availability
Soniox TTS graduates from preview to v1. Update your client to calltts-rt:v1 and set model to tts-rt-v1 — the v1-preview path and tts-rt-v1-preview model identifier are retired.Deepgram Aura 2 voice selection now required
Themodel field is now required on every Deepgram Aura 2 English and Spanish request. The previous defaults (aura-2-thalia-en and aura-2-celeste-es) no longer apply, so you must pick a voice explicitly.Kugel 2 TTS
Kugel 2 is available as a new TTS model ineu. It offers 87 voices with expressiveness control across 26 languages, including Arabic, Chinese, Hindi, Japanese, Korean, and Vietnamese.Soniox TTS v1-preview
Soniox TTS v1-preview is available as a new TTS model inna, with both streaming WebSocket and one-shot HTTP synthesis. Browse the voice catalog on the Soniox TTS voices page.Voice catalog pages for Cartesia Sonic 3 and Murf Falcon
You can now browse Cartesia Sonic 3 and Murf Falcon voices with audio samples directly in the docs. Each entry shows thevoice_id to pass in your init message.Nova 3 multi-language adds EU region
Deepgram Nova 3 multi-language is now available ineu, in addition to ap-southeast-2 and us-east-1. You can route multilingual transcription to European endpoints for lower latency.Nova 3 Hindi region change
Deepgram Nova 3 Hindi is no longer available inap-south-1. Use ap-southeast-2 or asia-south1 instead.Soniox Speech AI Real-time v4
Soniox Speech AI moves to v4. Use the Speech AI Real-time v4 endpoint for streaming transcription with speaker diarization, automatic language identification, and configurable endpoint detection across 60+ languages. The v3 endpoint has been retired — point clients at the new path to continue receiving native Soniox token frames.LiveKit plugin compatibility refresh
The LiveKit Agents plugin now targetslivekit-agents>=1.5.1 and Python 3.10+. You can pass model-specific options as keyword arguments — for example, whisper_params for Whisper, target_language_code for Sarvam STT, or modelId and speakingStyle for Rime Arcana. New slng_base_url and http_session arguments let you point at a self-hosted gateway and reuse an aiohttp.ClientSession.Sarvam Saaras v3 STT not supported in LiveKit plugin
Saaras is HTTP-only on SLNG and has no WebSocket endpoint, so it cannot run through the LiveKit plugin’s realtime path. For Hindi voice agents, useslng/deepgram/nova:3-hi or slng/deepgram/nova:3-multi. See the LiveKit plugin provider notes for details.Expanded regions for Murf Falcon TTS
Murf Falcon is now available inap, eu-non-eu, me, and na, in addition to the existing eu world part. You can now route synthesis closer to users across the Americas, Asia Pacific, and the Middle East.Asia Pacific region for Soniox Speech AI Real-time v3
Soniox Speech AI Real-time v3 adds theap world part alongside eu and na. Route transcription to Asia Pacific endpoints for lower latency in that region.URL and presigned S3 inputs for Batch STT
You can now submit audio to the Batch STT API without uploading a file on every request. Pass a publicly accessibleinput_url, or request a presigned S3 URL, upload directly, then create the job with the returned s3_key. Both methods accept an optional metadata object for attaching arbitrary key-value pairs to a job.Batch API usage guide
A new Batch API guide walks through the three input methods — file upload, URL input, and presigned S3 upload — with request flows and sample payloads.Deepgram Aura 2 English in eu-north-1
Deepgram Aura 2 English TTS is now available ineu-north-1, in addition to ap-southeast-2 and us-east-1.Whisper Large v3 Compressed removed
The Whisper Large v3 Compressed STT model has been retired from the catalog. Use Whisper Large v3 for multilingual transcription going forward.Runtime variables for voice agents
Voice agents can now capture values during a call and reuse them in webhook URLs and system tool arguments. Define aruntime_variables array on your agent, and the model sets values through the built-in set_runtime_variables tool. See the agent configuration examples for setup details.Webhook HTTP method and payload format
Webhook tools on voice agents now accepthttp_method (POST, PUT, PATCH, or DELETE) and webhook_format (envelope or raw). Use raw to send only the tool arguments as the request body, skipping the SLNG metadata envelope. Both fields are documented in the Voice Agents API reference.Expanded regions for Rime Arcana v2 and Cartesia Sonic 3
Rime Arcana v2 TTS is now available ineu-north-1 and us-east-1, in addition to ap-southeast-2. Cartesia Sonic 3 TTS is now available in all three world parts: ap, eu, and na.New regions for Deepgram Nova 3 English and Hindi
Nova 3 English is now available inap-south-1 and us-east-1. Nova 3 Hindi adds asia-south1 alongside existing regions.Utterance end events on Unmute STT Bridge
The Unmute STT Bridge now emitsutterance_end events when the upstream model signals the end of a spoken utterance. This gives you an explicit boundary marker for segmenting transcription output.Native token stream for Soniox Speech AI
Soniox Speech AI Real-time v3 now returns native Soniox token frames instead of normalized transcripts. You receive interim and final tokens directly, including<end> and <fin> endpoint markers when endpoint detection is enabled.Tool personalization for voice agents
You can now use{{variable}} placeholders in runtime tool fields — webhook URLs, system webhook argument values, human transfer phone numbers, and built-in timezones. Values resolve when the tool executes, not at session start, so a missing tool variable does not block the call. Supported surfaces, validation rules, and examples are documented on the Voice Agents page.Tool execution tracking on agent calls
A new tool executions endpoint lets you record webhook, template, human transfer, and built-in tool activity against a call. Execution records — including outcome, duration, and HTTP status — also appear in the call detail response.Cartesia Sonic 3 TTS
Cartesia Sonic 3 is available as a new TTS provider. It supports low-latency streaming synthesis over WebSocket with context-aware generation controls.Reson8 STT
Reson8 STT v1 is available as a new STT provider. It supports real-time transcription over WebSocket with word-level timestamps, confidence scores, and partial results in nine languages including Dutch, French, German, and Spanish.Deepgram Nova 3 Indic language endpoints
Four new SLNG-hosted Deepgram Nova 3 language variants are available inap-south-1 (Mumbai): Kannada, Marathi, Tamil, and Telugu. Each has a dedicated WebSocket endpoint for that language.Soniox Speech AI version correction
The Soniox STT endpoint is now correctly labeled Speech AI Real-time v3. URLs and navigation have been updated accordingly.Batch speech-to-text API
You can now transcribe audio files asynchronously with the new Batch STT API. Upload a file, poll the job status, and download the transcript when ready. Supported formats include wav, mp3, flac, aac, ogg, m4a, mp4, amr, and mpeg. Powered by Speechmatics.Murf Falcon TTS
Murf Falcon is available as a new TTS provider. It supports multilingual speech synthesis over WebSocket with multiple encodings and sample rates.Unified API documentation
The new Unified API section explains how to use one endpoint pattern for every STT and TTS model. Swap providers by changing only the URL path — your auth, request format, and code stay the same. Includes guides on parameter coverage and supported models.Integrations hub
A new Integrations page lists third-party platforms you can connect to SLNG. LiveKit, Cognigy, and Jambonz each have dedicated setup paths.Whisper Large v3 endpoint consolidated
The separate Whisper Large v3 Compressed endpoint has been removed. Use the standard Whisper Large v3 endpoint, which now handles compressed audio directly.Language selection for Nova 3 STT
SLNG-hosted Deepgram Nova 3 STT endpoints accept alanguage parameter in the WebSocket init config. Supported locales by variant:- English —
en,en-au,en-us,en-nz,en-gb,en-in - Spanish —
es,es-us,es-419,es-ar,es-mx,es-es - Hindi —
hi,en - Multi-language —
multi
en, so you can transcribe English audio without switching endpoints. See the Speech-to-Text models page for the full parameter list.More sample rates for Rime Arcana TTS
Rime Arcana now supports 8, 16, 22.05, 24 (default), 32, 44.1, and 48 kHz. You can match your audio pipeline directly without resampling.Simplified endpointing parameter
Theendpointing parameter on Deepgram STT endpoints now accepts only an integer (milliseconds of silence before finalizing speech). Set it to 0 to disable. Default remains 10.Graceful WebSocket session close
Send{ "type": "close" } on any WebSocket connection to shut down cleanly. The server finishes processing remaining audio, then closes. This replaces the previous cancel behavior and works across TTS, STT, and bridges.Keepalive for STT streams
Send{ "type": "keepalive" } on STT WebSocket connections to prevent idle timeouts during pauses. Useful for voice agent sessions where the user goes silent but the connection should stay open.Endpointing controls for Deepgram Nova STT
Two new parameters on Deepgram Nova STT models for tuning speech segmentation:endpointing— milliseconds of silence before finalizing speech. Set tofalseto disable. Default:10.utterance_end_ms— milliseconds of silence between words before anUtteranceEndevent. Range: 200–5000 ms, default: 1000 ms.
India region for Nova 3 Hindi
Deepgram Nova 3 Hindi is now available inap-south-1 (Mumbai), alongside ap-southeast-2 (Sydney). Use the X-Region-Override header to route to the closest region. See models by region.