LiveKit Agents plugin for SLNG

livekit-plugins-slng adds STT and TTS adapters for LiveKit Agents. It lets you use any model on the SLNG gateway from within a LiveKit agent.

Prerequisites

Python 3.10+
livekit-agents>=1.5.1
A LiveKit Agents project
An SLNG API key (get one at app.slng.ai)

Installation

pip install livekit-plugins-slng

Credentials

You need an SLNG API key. The plugin reads it from the SLNG_API_KEY environment variable automatically:

export SLNG_API_KEY="your-slng-api-key"

You can also pass it explicitly via api_key:

stt = slng.STT(api_key="your-slng-api-key", model="deepgram/nova:3")

slng.STT also accepts a legacy api_token= alias, but it is deprecated — use api_key in new code.

Quickstart

Create an STT and TTS instance, then pass them to your LiveKit agent session:

from livekit.plugins import slng

stt = slng.STT(
    api_key="your-slng-api-key",
    model="deepgram/nova:3",
    region_override="eu-west-1",
    language="en",
)

tts = slng.TTS(
    api_key="your-slng-api-key",
    model="deepgram/aura:2",
    region_override=["eu-west-1", "us-east-1"],
    voice="aura-2-thalia-en",
    language="en",
)

Region override

The plugin supports gateway region routing through the region_override option on both STT and TTS. This maps directly to the gateway’s X-Region-Override header. You can pass either a single region:

stt = slng.STT(
    api_key="your-slng-api-key",
    model="deepgram/nova:3",
    region_override="eu-west-1",
)

Or multiple preferred regions in priority order:

tts = slng.TTS(
    api_key="your-slng-api-key",
    model="deepgram/aura:2",
    voice="aura-2-thalia-en",
    region_override=["eu-west-1", "us-east-1"],
)

See the full region list and override behavior at docs.slng.ai/region-override.

Full voice agent example

This example wires up STT, TTS, and VAD into a complete LiveKit agent that greets the user on join:

from livekit.agents import Agent, AgentSession, JobContext, WorkerOptions, cli
from livekit.plugins import silero, slng

SLNG_API_KEY = "your-slng-api-key"


class MyAgent(Agent):
    async def on_enter(self):
        await self.session.say("Hello! How can I help?")


async def entrypoint(ctx: JobContext):
    await ctx.connect()

    stt = slng.STT(
        api_key=SLNG_API_KEY,
        model="deepgram/nova:3",
        language="en",
        sample_rate=16000,
        enable_partial_transcripts=True,
    )

    tts = slng.TTS(
        api_key=SLNG_API_KEY,
        model="deepgram/aura:2",
        voice="aura-2-thalia-en",
        language="en",
        sample_rate=24000,
    )

    # `turn_detection` and `allow_interruptions` still work in livekit-agents 1.5.x
    # but will be removed in v2.0. Use `turn_handling=TurnHandlingOptions(...)` going forward.
    session = AgentSession(
        stt=stt,
        tts=tts,
        vad=silero.VAD.load(),
        turn_detection="vad",
        allow_interruptions=True,
    )

    await session.start(agent=MyAgent(), room=ctx.room)


if __name__ == "__main__":
    cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))

Model identifiers

Models follow the format provider/model:variant. Prefix with slng/ to target an SLNG-hosted instance:

provider/model:variant          # third-party passthrough
slng/provider/model:variant     # SLNG-hosted

Examples:

model="deepgram/nova:3"              # Deepgram Nova 3 (passthrough)
model="slng/deepgram/nova:3-en"      # SLNG-hosted Deepgram Nova 3, English
model="elevenlabs/eleven-flash:2.5"  # ElevenLabs Flash v2.5 (passthrough)

See the Models page for the full list of available models.

STT reference

slng.STT streams speech-to-text over WebSocket. It supports multi-endpoint failover.

Constructor

stt = slng.STT(
    api_key="your-slng-api-key",          # Required. SLNG API key. Falls back to SLNG_API_KEY env var.
    model="deepgram/nova:3",              # Model identifier. Default: "deepgram/nova:3"
    language="en",                         # Language code. Default: "en"
    sample_rate=16000,                     # Audio sample rate in Hz. Default: 16000
    encoding="pcm_s16le",                  # Audio encoding: "pcm_s16le" or "pcm_mulaw". Default: "pcm_s16le"
    buffer_size_seconds=0.064,             # Audio buffer size in seconds. Default: 0.064
    enable_partial_transcripts=True,       # Enable interim results. Default: True
    enable_diarization=False,              # Enable speaker identification. Default: False
    min_speakers=None,                     # Minimum speakers for diarization. Default: None
    max_speakers=None,                     # Maximum speakers for diarization. Default: None
    vad_threshold=0.5,                     # Voice activity detection threshold. Default: 0.5
    vad_min_silence_duration_ms=300,       # Minimum silence for VAD (ms). Default: 300
    vad_speech_pad_ms=30,                  # Speech padding for VAD (ms). Default: 30
    model_endpoint=None,                   # Optional explicit WebSocket endpoint URL
    model_endpoints=None,                  # Optional list of failover endpoints
    slng_base_url="api.slng.ai",           # Gateway host override (self-hosted or staging)
    region_override=None,                  # Optional region override, sent as X-Region-Override
    http_session=None,                     # Optional reused aiohttp.ClientSession
    # **model_options                      # Arbitrary model-specific kwargs (see below)
)

Any additional keyword arguments are forwarded as model-specific options — for example, whisper_params={"task": "translate"} for Whisper, or target_language_code="hi-IN" to override language normalization for Sarvam STT models.

Endpoint failover

Pass a list of endpoints to model_endpoints. If the first one fails, the plugin tries the next in order:

stt = slng.STT(
    api_key=SLNG_API_KEY,
    model_endpoints=[
        "wss://api.slng.ai/v1/stt/deepgram/nova:3",
        "wss://api.slng.ai/v1/stt/slng/deepgram/nova:3-en",
    ],
    language="en",
)

Default endpoint

If model_endpoint is omitted, the plugin connects to:

wss://api.slng.ai/v1/stt/{model}

Region override

To force routing toward one or more preferred gateway regions, pass region_override. This value is forwarded directly as the gateway X-Region-Override header.

stt = slng.STT(
    api_key=SLNG_API_KEY,
    model="deepgram/nova:3",
    region_override=["eu-west-1", "us-east-1"],
)

See the full region list and override behavior at docs.slng.ai/region-override.

TTS reference

slng.TTS streams text-to-speech over WebSocket with connection pooling.

Constructor

tts = slng.TTS(
    api_key="your-slng-api-key",          # Required. SLNG API key. Falls back to SLNG_API_KEY env var.
    model="deepgram/aura:2",              # Model identifier. Default: "deepgram/aura:2"
    voice="aura-2-thalia-en",             # Voice identifier. Default: "default"
    language="en",                         # Language code. Default: "en"
    sample_rate=24000,                     # Audio sample rate in Hz. Default: 24000
    speed=1.0,                             # Speech speed multiplier. Default: 1.0
    model_endpoint=None,                   # Optional explicit WebSocket endpoint URL
    slng_base_url="api.slng.ai",           # Gateway host override (self-hosted or staging)
    region_override=None,                  # Optional region override, sent as X-Region-Override
    word_tokenizer=None,                   # Optional custom tokenize.WordTokenizer
    http_session=None,                     # Optional reused aiohttp.ClientSession
    # **model_options                      # Arbitrary model-specific kwargs (see below)
)

Additional keyword arguments are forwarded to the chosen model’s init payload. Known keys by provider:

Rime Arcana: modelId, segment, speakingStyle, addBreathing, addDisfluencies, phonemizeBetweenBrackets, translateTo.
Sarvam Bulbul: pace, temperature, output_audio_bitrate, min_buffer_size, max_chunk_length, target_language_code.

Streaming vs batch

tts.stream() sends text word-by-word and returns audio chunks in real time. Use this for voice agents.
tts.synthesize(text) does one-shot synthesis. Works fine for previews, but stream() is better for interactive agents.

Default endpoint

If model_endpoint is omitted, the plugin connects to:

wss://api.slng.ai/v1/tts/{model}

Region override

TTS supports the same region_override option and forwards it to the gateway as X-Region-Override. See the full region list and override behavior at docs.slng.ai/region-override.

Voice selection

Pick a voice that matches your chosen model. See the Voices pages for what’s available per provider.

Provider notes

Sarvam Bulbul v3 TTS

Works out of the box. The plugin auto-normalizes language codes to BCP-47 on the wire — pass language="hi", the plugin sends "hi-IN" to Sarvam. To override the normalization (e.g. force a different target language), pass target_language_code="..." in model_options.

Sarvam Saaras v3 STT

Saaras on SLNG is HTTP-only (no WebSocket endpoint) and is therefore not supported by this plugin’s realtime streaming path. For Hindi STT in a voice agent, use slng/deepgram/nova:3-hi or slng/deepgram/nova:3-multi instead.

Rime Arcana

Requires a voice (speaker) that matches the chosen language. Passing voice="default" auto-resolves to a reasonable default per language.

The plugin outputs linear16 PCM audio internally and registers itself with LiveKit on import. Both STT and TTS authenticate with api_key.

Most new SLNG gateway models work without plugin updates, but providers with non-standard WebSocket message formats may require plugin support (for example, Sarvam Bulbul needed nested data.audio parsing).

Next steps

Browse available Models for STT and TTS
Check the Voices pages for voice options per provider
See Voice Agents for the SLNG-managed agents API

Overview

LiveKit

Cognigy

Jambonz

Prerequisites

Installation

Credentials

Quickstart

Region override

Full voice agent example

Model identifiers

STT reference

Constructor

Endpoint failover

Default endpoint

Region override

TTS reference

Constructor

Streaming vs batch

Default endpoint

Region override

Voice selection

Provider notes

Sarvam Bulbul v3 TTS

Sarvam Saaras v3 STT

Rime Arcana

Next steps

Overview

LiveKit

Cognigy

Jambonz

Documentation Index

​Prerequisites

​Installation

​Credentials

​Quickstart

​Region override

​Full voice agent example

​Model identifiers

​STT reference

​Constructor

​Endpoint failover

​Default endpoint

​Region override

​TTS reference

​Constructor

​Streaming vs batch

​Default endpoint

​Region override

​Voice selection

​Provider notes

​Sarvam Bulbul v3 TTS

​Sarvam Saaras v3 STT

​Rime Arcana

​Next steps

Prerequisites

Installation

Credentials

Quickstart

Region override

Full voice agent example

Model identifiers

STT reference

Constructor

Endpoint failover

Default endpoint

Region override

TTS reference

Constructor

Streaming vs batch

Default endpoint

Region override

Voice selection

Provider notes

Sarvam Bulbul v3 TTS

Sarvam Saaras v3 STT

Rime Arcana

Next steps