Skip to main content
pipecat-slng adds STT and TTS services for Pipecat. It routes your pipeline through the SLNG gateway, so you can use any STT or TTS model on SLNG — Deepgram, ElevenLabs, Rime, Sarvam, and more — behind one API key. Swap the model string to switch provider; no other code changes needed.
Tested with Pipecat v1.3.0.

Prerequisites

  • Python 3.11+
  • pipecat-ai>=1.3.0
  • A Pipecat project
  • An SLNG API key (get one at app.slng.ai)

Installation

uv add pipecat-slng
# or
pip install pipecat-slng

Credentials

You need an SLNG API key. Read it from the SLNG_API_KEY environment variable:
export SLNG_API_KEY="your-slng-api-key"
Then pass it to each service via api_key:
import os

from pipecat_slng import SlngSTTService

stt = SlngSTTService(
    api_key=os.getenv("SLNG_API_KEY"),
    model="slng/deepgram/nova:3-en",
)

Quickstart

Create an STT and TTS service, then add them to your Pipecat pipeline:
import os

from pipecat_slng import SlngSTTService, SlngTTSService

stt = SlngSTTService(
    api_key=os.getenv("SLNG_API_KEY"),
    model="slng/deepgram/nova:3-en",
)

tts = SlngTTSService(
    api_key=os.getenv("SLNG_API_KEY"),
    model="slng/deepgram/aura:2-en",
    voice="aura-2-thalia-en",
)
SlngSTTService and SlngTTSService stream over WebSocket: low latency, with mid-utterance interruption support. Common runtime knobs are top-level keyword arguments (language, speed, enable_vad, enable_partials). For richer overrides, pass a SlngSTTSettings(...) or SlngTTSSettings(...) to settings=.

Region routing

Both services support gateway region routing. Pin requests to a specific datacenter with region_override, or constrain them to a broad geographic zone with world_part_override. When both are set, region_override wins.
stt = SlngSTTService(
    api_key=os.getenv("SLNG_API_KEY"),
    model="slng/deepgram/nova:3-en",
    region_override="eu-north-1",      # ap-southeast-2 | eu-north-1 | us-east-1
    world_part_override="eu",          # ap | eu | na
)
The WebSocket services send these as the X-Region-Override and X-World-Part-Override headers; the HTTP service (below) sends them as the region and world-part query parameters. See the full region list and override behavior at docs.slng.ai/region-override.

Full voice agent example

A complete cascade pipeline — Speech-to-Text → LLM → Text-to-Speech — using SLNG for STT and TTS and OpenAI for the LLM. The bot introduces itself when a client connects:
import os

from dotenv import load_dotenv

from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import (
    LLMContextAggregatorPair,
    LLMUserAggregatorParams,
)
from pipecat.runner.types import RunnerArguments, SmallWebRTCRunnerArguments
from pipecat.services.openai.responses.llm import OpenAIResponsesLLMService
from pipecat.transcriptions.language import Language
from pipecat.transports.base_transport import BaseTransport, TransportParams

from pipecat_slng import SlngSTTService, SlngTTSService

load_dotenv(override=True)


async def run_bot(transport: BaseTransport):
    slng_api_key = os.environ["SLNG_API_KEY"]

    stt = SlngSTTService(
        api_key=slng_api_key,
        model="slng/deepgram/nova:3-en",
        language=Language.EN,
        enable_vad=True,
        enable_partials=True,
        # region_override="eu-north-1",  # uncomment to pin to a datacenter
    )

    # Text-to-Speech (streaming WebSocket — low latency, supports interruption).
    # Deepgram Aura 2 supports `speed`; Rime / Sarvam don't (parameter-coverage
    # table on docs.slng.ai). Swap model= and voice= to change provider.
    tts = SlngTTSService(
        api_key=slng_api_key,
        model="slng/deepgram/aura:2-en",
        voice="aura-2-arcas-en",
        language=Language.EN,
        speed=1,
        # region_override="eu-north-1",
    )

    llm = OpenAIResponsesLLMService(
        api_key=os.getenv("OPENAI_API_KEY"),
        settings=OpenAIResponsesLLMService.Settings(
            model=os.getenv("OPENAI_MODEL", "gpt-4.1"),
            system_instruction=(
                "You are a helpful assistant in a voice conversation. "
                "Your responses will be spoken aloud, so avoid emojis, bullet points, "
                "or other formatting that can't be spoken."
            ),
        ),
    )

    context = LLMContext()
    user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
        context,
        user_params=LLMUserAggregatorParams(vad_analyzer=SileroVADAnalyzer()),
    )

    pipeline = Pipeline(
        [
            transport.input(),
            stt,
            user_aggregator,
            llm,
            tts,
            transport.output(),
            assistant_aggregator,
        ]
    )

    task = PipelineTask(
        pipeline,
        params=PipelineParams(enable_metrics=True, enable_usage_metrics=True),
    )

    @task.rtvi.event_handler("on_client_ready")
    async def on_client_ready(rtvi):
        context.add_message({"role": "user", "content": "Please introduce yourself."})
        await task.queue_frames([LLMRunFrame()])

    runner = PipelineRunner(handle_sigint=False)
    await runner.run(task)


async def bot(runner_args: RunnerArguments):
    match runner_args:
        case SmallWebRTCRunnerArguments():
            from pipecat.transports.smallwebrtc.transport import SmallWebRTCTransport

            transport = SmallWebRTCTransport(
                webrtc_connection=runner_args.webrtc_connection,
                params=TransportParams(audio_in_enabled=True, audio_out_enabled=True),
            )
            await run_bot(transport)


if __name__ == "__main__":
    from pipecat.runner.run import main

    main()
The full example, including the Daily transport branch, lives in examples/bot.py. Run it with:
cp .env.example .env   # set SLNG_API_KEY and OPENAI_API_KEY
uv run --extra example examples/bot.py
Then open http://localhost:7860/client and start talking. It uses the SmallWebRTC transport by default; pass -t daily to use Daily instead (requires pipecat-ai[daily]).

Model identifiers

Models follow the format provider/model:variant. Prefix with slng/ to target an SLNG-hosted instance, and suffix the language where the model exposes per-language variants:
model="slng/deepgram/nova:3-en"      # SLNG-hosted Deepgram Nova 3, English (STT)
model="slng/deepgram/aura:2-en"      # SLNG-hosted Deepgram Aura 2, English (TTS)
The plugin routes through the SLNG Unmute bridge, so the full list of models you can pass to model= is the bridge’s supported-models list — see Supported models. Not every model accepts every option (for example speed on TTS); check the parameter coverage table before tuning.

STT reference

SlngSTTService streams speech-to-text over WebSocket, connecting to wss://api.slng.ai/v1/bridges/unmute/stt/{model}.

Constructor

stt = SlngSTTService(
    api_key="your-slng-api-key",          # Required. SLNG API key.
    model="slng/deepgram/nova:3-en",      # Model identifier. Default: "slng/deepgram/nova:3-en"
    base_url="api.slng.ai",               # Gateway host (self-hosted or staging). Default: "api.slng.ai"
    encoding="linear16",                   # "linear16", "mp3", or "opus". Default: "linear16"
    sample_rate=None,                      # Audio sample rate in Hz. Default: the pipeline sample rate
    language=Language.EN,                  # Recognition language. Default: English
    enable_vad=True,                       # Enable server-side VAD. Default: True
    enable_partials=True,                  # Stream interim (partial) transcripts. Default: True
    region_override=None,                  # Pin to a datacenter, sent as X-Region-Override
    world_part_override=None,              # Constrain to a zone, sent as X-World-Part-Override
    settings=None,                         # Optional SlngSTTSettings for runtime updates
)
Language is imported from pipecat.transcriptions.language.

Confidence filter

When the provider surfaces a confidence score, transcripts below 0.5 are dropped before reaching your pipeline.

Default endpoint

The plugin connects to:
wss://api.slng.ai/v1/bridges/unmute/stt/{model}

TTS reference (streaming)

SlngTTSService streams text-to-speech over WebSocket, connecting to wss://api.slng.ai/v1/bridges/unmute/tts/{model}. This is the recommended path for interactive voice agents.

Constructor

tts = SlngTTSService(
    api_key="your-slng-api-key",          # Required. SLNG API key.
    model="slng/deepgram/aura:2-en",      # Model identifier. Default: "slng/deepgram/aura:2-en"
    voice="aura-2-thalia-en",             # Voice identifier. Default: None (server default)
    base_url="api.slng.ai",               # Gateway host. Default: "api.slng.ai"
    encoding="linear16",                   # "linear16", "mp3", "opus", "mulaw", or "alaw". Default: "linear16"
    sample_rate=None,                      # Audio sample rate in Hz. Default: the pipeline sample rate
    language=Language.EN,                  # Synthesis language. Default: English
    speed=None,                            # Speech speed multiplier. Default: None (server default)
    region_override=None,                  # Pin to a datacenter, sent as X-Region-Override
    world_part_override=None,              # Constrain to a zone, sent as X-World-Part-Override
    settings=None,                         # Optional SlngTTSSettings for runtime updates
)

Runtime settings updates

Changing voice, speed, or language mid-session (via Pipecat settings updates) reconnects the WebSocket to re-run the init handshake. Expect a brief reconnect, not a silent no-op.

Default endpoint

The plugin connects to:
wss://api.slng.ai/v1/bridges/unmute/tts/{model}

Voice selection

Pick a voice that matches your chosen model. See the Voices pages for what’s available per provider.

HTTP TTS (non-streaming fallback)

For simple request/response synthesis where streaming is not required, use SlngHttpTTSService. It issues one HTTP POST per utterance and returns the full audio body in a single frame.
import os

from pipecat_slng import SlngHttpTTSService

tts = SlngHttpTTSService(
    api_key=os.getenv("SLNG_API_KEY"),
    model="slng/deepgram/aura:2-en",
    voice="aura-2-thalia-en",
)
The HTTP bridge body accepts only {text, voice} — there is no config object. Encoding, sample_rate, language, and speed are therefore not configurable over HTTP; the server returns its default audio format. language and speed are kept for API parity with the WebSocket service but are not sent over the wire.
The service auto-detects WAV (decoded to raw PCM at the file’s sample rate) and plain PCM (passed through at the pipeline’s sample rate). Compressed responses (MP3/Ogg) yield an ErrorFrame — use the streaming SlngTTSService if you need codec control. Pass aiohttp_session= to reuse a shared aiohttp.ClientSession; otherwise one is created internally. Region routing on the HTTP service uses the region and world-part query parameters instead of headers.

Good to know

Both WebSocket services output linear16 PCM by default and authenticate with api_key. The package exports SlngSTTService, SlngTTSService, and SlngHttpTTSService, plus the SlngSTTSettings and SlngTTSSettings settings classes.
Prefer the streaming SlngTTSService for conversational agents — it supports mid-utterance interruption. Reserve SlngHttpTTSService for batch or non-interactive synthesis.

Next steps