Skip to main content
This guide covers what you need to build a production-ready WebSocket integration. For message types and parameters, see the WebSocket protocol reference. Prerequisites:

Best Practices

Items 1 and 2 are required for any production integration. Items 3-6 improve quality and resilience.

1. Connection Management

Implement Reconnection Logic:
class ResilientWebSocket {
  constructor(url) {
    this.url = url;
    this.backoff = 1000;
    this.maxBackoff = 30000;
    this.connect();
  }

  connect() {
    this.ws = new WebSocket(this.url);

    this.ws.onclose = () => {
      console.log(`Reconnecting in ${this.backoff}ms`);
      setTimeout(() => this.connect(), this.backoff);
      this.backoff = Math.min(this.backoff * 2, this.maxBackoff);
    };

    this.ws.onopen = () => {
      console.log("Connected");
      this.backoff = 1000; // Reset backoff on successful connection
    };
  }
}

2. Error Handling

Always handle errors gracefully:
ws.onerror = (error) => {
  console.error("WebSocket error:", error);
  // Notify user, attempt recovery
};

ws.onmessage = (event) => {
  if (typeof event.data === "string") {
    const message = JSON.parse(event.data);
    if (message.type === "error") {
      handleServerError(message.code, message.message);
    }
  }
};

3. Buffer Management (TTS)

Use flush strategically to control latency vs. quality:
// For low latency: flush after each sentence
ws.send(JSON.stringify({ type: "text", text: sentence }));
ws.send(JSON.stringify({ type: "flush" }));

// For better quality: batch multiple sentences
ws.send(JSON.stringify({ type: "text", text: paragraph }));
// ... send more text ...
ws.send(JSON.stringify({ type: "flush" })); // Flush at the end

4. Audio Format Consistency

Ensure your audio format matches configuration:
// Configuration
{
  encoding: 'linear16',    // 16-bit PCM
  sample_rate: 16000       // 16kHz
}

// Audio capture must match:
// - 16-bit samples
// - 16000 Hz sample rate
// - Single channel (mono)

5. Heartbeat/Keep-Alive

Send keep-alive messages to prevent idle disconnection. For STT sessions, use the built-in keepalive message type. For TTS, use a WebSocket ping frame:
// STT: use the protocol-level keepalive message
let keepaliveInterval = setInterval(() => {
  if (ws.readyState === WebSocket.OPEN) {
    ws.send(JSON.stringify({ type: "keepalive" }));
  }
}, 10000); // Every 10 seconds

ws.onclose = () => {
  clearInterval(keepaliveInterval);
};

6. Interruption Handling (TTS)

SLNG Voice Agents handle interruptions automatically via enable_interruptions. This only applies if you manage TTS WebSocket connections yourself.
When building your own voice agent, stop TTS output as soon as the user starts speaking:
  • Send { "type": "close" } to stop server-side generation and end the session
  • Send { "type": "clear" } to discard queued audio (keeps the session open)
  • Clear your local audio buffer and stop playback
For complete patterns (immediate interrupt, clear-and-restart, fade-out, voice agent loop), see TTS WebSocket examples.

Troubleshooting

Problem: WebSocket disconnects unexpectedly
  • Implement reconnection logic with exponential backoff (see Connection Management above)
  • Send periodic keep-alive messages to prevent idle timeouts (see Keep-Alive)
  • If behind a corporate proxy, confirm it supports WebSocket upgrades (Connection: Upgrade header)
  • Run a WebSocket echo test (wscat -c wss://echo.websocket.org) to rule out local network issues
Problem: Choppy or distorted audio playback
  • Buffer at least 200ms of audio before starting playback to absorb network jitter
  • Use the WebAudio API (AudioContext) instead of <audio> elements for gapless chunk playback
  • Confirm the sample rate in your audio player matches the sample_rate from your init config
  • 24kHz linear16 audio requires ~384 kbps; verify your connection can sustain this
Problem: Transcription results lag behind audio
  • Send audio in 20-100ms chunks (320-1600 bytes at 16kHz linear16) rather than large buffers
  • Measure round-trip time with Date.now() around send/receive to isolate network vs. server latency
  • Confirm encoding and sample_rate in your init config match your actual audio format
  • For real-time use, prefer Deepgram Nova which is optimized for streaming latency
Problem: Connection rejected with 401
  • Pass the API key as a header during the WebSocket handshake: Authorization: Bearer YOUR_KEY
  • Verify your key is active in the SLNG dashboard
  • Check for trailing whitespace or newlines in the key string
  • Some WebSocket libraries don’t support custom headers; pass the key as a query parameter (?token=YOUR_KEY) if needed

Next Steps

WebSocket protocol

Message types, parameters, and connection URLs

Protocol comparison

HTTP vs. WebSocket — when to use each

TTS examples

JavaScript and Python code for real-time TTS

STT examples

JavaScript and Python code for real-time STT