Documentation

HTTP vs. WebSocket

When dealing with Voice AI applications you might see references to different protocols like HTTP and WebSocket. Each protocol has its own strengths and trade-offs, making them suitable for different scenarios.

In this guide, we'll break down the differences between these protocols, their use cases, and when to choose each for integrating.

SLNG API supports two integration protocols: HTTP and WebSocket. Each has distinct characteristics optimized for different use cases.

Quick Comparison

Feature	HTTP	WebSocket
Direction	Request → Response	Bidirectional
Latency	Medium (200-500ms)	Lowest (sub-100ms)
Complexity	Simple	Higher
Connection	Per request	Persistent
Binary Data	Yes	Yes
Real-time	No	Yes
Best For	Batch processing	Voice agents

HTTP Protocol

Overview

Traditional request/response pattern. Client sends request, waits for complete response.

When to Use

Batch transcription
Pre-recorded TTS generation
Simple integrations
No real-time requirements
Stateless operations

Characteristics

Pros:

Simplest to implement
No connection management
Works everywhere
Easy debugging
Cacheable responses

Cons:

Higher latency
No streaming
Less efficient for multiple requests
Waits for complete response

Example: TTS

Code
 
POST https://api.slng.ai/v1/tts/deepgram/aura:2

{
  "text": "Convert this to speech",
  "encoding": "linear16",
  "sample_rate": 24000
}

# Response: Complete audio file

Example: STT

Code
 
POST https://api.slng.ai/v1/stt/slng/openai/whisper:large-v3
Content-Type: multipart/form-data

file=@audio.mp3

# Response: Complete transcription
{
  "text": "Transcribed content",
  "segments": [...]
}

WebSocket Protocol

Overview

Bidirectional, full-duplex communication over a persistent connection. Lowest latency, most powerful.

When to Use

Real-time voice agents
Interactive applications
Lowest latency required
Bidirectional streaming
Binary data streaming
Continuous communication

Characteristics

Pros:

Bidirectional streaming
Lowest latency (sub-100ms)
Binary data support
Most efficient for high-frequency updates
Real-time feedback

Cons:

More complex to implement
Connection management critical
Reconnection logic needed
Firewall/proxy issues possible
Stateful connection

Example: TTS WebSocket


Code
 
const ws = new WebSocket("wss://api.slng.ai/v1/tts/deepgram/aura:2");

ws.onopen = () => {
  // Initialize session
  ws.send(
    JSON.stringify({
      type: "init",
      config: {
        encoding: "linear16",
        sample_rate: 24000,
      },
    }),
  );

  // Send text to convert
  ws.send(
    JSON.stringify({
      type: "speak",
      text: "Hello from WebSocket!",
    }),
  );
};

ws.onmessage = (event) => {
  if (event.data instanceof ArrayBuffer) {
    // Binary audio data
    playAudio(event.data);
  } else {
    // JSON control messages
    const message = JSON.parse(event.data);
    console.log("Server message:", message);
  }
};

ws.onerror = (error) => {
  console.error("WebSocket error:", error);
};

ws.onclose = () => {
  console.log("Connection closed");
  // Implement reconnection logic here
};

Example: STT WebSocket


Code
 
const ws = new WebSocket("wss://api.slng.ai/v1/stt/deepgram/nova:2");

ws.onopen = () => {
  // Initialize session
  ws.send(
    JSON.stringify({
      type: "init",
      config: {
        language: "en",
        sample_rate: 16000,
      },
    }),
  );

  // Stream audio data (binary)
  microphone.ondata = (audioChunk) => {
    ws.send(audioChunk); // Send raw audio bytes
  };
};

ws.onmessage = (event) => {
  const message = JSON.parse(event.data);

  if (message.type === "transcript") {
    console.log("Transcript:", message.text);
    console.log("Is final:", message.is_final);
    console.log("Confidence:", message.confidence);
  }
};

Protocol Selection Guide

Use HTTP When:

Processing pre-recorded audio
Batch transcription of files
Generating audio for download
Simple, one-off requests
Caching is beneficial

Use WebSocket When:

Building voice agents
Need real-time interaction
Require lowest latency
Need bidirectional streaming
Streaming binary audio data
Continuous back-and-forth communication

Implementation Best Practices

For All Protocols

Authentication: Always include API key in headers
Error Handling: Implement robust error handling
Rate Limiting: Respect rate limits and implement backoff
Timeouts: Set appropriate timeouts
Logging: Log requests for debugging

For WebSocket

Reconnection Logic: Implement exponential backoff
Heartbeat/Ping: Keep connection alive
Message Queuing: Queue messages during reconnection
State Management: Track connection state
Binary Handling: Properly handle binary vs text frames

Common Patterns

Pattern: HTTP with Polling (Async Job)


Code
 
// Submit job
const job = await fetch("/tts/...", { method: "POST", body: data });
const jobId = await job.json().id;

// Poll for completion
while (true) {
  const status = await fetch(`/jobs/${jobId}`);
  if (status.complete) break;
  await sleep(1000);
}

Pattern: WebSocket with Reconnection


Code
 
class ResilientWebSocket {
  connect() {
    this.ws = new WebSocket(this.url);
    this.ws.onclose = () => {
      setTimeout(() => this.connect(), this.backoff);
      this.backoff = Math.min(this.backoff * 2, 30000);
    };
    this.ws.onopen = () => {
      this.backoff = 1000; // Reset backoff
    };
  }
}

Performance Characteristics

Latency Comparison (Typical)

HTTP: 200-500ms (includes full round trip)
WebSocket: 50-100ms (full-duplex, minimal overhead)

Throughput Comparison

HTTP: ~10-50 requests/second (depends on connection pool)
WebSocket: ~100+ messages/second (per connection)

Resource Usage

HTTP: Low (stateless, no persistent connections)
WebSocket: Medium-High (persistent, bidirectional)

Next Steps

Last modified on February 12, 2026

Available Models WebSocket API

Documentation

HTTP vs. WebSocket

In this guide, we'll break down the differences between these protocols, their use cases, and when to choose each for integrating.

SLNG API supports two integration protocols: HTTP and WebSocket. Each has distinct characteristics optimized for different use cases.

Quick Comparison

Feature	HTTP	WebSocket
Direction	Request → Response	Bidirectional
Latency	Medium (200-500ms)	Lowest (sub-100ms)
Complexity	Simple	Higher
Connection	Per request	Persistent
Binary Data	Yes	Yes
Real-time	No	Yes
Best For	Batch processing	Voice agents

HTTP Protocol

Overview

Traditional request/response pattern. Client sends request, waits for complete response.

When to Use

Batch transcription
Pre-recorded TTS generation
Simple integrations
No real-time requirements
Stateless operations

Characteristics

Pros:

Simplest to implement
No connection management
Works everywhere
Easy debugging
Cacheable responses

Cons:

Higher latency
No streaming
Less efficient for multiple requests
Waits for complete response

Example: TTS

Code
 
POST https://api.slng.ai/v1/tts/deepgram/aura:2

{
  "text": "Convert this to speech",
  "encoding": "linear16",
  "sample_rate": 24000
}

# Response: Complete audio file

Example: STT

Code
 
POST https://api.slng.ai/v1/stt/slng/openai/whisper:large-v3
Content-Type: multipart/form-data

file=@audio.mp3

# Response: Complete transcription
{
  "text": "Transcribed content",
  "segments": [...]
}

WebSocket Protocol

Overview

Bidirectional, full-duplex communication over a persistent connection. Lowest latency, most powerful.

When to Use

Real-time voice agents
Interactive applications
Lowest latency required
Bidirectional streaming
Binary data streaming
Continuous communication

Characteristics

Pros:

Bidirectional streaming
Lowest latency (sub-100ms)
Binary data support
Most efficient for high-frequency updates
Real-time feedback

Cons:

More complex to implement
Connection management critical
Reconnection logic needed
Firewall/proxy issues possible
Stateful connection

Example: TTS WebSocket


Code
 
const ws = new WebSocket("wss://api.slng.ai/v1/tts/deepgram/aura:2");

ws.onopen = () => {
  // Initialize session
  ws.send(
    JSON.stringify({
      type: "init",
      config: {
        encoding: "linear16",
        sample_rate: 24000,
      },
    }),
  );

  // Send text to convert
  ws.send(
    JSON.stringify({
      type: "speak",
      text: "Hello from WebSocket!",
    }),
  );
};

ws.onmessage = (event) => {
  if (event.data instanceof ArrayBuffer) {
    // Binary audio data
    playAudio(event.data);
  } else {
    // JSON control messages
    const message = JSON.parse(event.data);
    console.log("Server message:", message);
  }
};

ws.onerror = (error) => {
  console.error("WebSocket error:", error);
};

ws.onclose = () => {
  console.log("Connection closed");
  // Implement reconnection logic here
};

Example: STT WebSocket


Code
 
const ws = new WebSocket("wss://api.slng.ai/v1/stt/deepgram/nova:2");

ws.onopen = () => {
  // Initialize session
  ws.send(
    JSON.stringify({
      type: "init",
      config: {
        language: "en",
        sample_rate: 16000,
      },
    }),
  );

  // Stream audio data (binary)
  microphone.ondata = (audioChunk) => {
    ws.send(audioChunk); // Send raw audio bytes
  };
};

ws.onmessage = (event) => {
  const message = JSON.parse(event.data);

  if (message.type === "transcript") {
    console.log("Transcript:", message.text);
    console.log("Is final:", message.is_final);
    console.log("Confidence:", message.confidence);
  }
};

Protocol Selection Guide

Use HTTP When:

Processing pre-recorded audio
Batch transcription of files
Generating audio for download
Simple, one-off requests
Caching is beneficial

Use WebSocket When:

Building voice agents
Need real-time interaction
Require lowest latency
Need bidirectional streaming
Streaming binary audio data
Continuous back-and-forth communication

Implementation Best Practices

For All Protocols

Authentication: Always include API key in headers
Error Handling: Implement robust error handling
Rate Limiting: Respect rate limits and implement backoff
Timeouts: Set appropriate timeouts
Logging: Log requests for debugging

For WebSocket

Reconnection Logic: Implement exponential backoff
Heartbeat/Ping: Keep connection alive
Message Queuing: Queue messages during reconnection
State Management: Track connection state
Binary Handling: Properly handle binary vs text frames

Common Patterns

Pattern: HTTP with Polling (Async Job)


Code
 
// Submit job
const job = await fetch("/tts/...", { method: "POST", body: data });
const jobId = await job.json().id;

// Poll for completion
while (true) {
  const status = await fetch(`/jobs/${jobId}`);
  if (status.complete) break;
  await sleep(1000);
}

Pattern: WebSocket with Reconnection


Code
 
class ResilientWebSocket {
  connect() {
    this.ws = new WebSocket(this.url);
    this.ws.onclose = () => {
      setTimeout(() => this.connect(), this.backoff);
      this.backoff = Math.min(this.backoff * 2, 30000);
    };
    this.ws.onopen = () => {
      this.backoff = 1000; // Reset backoff
    };
  }
}

Performance Characteristics

Latency Comparison (Typical)

HTTP: 200-500ms (includes full round trip)
WebSocket: 50-100ms (full-duplex, minimal overhead)

Throughput Comparison

HTTP: ~10-50 requests/second (depends on connection pool)
WebSocket: ~100+ messages/second (per connection)

Resource Usage

HTTP: Low (stateless, no persistent connections)
WebSocket: Medium-High (persistent, bidirectional)

Next Steps

Last modified on February 12, 2026

Available Models WebSocket API