SLNG Speech-to-Text API

Whisper Large v3

Endpointhttps://api.slng.ai

Speech-to-Text using Whisper Large v3 model hosted by SLNG. Supports 99+ languages with automatic language detection. Best for general-purpose transcription with high accuracy.

openai/whisper:large-v3 - websocket

GET

https://api.slng.ai

/v1/stt/slng/openai/whisper:large-v3

Speech-to-Text API for converting audio files to text using SLNG openai/whisper. Speech-to-Text using Whisper Large v3 model hosted by SLNG. Supports 99+ languages with automatic language detection. Best for general-purpose transcription with high accuracy.

WebSocket Endpoint

Establishes a WebSocket connection for real-time speech-to-text.

Connection URL: wss://api.slng.ai/v1/stt/slng/openai/whisper:large-v3

openai/whisper:large-v3 - websocket › Headers

Authorization

string · required

The Authorization header is used to authenticate with the API using your API key. Value is of the format Bearer YOUR_KEY_HERE.

Upgrade

string · enum · required

Enum values:

websocket

Connection

string · enum · required

Enum values:

Upgrade

X-Region-Override

string · enum

Optional. Specify a target region for this model. If not provided, the system will automatically select an appropriate region.

Enum values:

eu-west

openai/whisper:large-v3 - websocket › Request Body

oneOf

Exactly one variant must match.

Decision Table

Variant	Matching Criteria
	type = object · requires: type
	type = object · requires: type, data
	type = object · requires: type

Properties for Init Message:

Initialize a session with recognition configuration.

type

const · required

Const value: init

object

Whisper recognition configuration

openai/whisper:large-v3 - websocket › Responses

Switching Protocols

oneOf

Exactly one variant must match.

Decision Table

Variant	Matching Criteria
	type = object · requires: type, session_id
	type = object · requires: type, transcript
	type = object · requires: type, transcript
	type = object · requires: type, code, message

Properties for Ready Message:

Indicates the session is ready to receive audio.

type

const · required

Const value: ready

session_id

string · required

Unique session identifier

GET/v1/stt/slng/openai/whisper:large-v3

curl --request GET \
  --url https://api.slng.ai/v1/stt/slng/openai/whisper:large-v3 \
  --header 'Authorization: <string>' \
  --header 'Connection: <string>' \
  --header 'Content-Type: application/json' \
  --header 'Upgrade: <string>' \
  --data '
{
  "type": "init",
  "config": {
    "language": "en",
    "sample_rate": 16000,
    "encoding": "linear16",
    "enable_vad": true,
    "enable_partials": true
  }
}
'

shell

Example Request Body

{
  "type": "init",
  "config": {
    "language": "en",
    "sample_rate": 16000,
    "encoding": "linear16",
    "enable_vad": true,
    "enable_partials": true
  }
}

json

application/json

Example Responses

{
  "type": "ready",
  "session_id": "session_abc123"
}

json

application/json

openai/whisper:large-v3 - http

POST

https://api.slng.ai

/v1/stt/slng/openai/whisper:large-v3

openai/whisper:large-v3 - http › Headers

Authorization

string · required

The Authorization header is used to authenticate with the API using your API key. Value is of the format Bearer YOUR_KEY_HERE.

X-Region-Override

string · enum

Optional. Specify a target region for this model. If not provided, the system will automatically select an appropriate region.

Enum values:

eu-west

openai/whisper:large-v3 - http › Request Body

audio

string · binary

Audio file to transcribe (mp3, mp4, wav, webm, etc.)

language

string

ISO-639-1 language code (optional, auto-detect if not provided)

Default: en

response_format

string · enum

Response format

Enum values:

json

text

verbose_json

temperature

number · min: 0 · max: 1

Sampling temperature

url

string

URL to audio file

openai/whisper:large-v3 - http › Responses

Successful transcription

text

string · required

Transcribed text.

language

string · required

Detected or specified language code.

duration

number

Audio duration in seconds.

confidence

number

Confidence score (0.0-1.0).

object[]

Word-level transcription with timing and confidence.

request_id

string · uuid

Unique request identifier.

model

string

Model used for transcription.

POST/v1/stt/slng/openai/whisper:large-v3

curl --request POST \
  --url https://api.slng.ai/v1/stt/slng/openai/whisper:large-v3 \
  --header 'Authorization: <string>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "url": "https://docs.slng.ai/audio/hello.wav",
  "language": "en"
}
'

shell

Example Request Body

{
  "url": "https://docs.slng.ai/audio/hello.wav",
  "language": "en"
}

json

application/json

Example Responses

{
  "text": "Hello, this is a sample transcription of the audio file.",
  "language": "en"
}

json

application/json

Speechmatics ES Nova 2

openai/whisper:large-v3 - websocket

GET

https://api.slng.ai

/v1/stt/slng/openai/whisper:large-v3

WebSocket Endpoint

Establishes a WebSocket connection for real-time speech-to-text.

Connection URL: wss://api.slng.ai/v1/stt/slng/openai/whisper:large-v3

openai/whisper:large-v3 - websocket › Headers

Authorization

string · required

The Authorization header is used to authenticate with the API using your API key. Value is of the format Bearer YOUR_KEY_HERE.

Upgrade

string · enum · required

Enum values:

websocket

Connection

string · enum · required

Enum values:

Upgrade

X-Region-Override

string · enum

Optional. Specify a target region for this model. If not provided, the system will automatically select an appropriate region.

Enum values:

eu-west

openai/whisper:large-v3 - websocket › Request Body

oneOf

Exactly one variant must match.

Decision Table

Variant	Matching Criteria
	type = object · requires: type
	type = object · requires: type, data
	type = object · requires: type

Properties for Init Message:

Initialize a session with recognition configuration.

type

const · required

Const value: init

object

Whisper recognition configuration

openai/whisper:large-v3 - websocket › Responses

Switching Protocols

oneOf

Exactly one variant must match.

Decision Table

Variant	Matching Criteria
	type = object · requires: type, session_id
	type = object · requires: type, transcript
	type = object · requires: type, transcript
	type = object · requires: type, code, message

Properties for Ready Message:

Indicates the session is ready to receive audio.

type

const · required

Const value: ready

session_id

string · required

Unique session identifier

curl --request GET \ --url https://api.slng.ai/v1/stt/slng/openai/whisper:large-v3 \ --header 'Authorization: <string>' \ --header 'Connection: <string>' \ --header 'Content-Type: application/json' \ --header 'Upgrade: <string>' \ --data ' { "type": "init", "config": { "language": "en", "sample_rate": 16000, "encoding": "linear16", "enable_vad": true, "enable_partials": true } } '

openai/whisper:large-v3 - http

POST

https://api.slng.ai

/v1/stt/slng/openai/whisper:large-v3

openai/whisper:large-v3 - http › Headers

Authorization

string · required

The Authorization header is used to authenticate with the API using your API key. Value is of the format Bearer YOUR_KEY_HERE.

X-Region-Override

string · enum

Optional. Specify a target region for this model. If not provided, the system will automatically select an appropriate region.

Enum values:

eu-west

openai/whisper:large-v3 - http › Request Body

audio

string · binary

Audio file to transcribe (mp3, mp4, wav, webm, etc.)

language

string

ISO-639-1 language code (optional, auto-detect if not provided)

Default: en

response_format

string · enum

Response format

Enum values:

json

text

verbose_json

temperature

number · min: 0 · max: 1

Sampling temperature

url

string

URL to audio file

openai/whisper:large-v3 - http › Responses

Successful transcription

text

string · required

Transcribed text.

language

string · required

Detected or specified language code.

duration

number

Audio duration in seconds.

confidence

number

Confidence score (0.0-1.0).

object[]

Word-level transcription with timing and confidence.

request_id

string · uuid

Unique request identifier.

model

string

Model used for transcription.