Real-time speech-to-text transcription with ultra-low latency using Deepgram's Nova model. Optimized for streaming audio with intelligent Voice Activity Detection (VAD) and speaker diarization.
deepgram/nova:3-multi - websocket
Speech-to-Text API for converting audio files to text using SLNG deepgram/nova. Real-time speech-to-text transcription with ultra-low latency using Deepgram's Nova model. Optimized for streaming audio with intelligent Voice Activity Detection (VAD) and speaker diarization.
WebSocket Endpoint
Establishes a WebSocket connection for real-time speech-to-text.
Connection URL: wss://api.slng.ai/v1/stt/slng/deepgram/nova:3-multi
Headers
AuthorizationThe Authorization header is used to authenticate with the API using your API key. Value is of the format Bearer YOUR_KEY_HERE.
UpgradeConnectionX-Region-OverrideOptional. Specify a target region for this model. If not provided, the system will automatically select an appropriate region.
deepgram/nova:3-multi - websocket › Request Body
Decision Table
| Variant | Matching Criteria |
|---|---|
| type = object · requires: type | |
| type = object · requires: type, data | |
| type = object · requires: type |
typemodelModel to use for transcription
Recognition configuration options
deepgram/nova:3-multi - websocket › Responses
Switching Protocols
Decision Table
| Variant | Matching Criteria |
|---|---|
| type = object · requires: type, session_id | |
| type = object · requires: type, transcript | |
| type = object · requires: type, transcript | |
| type = object · requires: type, code, message |
typesession_idUnique session identifier
deepgram/nova:3-multi - http
Speech-to-Text API for converting audio files to text using SLNG deepgram/nova. Real-time speech-to-text transcription with ultra-low latency using Deepgram's Nova model. Optimized for streaming audio with intelligent Voice Activity Detection (VAD) and speaker diarization.
Headers
AuthorizationThe Authorization header is used to authenticate with the API using your API key. Value is of the format Bearer YOUR_KEY_HERE.
X-Region-OverrideOptional. Specify a target region for this model. If not provided, the system will automatically select an appropriate region.
deepgram/nova:3-multi - http › Request Body
audioAudio file to transcribe
urlURL to audio file
languageAllowed language codes for deepgram/nova:3-multi.
modelAI model used to process submitted audio (nova-3, nova-2, etc.)
punctuateAdd punctuation and capitalization to the transcript
diarizeRecognize speaker changes. Each word assigned a speaker number starting at 0
smart_formatApply formatting to improve transcript readability
utterancesSegment speech into meaningful semantic units
utt_splitSeconds to wait before detecting a pause between words
paragraphsSplit audio into paragraphs to improve transcript readability
numeralsConvert numbers from written format to numerical format
profanity_filterConvert profanity to nearest non-profane word or remove it
redactRemove sensitive information (pci, pii, numbers) from transcripts
searchSearch for terms or phrases in submitted audio
replaceSearch and replace terms in transcript (format "term:replacement")
keywordsBoost or suppress terminology (Nova-2 and earlier, with intensifier like "word:5")
keytermKeyterm prompting for specialized terminology (Nova-3 only)
multichannelTranscribe each audio channel independently
alternativesNumber of alternative transcripts to return
filler_wordsInclude filler words like "uh" and "um" in transcript
dictationDictation mode for controlling formatting with dictated speech
measurementsConvert spoken measurements to abbreviations
encodingExpected encoding of submitted audio
sample_rateSample rate of submitted audio in Hz
channelsNumber of audio channels
detect_entitiesIdentifies and extracts key entities from content in submitted audio
Identifies the dominant language spoken in submitted audio
sentimentRecognizes the sentiment throughout a transcript
Summarize content. Supports string version option (v2) or boolean.
topicsDetect topics throughout a transcript
custom_topicCustom topics you want the model to detect within your input audio
custom_topic_modeSets how the model will interpret custom_topic param
intentsRecognizes speaker intent throughout a transcript
custom_intentCustom intents you want the model to detect within your input audio
custom_intent_modeSets how the model will interpret custom_intent param
callbackURL to which we'll make the callback request
callback_methodHTTP method by which the callback request will be made
Label your requests for the purpose of identification during usage reporting
Arbitrary key-value pairs attached to the API response for downstream processing
deepgram/nova:3-multi - http › Responses
Successful transcription
textTranscribed text.
languageDetected or specified language code.
durationAudio duration in seconds.
confidenceConfidence score (0.0-1.0).
Word-level transcription with timing and confidence.
request_idUnique request identifier.
modelModel used for transcription.