Skip to main content
POST
/
v1
/
stt
/
deepgram
/
nova:2
Nova 2
curl --request POST \
  --url https://api.slng.ai:2/v1/stt/deepgram/nova:2 \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form audio='@example-file' \
  --form language=en
{
  "results": {
    "channels": [
      {
        "alternatives": [
          {
            "transcript": "hello from sunny barcelona",
            "confidence": 0.98
          }
        ]
      }
    ]
  },
  "metadata": {
    "request_id": "req-123",
    "model": "nova-3"
  }
}

Authorizations

Authorization
string
header
required

API key issued by SLNG. Pass as Authorization: Bearer <token>.

Headers

X-World-Part-Override
enum<string>

Target world part override. Auto-selected if not provided.

Available options:
na,
eu

Body

Deepgram STT processing options.

audio
file

Audio file (multipart) or base64-encoded audio (JSON).

url
string

Publicly accessible audio URL.

Example:

"https://docs.slng.ai/audio/hello.wav"

model
string
default:nova-3

Model name override.

enable_partials
boolean
default:true

Enable partial/interim transcription results for streaming.

diarize
boolean
default:false

Enable speaker diarization. Words assigned speaker numbers starting at 0.

punctuate
boolean
default:false

Add punctuation and capitalization.

smart_format
boolean
default:false

Apply formatting to improve readability (dates, times, numbers, etc.).

utterances
boolean
default:false

Segment transcript into utterances.

paragraphs
boolean
default:false

Add paragraph formatting.

numerals
boolean
default:false

Convert spoken numbers to digits.

profanity_filter
boolean
default:false

Filter profanity from transcript.

redact
enum<string>[]

Redact sensitive information (pci, ssn, numbers).

Available options:
pci,
ssn,
numbers
detect_language
boolean
default:false

Auto-detect spoken language.

filler_words
boolean
default:false

Include filler words (uh, um) in transcript.

multichannel
boolean
default:false

Enable multi-channel audio processing.

keywords
string[]

Keywords to boost recognition accuracy.

encoding
enum<string>
default:linear16

Input audio encoding.

Available options:
linear16,
flac,
mulaw,
amr-nb,
amr-wb,
opus,
speex,
mp3,
mp4,
webm,
aac,
ogg
sample_rate
integer
default:16000

Input audio sample rate in Hz.

channels
integer

Number of audio channels.

Required range: x >= 1
tag
string

Tag for request tracking.

language
enum<string>

Supported language code.

Available options:
multi,
bg,
ca,
zh,
zh-CN,
zh-Hans,
zh-TW,
zh-Hant,
zh-HK,
cs,
da,
da-DK,
nl,
en,
en-US,
en-AU,
en-GB,
en-NZ,
en-IN,
et,
fi,
nl-BE,
fr,
fr-CA,
de,
de-CH,
el,
hi,
hu,
id,
it,
ja,
ko,
ko-KR,
lv,
lt,
ms,
no,
pl,
pt,
pt-BR,
pt-PT,
ro,
ru,
sk,
es,
es-419,
sv,
sv-SE,
th,
th-TH,
tr,
uk,
vi
Example:

"multi"

Response

Transcription successful.

3rd Party Deepgram API response format.

results
object
required
metadata
object