Whisper-v3
Kyutai Streaming
High-performance streaming speech recognition optimized for French and English. Provides real-time transcription with low latency. Supports audio streaming for live transcription scenarios. Hosted on India compute infrastructure in Mumbai for optimal regional performance.
Headers
Authorization
string · requiredThe
Authorization
header is used to authenticate with the API using your API key. Value is of the formatBearer YOUR_KEY_HERE
.
Request Body
audio
string · binary · requiredAudio file to transcribe (WAV, MP3, FLAC, OGG, M4A)
language
string · enumLanguage code: 'en' for English, 'fr' for French
Enum values:enfrDefault: enstream
booleanEnable streaming mode for real-time transcription
Default: falsetimestamps
booleanInclude word-level timestamps in response
Default: falsepunctuation
booleanAdd automatic punctuation to transcript
Default: true
Responses
Transcription result
text
string · requiredThe transcribed text
language
stringDetected or specified language
confidence
number · min: 0 · max: 1Overall confidence score (0-1)
duration
numberAudio duration in seconds
segments
object[]Word-level segments with timestamps (if requested)