SLNG provides access to best-in-class speech models through a single API. All models support consistent protocols and provide production-ready performance.
Real-time speech-to-text transcription with ultra-low latency using Deepgram's Nova model. Optimized for streaming audio with intelligent Voice Activity Detection (VAD) and speaker diarization.
Speech-to-Text. Real-time speech-to-text transcription using Speechmatics model hosted by SLNG via WebSocket. Supports streaming audio input with intelligent Voice Activity Detection (VAD), partial transcripts for immediate feedback. Perfect for live transcription, voice commands, and real-time captioning. Supports English and Spanish with high accuracy.
Speech-to-Text using Whisper Large v3 model hosted by SLNG. Supports 99+ languages with automatic language detection. Best for general-purpose transcription with high accuracy.
Real-time speech-to-text transcription with ultra-low latency using Deepgram's Nova model. Optimized for streaming audio with intelligent Voice Activity Detection (VAD) and speaker diarization.
Real-time speech-to-text transcription with ultra-low latency using Deepgram's Nova model. Optimized for streaming audio with intelligent Voice Activity Detection (VAD) and speaker diarization.
Deepgram Nova 3 EnglishDeepgram Nova 3 SpanishDeepgram Nova 3 Hindi
Regions
apeume
Use Cases
Real-time transcriptionVoice agentsLive streaming
Speechmatics
SLNGSTT
Speech-to-Text. Real-time speech-to-text transcription using Speechmatics model hosted by SLNG via WebSocket. Supports streaming audio input with intelligent Voice Activity Detection (VAD), partial transcripts for immediate feedback. Perfect for live transcription, voice commands, and real-time captioning. Supports English and Spanish with high accuracy.
Speech-to-Text using Whisper Large v3 model hosted by SLNG. Supports 99+ languages with automatic language detection. Best for general-purpose transcription with high accuracy.
Real-time speech-to-text transcription with ultra-low latency using Deepgram's Nova model. Optimized for streaming audio with intelligent Voice Activity Detection (VAD) and speaker diarization.
With the variety of models available, choosing the right one depends on your specific use case and requirements.
Here are some recommendations to help you decide based on our internal benchmarks.
Speech-to-Text using Whisper Large v3 model hosted by SLNG. Supports 99+ languages with automatic language detection. Best for general-purpose transcription with high accuracy.
Models optimized for accuracy in offline processing:
Whisper
SLNGSTT
Speech-to-Text using Whisper Large v3 model hosted by SLNG. Supports 99+ languages with automatic language detection. Best for general-purpose transcription with high accuracy.
Real-time speech-to-text transcription with ultra-low latency using Deepgram's Nova model. Optimized for streaming audio with intelligent Voice Activity Detection (VAD) and speaker diarization.
Real-time speech-to-text transcription with ultra-low latency using Deepgram's Nova model. Optimized for streaming audio with intelligent Voice Activity Detection (VAD) and speaker diarization.
Real-time speech-to-text transcription with ultra-low latency using Deepgram's Nova model. Optimized for streaming audio with intelligent Voice Activity Detection (VAD) and speaker diarization.
Real-time speech-to-text transcription with ultra-low latency using Deepgram's Nova model. Optimized for streaming audio with intelligent Voice Activity Detection (VAD) and speaker diarization.
Speech-to-Text. Real-time speech-to-text transcription using Speechmatics model hosted by SLNG via WebSocket. Supports streaming audio input with intelligent Voice Activity Detection (VAD), partial transcripts for immediate feedback. Perfect for live transcription, voice commands, and real-time captioning. Supports English and Spanish with high accuracy.
Speech-to-Text using Whisper Large v3 model hosted by SLNG. Supports 99+ languages with automatic language detection. Best for general-purpose transcription with high accuracy.
Real-time speech-to-text transcription with ultra-low latency using Deepgram's Nova model. Optimized for streaming audio with intelligent Voice Activity Detection (VAD) and speaker diarization.
Real-time speech-to-text transcription with ultra-low latency using Deepgram's Nova model. Optimized for streaming audio with intelligent Voice Activity Detection (VAD) and speaker diarization.
Speech-to-Text using Whisper Large v3 model hosted by SLNG. Supports 99+ languages with automatic language detection. Best for general-purpose transcription with high accuracy.
Real-time speech-to-text transcription with ultra-low latency using Deepgram's Nova model. Optimized for streaming audio with intelligent Voice Activity Detection (VAD) and speaker diarization.
Real-time speech-to-text transcription with ultra-low latency using Deepgram's Nova model. Optimized for streaming audio with intelligent Voice Activity Detection (VAD) and speaker diarization.
TTS Models: Typically charged per 1,000 characters
STT Models: Typically charged per minute of audio
For exact pricing, contact [email protected] or check your dashboard.
Adding New Models
We continuously adds new models and providers. Check back regularly for updates, or subscribe to our newsletter for announcements.
And let us know if there is a specific model or provider you'd like to see supported!