STT Routing

STT Routing is in PRIVATE BETA. The behavior described here is being rolled out gradually. Contact us for access.

The first stage of the execution layer. When audio arrives, it is routed to the right STT model for that specific interaction. A Hindi caller in Mumbai gets a different model than an English caller in New York. A noisy environment may route to a model with better noise handling. The layer balances accuracy, latency, and cost per turn.

How it routes

Routing weighs several inputs together, not in isolation:

Language and accent of the audio
Noise profile of the environment
Regional availability of models
Cost and latency constraints

For voice agent calls, routing happens per turn. Each turn can route to the model best suited to that specific audio segment.

Today

Until STT Routing is generally available, you select the STT model explicitly on each request. See the Speech-to-Text Overview for the current model list and how to choose.

Speech-to-Text Overview

The models available today and how to pick one.

How It Works

Where STT routing sits in the pipeline.

Adaptive Execution Tiered Decisioning

⌘I

Overview

Pipeline

Configuration

Unified API

How it routes

Today

Speech-to-Text Overview

How It Works

​How it routes

​Today

​Related

Speech-to-Text Overview

How It Works

How it routes

Today

Related