Skip to main content
Pronunciation dictionaries let you control how text is spoken before it reaches the selected TTS model. Create a dictionary once, then reference it from HTTP, WebSocket, or Unified TTS requests. Use them when a voice needs to pronounce acronyms, product names, customer names, jargon, or multilingual terms consistently across TTS models.

Placeholders

The snippets below use these placeholders. Replace them before running the code.
PlaceholderReplace with
SLNG_API_KEYAn SLNG API key from app.slng.ai/api-keys
support-pronunciationsThe dictionary name you create with POST /v1/pronunciation/dictionaries
pd_01abc...A dictionary dictionary_id returned at creation time
Pronunciation dictionaries currently support rewrite mode. SLNG rewrites matching words or phrases before synthesis, and the rewritten text is what the selected model receives.

How it works

Each dictionary belongs to the organization resolved from your API key. Requests from another organization cannot read or use it. The basic flow is:
  1. Create a dictionary with rewrite rules.
  2. Reference that dictionary by name or dictionary_id.
  3. Send a TTS request with a pronunciation object.
Only one active pronunciation dictionary can apply to a request or WebSocket turn.

Create a dictionary

Create dictionaries with the pronunciation dictionary API reference:
curl -X POST https://api.slng.ai/v1/pronunciation/dictionaries \
  -H "Authorization: Bearer SLNG_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "support-pronunciations",
    "metadata": {
      "language": "en-US",
      "use_case": "support voice agent"
    },
    "modes": {
      "rewrite": {
        "rules": [
          { "match": "SLNG", "replace": "slang" },
          { "match": "QubePay", "replace": "cube pay" },
          { "match": "ACH transfer", "replace": "ay see aitch transfer" },
          { "match": "ACH", "replace": "ay see aitch" }
        ]
      }
    }
  }'
A successful response includes the dictionary id, normalized name, metadata, modes, content hash, and creation timestamp:
{
  "id": "pd_01abc...",
  "org_id": "org_123",
  "name": "support-pronunciations",
  "normalized_name": "support-pronunciations",
  "metadata": {
    "language": "en-US",
    "use_case": "support voice agent"
  },
  "modes": {
    "rewrite": {
      "rules": [
        { "match": "SLNG", "replace": "slang" },
        { "match": "QubePay", "replace": "cube pay" },
        { "match": "ACH transfer", "replace": "ay see aitch transfer" },
        { "match": "ACH", "replace": "ay see aitch" }
      ]
    }
  },
  "content_hash": "sha256:...",
  "created_at": "2026-05-15T12:00:00.000Z"
}
Dictionary names must be unique within your organization. Names can contain letters, numbers, ., _, and -, and can be up to 128 characters.

Hear the difference

Use the same text with and without the dictionary:
Thanks for calling SLNG support. I found your QubePay ACH transfer, and the next invoice will arrive on Friday.

Manage dictionaries

For request and response schemas, see the generated API reference pages for listing dictionaries, reading one dictionary, and deleting a dictionary. List dictionaries:
curl -s https://api.slng.ai/v1/pronunciation/dictionaries \
  -H "Authorization: Bearer SLNG_API_KEY"
Get one dictionary by name:
curl -s https://api.slng.ai/v1/pronunciation/dictionaries/support-pronunciations \
  -H "Authorization: Bearer SLNG_API_KEY"
Delete a dictionary:
curl -X DELETE https://api.slng.ai/v1/pronunciation/dictionaries/support-pronunciations \
  -H "Authorization: Bearer SLNG_API_KEY"

Use a dictionary with HTTP TTS

Add a pronunciation object to the TTS request body:
curl -X POST https://api.slng.ai/v1/tts/slng/deepgram/aura:2-en \
  -H "Authorization: Bearer SLNG_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "aura-2-thalia-en",
    "text": "Thanks for calling SLNG support. I found your QubePay ACH transfer, and the next invoice will arrive on Friday.",
    "pronunciation": {
      "mode": "rewrite",
      "name": "support-pronunciations"
    }
  }'
You can also reference the dictionary by immutable ID:
{
  "pronunciation": {
    "mode": "rewrite",
    "dictionary_id": "pd_01abc..."
  }
}
Rules for the request object:
  • mode must be "rewrite"
  • provide exactly one of name or dictionary_id
  • use only one active dictionary per request
With the example dictionary above, the selected model receives this rewritten text:
Thanks for calling slang support. I found your cube pay ay see aitch transfer, and the next invoice will arrive on Friday.

Use a dictionary with WebSocket TTS

Set a default dictionary when you initialize the session:
{
  "type": "init",
  "config": {
    "pronunciation": {
      "mode": "rewrite",
      "name": "support-pronunciations"
    }
  }
}
Then send text normally:
{
  "type": "text",
  "text": "Thanks for calling SLNG support. I found your QubePay ACH transfer, and the next invoice will arrive on Friday."
}
To change dictionaries for a later turn, include pronunciation on the text message:
{
  "type": "text",
  "text": "Use another dictionary for this turn",
  "pronunciation": {
    "mode": "rewrite",
    "name": "finance-pronunciations"
  }
}
init.config.pronunciation sets the session default. text.pronunciation replaces the active dictionary for that turn, and later text turns reuse the most recent active dictionary.

Use a dictionary with Unified TTS

Use the same pronunciation shape with Unified TTS bridge requests:
{
  "type": "init",
  "model": "slng/deepgram/aura:2-en",
  "config": {
    "language": "en-US",
    "sample_rate": 24000,
    "encoding": "linear16",
    "pronunciation": {
      "mode": "rewrite",
      "name": "support-pronunciations"
    }
  }
}

Rewrite matching

Rewrite mode is deterministic:
  • matching is case-insensitive
  • only whole words or whole phrases are matched
  • longer phrases win before shorter matches
  • rewriting is single-pass and non-recursive
For example, this dictionary prefers ACH transfer over the shorter ACH match:
{
  "modes": {
    "rewrite": {
      "rules": [
        { "match": "ACH", "replace": "ay see aitch" },
        { "match": "ACH transfer", "replace": "ay see aitch transfer" }
      ]
    }
  }
}
Input:
I found your ACH transfer
Rewritten result:
I found your ay see aitch transfer

Limits and errors

Current limits:
LimitValue
Dictionary name128 characters
Rewrite rules per dictionary256
IPA rules per dictionary256
match length128 characters
replace length256 characters
ipa length256 characters
Runtime text input for rewrite20,000 characters
Runtime rewritten output100,000 characters
Pronunciation resolution fails closed. If the dictionary cannot be found or resolved, the request or WebSocket turn is not sent to the selected TTS model. Common HTTP errors:
Status and codeMeaning
400 invalid_pronunciationMalformed object, invalid name, unsupported mode, or dictionary not found
401 pronunciation_unauthenticatedMissing or unresolved organization context
409 pronunciation_conflictDictionary name already exists in the organization
503 pronunciation_unavailableGateway storage or dependency failure during dictionary resolution
Common WebSocket failures return an error frame:
{
  "type": "error",
  "code": "pronunciation_not_found",
  "message": "Pronunciation dictionary not found: support-pronunciations",
  "slng_request_id": "..."
}

Current limitations

  • Only mode: "rewrite" is executable today.
  • modes.ipa can be stored but is not executed.
  • There is no automatic fallback from IPA to rewrite.
  • Provider-native pronunciation dictionary uploads are not supported through SLNG.
For most applications, create a stable dictionary such as support-pronunciations and reuse it by name. Use dictionary_id only when your application needs an immutable machine reference. Keep dictionaries scoped to a domain, product line, or voice style. For WebSocket sessions, set the default dictionary in init, then override individual turns only when needed.