Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.slng.ai/llms.txt

Use this file to discover all available pages before exploring further.

Pronunciation dictionaries let you control how text is spoken before it reaches the selected TTS model. Create a dictionary once, then reference it from HTTP, WebSocket, or Unified TTS requests. Use them when a voice needs to pronounce acronyms, product names, customer names, jargon, or multilingual terms consistently across TTS models.
Pronunciation dictionaries currently support rewrite mode. The SLNG API rewrites matching words or phrases before synthesis, and the rewritten text is what the selected model receives.

How it works

Each dictionary belongs to the organization resolved from your API key. Requests from another organization cannot read or use it. The basic flow is:
  1. Create a dictionary with rewrite rules.
  2. Reference that dictionary by name or dictionary_id.
  3. Send a TTS request with a pronunciation object.
Only one active pronunciation dictionary can apply to a request or WebSocket turn.

Create a dictionary

Create dictionaries with the pronunciation dictionary API reference:
curl -X POST https://api.slng.ai/v1/pronunciation/dictionaries \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "brand-pronunciations",
    "metadata": {
      "language": "hi-IN",
      "providers": ["sarvam", "cartesia"]
    },
    "modes": {
      "rewrite": {
        "rules": [
          { "match": "NAIC", "replace": "en ay eye see" },
          { "match": "B2B", "replace": "bee to bee" }
        ]
      }
    }
  }'
A successful response includes the dictionary id, normalized name, metadata, modes, content hash, and creation timestamp:
{
  "id": "pd_01abc...",
  "org_id": "org_123",
  "name": "brand-pronunciations",
  "normalized_name": "brand-pronunciations",
  "metadata": {
    "language": "hi-IN",
    "providers": ["sarvam", "cartesia"]
  },
  "modes": {
    "rewrite": {
      "rules": [
        { "match": "NAIC", "replace": "en ay eye see" },
        { "match": "B2B", "replace": "bee to bee" }
      ]
    }
  },
  "content_hash": "sha256:...",
  "created_at": "2026-05-15T12:00:00.000Z"
}
Dictionary names must be unique within your organization. Names can contain letters, numbers, ., _, and -, and can be up to 128 characters.

Manage dictionaries

For request and response schemas, see the generated API reference pages for listing dictionaries, reading one dictionary, and deleting a dictionary. List dictionaries:
curl -s https://api.slng.ai/v1/pronunciation/dictionaries \
  -H "Authorization: Bearer YOUR_API_KEY"
Get one dictionary by name:
curl -s https://api.slng.ai/v1/pronunciation/dictionaries/brand-pronunciations \
  -H "Authorization: Bearer YOUR_API_KEY"
Delete a dictionary:
curl -X DELETE https://api.slng.ai/v1/pronunciation/dictionaries/brand-pronunciations \
  -H "Authorization: Bearer YOUR_API_KEY"

Use a dictionary with HTTP TTS

Add a pronunciation object to the TTS request body:
curl -X POST https://api.slng.ai/v1/tts/sarvam/bulbul:v3 \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "NAIC policy check karein aur B2B portal pe login karein",
    "speaker": "shubh",
    "target_language_code": "hi-IN",
    "pronunciation": {
      "mode": "rewrite",
      "name": "brand-pronunciations"
    }
  }'
You can also reference the dictionary by immutable ID:
{
  "pronunciation": {
    "mode": "rewrite",
    "dictionary_id": "pd_01abc..."
  }
}
Rules for the request object:
  • mode must be "rewrite"
  • provide exactly one of name or dictionary_id
  • use only one active dictionary per request
With the example dictionary above, the selected model receives this rewritten text:
en ay eye see policy check karein aur bee to bee portal pe login karein

Use a dictionary with WebSocket TTS

Set a default dictionary when you initialize the session:
{
  "type": "init",
  "config": {
    "pronunciation": {
      "mode": "rewrite",
      "name": "brand-pronunciations"
    }
  }
}
Then send text normally:
{
  "type": "text",
  "text": "NAIC policy check karein aur B2B portal pe login karein"
}
To change dictionaries for a later turn, include pronunciation on the text message:
{
  "type": "text",
  "text": "Use another dictionary for this turn",
  "pronunciation": {
    "mode": "rewrite",
    "name": "finance-pronunciations"
  }
}
init.config.pronunciation sets the session default. text.pronunciation replaces the active dictionary for that turn, and later text turns reuse the most recent active dictionary.

Use a dictionary with Unified TTS

Use the same pronunciation shape with Unified TTS bridge requests:
{
  "type": "init",
  "model": "sarvam/bulbul:v3",
  "voice": "shubh",
  "config": {
    "language": "hi-IN",
    "sample_rate": 24000,
    "encoding": "linear16",
    "pronunciation": {
      "mode": "rewrite",
      "name": "brand-pronunciations"
    }
  }
}

Rewrite matching

Rewrite mode is deterministic:
  • matching is case-insensitive
  • only whole words or whole phrases are matched
  • longer phrases win before shorter matches
  • rewriting is single-pass and non-recursive
For example, this dictionary prefers B2B portal over the shorter B2B match:
{
  "modes": {
    "rewrite": {
      "rules": [
        { "match": "B2B", "replace": "bee to bee" },
        { "match": "B2B portal", "replace": "bee to bee portal" }
      ]
    }
  }
}
Input:
Use the B2B portal today
Rewritten result:
Use the bee to bee portal today

Limits and errors

Current limits:
LimitValue
Dictionary name128 characters
Rewrite rules per dictionary256
IPA rules per dictionary256
match length128 characters
replace length256 characters
ipa length256 characters
Runtime text input for rewrite20,000 characters
Runtime rewritten output100,000 characters
Pronunciation resolution fails closed. If the dictionary cannot be found or resolved, the request or WebSocket turn is not sent to the selected TTS model. Common HTTP errors:
Status and codeMeaning
400 invalid_pronunciationMalformed object, invalid name, unsupported mode, or dictionary not found
401 pronunciation_unauthenticatedMissing or unresolved organization context
409 pronunciation_conflictDictionary name already exists in the organization
503 pronunciation_unavailableGateway storage or dependency failure during dictionary resolution
Common WebSocket failures return an error frame:
{
  "type": "error",
  "code": "pronunciation_not_found",
  "message": "Pronunciation dictionary not found: brand-pronunciations",
  "slng_request_id": "..."
}

Current limitations

  • Only mode: "rewrite" is executable today.
  • modes.ipa can be stored but is not executed.
  • There is no automatic fallback from IPA to rewrite.
  • Provider-native pronunciation dictionary uploads are not supported through the SLNG API.
For most applications, create a stable dictionary such as brand-pronunciations and reuse it by name. Use dictionary_id only when your application needs an immutable machine reference. Keep dictionaries scoped to a domain, product line, or voice style. For WebSocket sessions, set the default dictionary in init, then override individual turns only when needed.