{
"type": "request",
"text": "Hello from sunny Barcelona",
"model_id": "kugel-1-turbo",
"voice_id": 268,
"cfg_scale": 2,
"sample_rate": 24000
}{
"type": "config",
"voice_id": 268,
"model_id": "kugel-1-turbo",
"cfg_scale": 2,
"sample_rate": 24000,
"flush_timeout_ms": 500
}{
"type": "text",
"text": "Hello from sunny "
}{
"type": "flush",
"flush": true
}{
"type": "close",
"close": true
}{
"type": "audio_chunk",
"audio": "UklGRiQAAABXQVZFZm10IBAAAAABAAEA...",
"enc": "pcm_s16le",
"idx": 0,
"sr": 24000,
"samples": 4800,
"chunk_id": 0
}{
"type": "word_timestamps",
"word_timestamps": [
{
"word": "Hello",
"start_ms": 0,
"end_ms": 320,
"char_start": 0,
"char_end": 5,
"score": 0.98
},
{
"word": "world",
"start_ms": 340,
"end_ms": 620,
"char_start": 6,
"char_end": 11,
"score": 0.95
}
],
"chunk_id": 0
}{
"type": "final",
"final": true,
"chunks": 10,
"total_samples": 48000,
"dur_ms": 2000,
"gen_ms": 150,
"rtf": 0.075
}{
"type": "generation_started",
"generation_started": true,
"chunk_id": 0,
"text": "Hello from sunny Barcelona"
}{
"type": "chunk_complete",
"chunk_complete": true,
"chunk_id": 0,
"audio_seconds": 1.2,
"gen_ms": 150,
"is_final": false
}{
"type": "session_closed",
"session_closed": true,
"total_audio_seconds": 5.4,
"total_text_chunks": 3,
"total_audio_chunks": 15
}{
"type": "error",
"code": "provider_error",
"message": "Provider returned an unexpected error"
}Text-to-Speech API for generating speech from text using KugelAudio Kugel 1. High-quality text-to-speech with expressiveness control. Establishes a WebSocket connection for real-time text-to-speech using the unified SLNG TTS protocol.
{
"type": "request",
"text": "Hello from sunny Barcelona",
"model_id": "kugel-1-turbo",
"voice_id": 268,
"cfg_scale": 2,
"sample_rate": 24000
}{
"type": "config",
"voice_id": 268,
"model_id": "kugel-1-turbo",
"cfg_scale": 2,
"sample_rate": 24000,
"flush_timeout_ms": 500
}{
"type": "text",
"text": "Hello from sunny "
}{
"type": "flush",
"flush": true
}{
"type": "close",
"close": true
}{
"type": "audio_chunk",
"audio": "UklGRiQAAABXQVZFZm10IBAAAAABAAEA...",
"enc": "pcm_s16le",
"idx": 0,
"sr": 24000,
"samples": 4800,
"chunk_id": 0
}{
"type": "word_timestamps",
"word_timestamps": [
{
"word": "Hello",
"start_ms": 0,
"end_ms": 320,
"char_start": 0,
"char_end": 5,
"score": 0.98
},
{
"word": "world",
"start_ms": 340,
"end_ms": 620,
"char_start": 6,
"char_end": 11,
"score": 0.95
}
],
"chunk_id": 0
}{
"type": "final",
"final": true,
"chunks": 10,
"total_samples": 48000,
"dur_ms": 2000,
"gen_ms": 150,
"rtf": 0.075
}{
"type": "generation_started",
"generation_started": true,
"chunk_id": 0,
"text": "Hello from sunny Barcelona"
}{
"type": "chunk_complete",
"chunk_complete": true,
"chunk_id": 0,
"audio_seconds": 1.2,
"gen_ms": 150,
"is_final": false
}{
"type": "session_closed",
"session_closed": true,
"total_audio_seconds": 5.4,
"total_text_chunks": 3,
"total_audio_chunks": 15
}{
"type": "error",
"code": "provider_error",
"message": "Provider returned an unexpected error"
}API key issued by SLNG. Pass as Authorization: Bearer <token> in the WebSocket upgrade request headers.
GET
Target world part override. Auto-selected if not provided. Available world parts: eu.
euSend text with synthesis parameters to KugelAudio TTS.
Configure a KugelAudio streaming TTS session before sending text chunks.
Send an incremental text chunk for streaming TTS generation.
Force immediate generation of all buffered text.
End the streaming session and flush remaining buffered text.
Chunk of base64-encoded PCM audio from KugelAudio with encoding metadata.
Word-level timestamp alignment data from KugelAudio.
Signals generation is complete with KugelAudio performance statistics.
Indicates buffered text has reached the generation threshold.
Signals that audio generation for a text chunk is complete.
Signals the streaming session has ended with aggregate statistics.
Indicates an error occurred during synthesis.
Was this page helpful?