{
"type": "init",
"model": "kugelaudio/kugel:1-turbo",
"voice": "268",
"config": {
"cfg_scale": 2,
"sample_rate": 24000
}
}{
"type": "text",
"text": "Hello, this is a test of text-to-speech synthesis."
}{
"type": "flush"
}{
"type": "clear"
}{
"type": "cancel"
}{
"type": "ready",
"session_id": "sess_tts_abc123"
}{
"type": "audio_chunk",
"data": "UklGRiQAAABXQVZFZm10IBAAAAABAAEA...",
"sequence": 1
}{
"type": "segment_start",
"segment_id": "seg_001"
}{
"type": "segment_end",
"segment_id": "seg_001"
}{
"type": "flushed"
}{
"type": "cleared"
}{
"type": "audio_end",
"duration": 3.5
}{
"type": "error",
"code": "provider_error",
"message": "Provider returned an unexpected error"
}Text-to-Speech API for generating speech from text using KugelAudio Kugel 1. High-quality text-to-speech with expressiveness control. Establishes a WebSocket connection for real-time text-to-speech using the unified SLNG TTS protocol.
{
"type": "init",
"model": "kugelaudio/kugel:1-turbo",
"voice": "268",
"config": {
"cfg_scale": 2,
"sample_rate": 24000
}
}{
"type": "text",
"text": "Hello, this is a test of text-to-speech synthesis."
}{
"type": "flush"
}{
"type": "clear"
}{
"type": "cancel"
}{
"type": "ready",
"session_id": "sess_tts_abc123"
}{
"type": "audio_chunk",
"data": "UklGRiQAAABXQVZFZm10IBAAAAABAAEA...",
"sequence": 1
}{
"type": "segment_start",
"segment_id": "seg_001"
}{
"type": "segment_end",
"segment_id": "seg_001"
}{
"type": "flushed"
}{
"type": "cleared"
}{
"type": "audio_end",
"duration": 3.5
}{
"type": "error",
"code": "provider_error",
"message": "Provider returned an unexpected error"
}API key issued by SLNG. Pass as Authorization: Bearer <token> in the WebSocket upgrade request headers.
GET
WebSocket connection headers for KugelAudio TTS channels.
Initialize a KugelAudio TTS session with model, voice, and synthesis configuration.
Send text to synthesize into audio output.
Force any buffered text/audio to be finalized and delivered.
Clear any queued text/audio from the current session.
Cancel the current generation and stop any further audio.
Indicates the session is ready to receive messages.
Chunk of base64-encoded audio data.
Signals the start of a synthesized segment.
Signals the end of a synthesized segment.
Acknowledges that buffered output was flushed.
Acknowledges that queued output was cleared.
Signals the end of audio generation.
Indicates an error occurred during synthesis.
Was this page helpful?