TTS Streaming API - Tabbly Docs

Overview

Tabbly TTS provides a streaming Text-to-Speech API that allows you to generate high-quality voice audio in real-time for your voice AI applications. The API streams audio as it’s generated, providing low-latency responses for real-time use cases.

Base URL

https://api.tabbly.io

Endpoint

Authentication

All requests require an API key passed via the X-API-Key header.

X-API-Key

string

required

Your Tabbly TTS API key

Request

text

string

required

The text to convert to speech

voice_id

string

default:"Ashley"

Voice ID to use for synthesis. Default: “Ashley”

model_id

string

default:"tabbly-tts"

Model ID to use. Default: “tabbly-tts”

Response

Content-Type: audio/wav or application/octet-stream Format: LINEAR16 PCM, 48kHz, mono Streaming: Yes - audio is streamed as it’s generated Protocol: HTTP streaming with WAV-encoded audio chunks embedded in the stream

Example Request

curl -X POST 'https://api.tabbly.io/tts/stream' \
-H 'Content-Type: application/json' \
-H 'X-API-Key: your-api-key-here' \
-d '{
    "text": "Hello, this is a test of the Tabbly TTS streaming API",
    "voice_id": "Ashley",
    "model_id": "tabbly-tts"
}'

Audio Format

Sample Rate

integer

48000 Hz (fixed)

Channels

integer

1 (mono)

Bit Depth

integer

16-bit

Format

string

LINEAR16 PCM

MIME Type

string

audio/wav (stream may contain embedded WAV files)

WAV Header Processing

The response may include WAV files embedded in the stream. When processing the stream:

Detect WAV Headers: Look for RIFF and WAVE markers
Extract PCM Data: Find the data chunk and extract raw PCM audio
Handle Multiple WAV Files: The stream may contain multiple WAV files
Process Audio Chunks: Handle audio data as it arrives for real-time playback

WAV File Structure

RIFF (4 bytes) - "RIFF"
File Size (4 bytes)
WAVE (4 bytes) - "WAVE"
... (format chunk, etc.)
data (4 bytes) - "data"
Data Size (4 bytes)
[PCM Audio Data]

Processing Tips

Skip Headers: First 44 bytes typically contain WAV header
Extract Data Chunk: Look for data marker to find audio start
Handle Multiple Files: Stream may contain multiple WAV files sequentially
Frame Alignment: Ensure chunks are aligned to 16-bit sample boundaries (even bytes)

Error Responses

400

object

Bad Request - Invalid parameters or missing required fields

401

object

Unauthorized - Invalid or missing API key

402

object

Payment Required - Insufficient wallet balance

500

object

Server error

Rate Limits

API rate limits apply to prevent abuse. Contact support if you need higher limits.

Best Practices

Stream Processing

Process audio chunks as they arrive rather than waiting for the complete response. This reduces latency and enables real-time playback.

WAV Processing

The API may send WAV files embedded in the stream. Always extract PCM data from WAV chunks for proper playback. See code examples above.

Error Handling

Always handle HTTP errors and network timeouts gracefully. Implement retry logic for transient failures.

Voice Selection

Choose appropriate voice_id based on your use case. Different voices may have different characteristics and languages.

Text Length

For very long texts, consider splitting into smaller chunks for better streaming performance and lower latency.

Connection Reuse

Reuse HTTP client connections when making multiple requests to improve performance and reduce connection overhead.

Buffering

For real-time playback, implement buffering (10-20ms) to smooth out network jitter and prevent audio artifacts.

Frame Alignment

Ensure audio chunks are aligned to 16-bit sample boundaries (even number of bytes) to prevent audio clicks or pops.

Troubleshooting

No Audio Output

Verify API key is correct and has sufficient wallet balance
Check network connectivity to https://api.tabbly.io
Review response status code (should be 200)
Verify WAV headers are being detected and processed correctly
Check logs for HTTP connection errors

Audio Quality Issues

Ensure sample rate matches (48000 Hz)
Verify WAV header parsing is working correctly
Check audio data format (should be LINEAR16 PCM)
Ensure frame alignment (even number of bytes per chunk)
Check for proper PCM extraction from WAV files

Performance Issues

Reuse HTTP client instances for better performance
Monitor API response times
Consider caching for repeated text
Implement proper buffering for real-time playback
Check network latency to API endpoint

WAV Processing Issues

Verify WAV headers are being detected (RIFF and WAVE markers)
Check if data chunk is being found correctly
Ensure multiple WAV files in stream are handled properly
Verify PCM data extraction is working
Check for incomplete WAV files (keep in buffer until complete)

Integration Examples

Real-Time Playback

For real-time playback, implement buffering to smooth out network jitter:

import asyncio
import httpx

async def stream_with_buffering(text: str, api_key: str, output_queue):
    """Stream TTS with buffering for smooth playback."""
    url = "https://api.tabbly.io/tts/stream"
    headers = {"Content-Type": "application/json", "X-API-Key": api_key}
    data = {"text": text, "voice_id": "Ashley", "model_id": "tabbly-tts"}
    
    buffer = bytearray()
    CHUNK_SIZE = 960  # 10ms at 48kHz
    
    async with httpx.AsyncClient() as client:
        async with client.stream("POST", url, json=data, headers=headers) as response:
            response.raise_for_status()
            
            async for chunk in response.aiter_bytes():
                buffer.extend(chunk)
                # Process WAV headers and extract PCM...
                # When buffer >= CHUNK_SIZE, push to output_queue
                while len(buffer) >= CHUNK_SIZE:
                    await output_queue.put(buffer[:CHUNK_SIZE])
                    buffer = buffer[CHUNK_SIZE:]

Next Steps

Learn how to integrate with LiveKit: LiveKit Integration
Review best practices: Best Practices
Get your API key from the Tabbly dashboard
Review example implementations in the code samples above

TTS API

​Overview

​Base URL

​Endpoint

​Authentication

​Request

​Response

​Example Request

​Audio Format

​WAV Header Processing

​WAV File Structure

​Processing Tips

​Error Responses

​Rate Limits

​Best Practices

​Troubleshooting

​No Audio Output

​Audio Quality Issues

​Performance Issues

​WAV Processing Issues

​Integration Examples

​Real-Time Playback

​Next Steps

Overview

Base URL

Endpoint

Authentication

Request

Response

Example Request

Audio Format

WAV Header Processing

WAV File Structure

Processing Tips

Error Responses

Rate Limits

Best Practices

Troubleshooting

No Audio Output

Audio Quality Issues

Performance Issues

WAV Processing Issues

Integration Examples

Real-Time Playback

Next Steps