LiveREST

Speak

Text-to-speech that starts in milliseconds.

Low-latency streaming TTS with 36 stock voices, free instant voice cloning, and prompt-to-voice design. First audio bytes arrive in tens of milliseconds and play progressively.

Get a free key Read the docs

Price: $0.06/min
Endpoint: POST /v1/audio/speech
Scope: voice:synthesize
Model id: pyai-voice

Async synthesis$0.04/min

Voice cloning (enroll)Free

Voice designFree

Hear a few stock voices

What you get

Snappy by default

Audio streams from the first byte, so playback starts almost instantly instead of after the whole clip renders.

Your voice, free to clone

Enroll a voice once and synthesize with it - cloning enrollment and prompt-to-voice design are both free.

Drop-in OpenAI shape

Same /v1/audio/speech contract your OpenAI client already speaks.

Streaming TTFB ~32-98 ms 36 stock voices Voice cloning (free) Designed voices (free) mp3 + wav

Time-to-first-byte ~32-98 ms on the streaming path; progressive playback the whole way through.

Start in minutes

cURL

curl https://api.pyai.com/v1/audio/speech \
  -H "Authorization: Bearer $PYAI_KEY" \
  -d '{"model":"pyai-voice","input":"Hello from PyAI.","voice":"stock_ava_en_us"}' \
  --output hello.mp3

FAQ

How many voices are there?

36 stock voices today, plus your own cloned voices and prompt-designed voices - browse the live catalog at GET /v1/voices.

What does cloning cost?

Enrollment is free; you pay only for the audio you synthesize, billed per minute.

Build with Speak today.

Start free with $50.00 in credit - no card. Your test key works instantly.

Get a free key Model your spend

No credit card - OpenAI-compatible - cancel anytime