Text-to-speech

Switch from OpenAI TTS to PyAI Speak

A genuine drop-in: change three lines, keep your code.

Speak speaks the exact /v1/audio/speech contract your OpenAI client already uses. Point base_url at https://api.pyai.com/v1, swap in a PyAI key, set model to pyai-voice - and you get free voice cloning, prompt-to-voice design, and raw 8 kHz g711 telephony output on top.

Get a free key See the comparison

Why teams move from OpenAI TTS

Keep your OpenAI SDK

No new client. The request and response shapes match, so only base_url, the key, the model, and the voice id change.

Voices you can't get there

36 stock voices plus free cloned and prompt-designed voices - versus a fixed stock set.

Telephony output built in

response_format g711_ulaw / g711_alaw returns raw 8 kHz G.711 for carriers - no extra conversion.

Before and after

Before - OpenAI

from openai import OpenAI

client = OpenAI()  # OPENAI_API_KEY, base_url=https://api.openai.com/v1
client.audio.speech.create(
    model="gpt-4o-mini-tts",
    voice="alloy",
    input="Hello there.",
).stream_to_file("hello.mp3")

After - PyAI Speak

from openai import OpenAI

# Only base_url, key, model, and voice change.
client = OpenAI(api_key="pyai_live_...", base_url="https://api.pyai.com/v1")
client.audio.speech.create(
    model="pyai-voice",
    voice="stock_ava_en_us",
    input="Hello there.",
).stream_to_file("hello.mp3")

Migration checklist

Swap the connection

Change the base URL or WebSocket URL, pass a PyAI key, and keep the old client where the API shape is compatible.

Map models, voices, and formats

Use the table below to replace model ids, voice ids, response formats, sample rates, and auth headers without rewriting the product flow.

Replay customer traffic

Run real prompts, recordings, and phone-call samples through both systems. Compare latency, quality, completion rate, and all-in cost.

Launch with guardrails

Start on free credits or test keys, add usage alerts, then enable Trace, Recap, or managed Agents when calls need production review.

If the system you are leaving already uses OpenAI-compatible transcription, speech, or realtime APIs, start with the smallest compatible swap: base URL, key, model, and voice. The real test is whether PyAI improves cost, latency, and caller outcomes on your own traffic.

What maps to what

OpenAI TTS	PyAI
`base_url: https://api.openai.com/v1`	`base_url: https://api.pyai.com/v1`
`OPENAI_API_KEY`	`pyai_live_... (Authorization: Bearer)`
`model: gpt-4o-mini-tts`	`model: pyai-voice`
`voice: alloy`	`voice: stock_ava_en_us (or GET /v1/voices)`
`response_format: mp3/opus/aac/flac/wav/pcm`	`same - plus g711_ulaw / g711_alaw`

Good to know

x-api-key is an accepted alias for the Authorization header if your client sets it that way.
For 8 kHz telephony, set response_format to g711_ulaw or g711_alaw (raw, headerless); pcm returns raw 16-bit little-endian mono at sample_rate.

FAQ

Is it really a drop-in?

Yes for /v1/audio/speech and /v1/audio/transcriptions - the request/response shapes match OpenAI, so you reuse the same SDK.

What changes besides base_url?

Your key (pyai_live_... or pyai_test_...), the model (pyai-voice), and the voice id. Everything else stays.

Built on

SpeakText-to-speech that starts in milliseconds.

Drop in PyAI in one afternoon.

Start free with $50 in free credits - keep your OpenAI client, change base_url and key.

Get a free key Model your spend

No credit card - OpenAI-compatible - cancel anytime