Skip to content
All migration guides
Voice agents

Switch from Retell to PyAI Omni

Keep the agent, drop the per-vendor plumbing.

Omni runs your whole voice agent on one WebSocket - listening, reasoning, retrieving from your knowledge base, and speaking - with ~431 ms median turn-taking, barge-in, and warm transfer. Move your agent's persona and knowledge to PyAI and connect a single socket at $0.05/min all-in.

Why teams move from Retell

End-to-end on one socket

No stitched transcriber/LLM/voice - Omni handles the full loop and streams agent audio back over the same connection.

Grounded + tool-using

Bind your knowledge endpoint and webhook tools so the agent answers from your content and takes real actions, with barge-in and warm transfer.

Flat, predictable price

$0.05/min all-in, billed per second - no platform fee plus model/voice passthrough to reconcile.

Before and after

Before - Retell
import { RetellWebClient } from "retell-client-js-sdk";

// Your server creates a web call with the Retell API key,
// selecting an LLM + voice in the agent config, then returns accessToken.
const client = new RetellWebClient();
await client.startCall({ accessToken });
After - PyAI Omni
// One socket: STT + reasoning + retrieval + TTS, grounded in your KB.
const ws = new WebSocket(
  "wss://api.pyai.com/v1/omni?agent_id=support&format=pcm16&rate=24000",
  ["pyai-key." + apiKey],
);
ws.onmessage = (e) => playAudioFrame(e.data); // agent audio down
// stream mic PCM16 @ 24kHz up (rate=16000 for telephony)

Migration checklist

1

Swap the connection

Change the base URL or WebSocket URL, pass a PyAI key, and keep the old client where the API shape is compatible.

2

Map models, voices, and formats

Use the table below to replace model ids, voice ids, response formats, sample rates, and auth headers without rewriting the product flow.

3

Replay customer traffic

Run real prompts, recordings, and phone-call samples through both systems. Compare latency, quality, completion rate, and all-in cost.

4

Launch with guardrails

Start on free credits or test keys, add usage alerts, then enable Trace, Recap, or managed Agents when calls need production review.

If the system you are leaving already uses OpenAI-compatible transcription, speech, or realtime APIs, start with the smallest compatible swap: base URL, key, model, and voice. The real test is whether PyAI improves cost, latency, and caller outcomes on your own traffic.

What maps to what

RetellPyAI
Agent + LLM + voice configurationagent_id (opaque) + your kb_endpoint + tools
Web/phone call SDK + access tokenwss://api.pyai.com/v1/omni (subprotocol pyai-key.<key>)
Knowledge base uploadYour own kb_endpoint, grounded inline
Platform + passthrough pricing$0.05/min all-in
Built-in transferBarge-in + warm transfer

Good to know

  • format and rate are load-bearing on the connect URL; use rate=16000 for telephony.
  • agent_id is opaque and stateless - any id in your org's namespace is accepted and echoed to your knowledge endpoint.
  • An OpenAI-realtime-compatible alias exists: wss://api.pyai.com/v1/realtime?model=pyai-omni-realtime&agent_id=<id>.

FAQ

Where does my knowledge base live?

Omni grounds against your own knowledge endpoint, retrieved inline during the call - you keep ownership of the content.

Does it support barge-in and transfer?

Yes - natural turn-taking with barge-in is built in, plus warm transfer to a human when it matters.

Can I monitor compliance?

Add Trace to score 100% of calls against TCPA/HIPAA/PII/brand-voice rule packs with citations and a tamper-evident audit hash.

Run your agent on one socket.

Start free with $50 in free credits - grounded, tool-using voice agents at $0.05/min all-in.

No credit card - OpenAI-compatible - cancel anytime