Switch from Retell to PyAI Omni
Keep the agent, drop the per-vendor plumbing.
Omni runs your whole voice agent on one WebSocket - listening, reasoning, retrieving from your knowledge base, and speaking - with ~431 ms median turn-taking, barge-in, and warm transfer. Move your agent's persona and knowledge to PyAI and connect a single socket at $0.05/min all-in.
Why teams move from Retell
End-to-end on one socket
No stitched transcriber/LLM/voice - Omni handles the full loop and streams agent audio back over the same connection.
Grounded + tool-using
Bind your knowledge endpoint and webhook tools so the agent answers from your content and takes real actions, with barge-in and warm transfer.
Flat, predictable price
$0.05/min all-in, billed per second - no platform fee plus model/voice passthrough to reconcile.
Before and after
import { RetellWebClient } from "retell-client-js-sdk";
// Your server creates a web call with the Retell API key,
// selecting an LLM + voice in the agent config, then returns accessToken.
const client = new RetellWebClient();
await client.startCall({ accessToken });// One socket: STT + reasoning + retrieval + TTS, grounded in your KB.
const ws = new WebSocket(
"wss://api.pyai.com/v1/omni?agent_id=support&format=pcm16&rate=24000",
["pyai-key." + apiKey],
);
ws.onmessage = (e) => playAudioFrame(e.data); // agent audio down
// stream mic PCM16 @ 24kHz up (rate=16000 for telephony)Migration checklist
Swap the connection
Change the base URL or WebSocket URL, pass a PyAI key, and keep the old client where the API shape is compatible.
Map models, voices, and formats
Use the table below to replace model ids, voice ids, response formats, sample rates, and auth headers without rewriting the product flow.
Replay customer traffic
Run real prompts, recordings, and phone-call samples through both systems. Compare latency, quality, completion rate, and all-in cost.
Launch with guardrails
Start on free credits or test keys, add usage alerts, then enable Trace, Recap, or managed Agents when calls need production review.
If the system you are leaving already uses OpenAI-compatible transcription, speech, or realtime APIs, start with the smallest compatible swap: base URL, key, model, and voice. The real test is whether PyAI improves cost, latency, and caller outcomes on your own traffic.
What maps to what
| Retell | PyAI |
|---|---|
Agent + LLM + voice configuration | agent_id (opaque) + your kb_endpoint + tools |
Web/phone call SDK + access token | wss://api.pyai.com/v1/omni (subprotocol pyai-key.<key>) |
Knowledge base upload | Your own kb_endpoint, grounded inline |
Platform + passthrough pricing | $0.05/min all-in |
Built-in transfer | Barge-in + warm transfer |
Good to know
- format and rate are load-bearing on the connect URL; use rate=16000 for telephony.
- agent_id is opaque and stateless - any id in your org's namespace is accepted and echoed to your knowledge endpoint.
- An OpenAI-realtime-compatible alias exists: wss://api.pyai.com/v1/realtime?model=pyai-omni-realtime&agent_id=<id>.
FAQ
Where does my knowledge base live?
Omni grounds against your own knowledge endpoint, retrieved inline during the call - you keep ownership of the content.
Does it support barge-in and transfer?
Yes - natural turn-taking with barge-in is built in, plus warm transfer to a human when it matters.
Can I monitor compliance?
Add Trace to score 100% of calls against TCPA/HIPAA/PII/brand-voice rule packs with citations and a tamper-evident audit hash.
Run your agent on one socket.
Start free with $50 in free credits - grounded, tool-using voice agents at $0.05/min all-in.
No credit card - OpenAI-compatible - cancel anytime