Switch from Vapi to PyAI Omni
One engine instead of four vendors stitched together.
Omni collapses the transcriber, the reasoning model, retrieval, and the voice into a single WebSocket - audio in, audio out - with natural turn-taking (~431 ms median) and barge-in. Instead of wiring providers and paying each, you pay one flat $0.05/min, everything included.
Why teams move from Vapi
One hop, one bill
STT, reasoning, retrieval, and TTS run end to end in PyAI at $0.05/min all-in - no per-vendor passthrough math.
Stateless agent ids
agent_id is an opaque label authorized by your org; PyAI keeps no per-agent registry and echoes the id to your own knowledge endpoint.
Compliance on every call
Add Trace to score 100% of calls (TCPA/HIPAA/PII/brand-voice) with citations, redaction, and an audit hash.
Before and after
import Vapi from "@vapi-ai/web";
const vapi = new Vapi(process.env.VAPI_PUBLIC_KEY);
vapi.start({
transcriber: { provider: "deepgram", model: "nova-3" },
model: { provider: "openai", model: "gpt-4o" },
voice: { provider: "11labs", voiceId: "burt" },
});// One socket: PyAI does STT + reasoning + retrieval + TTS.
const ws = new WebSocket(
"wss://api.pyai.com/v1/omni?agent_id=front_desk&format=pcm16&rate=24000",
["pyai-key." + apiKey],
);
ws.onmessage = (e) => playAudioFrame(e.data); // agent audio down
// stream mic PCM16 @ 24kHz upMigration checklist
Swap the connection
Change the base URL or WebSocket URL, pass a PyAI key, and keep the old client where the API shape is compatible.
Map models, voices, and formats
Use the table below to replace model ids, voice ids, response formats, sample rates, and auth headers without rewriting the product flow.
Replay customer traffic
Run real prompts, recordings, and phone-call samples through both systems. Compare latency, quality, completion rate, and all-in cost.
Launch with guardrails
Start on free credits or test keys, add usage alerts, then enable Trace, Recap, or managed Agents when calls need production review.
If the system you are leaving already uses OpenAI-compatible transcription, speech, or realtime APIs, start with the smallest compatible swap: base URL, key, model, and voice. The real test is whether PyAI improves cost, latency, and caller outcomes on your own traffic.
What maps to what
| Vapi | PyAI |
|---|---|
transcriber + model + voice providers | One Omni engine (STT+LLM+TTS) |
Assistant / config object | agent_id (opaque) + your kb_endpoint + tools |
Bundled platform + passthrough pricing | $0.05/min all-in |
Provider SDK | wss://api.pyai.com/v1/omni (or /v1/realtime alias) |
Per-vendor API keys | One PyAI key (subprotocol pyai-key.<key>) |
Good to know
- Use rate=16000 on the connect URL for telephony; format and rate are load-bearing.
- There is also an OpenAI-realtime-compatible alias: wss://api.pyai.com/v1/realtime?model=pyai-omni-realtime&agent_id=<id>.
- Ground the agent by pointing PyAI at your own knowledge endpoint; bind webhook tools for real actions.
FAQ
Do I have to manage agents in a dashboard?
No. agent_id is an opaque label authorized by your key's org - any id in your namespace is accepted and echoed to your own knowledge endpoint, so there's no registry to maintain.
Is there an OpenAI-realtime-compatible URL?
Yes: wss://api.pyai.com/v1/realtime?model=pyai-omni-realtime&agent_id=<id>.
How is it billed?
$0.05/min all-in, billed per second - STT, reasoning, retrieval, and TTS included.
One model. One hop. $0.05/min.
Start free with $50 in free credits - ship a grounded voice agent on one socket.
No credit card - OpenAI-compatible - cancel anytime