Plivo Programmable Agents SDK - Build AI Agents that work over voice calls, SMS/WhatsApp programmatically.
The SDK supports every Voice AI Agent configuration. Behavior is determined by which configs you provide when creating an agent -- there is no explicit mode field:
| Config provided | Pipeline | You handle |
|---|---|---|
stt + llm + tts |
Full AI - Plivo runs the entire voice agent pipeline | Tool calls and flow control |
stt + tts |
BYOLLM - Plivo handles speech, you bring your own LLM | LLM inference, stream tokens back via send_text() |
s2s |
Speech-to-speech - single provider handles STT+LLM+TTS natively | Event handling (OpenAI Realtime, Gemini Live) |
| (none) | Audio stream - Plivo is a telephony bridge | You bring and orchestrate your own STT, LLM, TTS, VAD etc. |
Inbound/Outbound Call
|
Plivo Platform
|
WebSocket ──────► Your VoiceApp server
| |
Audio stream @app.on("tool_call")
VAD / Turn @app.on("prompt") ← BYOLLM
STT → LLM → TTS @app.on("turn.completed")
| |
Caller hears session.send_tool_result()
agent speech session.speak() / session.transfer()
session.send_text() ← stream LLM tokens
session.send_media() ← raw audio mode
- Tool calling - LLM invokes tools, you handle them and return results
- Mid-call model switching - swap LLM model/prompt/tools via
session.update()for agent handoff - Multi-party conferences - add participants with
calls.dial(), warm transfer patterns - Voicemail detection - async AMD with beep detection for outbound calls
- Background audio - ambient sounds (office, typing, call-center) mixed with agent speech
- DTMF handling - detect keypress events for IVR flows
- Interruption (barge-in) - caller can interrupt the agent mid-speech
- User idle detection - configurable reminders and auto-hangup on silence
- Per-turn metrics - latency breakdown (STT, LLM TTFT, TTS) for monitoring
- Audio streaming - raw audio relay with
send_media(), checkpoints, andclear_audio() - BYOK (Bring Your Own Keys) - pass API keys for Deepgram, OpenAI, ElevenLabs, Cartesia, etc.
- Async-first -
httpx.AsyncClientwithasync/awaiteverywhere - FastAPI native -
await voice.handle_fastapi(websocket)drops into any FastAPI/Starlette app - Standalone mode -
app.run(port=9000)starts a WebSocket server with graceful shutdown - Sync + async handlers - sync handlers run in a thread pool automatically
- Automatic retries - exponential backoff on 429 (respects
Retry-After) and 5xx - Typed events - 25 dataclasses for all WebSocket events (
ToolCall,TurnMetrics,StreamMedia, ...) - Per-session state -
session.datadict persists across events within a call - Messaging - SMS, MMS, WhatsApp with template and interactive message builders
- Numbers - search, buy, manage, and carrier lookup
- Clean errors -
PlivoErrorhierarchy withstatus_code,retry_after, and structured bodies - Webhook verification -
validate_signature_v3()for securing callbacks - Python 3.10+ - type hints,
hatchlingbuild,rufflinting
pip install plivo_agentRequires Python 3.10+.
Sign up at cx.plivo.com/signup to get your PLIVO_AUTH_ID and PLIVO_AUTH_TOKEN, set them as environment variables, then see the examples/ directory for runnable scripts:
- Full AI pipeline - tool calls, model switching, voicemail detection, transfers
- BYOLLM - bring your own LLM with OpenAI streaming, per-session conversation history
- BYOLLM echo - minimal echo agent for testing, no external dependencies
- Multi-party conference - MPC with mid-call dial, warm transfer to human agents
- Speech-to-speech - OpenAI Realtime / Gemini Live integration
- Raw audio streaming - bidirectional audio relay with checkpoints and pacing
- Background audio - ambient office/typing sounds mixed with agent speech
- Pipeline modes - all five config combinations in one file
- Metrics & observability - per-turn latency breakdown, VAD and turn events
- SMS & MMS - text messages and MMS with media attachments
- WhatsApp - text, media, templates, buttons, lists, CTA, location
- Buy a number - search and purchase a phone number
- Callback server - FastAPI HTTP webhook receiver for call events
- FastAPI integration - embed VoiceApp inside an existing FastAPI app
git clone https://github.com/plivo/plivo-agent-python.git
cd plivo-agent-python
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
pytest tests/ -v # 70 tests
ruff check src/ tests/ # lint