Talk. Listen. Think. Respond — in real time.
Sagent is a real-time AI voice call agent that can:
- 📞 Make outbound calls
- 📲 Handle inbound calls
- 🧠 Understand speech using LLMs
- 🗣️ Respond with natural voice
- 📡 Stream live transcripts to a dashboard
- 🔁 Streaming pipeline: STT → LLM → TTS
- 📡 Live transcript via WebSocket
- 🧑💼 Multi-tenant architecture
- ⚙️ Configurable AI agent (prompt-driven)
- 📞 Twilio call integration (inbound + outbound)
flowchart RL
Tenant --> Frontend[Dashboard]
Frontend --> Backend
Backend --> |WebSocket| Frontend
Backend --> Twilio
Twilio --> Backend
subgraph Server-side
Backend[FastAPI] --> AI[AI: STT + LLM + TTS]
AI --> Backend
direction TB
Backend --> DB[(PostgreSQL)]
end
Twilio <--> Lead
- 🔁 Real-time voice interaction (STT → LLM → TTS)
- 📡 Live transcript streaming (WebSocket)
- 🧑💼 Multi-tenant architecture
- ⚙️ Configurable AI agent (prompt-based behavior)
- 📞 Outbound & inbound call support
- 🗂️ Call history with transcripts & recordings
- 📱 Phone-like UI dashboard
- Real-time AI system (not batch or async)
- Full-stack architecture (FastAPI + React)
- Voice + LLM + Telephony integration
- Production-ready design (multi-tenant, scalable)
Live call + real-time transcript streaming UI
- FastAPI (Python)
- PostgreSQL (Render)
- WebSocket (real-time streaming)
- React (TypeScript)
- Tailwind CSS
- STT: ElevenLabs Scribe (Realtime)
- LLM: OpenAI API
- TTS: ElevenLabs Flash
- Twilio (calls + recordings)
- Render
Sagent/
├── backend/ # FastAPI backend
├── frontend/ # React dashboard
├── docs/ # system design documents
├── infra/ # deployment configs
└── README.mdsequenceDiagram
participant Twilio
participant Backend
participant AI
participant UI
UI->>Backend: Start Call
Backend->>Twilio: Initiate Call
loop Conversation
Twilio->>Backend: Audio
Backend->>AI: Process
AI->>Backend: Response
Backend->>Twilio: Voice
Backend->>UI: Transcript
end
- AI sales agent (cold calls)
- customer support automation
- appointment booking
- AI receptionist
- voice-based SaaS demos
- Real-time first (low-latency streaming)
- Modular architecture (clean separation)
- Scalable by design (multi-tenant ready)
- AI-centric (prompt-driven behavior)
git clone https://github.com/oceanstar88/sagent.git
cd sagentcd backend
pip install -r requirements.txt
uvicorn app.main:app --reloadcd frontend
npm install
npm run devCreate .env file:
DATABASE_URL=
JWT_SECRET=
TWILIO_ACCOUNT_SID=
TWILIO_AUTH_TOKEN=
TWILIO_PHONE_NUMBER=
ELEVENLABS_API_KEY=
OPENAI_API_KEY=- Start a call from dashboard
- Receive inbound call
- Watch live transcript
- Review call history
Detailed system design available in docs
Includes:
- system architecture
- AI engine design
- backend & frontend design
- API spec
- sequence diagrams
- call analytics dashboard
- CRM integration
- multi-agent orchestration
- voice cloning
- multilingual support
Built as a high-performance AI voice agent system demo for showcasing real-time AI + telephony integration.
Sagent demonstrates:
- real-time AI systems
- voice + LLM integration
- full-stack engineering capability
- production-level architecture
This is not just a demo — it's a foundation for real AI voice products.
⭐ If you find this interesting, consider starring the repo!
