Sahayak ("The Assistant") is a real-time autonomous voice AI agent built using the LiveKit Agents framework.
It acts as an official fee collection assistant for Hosaksham Academy, helping students and parents:
- Verify pending balances
- Understand payment dues
- Receive secure payment reminders
The project demonstrates production-style voice orchestration using:
- Real-time speech processing
- LLM-driven conversational workflows
- Deterministic function calling
- Secure transactional actions
- Real-time voice conversations
- Natural fee collection workflow
- Student verification via backend tools
- Payment reminder triggering
- Deterministic function calling (hallucination-safe)
- Async event-driven architecture
- Modular plugin-based pipeline
The voice pipeline uses the following components:
| Component | Purpose |
|---|---|
| Silero | Voice Activity Detection (VAD) |
| Deepgram | Speech-to-Text (STT) |
| Gemini 2.0 Flash | Conversational LLM + Function Calling |
| Cartesia | Text-to-Speech (TTS) |
| LiveKit Agents | Real-time orchestration |
collection-agent/
│
├── db.py
├── main.py
├── .env.example
├── requirements.txt
└── README.mdActs as the mock database layer.
- Stores dummy student records
- Performs deterministic student lookup
{
"student_id": "S101",
"name": "Rahul",
"pending_amount": 15000,
"due_date": "Jan 20",
"phone": "9876543210"
}get_student(student_id)Returns:
- Student dictionary if found
Noneif invalid ID
Core runtime engine of the voice AI agent.
- Starts LiveKit worker
- Configures audio pipeline
- Registers function tools
- Defines agent behavior and persona
- Manages real-time conversations
Fetches student details deterministically from db.py.
User: "My ID is S101"
Agent → lookup_student("S101")
Triggers a mock transactional payment reminder flow.
User: "Please send the payment link"
Agent → send_payment_reminder(student_id)
git clone https://github.com/your-username/collection-agent.git
cd collection-agentcp .env.example .envUpdate .env with your credentials:
LIVEKIT_URL=wss://your-project.livekit.cloud
LIVEKIT_API_KEY=your_livekit_api_key
LIVEKIT_API_SECRET=your_livekit_api_secret
GOOGLE_API_KEY=your_gemini_api_key
DEEPGRAM_API_KEY=your_deepgram_api_key
CARTESIA_API_KEY=your_cartesia_api_keypython -m venv venv
.\venv\Scripts\activatepython -m venv venv
source venv/bin/activatepip install livekit-agents \
livekit-plugins-google \
livekit-plugins-deepgram \
livekit-plugins-cartesia \
livekit-plugins-silero \
python-dotenvpython main.py devExpected output:
Worker connected successfully.
Waiting for participant...
Use:
- LiveKit Cloud Dashboard
- LiveKit Sandbox
- LiveKit Agent Playground
Connect using the same credentials configured in .env.
Enable:
- Microphone
- Speakers
The agent automatically initiates the conversation.
Agent:
"Namaste, this is Sahayak from Hosaksham Academy.
May I know your Student ID, please?"
User:
"My ID is S101."
Agent:
"Thank you Rahul.
You have an outstanding balance of ₹15,000 due on Jan 20th.
Would you like me to send you the payment link?"
User:
"Yes, please send it."
Agent:
"Success.
A secure payment link has been sent to Rahul's phone ending in 3210."
Terminal logs display function execution traces:
--- [DEBUG] lookup_student called with S101 ---
--- [DEBUG] send_payment_reminder triggered ---
These logs verify:
- Deterministic function calling
- Correct tool execution
- Zero hallucination workflow
- Razorpay integration
- WhatsApp payment reminders
- Persistent database layer
- Parent authentication
- CRM integration
- Multilingual support
- Analytics dashboard
- Call summaries
- Retry and escalation workflows
| Layer | Technology |
|---|---|
| Realtime Infrastructure | LiveKit |
| LLM | Gemini 2.0 Flash |
| STT | Deepgram |
| TTS | Cartesia |
| VAD | Silero |
| Runtime | Python AsyncIO |
MIT License
Built using:
- LiveKit Agents Framework
- Google Gemini APIs
- Deepgram
- Cartesia
- Hosaksham Platform