Skip to content

abhijit0p/collection-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sahayak: Real-Time Voice AI Fee Collection Agent

Sahayak ("The Assistant") is a real-time autonomous voice AI agent built using the LiveKit Agents framework.

It acts as an official fee collection assistant for Hosaksham Academy, helping students and parents:

  • Verify pending balances
  • Understand payment dues
  • Receive secure payment reminders

The project demonstrates production-style voice orchestration using:

  • Real-time speech processing
  • LLM-driven conversational workflows
  • Deterministic function calling
  • Secure transactional actions

Features

  • Real-time voice conversations
  • Natural fee collection workflow
  • Student verification via backend tools
  • Payment reminder triggering
  • Deterministic function calling (hallucination-safe)
  • Async event-driven architecture
  • Modular plugin-based pipeline

Architecture

The voice pipeline uses the following components:

Component Purpose
Silero Voice Activity Detection (VAD)
Deepgram Speech-to-Text (STT)
Gemini 2.0 Flash Conversational LLM + Function Calling
Cartesia Text-to-Speech (TTS)
LiveKit Agents Real-time orchestration

Project Structure

collection-agent/
│
├── db.py
├── main.py
├── .env.example
├── requirements.txt
└── README.md

File Overview

db.py

Acts as the mock database layer.

Responsibilities

  • Stores dummy student records
  • Performs deterministic student lookup

Example Student Record

{
    "student_id": "S101",
    "name": "Rahul",
    "pending_amount": 15000,
    "due_date": "Jan 20",
    "phone": "9876543210"
}

Helper Function

get_student(student_id)

Returns:

  • Student dictionary if found
  • None if invalid ID

main.py

Core runtime engine of the voice AI agent.

Responsibilities

  • Starts LiveKit worker
  • Configures audio pipeline
  • Registers function tools
  • Defines agent behavior and persona
  • Manages real-time conversations

Registered Function Tools

lookup_student

Fetches student details deterministically from db.py.

Example

User: "My ID is S101"

Agent → lookup_student("S101")

send_payment_reminder

Triggers a mock transactional payment reminder flow.

Example

User: "Please send the payment link"

Agent → send_payment_reminder(student_id)

Installation

1. Clone Repository

git clone https://github.com/your-username/collection-agent.git

cd collection-agent

Environment Setup

2. Create .env

cp .env.example .env

Update .env with your credentials:

LIVEKIT_URL=wss://your-project.livekit.cloud
LIVEKIT_API_KEY=your_livekit_api_key
LIVEKIT_API_SECRET=your_livekit_api_secret

GOOGLE_API_KEY=your_gemini_api_key
DEEPGRAM_API_KEY=your_deepgram_api_key
CARTESIA_API_KEY=your_cartesia_api_key

Create Virtual Environment

Windows

python -m venv venv

.\venv\Scripts\activate

macOS/Linux

python -m venv venv

source venv/bin/activate

Install Dependencies

pip install livekit-agents \
livekit-plugins-google \
livekit-plugins-deepgram \
livekit-plugins-cartesia \
livekit-plugins-silero \
python-dotenv

Running the Agent

Start Local Worker

python main.py dev

Expected output:

Worker connected successfully.
Waiting for participant...

Testing the Voice Agent

Step 1: Open LiveKit Playground

Use:

  • LiveKit Cloud Dashboard
  • LiveKit Sandbox
  • LiveKit Agent Playground

Connect using the same credentials configured in .env.


Step 2: Join Room

Enable:

  • Microphone
  • Speakers

The agent automatically initiates the conversation.


Example Conversation

Greeting

Agent:
"Namaste, this is Sahayak from Hosaksham Academy.
May I know your Student ID, please?"

Student Verification

User:
"My ID is S101."

Agent:
"Thank you Rahul.
You have an outstanding balance of ₹15,000 due on Jan 20th.
Would you like me to send you the payment link?"

Payment Reminder

User:
"Yes, please send it."

Agent:
"Success.
A secure payment link has been sent to Rahul's phone ending in 3210."

Debugging

Terminal logs display function execution traces:

--- [DEBUG] lookup_student called with S101 ---

--- [DEBUG] send_payment_reminder triggered ---

These logs verify:

  • Deterministic function calling
  • Correct tool execution
  • Zero hallucination workflow

Future Enhancements

  • Razorpay integration
  • WhatsApp payment reminders
  • Persistent database layer
  • Parent authentication
  • CRM integration
  • Multilingual support
  • Analytics dashboard
  • Call summaries
  • Retry and escalation workflows

Tech Stack

Layer Technology
Realtime Infrastructure LiveKit
LLM Gemini 2.0 Flash
STT Deepgram
TTS Cartesia
VAD Silero
Runtime Python AsyncIO

License

MIT License


Acknowledgements

Built using:

  • LiveKit Agents Framework
  • Google Gemini APIs
  • Deepgram
  • Cartesia
  • Hosaksham Platform

About

Production-style realtime voice AI workflow agent using LiveKit Agents, Gemini function calling, Deepgram STT, and Cartesia TTS.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages