| title | Qualora - Enterprise Quality Auditor |
|---|---|
| emoji | ποΈ |
| colorFrom | blue |
| colorTo | indigo |
| sdk | docker |
| pinned | false |
Turn every conversation into structured intelligence.
An enterprise-grade AI auditing platform that transcribes, analyzes, and scores support interactions β across voice, text, and file β with zero single point of failure.
- Project Overview & Purpose
- Key Capabilities
- Architecture & System Workflow
- The LLM-as-a-Judge Audit Matrix
- Technology Stack
- Getting Started & Installation
- Environment Configuration
- Hugging Face Space Node Setup
- API Reference & Endpoints
- Enterprise Standards & Security
- Dynamic Content Maintenance
- Troubleshooting & Roadmap
Organizations struggle to capture and evaluate the human element of customer support. Traditional metrics (CSAT, NPS) miss critical nuances in agent performance, while compliance auditing remains manual, expensive, and inconsistent.
Qualora turns unstructured human interaction into highly structured, actionable intelligence. It provides zero single point of failure through a distributed hybrid architecture, returning comprehensive, deterministic analysis for:
- Customer Success Teams: SaaS performance tracking and churn signal detection.
- Contact Centers: BPO quality assurance at massive scale.
- Compliance & Legal Officers: Automated PII auditing and regulatory deviation flags.
- QA Managers: Data-driven behavioral coaching and agent trend analysis.
- ποΈ Multi-Speaker Recognition & Acoustic Profiling: Uses Pyannote speaker diarization to separate voices. Combined with SpeechBrain and Parselmouth, it detects acoustic stress and physiological emotion directly from the audio envelope.
- π€ Tiered Voice Capture Pipeline: Voice calls are processed through HF Space first, then fail over to an API chain (ElevenLabs Scribe β Deepgram Nova-2 β Groq Whisper) when needed.
- βοΈ LLM-as-a-Judge Auditing Framework: Uses a hardened fallback cascade (OpenRouter β Groq β HuggingFace Inference) to output structured JSON with F1 scores, emotional timelines, and compliance flags.
- π Retrieval-Augmented Generation (RAG) KB: Audits interactions against your specific company policy documents uploaded to a hybrid MongoDB/ChromaDB vector store.
- π₯ Human-in-the-Loop (HITL) Review: Supervisors and admins can approve, flag, or reject machine-generated audits with persistent override trails.
Qualora employs a Master-Worker pattern designed for maximum resilience in serverless and containerized environments.
When audio is submitted, a resilient two-stage capture sequence runs:
- Stage T1 (HF Space Node): Streams the file to a private node for deep acoustic analysis, pitch detection, and diarization.
- Stage T2 (Cloud API Cascade): If T1 is unavailable or returns empty speech, the pipeline falls back through ElevenLabs Scribe β Deepgram Nova-2 β Groq Whisper.
flowchart TB
A[Input: Audio, Chat, File] --> B{Input Type}
B -- Text or File --> C[Transcript Ready]
B -- Voice --> V0
subgraph V[Voice Capture Failover]
direction TB
V0[HF Space T1] --> V1[Transcript Ready]
V0 -. fail .-> V2[ElevenLabs T2a]
V2 -. fail .-> V3[Deepgram T2b]
V3 -. fail .-> V4[Groq Whisper T2c]
V4 --> V1
end
V1 --> C
subgraph R[RAG Retrieval Cascade]
direction TB
R1[Voyage plus MongoDB Atlas]
R2[ChromaDB local fallback]
R3[FAISS memory fallback]
RX[Policy Context or Null Sentinel]
R1 --> RX
R1 -. fail or empty .-> R2
R2 -. fail or empty .-> R3
R3 --> RX
end
C --> R1
subgraph L[LLM Judge Cascade]
direction TB
L1[OpenRouter T1]
L2[Groq T2]
L3[HF Inference T3]
L1 -. fail or rate-limit .-> L2
L2 -. fail or rate-limit .-> L3
end
RX --> L1
subgraph O[Audit Output and Controls]
direction TB
O1[JSON validation and coercion]
O2[Persist audit and metadata]
O3[Alert evaluation]
O4[HITL override]
O5[Dashboard, trends, agent scoring]
L1 --> O1
L2 --> O1
L3 --> O1
O1 --> O2 --> O3 --> O5
O2 --> O4 --> O5
end
Every audit produces a deterministic JSON matrix that captures conversation quality across multiple psychological and compliance dimensions. Identical inputs produce identical outputs via SHA-256 result caching.
{
"summary": "Customer reported a billing discrepancy; agent resolved without escalation.",
"agent_f1_score": 0.91,
"satisfaction_prediction": "High",
"compliance_risk": "Green",
"quality_matrix": {
"language_proficiency": 9,
"cognitive_empathy": 8,
"efficiency": 9,
"bias_reduction": 10,
"active_listening": 9
},
"emotional_timeline": [
{ "turn": 1, "speaker": "Customer", "emotion": "Frustrated", "intensity": 8 },
{ "turn": 2, "speaker": "Agent", "emotion": "Empathetic", "intensity": 6 }
],
"compliance_flags": ["Missing identity-verification step"],
"emotions": { "agent": "calm", "customer": "frustrated" },
"behavioral_nudges": [
"Mirroring: Repeat the specific billing date back to the customer earlier."
]
}The engine executes a waterfall attempt across multiple providers to ensure 100% uptime:
- Tier 1: OpenRouter (configurable model set via
OPENROUTER_MODELS). - Tier 2: Groq (default includes
llama-3.3-70b-versatile). - Tier 3: HuggingFace Inference (fallback instruction-tuned models).
| Service Layer | Provider | Role in Pipeline | Failover Behavior |
|---|---|---|---|
| Voice T1 | Hugging Face Space (Gradio) | Primary transcription + diarization + acoustic profile | If unavailable/empty transcript, system moves to Voice T2 |
| Voice T2a | ElevenLabs | Premium transcription with diarization | On error, move to Deepgram |
| Voice T2b | Deepgram | STT fallback with utterance diarization | On error, move to Groq Whisper |
| Voice T2c | Groq Speech | Last fallback in voice capture chain | On error, voice request fails with provider exhausted |
| Audit LLM T1 | OpenRouter | Primary JSON audit judge | Rate-limit/error skips to Groq tier |
| Audit LLM T2 | Groq | Secondary audit judge | Rate-limit/error skips to HF Inference tier |
| Audit LLM T3 | Hugging Face Inference | Final audit fallback | If all fail, request returns provider exhaustion |
| RAG Retrieval L1 | Voyage AI embeddings + MongoDB Atlas Vector Search | Primary policy retrieval | If unavailable, fallback to Chroma local embeddings |
| RAG Retrieval L2 | ChromaDB (local) | Local persistent vector fallback | If unavailable, fallback to FAISS memory |
| RAG Retrieval L3 | FAISS (in-memory) | Emergency retrieval fallback | If unavailable, audit proceeds with no policy context sentinel |
| Service Layer | Models / Engines | Override Variable(s) |
|---|---|---|
| Voice T1 | faster-whisper (large-v3), pyannote/speaker-diarization-3.1, SpeechBrain wav2vec2-IEMOCAP, Parselmouth |
WHISPER_MODEL (HF Space secret) |
| Voice T2a | scribe_v1 |
None |
| Voice T2b | nova-2 |
None |
| Voice T2c | whisper-large-v3, whisper-large-v3-turbo |
None |
| Audit LLM T1 | openrouter/free, qwen/qwen3.6-plus:free, x-ai/grok-4.20 |
OPENROUTER_MODELS |
| Audit LLM T2 | llama-3.3-70b-versatile, llama-3.1-8b-instant, groq/compound, groq/compound-mini |
GROQ_MODELS |
| Audit LLM T3 | Qwen/Qwen2.5-7B-Instruct, mistralai/Mistral-7B-Instruct-v0.3, meta-llama/Llama-3.1-8B-Instruct |
HF_MODELS |
| RAG Embeddings L1 | voyage-4-lite (documents + queries) |
VOYAGE_MODEL_DOC, VOYAGE_MODEL_QUERY |
| RAG Embeddings L2 | all-MiniLM-L6-v2 |
LOCAL_EMBED_MODEL |
| RAG Embeddings L3 | In-process FAISS index | None |
- MongoDB Atlas (
kb_chunks) stores Voyage vectors for organization-scoped retrieval. - ChromaDB (
data/chroma) stores local sentence-transformer vectors for warm local search. - Atomic indexing is enforced: if Chroma indexing fails, MongoDB chunk vectors are rolled back to prevent split-brain policy context.
- Null-context protocol is explicit: when no policy chunks are available across all tiers, the auditor receives a structured no-context sentinel instead of hallucinated policy data.
| Layer | Technologies | Purpose |
|---|---|---|
| Frontend | Vanilla JS, CSS3, MD3 | Zero-framework, high-performance UI |
| Backend | Python 3.10+, Flask | RESTful API and Job Orchestration |
| Databases | MongoDB Atlas, ChromaDB | Hybrid transactional & vector storage |
| ASR Engine | Faster-Whisper, Pyannote | Acoustic profiling and diarization |
| Security | JWT, CSRF, HTTP-Only | Enterprise-grade session protection |
| Analytics | ECharts, Chart.js | 3D Emotional Landscapes & Radar charts |
| Card | Primary | Supporting | Notes |
|---|---|---|---|
| Language Runtime | Python 3.10+ | JavaScript (ES6+) | Python powers API and orchestration; JS drives modular controllers |
| Backend Framework | Flask 3 | Flask-CORS, PyMongo | Blueprint architecture across auth, audits, KB, admin, alerts, agents |
| Frontend Framework Style | Vanilla JS + server-rendered Jinja templates | Material Design 3 patterns | No SPA framework dependency; optimized for controlled enterprise UX |
| RAG Vector Stores | MongoDB Atlas Vector Search | ChromaDB, FAISS | Hierarchical retrieval with strict rollback semantics |
| LLM Providers | OpenRouter | Groq, Hugging Face Inference | Multi-provider cascade for deterministic JSON auditing |
| LLM Model Families | Qwen, Llama | Mistral | Fallback-safe instruction models for judge output |
| Embedding Providers | Voyage AI | sentence-transformers | Query and document embedding generation |
| Speech Providers | HF Space (Gradio app) | ElevenLabs, Deepgram, Groq Speech | T1/T2 failover for resilient voice processing |
| Speech Model Families | faster-whisper, pyannote | Whisper, Nova, Scribe | Transcription, diarization, and acoustic profiling |
- Python 3.10 or higher
- MongoDB 5.0+ (Atlas cloud recommended)
- At least one LLM provider key (
OPENROUTER_API_KEYorGROQ_API_KEYorHF_SPACE_TOKEN)
-
Clone the Repository
git clone https://github.com/prathamamritkar/genAI-qualityBot.git cd genAI-qualityBot -
Environment Setup
python -m venv venv source venv/bin/activate # Windows: venv\Scripts\activate pip install -r requirements.txt
-
Database Initialization The application will automatically verify and create required MongoDB indexes and seed the system admin accounts upon the first successful connection.
-
Launch Application
python app.py
Access the dashboard at
http://localhost:8000.
Create a .env file in the root directory. Values here override standard OS environment variables.
# --- DATABASE ---
MONGODB_URI=mongodb+srv://user:pass@cluster.mongodb.net/qualora
# --- SECURITY ---
JWT_SECRET=your_random_32_char_secret_key
JWT_EXPIRATION_SECONDS=3600
ALLOWED_ORIGIN=http://localhost:5173
ELEVENLABS_WEBHOOK_SECRET=your_webhook_secret
# --- AI PROVIDERS (CORE) ---
GROQ_API_KEY=your_groq_api_key
OPENROUTER_API_KEY=your_openrouter_key
VOYAGE_API_KEY=your_voyage_api_key
# --- AI PROVIDERS (OPTIONAL TRANSCRIPTION) ---
ELEVENLABS_API_KEY=your_elevenlabs_key
DEEPGRAM_API_KEY=your_deepgram_key
# --- DISTRIBUTED NODES ---
HF_SPACE_URL=your-username/qualora-asr-node
HF_SPACE_TOKEN=your_hf_read_token
# --- OPTIONAL MODEL OVERRIDES ---
OPENROUTER_MODELS=openrouter/free,qwen/qwen3.6-plus:free,x-ai/grok-4.20
GROQ_MODELS=llama-3.3-70b-versatile,llama-3.1-8b-instant
HF_MODELS=Qwen/Qwen2.5-7B-Instruct,mistralai/Mistral-7B-Instruct-v0.3
VOYAGE_MODEL_DOC=voyage-4-lite
VOYAGE_MODEL_QUERY=voyage-4-lite
LOCAL_EMBED_MODEL=all-MiniLM-L6-v2On first successful database connection, Qualora seeds accounts only when these environment variables are present.
| Role | Login Email | Password |
|---|---|---|
| Admin | admin@auditor.local |
change_this_immediately |
| Demo User (Agent) | demo_agent@demo.local |
demo_pass_123 |
Notes:
- The demo user email is generated as
<DEMO_USER_LOGIN>@demo.local. - Change these credentials before any shared/staging/production deployment.
The HF Space node provides the deep acoustic analysis and diarization engine. It acts as the primary voice path, with API-chain failover when unavailable.
- Create a Space: Create a new Private Space on Hugging Face using the Docker SDK.
- Push Source: Push the contents of the
/hf_spacedirectory (Dockerfile, app.py, requirements.txt) to the repository. - Repository Secrets: Add the following secrets in your Space Settings:
HF_TOKEN: Your HuggingFace Read Token.WHISPER_MODEL: Set tomediumorlarge-v3.
- Model Access: You must visit pyannote/speaker-diarization-3.1 and pyannote/segmentation-3.0 to accept the user agreements, or the node will fail to initialize.
| Endpoint | Method | Description |
|---|---|---|
/api/audits/process-chat |
POST |
Audit raw text or pasted transcripts. |
/api/audits/process-file |
POST |
Extract text from PDF/TXT/CSV and audit. |
/api/audits/process-call |
POST |
Voice pipeline for audio files (T1 HF Space, T2 API fallback). |
/api/audits/<id>/override |
POST |
HITL: Approve, Flag, or Reject an AI audit. |
/api/audits/history |
GET |
Retrieve paginated audit logs for the organization. |
/api/audits/<id> |
GET |
Retrieve a full single-audit detail payload. |
| Endpoint | Method | Description |
|---|---|---|
/api/kb/upload |
POST |
Upload policy documents (Max 5 per batch). |
/api/kb/documents |
GET |
List all indexed organizational policies. |
/api/kb/<id>/file |
GET |
Stream the original uploaded document. |
/api/kb/<id>/status |
GET |
Poll the indexing progress (Mongo vs Chroma). |
/api/kb/<id>/reindex |
POST |
Re-trigger indexing for failed or partial documents. |
/api/kb/<id> |
DELETE |
Atomic removal of document and vector chunks. |
| Endpoint | Method | Description |
|---|---|---|
/api/auth/register |
POST |
Create user + organization and set secure auth cookies. |
/api/auth/login |
POST |
Authenticate and issue HTTP-only JWT cookie + CSRF token. |
/api/auth/logout |
POST |
Clear auth cookies and terminate the session. |
/api/auth/me |
GET |
Fetch current authenticated user profile. |
/api/health |
GET |
Fast health check for DB connectivity. |
/api/health/deep |
GET |
Deep health check including vector-store readiness. |
Qualora is built with a "Security-First" architecture to handle sensitive customer interactions:
- XSS Immunity: JWTs are stored in
HTTP-Onlycookies. JavaScript cannot access the session token, neutralizing most cross-site scripting vectors. - CSRF Protection: All state-changing requests (POST/PUT/DELETE) require a valid
X-CSRF-Tokenheader that is validated using constant-time comparison. - Atomic RAG Indexing: Our "Zombie Guard" ensures that if the local ChromaDB index fails, the MongoDB Atlas vectors are rolled back instantly to prevent policy hallucinations.
- Prompt Injection Defense: Transcripts are strictly bounded by XML sanitization tags within the LLM prompt, preventing "Ignore previous instructions" style attacks.
The sitemap, privacy policy, and terms are loaded dynamically to allow updates without redeploying code.
- Sitemap: Managed in
/data/sitemap.json. Defines the dashboard hierarchy and icons. - Policies: Managed in
/data/privacy-policy.mdand/data/terms-of-service.md. - UI Hint: Use standard Markdown. The frontend automatically converts these to MD3-compliant HTML on load.
- 401 Unauthorized: Session has expired. The
fetch-wrapper.jswill automatically redirect you to the landing page. - Indexing Stuck: Check if the
VOYAGE_API_KEYis active. UsePOST /api/kb/<doc_id>/reindexto retry. - All Speakers are "Speaker 0": This indicates the HF Space node is offline. Configure
HF_SPACE_URLto enable diarization.
- Real-time Streaming: WebSockets for mid-call live auditing.
- CRM Sync: Automated export of audit scores to Salesforce and Zendesk.
- Multilingual Support: Expanding RAG and ASR to Spanish, French, and German.
MIT License - Copyright (c) 2026 Qualora. Built by Pratham Amritkar.