Skip to content
View AlfaPankaj's full-sized avatar

Highlights

  • Pro

Block or report AlfaPankaj

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
AlfaPankaj/README.md

Pankaj Yadav

Generative AI Researcher & Systems Engineer

Building production-grade AI for the next billion users — not just the next billion dollars.

LinkedIn Email arXiv Profile Views


About Me

I build AI systems that run on hardware most researchers discard.

My work sits at the intersection of edge AI, agentic systems, and low-resource NLP — with a focus on making powerful AI accessible on 4 GB VRAM consumer hardware, in Indian languages, without cloud dependency.

  • 🧠 Building CHAARI 2.0 — a privacy-first bilingual agentic AI OS companion (Hinglish, 4 GB VRAM, cryptographic mesh, arXiv in prep)
  • ⚡ Researching NMOS — 70B+ inference on 4 GB VRAM via anticipatory behavioral signal loading
  • 🎓 MSc Information Technology @ Lovely Professional University (May 2026)
  • 💼 ex-Intern @ LG Electronics (predictive maintenance, >95% recall)
  • 🏐 State & National Volleyball player — teamwork on and off the court

Flagship Projects

🤖 CHAARI 2.0 — Privacy-First Bilingual Agentic AI

Comprehensive Hinglish AI Agentic Runtime Interface

Production-grade two-node agentic AI companion running entirely on RTX 2050 (4 GB VRAM). Built solo. No research lab. No cloud budget.

Component Detail
Scale 39+ Python modules · 8,000+ lines of code · 369+ automated tests
Model Fine-tuned Qwen 2.5 4.2B on custom Hinglish dataset · 30–40 tok/s on 4 GB VRAM
Safety 7-layer Constitutional AI-inspired pipeline (code-based, not prompt-based)
Security RSA-2048 two-node TCP mesh · nonce replay protection · 3-step handshake
RAG RAPTOR 3-level hierarchical RAG · 1.14 GB vector index · sub-second retrieval
Voice Full-duplex STT+TTS · sub-800ms conversation latency · sub-100ms tool calls
Vision OCR + Llava 7B for screen/image understanding
Research arXiv paper in preparation (cs.CL / cs.AI)

🧠 NMOS — Neural Memory Operating System (Active Research)

Anticipatory Inference for LLMs Using User Interaction Signals

The Zero-Lag Hypothesis: Perceived Latency ≈ max(0, T_load − T_typing)

Running 70B+ parameter models on 4 GB VRAM by using human behavioral signals to mask the physical memory wall.

Module Role Status
Scout (SmolLM2-135M) Real-time shard affinity prediction ✅ 90% accuracy
River Async double-buffered prefetcher ✅ Zero GPU stall
Memory Paged-KV controller with H2O folding ✅ Active
Engine Speculative decoding orchestrator (K=15) ✅ ~16 tok/s on 70B
Failure Memory HNSW vector DB for misprediction learning 🔄 Next phase

Other Projects

Project Stack Highlights
Autonomous Financial Research Agent LangGraph · MCP · FinBERT Multi-step reasoning workflow with neurosymbolic guardrails
HinglishSearch RAG Endee VectorDB · Docker · CHAARI 2.0 Semantic search for Hinglish documents · sub-second retrieval
Industrial Predictive Maintenance PyTorch · LSTM · Isolation Forest >95% recall · deployed at LG Electronics

Tech Stack

Core Languages

Python C++ SQL

AI / ML

PyTorch TensorFlow HuggingFace LangGraph Ollama

LLMs & GenAI

Qwen Llama Groq Gemini Vertex AI

RAG & Vector DBs

FAISS ChromaDB Endee

Infrastructure

Docker Git Google Colab


Certifications

  • 🏅 Train/Build Small Language Models — Google DeepMind (Advanced)
  • 🏅 Enterprise AI Agents & Fundamentals — Google Cloud
  • 🏅 Gemini Enterprise Applications — Google Cloud
  • 🏅 Quantitative Research — JPMorgan Chase & Co. (Forage)
  • 🏅 Data Analytics — Deloitte Australia (Forage)

GitHub Stats

Pankaj's GitHub Stats Top Languages


"Building AI for the next billion users, not just the next billion dollars."

— Built in Rudrapur. Running everywhere.

Pinned Loading

  1. Hinglish--search--endee Hinglish--search--endee Public

    Semantic Search and RAG for Hinglish documents using Endee vector database

    Python 5

  2. AlfaPankaj AlfaPankaj Public

    about me

    3

  3. hinglish-search-endee hinglish-search-endee Public

    Forked from endee-io/endee

    Semantic search and RAG for Hinglish documents using Endee vector database

    C++

  4. YOLOv11_-_CLIP_Natural_Language_Visual_Search YOLOv11_-_CLIP_Natural_Language_Visual_Search Public

    This project takes object detection to the next level by combining **YOLOv11** (for spatial localization) with **CLIP** (for semantic │ │ understanding). It allows you to search for specific object…

    Python

  5. CHAARI_2.0 CHAARI_2.0 Public

    CHAARI 2.0 is a privacy-first, bilingual agentic AI voice OS—not a chatbot. It controls your computer, speaks Hinglish natively, and runs on a two-node cryptographic mesh with RSA-2048 signed comma…

    Python 1

  6. Neural_Memory_Operating_system Neural_Memory_Operating_system Public

    NMOS (Neural Memory OS) is a predictive partial execution engine enabling 70B-level reasoning on 4GB VRAM. It uses the “Zero-Lag” hypothesis, leveraging typing latency as a compute window to mask m…

    Python 1