Skip to content
View sumreen7's full-sized avatar
💭
Trying to make the world better one project at a time
💭
Trying to make the world better one project at a time

Highlights

  • Pro

Block or report sumreen7

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
sumreen7/README.md

Hi there, I'm Sumreen! 👋

🚀 About Me

I'm a Master's student at Carnegie Mellon University pursuing Information Systems Management with a focus on Business Intelligence & Data Analytics (GPA: 4.0/4.0). Previously, I worked as a Software Engineer at Salesforce, where I built automation pipelines, CI/CD systems, and AI-powered workflows.

I'm passionate about leveraging AI, data science, and automation to solve complex problems and create impactful solutions. Currently exploring LLM applications, agentic AI systems, and ML for real-world product innovation.

🔭 Currently working on: AI workflow orchestration & multi-agent systems
🌱 Learning: Advanced ML techniques, cloud architecture, and product strategy
💬 Ask me about: Python automation, AI agents, RAG systems, data pipelines, or Salesforce development
📫 Reach me: sumreenf@andrew.cmu.edu | Portfolio


💻 Tech Stack

💻 Programming

Python SQL Java Apex TypeScript JavaScript R Bash REST APIs

🧠 AI & Machine Learning

PyTorch TensorFlow Scikit Learn OpenAI Hugging Face LangChain LangGraph Ollama ChromaDB

Skills: RAG (Naive + Agentic) • LLM Fine-tuning • Prompt Engineering • AI Agents • Tool Calling • RAGAS Evaluation

📊 Data Science & Analytics

Pandas NumPy Jupyter Tableau Power BI

☁️ Cloud & DevOps

AWS Docker Kubernetes Git Jenkins

Skills: CI/CD • Containerization • Cloud-native Development

🔗 Platforms & Ecosystems

Salesforce MuleSoft GitHub Slack n8n Google APIs


🏆 Featured Projects

Production RAG system that lets developers ask natural language questions about documentation and receive grounded, source-cited answers — powered by a local LLM running entirely on your machine.

  • Tech Stack: Python, LangChain, LangGraph, Ollama (Mistral 7B), ChromaDB, SentenceTransformers, FastAPI, Streamlit, Docker, RAGAS
  • Key Features:
    • Naive + Agentic RAG — two pipeline modes switchable via one env var. Naive retrieves once and generates. Agentic uses a ReAct agent that decides when and how to search, can perform multiple searches per question, and decomposes complex queries into sub-searches.
    • End-to-end pipeline: document ingestion → chunking → embedding → vector search → LLM answer with source citations
    • RAGAS evaluation: faithfulness 1.0, context precision 1.0, overall 0.819
    • 63 automated tests, Docker deployment, Streamlit chat UI with live health monitoring
  • Impact: Zero hallucination (faithfulness 1.0) — every answer traces back to a specific document chunk
  • 📘 Interactive Guide →

A distributed multi-agent AI system orchestrating 7+ autonomous agents with RAG pipelines and OAuth integrations.

  • Tech Stack: Python, LangChain, OpenAI API, Google Workspace APIs, Vector Embeddings
  • Key Features:
    • Event-driven workflow engine for Gmail, Calendar, and Notion integration
    • RAG-based knowledge retrieval with structured JSON outputs
    • Reduced manual productivity tasks by 80%
  • Impact: Automated complex multi-step workflows across interconnected systems

📊 NVIDIA × CMU: Social Listening & Market Insights

Large-scale NLP analysis project for NVIDIA GTC positioning strategy.

  • Tech Stack: Python, NLP, Sentiment Analysis, Tableau, Brandwatch
  • Key Features:
    • Analyzed 100K+ social media conversations
    • Built sentiment analysis and topic modeling pipelines
    • Generated strategic recommendations for stakeholder presentations
  • Impact: Identified perception gaps and emerging AI trends for product positioning

End-to-end ETL pipeline on real insurance claims data using Bronze/Silver/Gold medallion architecture.

  • Tech Stack: Python, DuckDB, dbt, SQL, Tableau
  • Key Features:
    • Built Bronze → Silver → Gold pipeline ingesting 1,338 healthcare claims with zero data loss
    • Kimball dimensional model with dim_patient and fact_claims tables using dbt
    • 6 automated dbt data quality tests passing across all transformation layers
    • 📱 Interactive HTML Report
  • Impact: Identified smokers cost 3.8× more in claims ($32,108 vs $8,415 avg) through Gold layer analytics

AI-powered travel planning platform with real-time itinerary adaptation.

  • Tech Stack: Python, LLM APIs, NLP, React, Weather APIs
  • Key Features:
    • LLM tool-calling pipelines for dynamic travel recommendations
    • Processed 50K+ travel reviews for safety scoring and sentiment
    • Achieved 84% classification accuracy through A/B testing with 200+ users
  • Impact: Personalized destination insights with real-time contextual signals

End-to-end demand forecasting pipeline for eyewear retail assortment optimization.

  • Tech Stack: Python, XGBoost, Scikit-learn, Pandas, Time Series Analysis
  • Key Features:
    • Built ML pipeline with 33 engineered features analyzing 910K+ units across 162 SKUs
    • Achieved 0.737 R² using XGBoost with hyperparameter tuning and cross-validation
    • Generated 14 data-driven product management recommendations
    • 📱 Interactive HTML Report
  • Impact: Identified $2.3K/SKU revenue opportunity through strategic assortment reallocation

🎓 Education & Certifications

Carnegie Mellon University | Master of Information Systems Management (MISM)
Business Intelligence & Data Analytics | GPA: 4.0/4.0 | Expected Dec 2026

Certifications:

  • 🏅 Salesforce AI Associate
  • 🏅 Salesforce Advanced Admin & App Builder
  • 🏅 Salesforce Advanced Developer & DevOps
  • 🏅 Wharton School: AI for Business
  • 🌟 Salesforce Ranger: 82,000+ Trailhead points

💼 Professional Experience

Software Engineer @ Salesforce (Jul 2023 - Aug 2025)
Built predictive analytics and automation workflows using Python, SQL, and APIs to optimize enterprise asset onboarding for 40K+ users.

Product Analytics & Insights Intern @ SRM Films (May 2023 - Jul 2023)
Analyzed engagement, retention, and churn metrics and developed dashboards to support data-driven content strategy.

Summer Analyst Intern @ Salesforce (May 2022 - Jul 2022)
Developed automation pipelines integrating Slack and MuleSoft APIs to streamline enterprise onboarding workflows.


🤝 Connect With Me

LinkedIn Email Portfolio GitHub


💡 "Making the world better one project at a time"

Profile Views

Pinned Loading

  1. developer-knowledge-rag developer-knowledge-rag Public

    Developer Documentation AI Assistant using RAG, Mistral (Ollama), ChromaDB, and FastAPI. Supports semantic search and domain-specific Q&A over engineering documentation.

    Python

  2. ai-chief-of-staff ai-chief-of-staff Public

    Automates email triage, calendar management, task extraction, and daily briefings — so you can focus on what matters.

    Shell

  3. My_AiPortfolio My_AiPortfolio Public

    Personal AI portfolio

    TypeScript

  4. esha-pandya0203/dfp-job-analyzer esha-pandya0203/dfp-job-analyzer Public

    Python 1 1

  5. VSPVision_IntelligentAnalytics VSPVision_IntelligentAnalytics Public

    XGBoost demand forecasting pipeline with feature engineering, time-based validation, and hyperparameter tuning - built on SKU-level retail sales data.

    Jupyter Notebook

  6. Predicting-Depression-from-Social-Netwroking-Data-using-Machine-Learning-Techniques Predicting-Depression-from-Social-Netwroking-Data-using-Machine-Learning-Techniques Public

    Forked from NANDINI-star/Predicting-Depression-from-Social-Netwroking-Data-using-Machine-Learning-Techniques

    Jupyter Notebook