I'm a Master's student at Carnegie Mellon University pursuing Information Systems Management with a focus on Business Intelligence & Data Analytics (GPA: 4.0/4.0). Previously, I worked as a Software Engineer at Salesforce, where I built automation pipelines, CI/CD systems, and AI-powered workflows.
I'm passionate about leveraging AI, data science, and automation to solve complex problems and create impactful solutions. Currently exploring LLM applications, agentic AI systems, and ML for real-world product innovation.
🔭 Currently working on: AI workflow orchestration & multi-agent systems
🌱 Learning: Advanced ML techniques, cloud architecture, and product strategy
💬 Ask me about: Python automation, AI agents, RAG systems, data pipelines, or Salesforce development
📫 Reach me: sumreenf@andrew.cmu.edu | Portfolio
Skills: RAG (Naive + Agentic) • LLM Fine-tuning • Prompt Engineering • AI Agents • Tool Calling • RAGAS Evaluation
Skills: CI/CD • Containerization • Cloud-native Development
Production RAG system that lets developers ask natural language questions about documentation and receive grounded, source-cited answers — powered by a local LLM running entirely on your machine.
- Tech Stack: Python, LangChain, LangGraph, Ollama (Mistral 7B), ChromaDB, SentenceTransformers, FastAPI, Streamlit, Docker, RAGAS
- Key Features:
- Naive + Agentic RAG — two pipeline modes switchable via one env var. Naive retrieves once and generates. Agentic uses a ReAct agent that decides when and how to search, can perform multiple searches per question, and decomposes complex queries into sub-searches.
- End-to-end pipeline: document ingestion → chunking → embedding → vector search → LLM answer with source citations
- RAGAS evaluation: faithfulness 1.0, context precision 1.0, overall 0.819
- 63 automated tests, Docker deployment, Streamlit chat UI with live health monitoring
- Impact: Zero hallucination (faithfulness 1.0) — every answer traces back to a specific document chunk
- 📘 Interactive Guide →
A distributed multi-agent AI system orchestrating 7+ autonomous agents with RAG pipelines and OAuth integrations.
- Tech Stack: Python, LangChain, OpenAI API, Google Workspace APIs, Vector Embeddings
- Key Features:
- Event-driven workflow engine for Gmail, Calendar, and Notion integration
- RAG-based knowledge retrieval with structured JSON outputs
- Reduced manual productivity tasks by 80%
- Impact: Automated complex multi-step workflows across interconnected systems
Large-scale NLP analysis project for NVIDIA GTC positioning strategy.
- Tech Stack: Python, NLP, Sentiment Analysis, Tableau, Brandwatch
- Key Features:
- Analyzed 100K+ social media conversations
- Built sentiment analysis and topic modeling pipelines
- Generated strategic recommendations for stakeholder presentations
- Impact: Identified perception gaps and emerging AI trends for product positioning
End-to-end ETL pipeline on real insurance claims data using Bronze/Silver/Gold medallion architecture.
- Tech Stack: Python, DuckDB, dbt, SQL, Tableau
- Key Features:
- Built Bronze → Silver → Gold pipeline ingesting 1,338 healthcare claims with zero data loss
- Kimball dimensional model with
dim_patientandfact_claimstables using dbt - 6 automated dbt data quality tests passing across all transformation layers
- 📱 Interactive HTML Report
- Impact: Identified smokers cost 3.8× more in claims ($32,108 vs $8,415 avg) through Gold layer analytics
AI-powered travel planning platform with real-time itinerary adaptation.
- Tech Stack: Python, LLM APIs, NLP, React, Weather APIs
- Key Features:
- LLM tool-calling pipelines for dynamic travel recommendations
- Processed 50K+ travel reviews for safety scoring and sentiment
- Achieved 84% classification accuracy through A/B testing with 200+ users
- Impact: Personalized destination insights with real-time contextual signals
End-to-end demand forecasting pipeline for eyewear retail assortment optimization.
- Tech Stack: Python, XGBoost, Scikit-learn, Pandas, Time Series Analysis
- Key Features:
- Built ML pipeline with 33 engineered features analyzing 910K+ units across 162 SKUs
- Achieved 0.737 R² using XGBoost with hyperparameter tuning and cross-validation
- Generated 14 data-driven product management recommendations
- 📱 Interactive HTML Report
- Impact: Identified $2.3K/SKU revenue opportunity through strategic assortment reallocation
Carnegie Mellon University | Master of Information Systems Management (MISM)
Business Intelligence & Data Analytics | GPA: 4.0/4.0 | Expected Dec 2026
Certifications:
- 🏅 Salesforce AI Associate
- 🏅 Salesforce Advanced Admin & App Builder
- 🏅 Salesforce Advanced Developer & DevOps
- 🏅 Wharton School: AI for Business
- 🌟 Salesforce Ranger: 82,000+ Trailhead points
Software Engineer @ Salesforce (Jul 2023 - Aug 2025)
Built predictive analytics and automation workflows using Python, SQL, and APIs to optimize enterprise asset onboarding for 40K+ users.
Product Analytics & Insights Intern @ SRM Films (May 2023 - Jul 2023)
Analyzed engagement, retention, and churn metrics and developed dashboards to support data-driven content strategy.
Summer Analyst Intern @ Salesforce (May 2022 - Jul 2022)
Developed automation pipelines integrating Slack and MuleSoft APIs to streamline enterprise onboarding workflows.

