I'm a Data Science Master's candidate (expected June 2026) with hands-on experience building end-to-end analytics pipelines at scale β including a 16M+ row machine learning project using U.S. Census data. I specialize in predictive modeling, fairness-aware ML, and translating complex findings into actionable insights for business and policy stakeholders. My background in legal operations and academic administration gives me an edge in data governance, compliance, and communicating with non-technical audiences.
- π½ NYC 311 + Weather Correlation Dashboard β Production-grade ETL pipeline ingesting 941K real NYC civic complaints + NOAA weather data via REST APIs; cleaned with Python & PostgreSQL, visualized in an interactive 5-tab Power BI dashboard with automated daily refresh. Live Dashboard
- π Rent Burden Prediction β Fairness & ML analysis on 16M+ ACS PUMS household records (Logistic Regression, Random Forest, Gradient Boosting); equity analysis across race, sex, and geography for HUD policy context
- π¦ Home Loan Approval Prediction β ML pipeline on 4.25M real HMDA 2023 mortgage applications; XGBoost ROC-AUC 0.9932, 96.3% accuracy across 121 features
- π Marketing Campaign Effectiveness β End-to-end ROI analysis for Nike Inc. using real Google Trends (pytrends API) + SEC EDGAR 10-K filings; ROAS modeling, lag correlation, and 6-panel dashboard in Python
- π Customer Churn & CLV Analysis β End-to-end SaaS churn prediction pipeline; synthetic data calibrated to HubSpot 2023 10-K & SaaS Capital benchmarks; Logistic Regression (AUC 0.92) identifying $2.3M annual MRR at risk across 5,000 customers; interactive Tableau dashboard with risk segmentation and intervention ROI modeling. Live Dashboard