Skip to content
View ahartshorn416's full-sized avatar

Block or report ahartshorn416

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ahartshorn416/README.md

Hi! πŸ‘‹ I'm Alison Hartshorn

πŸ’« About Me:

I'm a Data Science Master's candidate (expected June 2026) with hands-on experience building end-to-end analytics pipelines at scale β€” including a 16M+ row machine learning project using U.S. Census data. I specialize in predictive modeling, fairness-aware ML, and translating complex findings into actionable insights for business and policy stakeholders. My background in legal operations and academic administration gives me an edge in data governance, compliance, and communicating with non-technical audiences.

πŸ”­ Featured Projects:

  • πŸ—½ NYC 311 + Weather Correlation Dashboard β€” Production-grade ETL pipeline ingesting 941K real NYC civic complaints + NOAA weather data via REST APIs; cleaned with Python & PostgreSQL, visualized in an interactive 5-tab Power BI dashboard with automated daily refresh. Live Dashboard
  • 🏠 Rent Burden Prediction β€” Fairness & ML analysis on 16M+ ACS PUMS household records (Logistic Regression, Random Forest, Gradient Boosting); equity analysis across race, sex, and geography for HUD policy context
  • 🏦 Home Loan Approval Prediction β€” ML pipeline on 4.25M real HMDA 2023 mortgage applications; XGBoost ROC-AUC 0.9932, 96.3% accuracy across 121 features
  • πŸ“Š Marketing Campaign Effectiveness β€” End-to-end ROI analysis for Nike Inc. using real Google Trends (pytrends API) + SEC EDGAR 10-K filings; ROAS modeling, lag correlation, and 6-panel dashboard in Python
  • πŸ”„ Customer Churn & CLV Analysis β€” End-to-end SaaS churn prediction pipeline; synthetic data calibrated to HubSpot 2023 10-K & SaaS Capital benchmarks; Logistic Regression (AUC 0.92) identifying $2.3M annual MRR at risk across 5,000 customers; interactive Tableau dashboard with risk segmentation and intervention ROI modeling. Live Dashboard

🌐 Socials:

LinkedIn Tableau

πŸ’» Tech Stack:

Python R MySQL PostgreSQL NumPy Pandas Matplotlib scikit-learn XGBoost PyTorch Tableau Power BI SciPy

Pinned Loading

  1. nyc311-weather-dashboard nyc311-weather-dashboard Public

    Does rain make New Yorkers angrier? ETL pipeline + Power BI dashboard correlating 941K NYC 311 complaints with daily weather data. Python Β· PostgreSQL Β· Automated daily refresh.

    Python

  2. predicting-rent-burden predicting-rent-burden Public

    ML models predicting U.S. household rent burden using 16M+ ACS survey records β€” includes fairness analysis across race, sex & geography to inform housing policy.

    Python

  3. home_loan_approval_prediction home_loan_approval_prediction Public

    Predicts U.S. home loan approvals using 4.25M real HMDA 2023 applications β€” XGBoost, Random Forest, Logistic Regression, ROC-AUC 0.9932

    Python

  4. marketing-effectiveness-analysis marketing-effectiveness-analysis Public

    Analyzing marketing campaign ROI for Nike using Google Trends, SEC EDGAR financials, and lag correlation β€” built in Python with pandas, matplotlib, and pytrends.

    Python

  5. customer-churn-analysis customer-churn-analysis Public

    End-to-end SaaS customer churn prediction and CLV analysis Β· Logistic regression model (AUC 0.92) Β· $2.3M MRR at risk identified Β· Calibrated to HubSpot 2023 10-K and SaaS Capital benchmarks Β· Pyth…

    Jupyter Notebook