Data Analyst based in Nouakchott, Mauritania 🇲🇷
Google Certified · Python · SQL · Power BI · Machine Learning
🔗 LinkedIn · Available for Data Analyst / BI roles
End-to-end retail analytics · PACE framework · Statistical validation
Full analytical pipeline on 9,994 transactions: EDA → T-Test / ANOVA / Chi² → K-Means (k=4) → Ridge regression with K-Fold CV per cluster. Delivered quantified pricing recommendations.
- Segments: VIP (+$5,912), Loyal (+$1,323), Occasional (+$290), At-Risk (−$212)
- Per-cluster R²: 0.57 · 0.99 · 0.32 · 0.80 — all beat global OLS (R²=0.34)
- Business impact: ~$23,318 recoverable profit (Furniture discount cap)
- Stack: Python · scipy · statsmodels · scikit-learn (K-Means, Ridge, K-Fold CV)
🏆 Kaggle Competition entry — Playground Series S4E1
Predicted bank customer churn at competitive level. Trained and benchmarked 3 models (Logistic Regression, XGBoost default, XGBoost optimized).
- Public leaderboard: ~0.887 AUC-ROC ⭐ (surpassing the ~0.88 benchmark)
- Critical segment identified: Germany (37% churn, double France/Spain)
- Method: GridSearchCV · SMOTE on train only · 3-model comparison
- Stack: Python · pandas · scikit-learn · imbalanced-learn · XGBoost
Capstone — Google Advanced Data Analytics Certificate
Predicted employee attrition at Salifort Motors (14,999 employees) using Random Forest + SMOTE, following the PACE framework.
- Model: Random Forest with GridSearchCV (5-fold CV)
- Performance: 98% accuracy · ROC-AUC 0.997 · 92% recall on leavers
- Key insight: Workload and tenure drive turnover — NOT salary
- Stack: Python · pandas · scikit-learn · imbalanced-learn · seaborn
Capstone — Google Data Analytics Certificate
Analyzed ride patterns of Cyclistic members vs casual riders to inform marketing strategy converting casual users into annual members.
- Focus: Exploratory data analysis + dashboard design
- Deliverable: Interactive dashboard with actionable marketing recommendations
- Stack: SQL · Data cleaning · EDA · Visualization · Dashboard
Languages: Python · SQL
Data & BI: pandas · numpy · Power BI · BigQuery · Google Sheets
Machine Learning: scikit-learn · XGBoost · imbalanced-learn · Random Forest · K-Means · Ridge · Linear Regression
Statistics: scipy · statsmodels · hypothesis testing (T-Test, ANOVA, Chi²)
Visualization: matplotlib · seaborn · Power BI
Automation: n8n · Google Workspace APIs
Frameworks: PACE · CRISP-DM
French (native) · Arabic (native) · German (C1) · English (B2)
- Google Advanced Data Analytics Professional Certificate
- Google Business Intelligence Professional Certificate
- Google Data Analytics Professional Certificate
Projects cover supervised ML (classification · regression), unsupervised ML (clustering), statistical inference, and BI dashboarding — a full Data Analyst toolkit.