MSc Applied Mathematics & Statistics (Data Science) — Université de Caen Normandie 🇫🇷
I build end-to-end ML projects (data → features → model → evaluation) with clean, reproducible pipelines.
🎯 Seeking Data Science / Machine Learning internship (Feb–Aug 2026)
Open to: Data Science • Machine Learning • Data Engineering (ETL) • Analytics • Cloud (Azure)
- 📌 Focus: Fraud detection, imbalanced classification, feature engineering, model evaluation
- 🧱 Also: ETL / datawarehouse (star schema), SQL, data quality checks
- ☁️ Cloud: Azure ML (DP-100 prep), experiments, pipelines, MLOps fundamentals
Tech: PySpark • Spark MLlib • Python • SQL • (Optional) Superset
What I did:
- Distributed pipeline: load + join identity, cleaning, missing values handling
- EDA:
isFrauddistribution, key drivers, first engineered features - Supervised modeling + metrics tracking (AUC, precision, recall)
➡️ Repo: https://github.com/elfahad98/ieee-fraud-pyspark
Tech: Python • scikit-learn • XGBoost • Pandas • NumPy
What I did:
- Behavioral features (time patterns, frequency, device signals, etc.)
- Imbalanced learning + benchmarking + thresholding with low false positives in mind
➡️ Repo: https://github.com/elfahad98/ato-fraud-detection-mlp
Tech: Apache Hop • PostgreSQL • SQL
What I did:
- Star schema design (fact + dimensions)
- ETL workflows: ingestion, cleaning, quality checks, error handling
- (Concepts) Slowly Changing Dimensions (SCD2)
➡️ Repo: https://github.com/elfahad98/etl-datawarehouse
- Email: elfahad98@gmail.com



