Exchange rate volatility significantly impacts businesses, investors, and individuals in emerging economies like Nigeria.
This project implements a production-oriented machine learning system for forecasting the USD/NGN exchange rate, designed to handle the full ML lifecycle:
- Feature engineering
- Model training and selection
- Monitoring and drift detection
- Automated retraining
- API-based inference
Unlike typical notebook-based projects, this system is built to be robust, extensible, and deployment-ready.
- 📊 End-to-end ML pipeline (data → training → deployment → monitoring)
- ⚙️ Shared feature pipeline to prevent training-serving skew
- 🤖 Multi-model training (Linear, Random Forest, XGBoost)
- 🏆 Automated model selection based on RMSE
- 📉 Drift detection using statistical tests (KS-test)
- 🔁 Automated retraining pipeline
- 🚀 FastAPI-based prediction service
- 🧾 Prediction logging (SQLite, Postgres-ready)
git clone https://github.com/Udeibom/exchange-rate-forecasting.git
cd exchange-rate-forecasting
python -m venv .venv source .venv/bin/activate # Linux/Mac .venv\Scripts\activate # Windows
pip install -r requirements.txt python training/train_all.py
uvicorn api.main:app --reload
POST http://127.0.0.1:8000/predict
- Target: USD/NGN closing exchange rate
- Task: Regression
- Horizons:
- 1-day ahead (primary)
- 7-day ahead (secondary)
- RMSE (primary)
- MAE
- MAPE
Time-series validation is strictly enforced (no data leakage).
┌────────────────────┐
│ Raw FX Data │
└─────────┬──────────┘
▼
┌────────────────────┐
│ Feature Pipeline │
└─────────┬──────────┘
▼
┌────────────────────────────────────┐
│ Model Training (LR, RF, XGB) │
└──────────────┬─────────────────────┘
▼
┌────────────────────┐
│ Best Model Selector│
└─────────┬──────────┘
▼
┌────────────────────┐
│ Model Registry │
└─────────┬──────────┘
▼
┌────────────────────┐
│ FastAPI Inference │
└─────────┬──────────┘
▼
┌────────────────────┐
│ Monitoring & Drift │
└─────────┬──────────┘
▼
┌────────────────────┐
│ Automated Retrain │
└────────────────────┘
A shared feature pipeline ensures consistency between training and inference.
Features include:
- Lag features (1, 7, 14, 30 days)
- Rolling mean and standard deviation
- Returns and volatility
- Calendar-based features
The pipeline follows a scikit-learn-style API (fit, transform).
- Linear Regression (baseline)
- Random Forest Regressor
- XGBoost Regressor
Models are evaluated using time-series cross-validation, and the best model is automatically selected.
- Stores trained models and pipelines
- Tracks feature schema
- Enables reproducible inference
- Rolling error tracking
- KS-test for distribution shift
- Backtesting against naive baseline
Triggered when:
- New data is available
- Drift thresholds are exceeded
POST /predict
Request → Feature Pipeline → Model → Prediction → Logging
Predictions are stored in SQLite (Postgres-ready).
. ├── api/ │ └── main.py ├── data/ ├── features/ │ ├── build_features.py │ └── pipeline.py ├── training/ │ ├── train_all.py │ ├── retrain.py │ └── scheduler.py ├── evaluation/ │ ├── metrics.py │ ├── drift.py │ ├── backtesting.py │ └── report.py ├── models/ ├── artifacts/ ├── README.md └── requirements.txt
- Shared feature pipeline to eliminate training-serving skew
- Time-series validation to prevent data leakage
- Automated model selection based on RMSE
- Drift detection (KS-test) for monitoring
- Modular architecture for scalability
- Docker + cloud deployment
- CI/CD pipelines
- Alerting system (Slack / Email)
- Real-time data ingestion
- Shadow model evaluation
Caleb Udeibom
Machine Learning Engineer