Skip to content

Udeibom/rate-forcaster

Repository files navigation

End-to-End Exchange Rate Forecasting System (USD/NGN)

Overview

Exchange rate volatility significantly impacts businesses, investors, and individuals in emerging economies like Nigeria.

This project implements a production-oriented machine learning system for forecasting the USD/NGN exchange rate, designed to handle the full ML lifecycle:

  • Feature engineering
  • Model training and selection
  • Monitoring and drift detection
  • Automated retraining
  • API-based inference

Unlike typical notebook-based projects, this system is built to be robust, extensible, and deployment-ready.


Key Features

  • 📊 End-to-end ML pipeline (data → training → deployment → monitoring)
  • ⚙️ Shared feature pipeline to prevent training-serving skew
  • 🤖 Multi-model training (Linear, Random Forest, XGBoost)
  • 🏆 Automated model selection based on RMSE
  • 📉 Drift detection using statistical tests (KS-test)
  • 🔁 Automated retraining pipeline
  • 🚀 FastAPI-based prediction service
  • 🧾 Prediction logging (SQLite, Postgres-ready)

Quick Start

Run Locally

git clone https://github.com/Udeibom/exchange-rate-forecasting.git

cd exchange-rate-forecasting

python -m venv .venv source .venv/bin/activate # Linux/Mac .venv\Scripts\activate # Windows

pip install -r requirements.txt python training/train_all.py

uvicorn api.main:app --reload

Make a Prediction

POST http://127.0.0.1:8000/predict


Forecasting Objective

  • Target: USD/NGN closing exchange rate
  • Task: Regression
  • Horizons:
    • 1-day ahead (primary)
    • 7-day ahead (secondary)

Evaluation Metrics

  • RMSE (primary)
  • MAE
  • MAPE

Time-series validation is strictly enforced (no data leakage).


System Architecture

         ┌────────────────────┐
         │   Raw FX Data       │
         └─────────┬──────────┘
                   ▼
         ┌────────────────────┐
         │ Feature Pipeline   │
         └─────────┬──────────┘
                   ▼
┌────────────────────────────────────┐
│   Model Training (LR, RF, XGB)     │
└──────────────┬─────────────────────┘
               ▼
     ┌────────────────────┐
     │ Best Model Selector│
     └─────────┬──────────┘
               ▼
     ┌────────────────────┐
     │ Model Registry     │
     └─────────┬──────────┘
               ▼
     ┌────────────────────┐
     │ FastAPI Inference  │
     └─────────┬──────────┘
               ▼
     ┌────────────────────┐
     │ Monitoring & Drift │
     └─────────┬──────────┘
               ▼
     ┌────────────────────┐
     │ Automated Retrain  │
     └────────────────────┘

Feature Engineering

A shared feature pipeline ensures consistency between training and inference.

Features include:

  • Lag features (1, 7, 14, 30 days)
  • Rolling mean and standard deviation
  • Returns and volatility
  • Calendar-based features

The pipeline follows a scikit-learn-style API (fit, transform).


Models

  • Linear Regression (baseline)
  • Random Forest Regressor
  • XGBoost Regressor

Models are evaluated using time-series cross-validation, and the best model is automatically selected.


Model Lifecycle Management

Model Registry

  • Stores trained models and pipelines
  • Tracks feature schema
  • Enables reproducible inference

Monitoring & Drift Detection

  • Rolling error tracking
  • KS-test for distribution shift
  • Backtesting against naive baseline

Automated Retraining

Triggered when:

  • New data is available
  • Drift thresholds are exceeded

Prediction API

Endpoint

POST /predict

Flow

Request → Feature Pipeline → Model → Prediction → Logging

Predictions are stored in SQLite (Postgres-ready).


Project Structure

. ├── api/ │ └── main.py ├── data/ ├── features/ │ ├── build_features.py │ └── pipeline.py ├── training/ │ ├── train_all.py │ ├── retrain.py │ └── scheduler.py ├── evaluation/ │ ├── metrics.py │ ├── drift.py │ ├── backtesting.py │ └── report.py ├── models/ ├── artifacts/ ├── README.md └── requirements.txt


Key Engineering Decisions

  • Shared feature pipeline to eliminate training-serving skew
  • Time-series validation to prevent data leakage
  • Automated model selection based on RMSE
  • Drift detection (KS-test) for monitoring
  • Modular architecture for scalability

Future Improvements

  • Docker + cloud deployment
  • CI/CD pipelines
  • Alerting system (Slack / Email)
  • Real-time data ingestion
  • Shadow model evaluation

Author

Caleb Udeibom
Machine Learning Engineer

About

An end-to-end ML system with feature pipelines, experiment tracking, automated model selection, and live inference.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors