Pearls AQI Predictor - Islamabad Forecast Lab

Pearls AQI Predictor is my end-to-end Data Sciences internship project for forecasting the next 3 days of AQI in Islamabad, Pakistan.

I built this as a working ML product, not just a notebook. The system collects live air quality data, engineers features, stores them in MongoDB Atlas, trains multiple models automatically, registers champion models, and serves predictions through a FastAPI backend and a Next.js dashboard.

Live Links

Item	Link
Live frontend	https://pearls-aqi.vercel.app/
Forecast dashboard	https://pearls-aqi.vercel.app/dashboard
Backend API	https://aqi-predictor-api-cuec.onrender.com
FastAPI docs	https://aqi-predictor-api-cuec.onrender.com/docs
Final report PDF	documentation/Pearls_AQI_Predictor_Final_Internship_Report.pdf

Project Screenshots

What The System Does

Predicts Islamabad AQI for Day +1, Day +2, and Day +3.
Uses live API-based weather and pollutant data from Open-Meteo.
Stores processed features in MongoDB Atlas as a cloud feature store.
Trains multiple models: Ridge, Random Forest, Gradient Boosting, and MLP Neural Net.
Evaluates models using RMSE, MAE, and R2.
Selects champion models dynamically instead of hardcoding a winner.
Stores model registry metadata and model binaries through MongoDB Atlas and GridFS.
Runs automated feature and training pipelines using GitHub Actions.
Serves predictions through a deployed FastAPI backend on Render.
Presents results through a deployed Next.js frontend on Vercel.
Includes EDA, feature importance style evidence, quality audit checks, and pipeline evidence.

Architecture

Open-Meteo APIs
   |
   | hourly GitHub Actions feature pipeline
   v
MongoDB Atlas Feature Store
   |
   | daily/catch-up GitHub Actions training pipeline
   v
Model Metrics + Model Registry + GridFS Artifacts
   |
   | inference endpoints
   v
FastAPI Backend on Render
   |
   | public API calls
   v
Next.js Frontend on Vercel

Latest Forecast Snapshot

The live backend returns the latest 3-day Islamabad forecast from the model registry.

Horizon	Date	Predicted AQI	Risk	Champion model
Day +1	2026-06-08	87.89	Moderate	ridge
Day +2	2026-06-09	86.41	Moderate	random_forest
Day +3	2026-06-10	100.50	Unhealthy for Sensitive Groups	random_forest

The dashboard also supports model override, so individual trained models can be compared against the automatic horizon champions.

Model Training Summary

The training pipeline reads historical features from MongoDB Atlas, creates future targets for 1-day, 2-day, and 3-day forecasting, trains all candidate models, evaluates them, and stores the full result in the cloud registry.

Horizon	Selected champion	RMSE	MAE	R2
Day +1	ridge	12.19	8.88	0.507
Day +2	random_forest	22.24	16.54	-0.673
Day +3	random_forest	24.10	17.45	-0.973

Overall leaderboard winner: random_forest.

Cloud Evidence

MongoDB Atlas

GitHub Actions Automation

Backend and Frontend Deployments

API, EDA, and Submission Proof

Automation Details

Feature Pipeline

Workflow file: .github/workflows/feature-pipeline.yml

Runs on GitHub Actions schedule.
Has primary and backup cron triggers because GitHub scheduled runners can be delayed.
Fetches current Islamabad air quality and weather data.
Engineers features and stores them in MongoDB Atlas.
Uses deduplication so repeated scheduled runs do not corrupt the feature store.
Logs each run to the pipeline_runs collection.

Training Pipeline

Workflow file: .github/workflows/training-pipeline.yml

Runs daily and also includes catch-up logic after feature runs.
Fetches historical feature data from MongoDB Atlas.
Trains Ridge, Random Forest, Gradient Boosting, and MLP Neural Net models.
Evaluates with RMSE, MAE, and R2.
Saves model metrics, model registry records, and model artifacts.
Generates latest 3-day prediction records.

Manual Recovery

Workflow file: .github/workflows/manual-recovery.yml

This exists so I can manually recover the system if any external platform delays or skips a scheduled run.

Repository Structure

backend/               FastAPI service, database layer, feature pipeline, training code
frontend/              Next.js frontend product dashboard
.github/workflows/     Feature, training, and manual recovery automation
assets/readme/         Public screenshots used inside this README
documentation/         Final internship report PDF only
render.yaml            Render deployment blueprint

Local Setup

Backend

cd backend
python -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txt
uvicorn app.api:app --reload --port 8000

Required backend environment variables:

MONGODB_URI=your_mongodb_atlas_uri
MONGODB_DB_NAME=aqi_predictor
CITY=Islamabad
LATITUDE=33.6844
LONGITUDE=73.0479

Frontend

cd frontend
npm install
npm run dev

The frontend reads the backend URL from its environment configuration. For local testing, point it to the local FastAPI server or to the deployed Render API.

Final Submission

The candidate portal requested a public GitHub repository link. This repository contains the working project code, deployed frontend/backend links, automation workflows, screenshots, evidence, and final report PDF.

Final report PDF:

documentation/Pearls_AQI_Predictor_Final_Internship_Report.pdf

Built By

Salman Khan

GitHub: https://github.com/codewithsalty
Live project: https://pearls-aqi.vercel.app/
Backend API: https://aqi-predictor-api-cuec.onrender.com/docs

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
.github/workflows		.github/workflows
assets/readme		assets/readme
backend		backend
documentation		documentation
frontend		frontend
scripts		scripts
streamlit_dashboard		streamlit_dashboard
.gitignore		.gitignore
README.md		README.md
docker-compose.local.yml		docker-compose.local.yml
render.yaml		render.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pearls AQI Predictor - Islamabad Forecast Lab

Live Links

Project Screenshots

What The System Does

Architecture

Latest Forecast Snapshot

Model Training Summary

Cloud Evidence

MongoDB Atlas

GitHub Actions Automation

Backend and Frontend Deployments

API, EDA, and Submission Proof

Automation Details

Feature Pipeline

Training Pipeline

Manual Recovery

Repository Structure

Local Setup

Backend

Frontend

Final Submission

Built By

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Pearls AQI Predictor - Islamabad Forecast Lab

Live Links

Project Screenshots

What The System Does

Architecture

Latest Forecast Snapshot

Model Training Summary

Cloud Evidence

MongoDB Atlas

GitHub Actions Automation

Backend and Frontend Deployments

API, EDA, and Submission Proof

Automation Details

Feature Pipeline

Training Pipeline

Manual Recovery

Repository Structure

Local Setup

Backend

Frontend

Final Submission

Built By

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages