Skip to content

conqueror/mcgill-showcases

McGill ML Showcases

Public, student-friendly machine learning showcase projects for learning by doing.

This repository contains tutorial-style projects with reproducible tooling (uv + make), clear learning flows, and practical artifacts.

CI Markdown Links Notebook Smoke License: MIT

Table of Contents

Start Here

  1. Install Python 3.11+ and uv.
  2. Run:
make sync
  1. Pick one project from the catalog below.
  2. Enter that project and follow its README.md.

If this is your first time, start with sota-supervised-learning-showcase. If you want the deep-learning sequence specifically, start with deep-learning-math-foundations-showcase.

Project Catalog

Project Topic Difficulty Estimated Time Prerequisites Start Link
deep-learning-math-foundations-showcase Essential math for deep learning: vectors, derivatives, entropy, and gradient descent Beginner 1.0-1.5 hours Python basics, high-school algebra projects/deep-learning-math-foundations-showcase/README.md
neural-network-foundations-showcase Perceptrons, activations, backprop intuition, initialization, and decision boundaries Beginner 1.0-1.5 hours Python, basic algebra, deep-learning math foundations recommended projects/neural-network-foundations-showcase/README.md
pytorch-training-regularization-showcase PyTorch training loops, optimizers, schedulers, dropout, batch norm, and regularization experiments Beginner-Intermediate 1.5-2.0 hours Python, neural-network foundations recommended projects/pytorch-training-regularization-showcase/README.md
sota-supervised-learning-showcase Supervised learning foundations + SOTA-style evaluation Beginner-Intermediate 1.5-2.5 hours Python, basic classification/regression projects/sota-supervised-learning-showcase/README.md
credit-risk-classification-capstone-showcase Credit default capstone (EDA, imbalance handling, threshold decisions) Intermediate 2-3 hours Supervised ML basics, tabular data prep projects/credit-risk-classification-capstone-showcase/README.md
nyc-demand-forecasting-foundations-showcase Time-aware demand forecasting with explicit train/val/test splits Intermediate 1.5-2.5 hours Python, regression basics, time-based validation intuition projects/nyc-demand-forecasting-foundations-showcase/README.md
sota-unsupervised-semisup-showcase Unsupervised, semi-supervised, self-supervised, active learning Intermediate 2-3 hours Python, basic ML intuition projects/sota-unsupervised-semisup-showcase/README.md
causalml-kaggle-showcase Causal inference, uplift modeling, policy simulation Intermediate 2-3 hours Python, basic ML, Kaggle token projects/causalml-kaggle-showcase/README.md
mlops-drift-production-showcase MLOps lifecycle, drift detection, retraining decisions, local API serving Intermediate 2-3 hours Python, ML basics, API basics projects/mlops-drift-production-showcase/README.md
xai-fairness-audit-showcase Explainability, subgroup fairness metrics, mitigation tradeoffs Intermediate 2-3 hours Python, classification metrics projects/xai-fairness-audit-showcase/README.md
automl-hpo-showcase Hyperparameter optimization strategy benchmarking (grid/random/TPE) Intermediate 1.5-2.5 hours Python, model tuning basics projects/automl-hpo-showcase/README.md
autoresearch Fixed-budget autonomous research loops with Codex/Claude launch briefs for macOS and Unix Intermediate-Advanced 2-3 hours Python, basic ML, Git, access to Apple Silicon or an NVIDIA GPU for the real upstream path projects/autoresearch/README.md
agentic-course-assistant-showcase Agent routing, tools, guardrails, traces, eval rubrics, A2A/session/memory concepts, and optional OpenAI Agents SDK / Google ADK examples Intermediate 1-1.5 hours Python, basic ML workflow vocabulary projects/agentic-course-assistant-showcase/README.md
eda-leakage-profiling-showcase Data profiling, missingness diagnostics, leakage analysis, split strategy comparison Beginner-Intermediate 1.5-2.0 hours Python, pandas basics projects/eda-leakage-profiling-showcase/README.md
feature-engineering-dimred-showcase Encoding, feature selection, PCA/t-SNE/UMAP comparison Beginner-Intermediate 1.5-2.5 hours Python, preprocessing basics projects/feature-engineering-dimred-showcase/README.md
modern-nlp-pipeline-showcase Shared text pipeline for classification, retrieval, QA, and summarization on research abstracts Intermediate 2-3 hours Python, basic ML, interest in NLP systems projects/modern-nlp-pipeline-showcase/README.md
rl-bandits-policy-showcase Multi-armed bandits, reward/regret analysis, policy recommendation Intermediate 1.5-2.5 hours Python, probability basics projects/rl-bandits-policy-showcase/README.md
batch-vs-stream-ml-systems-showcase Batch vs stream KPI pipelines, parity and latency analysis Intermediate 2-3 hours Python, data systems basics projects/batch-vs-stream-ml-systems-showcase/README.md
model-release-rollout-showcase Canary rollout, promote/hold/rollback decisions, registry artifacts Intermediate 1.5-2.0 hours Python, model metrics basics projects/model-release-rollout-showcase/README.md
learning-to-rank-foundations-showcase Learning-to-rank foundations with grouped splits and NDCG Intermediate 1.5-2.5 hours Python, ranking/recommendation basics projects/learning-to-rank-foundations-showcase/README.md
ranking-api-productization-showcase FastAPI ranking service, schema contracts, structured logging, OpenAPI Intermediate 1.5-2.5 hours Python, API basics, model serving basics projects/ranking-api-productization-showcase/README.md
demand-api-observability-showcase Demand prediction API with Prometheus metrics and optional OTel tracing Intermediate 1.5-2.5 hours Python, API basics, observability basics projects/demand-api-observability-showcase/README.md

Clean-Checkout Data And Artifacts

This repo keeps generated outputs out of git so students can reproduce them locally. Most projects write files under artifacts/ only after make run, make smoke, or a similar project command. A clean checkout may therefore contain only placeholders such as .gitkeep.

Raw local inputs are also kept out of git by default. Projects that need starter data either generate it in code or ship a small bundled sample dataset inside src/ so tests and smoke runs work on a normal laptop without private files.

Use each project README as the source of truth, but the usual flow is:

make sync
make smoke  # or make run
make verify
make test

If make verify reports missing artifacts before a run, generate the artifacts first. The verifier is checking the stable contract for what the project is expected to produce, not requiring generated outputs to be committed.

Repository Commands

Use root commands to run quality gates across all projects:

make help
make sync
make lint
make ty
make test
make check
make check-contracts
make verify
make smoke
make docs-build
make docs-serve
make docs-check
make harness-preflight
make harness-lint

Project-specific runs should be started from each project folder.

Contract note:

  • make check-contracts bootstraps missing supervised artifacts in quick mode, then validates split/EDA/leakage/eval/experiment contracts.
  • make harness-preflight and make harness-lint validate the repo-local public harness-lite bootstrap.

Documentation Site

make docs-serve
  • Strict docs build check:
make docs-check
  • API docs note:
    • GitHub Pages hosts static API reference pages and embedded ReDoc viewers backed by versioned OpenAPI JSON assets.
    • Interactive Swagger UI (/docs) is available when running each FastAPI showcase locally with make dev.
  • Main docs entry points:
    • docs/index.md
    • docs/showcase-architecture.md
    • docs/new-showcase-playbook.md
    • docs/api/index.md
    • docs/api/ranking-api.md
    • docs/api/demand-api.md

Learning Path

  • Deep learning foundations path: deep-learning math foundations -> neural-network foundations -> pytorch training regularization -> supervised or unsupervised next.
  • Core ML path: supervised -> unsupervised/semisup -> causal.
  • Production path: supervised -> mlops drift -> batch vs stream.
  • Forecasting path: nyc-demand forecasting foundations -> demand API observability -> model rollout.
  • Ranking path: learning-to-rank foundations -> ranking API productization -> model rollout.
  • NLP systems path: pytorch training regularization -> modern NLP pipeline -> learning to rank -> ranking API productization.
  • Release path: mlops drift -> batch vs stream -> model rollout.
  • Responsible AI path: supervised -> xai fairness -> causal.
  • Optimization path: supervised -> automl hpo -> autoresearch -> rl bandits.
  • Agent frameworks path: automl hpo -> autoresearch -> agentic course assistant -> model rollout.
  • Data quality path: eda leakage profiling -> feature engineering -> supervised contract artifacts.
  • See detailed guidance in docs/learning-path.md.

Coverage Matrix

  • Full aspect mapping is available in docs/aspect-coverage-matrix.md.
  • Use this matrix to match course topics to concrete commands and artifacts.

How to Get Help

  • Read docs/faq.md and docs/troubleshooting.md first.
  • Ask learning questions using GitHub Issues template: "Learning Question".
  • Open bug reports with reproducible steps and command output.

Contributing

See CONTRIBUTING.md for setup, standards, and pull request workflow.

Harness Lite

  • Repo-local harness config: .codex/config.toml
  • Routing manifest: .codex/harness/role-skill-matrix.toml
  • Operating pack: docs/agents/oodaris-harness-v2-operating-pack.md

License

MIT License. See LICENSE.

About

Student-friendly ML showcase projects for McGill courses

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors