🔬 Model Explainability Dashboard

An interactive machine learning explainability dashboard built with Streamlit, scikit-learn, and SHAP.
Trains a RandomForestClassifier and makes its predictions fully transparent through global and local SHAP explanations.

📌 Why Explainability Matters

Machine learning models are increasingly used to support high-stakes decisions in healthcare, finance, and beyond. Without explainability, a model is a black box — its predictions cannot be audited, trusted, or safely deployed.

Explainable AI (XAI) bridges this gap by answering:

Which features drove this prediction?
How much did each feature contribute — positively or negatively?
Is the model relying on spurious patterns or meaningful signals?

SHAP (SHapley Additive exPlanations) is the gold-standard framework for model-agnostic, theoretically grounded feature attribution.

✨ Features

Feature	Description
📊 Model metrics	Accuracy, Precision, Recall, F1-score, ROC-AUC displayed as rich metric cards
🌍 Global explanations	Mean absolute SHAP values ranked across all test samples
🔍 Local explanations	Per-sample SHAP waterfall showing top positive/negative contributions
🎛️ Interactive explorer	Slider to select any test sample and inspect the model's reasoning
🖼️ Static chart export	SHAP summary bar chart saved to `outputs/shap_summary.png`
📋 Dataset overview	Feature matrix preview and class distribution

🗂️ Project Structure

model-explainability-dashboard/
│
├── app.py                  # Streamlit dashboard (main entry point)
│
├── src/
│   ├── __init__.py
│   ├── data.py             # Dataset loading & splitting
│   ├── train.py            # Model training, evaluation & persistence
│   ├── explain.py          # SHAP explainability utilities
│   └── utils.py            # Shared helpers
│
├── models/
│   ├── .gitkeep
│   └── model.joblib        # Trained model (generated — git-ignored)
│
├── outputs/
│   ├── .gitkeep
│   ├── metrics.json        # Evaluation metrics (generated — git-ignored)
│   └── shap_summary.png    # Static SHAP chart (generated — git-ignored)
│
├── requirements.txt
├── .gitignore
└── README.md

🚀 Installation

1. Clone the repository

git clone https://github.com/<your-username>/model-explainability-dashboard.git
cd model-explainability-dashboard

2. Create and activate a virtual environment

python -m venv .venv
source .venv/bin/activate          # macOS / Linux
# .venv\Scripts\activate           # Windows

3. Install dependencies

pip install -r requirements.txt

🏋️ How to Train the Model

python -m src.train

This will:

Load the Breast Cancer Wisconsin dataset from sklearn.datasets.
Split it into train (80%) and test (20%) sets.
Train a RandomForestClassifier with 200 trees.
Evaluate accuracy, precision, recall, F1-score, and ROC-AUC.
Save the trained model → models/model.joblib
Save evaluation metrics → outputs/metrics.json
Print a full training report to the console.

Expected output:

🔬 Model Explainability Dashboard — Training Pipeline
=======================================================

[1/4] Loading dataset …
      Samples: 569  |  Features: 30
[2/4] Splitting into train / test sets …
      Train: 455  |  Test: 114
[3/4] Training RandomForestClassifier …
      Trees: 200
[4/4] Evaluating & saving artefacts …
  ✓ Model saved → models/model.joblib
  ✓ Metrics saved → outputs/metrics.json

📊 Test-set results:
     accuracy        0.9737
     precision       0.9722
     recall          0.9859
     f1_score        0.9790
     roc_auc         0.9974
     n_test_samples  114
     n_estimators    200

✅ Done!  Run `streamlit run app.py` to open the dashboard.

📊 How to Run the Dashboard

streamlit run app.py

The app opens at http://localhost:8501 and provides:

Dataset overview and class distribution
Model performance metric cards
Global feature importance bar chart (SHAP)
Per-sample local explanation with interactive slider
Positive/negative SHAP contribution breakdown

🧰 Technologies Used

Library	Purpose
scikit-learn	RandomForestClassifier, dataset, metrics
SHAP	TreeExplainer, global & local attributions
Streamlit	Interactive web dashboard
pandas	Data manipulation
NumPy	Numerical computation
Altair	Interactive Vega-Lite charts
matplotlib	Static SHAP chart export
joblib	Model serialisation

⚠️ Disclaimer

This is an educational demo only.
The model and SHAP explanations are provided for learning and portfolio purposes.
They should not be used for any clinical, diagnostic, or medical decision-making.
The Breast Cancer Wisconsin dataset is a public benchmark dataset widely used in ML research.

📄 License

MIT License — feel free to fork, extend, and learn from this project.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🔬 Model Explainability Dashboard

📌 Why Explainability Matters

✨ Features

🗂️ Project Structure

🚀 Installation

1. Clone the repository

2. Create and activate a virtual environment

3. Install dependencies

🏋️ How to Train the Model

📊 How to Run the Dashboard

🧰 Technologies Used

⚠️ Disclaimer

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
models		models
outputs		outputs
src		src
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🔬 Model Explainability Dashboard

📌 Why Explainability Matters

✨ Features

🗂️ Project Structure

🚀 Installation

1. Clone the repository

2. Create and activate a virtual environment

3. Install dependencies

🏋️ How to Train the Model

📊 How to Run the Dashboard

🧰 Technologies Used

⚠️ Disclaimer

📄 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages