Skip to content

Abdo-ateM/model-explainability-dashboard

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🔬 Model Explainability Dashboard

An interactive machine learning explainability dashboard built with Streamlit, scikit-learn, and SHAP.
Trains a RandomForestClassifier and makes its predictions fully transparent through global and local SHAP explanations.


📌 Why Explainability Matters

Machine learning models are increasingly used to support high-stakes decisions in healthcare, finance, and beyond. Without explainability, a model is a black box — its predictions cannot be audited, trusted, or safely deployed.

Explainable AI (XAI) bridges this gap by answering:

  • Which features drove this prediction?
  • How much did each feature contribute — positively or negatively?
  • Is the model relying on spurious patterns or meaningful signals?

SHAP (SHapley Additive exPlanations) is the gold-standard framework for model-agnostic, theoretically grounded feature attribution.


✨ Features

Feature Description
📊 Model metrics Accuracy, Precision, Recall, F1-score, ROC-AUC displayed as rich metric cards
🌍 Global explanations Mean absolute SHAP values ranked across all test samples
🔍 Local explanations Per-sample SHAP waterfall showing top positive/negative contributions
🎛️ Interactive explorer Slider to select any test sample and inspect the model's reasoning
🖼️ Static chart export SHAP summary bar chart saved to outputs/shap_summary.png
📋 Dataset overview Feature matrix preview and class distribution

🗂️ Project Structure

model-explainability-dashboard/
│
├── app.py                  # Streamlit dashboard (main entry point)
│
├── src/
│   ├── __init__.py
│   ├── data.py             # Dataset loading & splitting
│   ├── train.py            # Model training, evaluation & persistence
│   ├── explain.py          # SHAP explainability utilities
│   └── utils.py            # Shared helpers
│
├── models/
│   ├── .gitkeep
│   └── model.joblib        # Trained model (generated — git-ignored)
│
├── outputs/
│   ├── .gitkeep
│   ├── metrics.json        # Evaluation metrics (generated — git-ignored)
│   └── shap_summary.png    # Static SHAP chart (generated — git-ignored)
│
├── requirements.txt
├── .gitignore
└── README.md

🚀 Installation

1. Clone the repository

git clone https://github.com/<your-username>/model-explainability-dashboard.git
cd model-explainability-dashboard

2. Create and activate a virtual environment

python -m venv .venv
source .venv/bin/activate          # macOS / Linux
# .venv\Scripts\activate           # Windows

3. Install dependencies

pip install -r requirements.txt

🏋️ How to Train the Model

python -m src.train

This will:

  1. Load the Breast Cancer Wisconsin dataset from sklearn.datasets.
  2. Split it into train (80%) and test (20%) sets.
  3. Train a RandomForestClassifier with 200 trees.
  4. Evaluate accuracy, precision, recall, F1-score, and ROC-AUC.
  5. Save the trained model → models/model.joblib
  6. Save evaluation metrics → outputs/metrics.json
  7. Print a full training report to the console.

Expected output:

🔬 Model Explainability Dashboard — Training Pipeline
=======================================================

[1/4] Loading dataset …
      Samples: 569  |  Features: 30
[2/4] Splitting into train / test sets …
      Train: 455  |  Test: 114
[3/4] Training RandomForestClassifier …
      Trees: 200
[4/4] Evaluating & saving artefacts …
  ✓ Model saved → models/model.joblib
  ✓ Metrics saved → outputs/metrics.json

📊 Test-set results:
     accuracy        0.9737
     precision       0.9722
     recall          0.9859
     f1_score        0.9790
     roc_auc         0.9974
     n_test_samples  114
     n_estimators    200

✅ Done!  Run `streamlit run app.py` to open the dashboard.

📊 How to Run the Dashboard

streamlit run app.py

The app opens at http://localhost:8501 and provides:

  • Dataset overview and class distribution
  • Model performance metric cards
  • Global feature importance bar chart (SHAP)
  • Per-sample local explanation with interactive slider
  • Positive/negative SHAP contribution breakdown

🧰 Technologies Used

Library Purpose
scikit-learn RandomForestClassifier, dataset, metrics
SHAP TreeExplainer, global & local attributions
Streamlit Interactive web dashboard
pandas Data manipulation
NumPy Numerical computation
Altair Interactive Vega-Lite charts
matplotlib Static SHAP chart export
joblib Model serialisation

⚠️ Disclaimer

This is an educational demo only.
The model and SHAP explanations are provided for learning and portfolio purposes.
They should not be used for any clinical, diagnostic, or medical decision-making.
The Breast Cancer Wisconsin dataset is a public benchmark dataset widely used in ML research.


📄 License

MIT License — feel free to fork, extend, and learn from this project.

About

Interactive SHAP explainability dashboard for tabular machine learning models using scikit-learn and Streamlit.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages