Machine Teaching for Explainable AI in Industry:
A Novel Approach for Time Series Classifiers

Master's thesis project in informatics (machine learning specialisation)
by Sebastian Einar Salas Røkholt
Spring 2026

Project Context

This repository contains the software implementation for the MSc thesis paper titled "Machine Teaching for Explainable AI in Industry: A Novel Approach for Time Series Classifiers", graded A in April 2026. My thesis work is attached to "Machine Teaching for Explainable AI" (MT4XAI), a joint research project between the Department of Informatics at the University of Bergen and The Valencian Research Institute for AI (VRAIN).

The scope of the the thesis project was to build an end-to-end MT4XAI pipeline for sequence level time series anomaly detection on electric vehicle (EV) charging sessions. It aims to answer the research question: "How can techniques from MTXAI be applied to time series classifiers in order to generate simple, understandable and faithful explanations of model decisions in a real world industry setting?"

The implemented system currently covers:

forecasting based anomaly detection with LSTM and TCN modelling support
anomaly scoring with session level metrics and threshold based classification
classifier preserving charging curve simplification with ORS
machine teaching set construction with facility location selection and curriculum aware serving
MLLM based proxy learner experiments across conditions A to F

You can read the thesis paper here: Docs/INF399___MT4XAI_Master_s_Project.pdf

Architecture and Pipeline

This section documents the current and target pipeline architecture.

1. MT4XAI Experimental System Pipeline Diagram (implemented)

This diagram summarises the implemented offline experimental pipeline in this repository. Most of these software components were implemented in Jupyter Notebooks (for ease of documentation) that heavily use assets from a shared Python package. An anonymised version of the dataset extracted from the database is available in the /Data/ folder. Prepared charging session batches are used to train the forecasting model, select anomaly scoring configuration, generate ORS simplifications, construct teaching sets, and run the MLLM trial engine with groups A to F.

2. MT4XAI Pipeline with Detailed Teaching Set Construction Diagram (implemented)

This diagram expands the implemented teaching set flow. Simplification candidates are embedded, stratified by simplicity level and budget, then selected with a facility location objective to produce compact and diverse teaching sets. The various teaching sets were used in the multimodal LLM experiment for experimental conditions/trial groups A to F. Trial group A-D use teaching sets A-D, respectively, while group E reuses D assets and group F gives the "no teaching"-baseline for hypothesis testing.

3. MT4XAI Target Production System Pipeline Diagram (future work)

This diagram shows the target production concept for combined offline and online operation. Offline components periodically train and refresh the model and global teaching set, while online components process a single user charging session, classify it, optionally simplify it, and support user facing local or global explanation workflows.

Repository Structure

.
├── Docs/
│   ├── INF399___MT4XAI_Master_s_Project.pdf
│   └── Diagrams/
├── Notebooks/
│   ├── 00__Data_Anonymisation.ipynb
│   ├── 01__Data_Wrangling_and_FE.ipynb
│   ├── 02__EDA.ipynb
│   ├── 03__Modelling.ipynb
│   ├── 04__Anomaly_Detection.ipynb
│   ├── 05__Curve_Simplification.ipynb
│   ├── 06__MT4XAI.ipynb
│   └── 07__MLLM_Experiment.ipynb
├── src/
│   ├── mt4xai/                  # Core modelling, inference, ORS and teaching logic
│   └── mllm_experiment/         # MLLM experiment runner, prompts and metadata loaders
├── Data/                        # Datasets and generated metadata or results
├── Figures/                     # Teaching and exam set figures
├── Models/                      # Trained model checkpoints and tuning outputs
├── scripts/                     # Utility scripts such as ORS validation
├── config.yaml
├── project_config.py
├── linux_requirements.txt
└── pyproject.toml

Notebook Overview

Notebook execution is split into a reproducibility only step and a recommended main workflow.

Important

00__Data_Anonymisation.ipynb depends on the private raw dataset and external weather APIs
readers without private access to the PI-sensitive raw dataset should start from 01__Data_Wrangling_and_FE.ipynb

Step	Notebook	Role in pipeline	Main outputs
0	`00__Data_Anonymisation.ipynb`	Reproducibility only preprocessing on private raw source	`Data/etron55-charging-sessions-public.parquet`
1	`01__Data_Wrangling_and_FE.ipynb`	Public dataset wrangling and feature engineering for modelling	`Data/etron55-charging-sessions.parquet`
2	`02__EDA.ipynb`	Exploratory analysis used for design choices	EDA tables and figures
3	`03__Modelling.ipynb`	Forecasting model training and tuning	`Models/` checkpoints and selected final model
4	`04__Anomaly_Detection.ipynb`	Session level scoring and threshold calibration	Anomaly metric and threshold configuration
5	`05__Curve_Simplification.ipynb`	ORS diagnostics and parameter selection	Simplification diagnostics and examples
6	`06__MT4XAI.ipynb`	Teaching pool construction and teaching or exam set export	`Figures/teaching_sets/`, `Figures/exam_sets/`, metadata build inputs
7	`07__MLLM_Experiment.ipynb`	Trial execution wrapper and effect analysis for groups A to F	`Data/mllm_experiment_results/` analysis outputs

Recommended order for most users is 01 to 07.

Source Code Overview

`src/mt4xai`

Reusable pipeline package used by notebooks.

data.py data handling, splits, scaling, session containers
model.py LSTM and TCN forecasting models and loaders
train.py Ray based tuning and training loops
inference.py residual prediction reconstruction, anomaly metrics and classification helpers
ors.py merged ORS implementation including DP prefix v3 mode
ors_v3.py compatibility wrapper that maps legacy ors_v3 imports to mt4xai.ors
teach.py teaching pool and teaching set construction logic
plot.py visualisation helpers

`src/mllm_experiment`

Scriptable MLLM evaluation package.

run_trial.py trial CLI entry point and run metadata handling
trial.py participant lifecycle across pre exam, teaching and post exam
data_loading.py metadata and image loading with group aware resolution
prompts.py multimodal prompt builders per phase and condition
openai_client.py OpenAI Responses API wrapper with retries and shared rate limiting
build_teaching_sets.py anonymised teaching set image export and teaching_items.csv
build_exam_sets.py anonymised exam set image export and exam_items.csv
utils.py structured logging and result writers

Experiment Conditions

Current trial support includes conditions A to F.

A overlay plus simplified with curriculum
B overlay plus simplified without curriculum
C raw modality only
D simplified modality with curriculum
E simplified modality with enforced rule update protocol and locked post exam rule use
F baseline with no teaching phase

Secrets and Environment Variables

Create .env from the committed template:

cp example.env .env

Required secrets by workflow:

OPENAI_API_KEY for real API calls in 07__MLLM_Experiment.ipynb and mllm_experiment.run_trial
FROST_API_CLIENT_ID and DMI_MET_OBS_API_KEY for 00__Data_Anonymisation.ipynb

The template also includes legacy optional placeholders currently present in local environment files.

Setup

Python 3.12 is the target runtime.

Option A: CUDA setup on Linux or WSL2 with NVIDIA GPU

python3.12 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r linux_requirements.txt
pip install -e .
python - <<'PY'
import torch
print("cuda_available=", torch.cuda.is_available())
PY

Option B: CPU only setup

python3.12 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
grep -vE '^(torch==|nvidia-|triton==)' linux_requirements.txt > /tmp/requirements.cpu.txt
pip install --index-url https://download.pytorch.org/whl/cpu torch==2.5.0
pip install -r /tmp/requirements.cpu.txt
pip install -e .
python - <<'PY'
import torch
print("cuda_available=", torch.cuda.is_available())
PY

Running the Pipeline

1. Run notebooks

cd Notebooks
../.venv/bin/jupyter lab

Run in order:

01__Data_Wrangling_and_FE.ipynb
02__EDA.ipynb
03__Modelling.ipynb
04__Anomaly_Detection.ipynb
05__Curve_Simplification.ipynb
06__MT4XAI.ipynb
07__MLLM_Experiment.ipynb

Run notebook 00__Data_Anonymisation.ipynb only when reproducing private data anonymisation.

2. Build anonymised teaching and exam metadata with package entry points

source .venv/bin/activate
exp-build-teaching-sets
exp-build-exam-sets

Equivalent module form:

PYTHONPATH=src python -m mllm_experiment.build_teaching_sets
PYTHONPATH=src python -m mllm_experiment.build_exam_sets

3. Validate trial runner with dry run

PYTHONPATH=src python -m mllm_experiment.run_trial \
  --participants 1 \
  --teaching_set_dir Figures/teaching_sets/mllm_experiment_sets \
  --exam_sets_dir Figures/exam_sets/mllm_experiment_sets \
  --metadata_dir Data/mllm_experiment_metadata \
  --conditions all \
  --output_dir Data/mllm_experiment_results/dry_run \
  --random_seed 42 \
  --dry_run

4. Run real MLLM experiment

PYTHONPATH=src python -m mllm_experiment.run_trial \
  --participants 30 \
  --teaching_set_dir Figures/teaching_sets/mllm_experiment_sets \
  --exam_sets_dir Figures/exam_sets/mllm_experiment_sets \
  --metadata_dir Data/mllm_experiment_metadata \
  --conditions all \
  --output_dir Data/mllm_experiment_results/gpt-5-nano_experiment_1 \
  --model_name gpt-5-nano \
  --parallel_participants 2 \
  --random_seed 42

Reproducibility Notes

Keep config.yaml, notebook parameters, and CLI flags versioned together for each run
Use unique output directories under Data/mllm_experiment_results/
Keep immutable snapshots of Data/mllm_experiment_metadata/*.csv, run metadata JSON, and selected model checkpoints
Run dry runs before real API runs to validate metadata and prompt protocol formatting

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Machine Teaching for Explainable AI in Industry:
A Novel Approach for Time Series Classifiers

Project Context

Architecture and Pipeline

1. MT4XAI Experimental System Pipeline Diagram (implemented)

2. MT4XAI Pipeline with Detailed Teaching Set Construction Diagram (implemented)

3. MT4XAI Target Production System Pipeline Diagram (future work)

Repository Structure

Notebook Overview

Source Code Overview

`src/mt4xai`

`src/mllm_experiment`

Experiment Conditions

Secrets and Environment Variables

Setup

Option A: CUDA setup on Linux or WSL2 with NVIDIA GPU

Option B: CPU only setup

Running the Pipeline

1. Run notebooks

2. Build anonymised teaching and exam metadata with package entry points

3. Validate trial runner with dry run

4. Run real MLLM experiment

Reproducibility Notes

About

Uh oh!

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 97 Commits
Data		Data
Docs		Docs
Figures		Figures
Models		Models
Notebooks		Notebooks
scripts		scripts
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.yaml		config.yaml
example.env		example.env
linux_requirements.txt		linux_requirements.txt
project_config.py		project_config.py
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

Machine Teaching for Explainable AI in Industry: A Novel Approach for Time Series Classifiers

Project Context

Architecture and Pipeline

1. MT4XAI Experimental System Pipeline Diagram (implemented)

2. MT4XAI Pipeline with Detailed Teaching Set Construction Diagram (implemented)

3. MT4XAI Target Production System Pipeline Diagram (future work)

Repository Structure

Notebook Overview

Source Code Overview

src/mt4xai

src/mllm_experiment

Experiment Conditions

Secrets and Environment Variables

Setup

Option A: CUDA setup on Linux or WSL2 with NVIDIA GPU

Option B: CPU only setup

Running the Pipeline

1. Run notebooks

2. Build anonymised teaching and exam metadata with package entry points

3. Validate trial runner with dry run

4. Run real MLLM experiment

Reproducibility Notes

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Machine Teaching for Explainable AI in Industry:
A Novel Approach for Time Series Classifiers

`src/mt4xai`

`src/mllm_experiment`

Packages