Master's thesis project in informatics (machine learning specialisation)
by Sebastian Einar Salas Røkholt
Spring 2026
This repository contains the software implementation for the MSc thesis paper titled "Machine Teaching for Explainable AI in Industry: A Novel Approach for Time Series Classifiers", graded A in April 2026. My thesis work is attached to "Machine Teaching for Explainable AI" (MT4XAI), a joint research project between the Department of Informatics at the University of Bergen and The Valencian Research Institute for AI (VRAIN).
The scope of the the thesis project was to build an end-to-end MT4XAI pipeline for sequence level time series anomaly detection on electric vehicle (EV) charging sessions. It aims to answer the research question: "How can techniques from MTXAI be applied to time series classifiers in order to generate simple, understandable and faithful explanations of model decisions in a real world industry setting?"
The implemented system currently covers:
- forecasting based anomaly detection with LSTM and TCN modelling support
- anomaly scoring with session level metrics and threshold based classification
- classifier preserving charging curve simplification with ORS
- machine teaching set construction with facility location selection and curriculum aware serving
- MLLM based proxy learner experiments across conditions A to F
You can read the thesis paper here: Docs/INF399___MT4XAI_Master_s_Project.pdf
This section documents the current and target pipeline architecture.
This diagram summarises the implemented offline experimental pipeline in this repository. Most of these software components were implemented in Jupyter Notebooks (for ease of documentation) that heavily use assets from a shared Python package. An anonymised version of the dataset extracted from the database is available in the /Data/ folder. Prepared charging session batches are used to train the forecasting model, select anomaly scoring configuration, generate ORS simplifications, construct teaching sets, and run the MLLM trial engine with groups A to F.
This diagram expands the implemented teaching set flow. Simplification candidates are embedded, stratified by simplicity level and budget, then selected with a facility location objective to produce compact and diverse teaching sets. The various teaching sets were used in the multimodal LLM experiment for experimental conditions/trial groups A to F. Trial group A-D use teaching sets A-D, respectively, while group E reuses D assets and group F gives the "no teaching"-baseline for hypothesis testing.
This diagram shows the target production concept for combined offline and online operation. Offline components periodically train and refresh the model and global teaching set, while online components process a single user charging session, classify it, optionally simplify it, and support user facing local or global explanation workflows.
.
├── Docs/
│ ├── INF399___MT4XAI_Master_s_Project.pdf
│ └── Diagrams/
├── Notebooks/
│ ├── 00__Data_Anonymisation.ipynb
│ ├── 01__Data_Wrangling_and_FE.ipynb
│ ├── 02__EDA.ipynb
│ ├── 03__Modelling.ipynb
│ ├── 04__Anomaly_Detection.ipynb
│ ├── 05__Curve_Simplification.ipynb
│ ├── 06__MT4XAI.ipynb
│ └── 07__MLLM_Experiment.ipynb
├── src/
│ ├── mt4xai/ # Core modelling, inference, ORS and teaching logic
│ └── mllm_experiment/ # MLLM experiment runner, prompts and metadata loaders
├── Data/ # Datasets and generated metadata or results
├── Figures/ # Teaching and exam set figures
├── Models/ # Trained model checkpoints and tuning outputs
├── scripts/ # Utility scripts such as ORS validation
├── config.yaml
├── project_config.py
├── linux_requirements.txt
└── pyproject.toml
Notebook execution is split into a reproducibility only step and a recommended main workflow.
Important
00__Data_Anonymisation.ipynbdepends on the private raw dataset and external weather APIs- readers without private access to the PI-sensitive raw dataset should start from
01__Data_Wrangling_and_FE.ipynb
| Step | Notebook | Role in pipeline | Main outputs |
|---|---|---|---|
| 0 | 00__Data_Anonymisation.ipynb |
Reproducibility only preprocessing on private raw source | Data/etron55-charging-sessions-public.parquet |
| 1 | 01__Data_Wrangling_and_FE.ipynb |
Public dataset wrangling and feature engineering for modelling | Data/etron55-charging-sessions.parquet |
| 2 | 02__EDA.ipynb |
Exploratory analysis used for design choices | EDA tables and figures |
| 3 | 03__Modelling.ipynb |
Forecasting model training and tuning | Models/ checkpoints and selected final model |
| 4 | 04__Anomaly_Detection.ipynb |
Session level scoring and threshold calibration | Anomaly metric and threshold configuration |
| 5 | 05__Curve_Simplification.ipynb |
ORS diagnostics and parameter selection | Simplification diagnostics and examples |
| 6 | 06__MT4XAI.ipynb |
Teaching pool construction and teaching or exam set export | Figures/teaching_sets/, Figures/exam_sets/, metadata build inputs |
| 7 | 07__MLLM_Experiment.ipynb |
Trial execution wrapper and effect analysis for groups A to F | Data/mllm_experiment_results/ analysis outputs |
Recommended order for most users is 01 to 07.
Reusable pipeline package used by notebooks.
data.pydata handling, splits, scaling, session containersmodel.pyLSTM and TCN forecasting models and loaderstrain.pyRay based tuning and training loopsinference.pyresidual prediction reconstruction, anomaly metrics and classification helpersors.pymerged ORS implementation including DP prefix v3 modeors_v3.pycompatibility wrapper that maps legacyors_v3imports tomt4xai.orsteach.pyteaching pool and teaching set construction logicplot.pyvisualisation helpers
Scriptable MLLM evaluation package.
run_trial.pytrial CLI entry point and run metadata handlingtrial.pyparticipant lifecycle across pre exam, teaching and post examdata_loading.pymetadata and image loading with group aware resolutionprompts.pymultimodal prompt builders per phase and conditionopenai_client.pyOpenAI Responses API wrapper with retries and shared rate limitingbuild_teaching_sets.pyanonymised teaching set image export andteaching_items.csvbuild_exam_sets.pyanonymised exam set image export andexam_items.csvutils.pystructured logging and result writers
Current trial support includes conditions A to F.
Aoverlay plus simplified with curriculumBoverlay plus simplified without curriculumCraw modality onlyDsimplified modality with curriculumEsimplified modality with enforced rule update protocol and locked post exam rule useFbaseline with no teaching phase
Create .env from the committed template:
cp example.env .envRequired secrets by workflow:
OPENAI_API_KEYfor real API calls in07__MLLM_Experiment.ipynbandmllm_experiment.run_trialFROST_API_CLIENT_IDandDMI_MET_OBS_API_KEYfor00__Data_Anonymisation.ipynb
The template also includes legacy optional placeholders currently present in local environment files.
Python 3.12 is the target runtime.
python3.12 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r linux_requirements.txt
pip install -e .
python - <<'PY'
import torch
print("cuda_available=", torch.cuda.is_available())
PYpython3.12 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
grep -vE '^(torch==|nvidia-|triton==)' linux_requirements.txt > /tmp/requirements.cpu.txt
pip install --index-url https://download.pytorch.org/whl/cpu torch==2.5.0
pip install -r /tmp/requirements.cpu.txt
pip install -e .
python - <<'PY'
import torch
print("cuda_available=", torch.cuda.is_available())
PYcd Notebooks
../.venv/bin/jupyter labRun in order:
01__Data_Wrangling_and_FE.ipynb02__EDA.ipynb03__Modelling.ipynb04__Anomaly_Detection.ipynb05__Curve_Simplification.ipynb06__MT4XAI.ipynb07__MLLM_Experiment.ipynb
Run notebook 00__Data_Anonymisation.ipynb only when reproducing private data anonymisation.
source .venv/bin/activate
exp-build-teaching-sets
exp-build-exam-setsEquivalent module form:
PYTHONPATH=src python -m mllm_experiment.build_teaching_sets
PYTHONPATH=src python -m mllm_experiment.build_exam_setsPYTHONPATH=src python -m mllm_experiment.run_trial \
--participants 1 \
--teaching_set_dir Figures/teaching_sets/mllm_experiment_sets \
--exam_sets_dir Figures/exam_sets/mllm_experiment_sets \
--metadata_dir Data/mllm_experiment_metadata \
--conditions all \
--output_dir Data/mllm_experiment_results/dry_run \
--random_seed 42 \
--dry_runPYTHONPATH=src python -m mllm_experiment.run_trial \
--participants 30 \
--teaching_set_dir Figures/teaching_sets/mllm_experiment_sets \
--exam_sets_dir Figures/exam_sets/mllm_experiment_sets \
--metadata_dir Data/mllm_experiment_metadata \
--conditions all \
--output_dir Data/mllm_experiment_results/gpt-5-nano_experiment_1 \
--model_name gpt-5-nano \
--parallel_participants 2 \
--random_seed 42- Keep
config.yaml, notebook parameters, and CLI flags versioned together for each run - Use unique output directories under
Data/mllm_experiment_results/ - Keep immutable snapshots of
Data/mllm_experiment_metadata/*.csv, run metadata JSON, and selected model checkpoints - Run dry runs before real API runs to validate metadata and prompt protocol formatting


