Releases: RobotStudyCompanion/Benchmark_LM
Release list
v0.1
First tagged release accompanying the paper
"Benchmarking Local Language Models for Social Robots using Edge Devices"
(accepted IEEE ARSO 2026).
Release summary. Reproducible benchmark suite covering 25 open-source
language models on Raspberry Pi 4, Raspberry Pi 5, and laptop-GPU hosts.
Evaluates inference efficiency (TPS, TPJ), knowledge (six-category MMLU
subset), and teaching effectiveness (LLM-rated against eight criteria,
validated by five human raters).
Accompanying data record: https://doi.org/10.5281/zenodo.19643021
Highlights since dorian-original:
- Consolidated per-platform runners and analysers from the development
repository (orlandossss/Master_Benchmark, archiving). - Disk-I/O telemetry on the Raspberry Pi runners, matching the data
published in the Zenodo record. - Linux-only packaging with pinned
requirements.txtandsetup.sh. - Syntax-check CI workflow on push and pull request.
- Apache-2.0 licence, CITATION.cff, hardened
.gitignore.
Known scope: the three benchmark runners and three analysers remain
separate per-platform scripts for v0.1. Consolidation into a single
platform-aware runner is scoped for v0.2 — see future_work/ for the
broader forward-looking roadmap.
Full Changelog: dorian-original...v0.1