Measurement uncertainty MCP server + 6 AI agents + verifiable audit trail. ISO/IEC 17025 + KOLAS-accredited laboratories.
A compliance operating system for Korea's 1,200+ KOLAS-accredited testing, calibration, RMP, and inspection institutions — and a measurement uncertainty MCP server that any MCP-compatible AI client (Claude Desktop, Cursor, VS Code, etc.) can call directly.
Built solo by Yongbeom Kim (KTR ISO 17034 cert. 2024-CM-007), who ran KOLAS audits at KIM's Reference with zero major non-conformities and got tired of redoing everything in Excel and email every quarter.
claude mcp add --transport http measurement-uncertainty \
https://measurement-uncertainty.mcpize.run→ then ask your AI: "Compute the GUM uncertainty for this voltage divider" or "Apply the TEM lattice template at 95% confidence".
pip install -e ".[dev,ml]"
streamlit run app.pyLive demo: metroai-gnbdv7pqq3quqsudb5pwvj.streamlit.app
pip install metroaifrom metroai.templates import create_tem_lattice_calculator
calc = create_tem_lattice_calculator()
result = calc.calculate()
print(f"d = {result.measurand_value:.6f} nm "
f"± {result.expanded_uncertainty:.4e} (k={result.coverage_factor:.2f})")After surveying the SEM-lab-operator journey end-to-end, four new features landed in v0.7.0 specifically to cover the applicant side of the accreditation workflow:
- 🔬 Domain-specific entry wizard — Landing page asks "Which instrument are you accrediting?" Five paths: SEM / TEM / AFM / OCD / general measurement. Each one routes to a domain dashboard with only the standards, KOLAS steps, uncertainty templates, and SOP checks that matter for that domain.
- 📚 Domain-specific KOLAS guides — Per-domain content: applicable ISO standards (4–6 per domain), six-step KOLAS accreditation process with typical pitfalls, 3–4 common nonconformities with root cause + MetroAI fix, and the typical uncertainty budget components. Content sourced from public ISO/SEMI/KOLAS-G-002 documents.
- 📝 KOLAS application form auto-generator — Fill an organization profile once → ReportLab generates a 7-section ISO/IEC 17025-style accreditation application PDF (organization info, scope, personnel, equipment, reference standards, environmental control, quality system). Generic template; final submission should be cross-checked against KAB's latest official form.
- 📋 Domain SOP rule-based checklist — Each domain ships with a 10-item SOP checklist derived from KOLAS-evaluator-perspective common findings. Real-time gap score and 1-click "add to orchestrator queue" for remediation.
End-to-end, the v0.7.0 changes raised our internal "lab-operator journey fit score" from 45% → 68% on a 7-stage scenario (entry → guide → form → KOLAS process → SOP check → simulation → end-to-end). The final 32% includes stages we can't automate ourselves (the "consulting + on-site evaluator hand-holding" piece of the journey).
| Agent | Role | Data source |
|---|---|---|
semi-intel |
Semiconductor industry signals | DART (Korea FSS) + NTIS R&D feeds |
job-scout |
Personnel turnover signal | Public job postings (baseline stub) |
kolas-monitor |
KOLAS / KAB / KTR notice scan | knab.go.kr live fetch (with stub fallback) |
kolas-audit-predictor |
Next-audit risk prediction | Rule baseline + optional GBT model |
orchestrator |
Integrated P0/P1/P2 task queue | All other agents |
schedule |
Calibration / audit / review calendar | Internal events DB |
Every agent output carries is_live / data_origin flags (live · stub ·
synthetic), so the UI can clearly distinguish authoritative data from
heuristics.
- GUM (ISO/IEC Guide 98-3) — symbolic partial derivatives, Welch–Satterthwaite, expanded U
- MCM (ISO/IEC Guide 98-3 Suppl. 1) — Monte Carlo with configurable n
- QMC — Sobol low-discrepancy sequence (verified ±0.003% agreement with GUM analytic on simple linear models)
reverse_uncertainty— novel within prior-art search. Given a target combined U, compute the maximum allowed standard uncertainty per component. Not found in GUM Workbench, NIST Uncertainty Machine, or major open-source GUM tools as of 2026-05.
| Template | Domain | Standard |
|---|---|---|
gauge_block |
Length | KOLAS-G-002 |
mass |
Mass (weights) | OIML R 111 |
temperature |
Temperature (PRT) | ITS-90 |
pressure |
Pressure | KOLAS-G-002 |
dc_voltage |
DC voltage | KOLAS-G-002 |
tem_lattice (v0.6 new) |
TEM d-spacing | Si CRM reference |
sem_eds (v0.6 new) |
SEM-EDS quantitative | ZAF, ISO 22489 |
afm_roughness (v0.6 new) |
AFM surface roughness Sa/Sq | ISO 25178-2 |
ocd_scatterometry (v0.6 new) |
OCD CD measurement | RCWA, SEMI MF-1789 |
- Ed25519 digital signatures (RFC 8032) — tamper-evident outputs
- W3C PROV-O provenance graphs (JSON-LD) — full input → model → output lineage
- Designed so a KOLAS auditor can verify no post-hoc tampering
| Tool | Use case |
|---|---|
calculate_uncertainty |
GUM calculation across the 9 templates |
pt_analysis |
Proficiency Testing — z-score / En / zeta per ISO 13528 + 17043 |
reverse_uncertainty |
Target-U → per-component limit allocation |
5-fold CV on synthetic data: accuracy 60.6% ± 3.1pp · ROC-AUC 0.628 ± 0.038 · Brier 0.241 · F1 0.636 (n=2000 × 6 features, label noise 0.15)
- GradientBoostingClassifier (n_estimators=200, depth=3, lr=0.05)
- Top 3 feature importances:
months_since_last_audit(0.34) ·personnel_turnover(0.25) ·sop_completeness(0.24) — aligned with domain intuition. - External validation on real KOLAS audit outcomes is pending. Synthetic-data metrics do not imply real-world accuracy.
- A prior sandbox figure of 87.1% has been removed from all artifacts. See
docs/HONESTY_NOTES.mdfor citation rules.
- ISO/IEC 17025:2017 (testing & calibration laboratories)
- ISO/IEC Guide 98-3 (GUM) + Suppl. 1 (MCM)
- ISO 13528 + ISO 17043 (proficiency testing)
- ISO 18516 (microscope methods)
- ISO 25178-2 (areal surface texture)
- ISO 22489 (SEM-EDS quantitative)
- KOLAS-G-001 / G-002 (Korean accreditation guidelines)
- SEMI MF-1789 (OCD scatterometry)
- W3C PROV-O (audit provenance)
- RFC 8032 (Ed25519 signatures)
v2 backbone (since v0.6.0):
- 🏠 Landing (
app.py) — KOLAS Compliance OS positioning + domain wizard - 🤖 6 Agents Dashboard (
pages/11) — Quality Manager daily view, KPI strip + task queue - 📋 SOP Gap Analyzer (
pages/12) — Technical Manager work surface, AI-detected gaps + v0.7 domain-specific checklist - 📰 KOLAS Feed (
pages/13) — kolas-monitor regulatory news - 🎯 Audit Risk Detail (
pages/14) — explainability, waterfall + AI reasoning + what-if - 📅 Ops Backbone (
pages/15) — certificates / personnel / schedule
v0.7.0 P0 — lab-operator journey (NEW):
- 🔬 SEM domain dashboard (
pages/16) — SEM-EDS standards + KOLAS process + nonconformities + SOP checklist - ⚛️ TEM domain dashboard (
pages/17) — lattice constant, ISO 29301 + Cs-corrector spec - 📐 AFM domain dashboard (
pages/18) — surface roughness Sa/Sq/Sz per ISO 25178-2 - 📏 OCD domain dashboard (
pages/19) — Scatterometry / RCWA library matching per SEMI MF-1789 - 📝 KOLAS application form (
pages/20) — Fill-once → 7-section ISO 17025-style PDF (KAB-F-21 reference)
Plus the legacy v0.5 calibration / PT / certificate pages (pages/1–10).
metroai/
├── app.py ← v2-spec landing page (Streamlit entry)
├── app_v0_5_backup.py ← Legacy v0.5 landing (backup)
├── pages/ ← Streamlit multi-page
│ ├── 1_📐_불확도_계산.py ← Uncertainty calculator (KR)
│ ├── 2_📊_PT_분석.py ← PT analysis (KR)
│ ├── 3_📄_교정성적서.py ← Calibration certificate PDF
│ ├── 4_🔄_불확도_역설계.py ← Reverse uncertainty (novel)
│ ├── 11_🤖_6_Agents.py ← v2 block 2: main dashboard
│ ├── 12_📋_SOP_갭_분석.py ← v2 block 4: SOP gap analyzer
│ ├── 13_📰_KOLAS_피드.py ← v2 block 5: regulatory feed
│ ├── 14_🎯_감사_위험_상세.py ← v2 block 3: risk explainability
│ └── 15_📅_인증서_인력_일정.py ← v2 block 6: operations
├── metroai/
│ ├── core/ ← GUM / MCM / model parsing
│ ├── agents/ ← 6 AI agents backbone
│ ├── audit/ ← Ed25519 + PROV-O
│ ├── connectors/ ← KOLAS / DART / NTIS live fetch + stub fallback
│ ├── math/ ← Sobol QMC
│ ├── ml/ ← GBT audit-risk model + synthetic data
│ ├── templates/ ← 9 calibration templates
│ ├── schemas.py ← Pydantic v2 input validation
│ ├── exceptions.py ← MetroAIError hierarchy
│ └── mcp_server.py ← MCP stdio server
├── tests/ ← 80+ unit tests (pytest)
├── docs/
│ ├── HONESTY_NOTES.md ← Citation rules
│ ├── v0.7.0_ROADMAP.md ← Next 3 months
│ └── RELEASE_NOTES_v0.6.0.md
├── mcp_manifest.json ← MCPize manifest (v0.6.0)
└── pyproject.toml
Reordered 5/19 around the lab-operator journey (after a virtual-user audit revealed v0.6.0 covered only 45% of the path-to-accreditation). Philosophy shifted from outbound-first → user-fit-first.
| Priority | Item | Status | Goal |
|---|---|---|---|
| P0 | Domain-specific entry wizard (SEM/TEM/AFM/OCD/general) | ✅ shipped v0.7.0 | Stage 1 of journey |
| P0 | Domain-specific KOLAS guides | ✅ shipped v0.7.0 | Stage 2, 4 |
| P0 | KOLAS application form auto-generator | ✅ shipped v0.7.0 | Stage 3 |
| P0 | Domain SOP rule-based checklist | ✅ shipped v0.7.0 | Stage 5 |
| P1 | Real KOLAS audit data + GBT retrain | pending | Replace synthetic 60.6% |
| P2 | HF Spaces migration | guide ready | Eliminate Streamlit Cloud sleep |
| P2 | Cold email to 5 KOLAS labs (post-P0 fit ≥ 80%) | pending | First user signal |
| P2 | Show HN + Reddit r/MachineLearning publish | drafts ready | External signal |
| P3 | Consulting SOP guide (per-domain on-site eval prep) | needs author | Cover stage 7 partially |
| P3 | LLM-assisted kolas-monitor (real inference) | stub now | Clear AI differentiation |
See docs/v0.7.0_ROADMAP.md for the full plan.
- Python 3.10+ (tested on 3.10 / 3.11 / 3.12)
- Streamlit (web UI)
- sympy / numpy / scipy (numerical)
- Pydantic v2 (input validation)
- cryptography (Ed25519)
- scikit-learn (GBT model, optional
[ml]extra) - reportlab + openpyxl (PDF + Excel export)
- altair / plotly (visualizations)
pip install -e ".[dev,ml]"
pytest tests/ -vLatest CI on Python 3.10 / 3.11 / 3.12 — 36 passing v0.6.0 tests plus the v0.5 legacy suite.
MIT License. See LICENSE.
- GitHub Issues — bug reports, feature requests
- GitHub Discussions — Q&A, design discussion
- MCPize page — install + reviews: mcpize.com/mcp/measurement-uncertainty
- Glama listing — glama.ai/mcp/servers?query=metroai
- Email — kyb8801@gmail.com (KOLAS-side feedback especially welcome)
본 프로젝트는 한국 KOLAS 인정 기관 실무자가 직접 사용할 수 있도록 한국어 페이지와
한국어 UI 를 지원합니다. 자세한 한국어 가이드는 docs/RELEASE_NOTES_v0.6.0.md
및 Streamlit 앱의 한국어 페이지들 (불확도 계산 / PT 분석 / 교정성적서 / 불확도 역설계
/ KOLAS 로드맵 / 6 Agents 대시보드 / SOP 갭 분석 / KOLAS 피드 / 감사 위험 상세 /
운영 백본) 을 참고해주세요. cold-feedback 환영합니다 — kyb8801@gmail.com.
Built with care by @kyb8801 · KIM's Reference KOLAS RMP cert. KRMPs-021 background.