feat: integrate step-level metrics into STOI product by liziye627-design · Pull Request #1 · KevinYoung-Kw/stoi-cli

liziye627-design · 2026-04-09T00:18:25Z

Summary

stoi_analyze: extract output_text from session content blocks, add compute_step_metrics(), aggregate_step_metrics(), print_step_metrics_report() functions, support --metrics flag
stoi: add metrics and report CLI commands with F/V/C/U quality dimensions and TE/SUS/FR/MG/RR core metrics
stoi_tui: add metrics-box to dashboard with F/V/C/U bars and core metrics display, enhance suggestions with step-level insights
tests: add 10 integration tests for compute/aggregate/output_text extraction

Test plan

All 83 tests pass (python3 -m pytest tests/ -v)
python3 stoi.py metrics --latest — verify metrics command works
python3 stoi.py report --days 7 — verify report command works
python3 stoi.py analyze --metrics --top 5 — verify --metrics flag works
python3 stoi.py stats — verify existing commands still work

…dity, feedback-validity)

…theme

Based on the research report defining F/V/C/U quality dimensions and TE/SUS/FR/MG/RR core metrics: - Add stoi_metrics.py: StepParser, TokenAttributor, QualityScorer, MetricsCalculator implementing step-level token attribution with tiktoken/heuristic fallback, four-dimensional quality scoring (Factuality, Validity, Coherence, Utility), and five quantitative metrics (Token Efficiency, Step Utility Score, Faithfulness Risk, Monitorability Gain, Redundancy Ratio) - Fix calc_stoi algorithm: wasted_tokens now correctly counts only new_tokens instead of (total_context - cache_read), and applies cache_creation investment ratio to reduce waste penalty - Fix calc_stoi_score L3 weight comment (0.35 -> 0.20 to match code) - Update stoi_proxy.py with the corrected waste calculation - Add token efficiency and useful output ratio to session reports - Add comprehensive test suite (73 tests) covering engine fixes and the new metrics framework Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- stoi_analyze: extract output_text from session content blocks, add compute_step_metrics/aggregate_step_metrics/print_step_metrics_report functions, support --metrics flag in analyze command - stoi: add 'metrics' and 'report' CLI commands with F/V/C/U quality dimensions and TE/SUS/FR/MG/RR core metrics - stoi_tui: add metrics-box to dashboard with F/V/C/U bars and core metrics display, enhance suggestions with step-level insights - tests: add 10 integration tests for compute/aggregate/output_text

KevinYoung-Kw and others added 6 commits April 9, 2026 03:13

Add STOI-Demo-cola with updated README

e7dbfb8

feat: add feedback validity analysis commands (backfill-feedback-vali…

8405067

…dity, feedback-validity)

fix: add missing argparse import

4ba0514

fix: add UnicodeDecodeError handling and update table colors to cola …

f54037b

…theme

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: integrate step-level metrics into STOI product#1

feat: integrate step-level metrics into STOI product#1
liziye627-design wants to merge 6 commits intoKevinYoung-Kw:mainfrom
liziye627-design:feat/step-level-metrics

liziye627-design commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

liziye627-design commented Apr 9, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants