fix: add units to result summary output by hashwnath · Pull Request #118 · fw-ai/benchmark

hashwnath · 2026-06-01T02:32:18Z

Summary

Appends (ms) to all time-based metric labels in the summary output, so users can immediately tell what unit each value uses
Covers base metrics (Time To First Token, Latency Per Token, Overall Latency Per Token, Total Latency, Latency Per Embedding, Latency Per Char) and their percentile variants (P50, P90, P95, P99, P99.9)

Closes #5

Before

Time To First Token: 5.705300167132269
Latency Per Token  : 135.50119360148753
Total Latency      : 28838.560053018486

After

Time To First Token (ms): 5.705300167132269
Latency Per Token (ms)  : 135.50119360148753
Total Latency (ms)      : 28838.560053018486

Test plan

Run a load test with --stream and verify summary labels show (ms) for time-based metrics
Run with --embeddings and verify Latency Per Embedding (ms) appears
Confirm non-time metrics (e.g. Num Tokens, Qps, Prompt Tokens) do not get a unit suffix

🤖 Generated with Claude Code

Note

Low Risk
Display-only label formatting in the quit summary and optional CSV export; no change to measured values or request behavior.

Overview
Load test summary output (console and --summary-file CSV column headers) now appends (ms) to labels for time-based metrics.

The one-line pretty_name lambda is replaced with logic that maps known latency metrics (time_to_first_token, latency_per_token, overall_latency_per_token, total_latency, latency_per_embedding, latency_per_char) and their percentile variants (e.g. P50_total_latency) by stripping the P50_ / P90_ / etc. prefix before deciding whether to add the unit suffix. Count/rate fields like QPS and token counts are unchanged.

^{Reviewed by Cursor Bugbot for commit aee9790. Bugbot is set up for automated code reviews on this repo. Configure here.}

Appends "(ms)" to all time-based metric labels in the summary output so users can immediately see the unit. This covers both the base metrics (e.g. "Total Latency (ms)") and their percentile variants (e.g. "P99 Total Latency (ms)"). Closes fw-ai#5 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Reviewed by Cursor Bugbot for commit bcbbce5. Configure here.}

P99.9_ metrics were matching the P99_ prefix first, causing the (ms) suffix to be silently dropped for P99.9 percentile time metrics. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

cursor Bot reviewed Jun 1, 2026

View reviewed changes

Comment thread llm_bench/load_test.py

fix: reorder percentile prefixes so P99.9 is checked before P99

aee9790

P99.9_ metrics were matching the P99_ prefix first, causing the (ms) suffix to be silently dropped for P99.9 percentile time metrics. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: add units to result summary output#118

fix: add units to result summary output#118
hashwnath wants to merge 2 commits into
fw-ai:mainfrom
hashwnath:fix/5-add-units-to-summary

hashwnath commented Jun 1, 2026 •

edited by cursor Bot

Loading

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

hashwnath commented Jun 1, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Before

After

Test plan

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hashwnath commented Jun 1, 2026 •

edited by cursor Bot

Loading