Skip to content

Add multiple runs per question and report average/stdev #149

@ghost

Description

Hello, please add the ability to have a fixed number of runs per question instead of 1 and report average and stdev of all metrics (perhaps min/max or some sort of a histogram as well). That would allow avoiding outliers in the testing process like network connection issues, LLM temperature effect etc.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions