Skip to content

feat: curated failure dataset from traces (agent-strace dataset) #66

@Siddhant-K-code

Description

@Siddhant-K-code

Implemented in #78 (v0.37.0). All acceptance criteria met:

  • eval dataset add, list, show, export commands
  • eval dataset auto with 6 signal filters: has-errors, high-retry, cost-above:N, wide-blast, long-duration:Ns, low-eval-score:N
  • Deduplication, --since-days, --label support
  • Stored as JSONL in .agent-traces/datasets/
  • Zero new dependencies

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions