Skip to content

Run LoCoMo end-to-end; establish baseline numbers #14

Description

@StigNorland

Once the dataset (#9) and spaCy (#8) are in place, run the full pipeline and record baselines. No results/ directory currently exists, so all thresholds are uncalibrated guesses.

  • Run locomo eval across the 5 QA categories
  • Create results/ and commit a baseline snapshot
  • Compare graph vs. graph_only vs. baseline context modes

Blocked by #8, #9.

Metadata

Metadata

Assignees

No one assigned

    Labels

    evalEvaluation and benchmarking

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions