-
Notifications
You must be signed in to change notification settings - Fork 104
Description
Proposal: GNAP as a coordination layer for AutoChain's automated agent evaluation
AutoChain is a lightweight generative agent framework focused on rapid iteration — easy customization and automated multi-turn evaluation via simulated conversations. As you build more complex evaluation pipelines (multiple agents being evaluated in parallel, coordinator + specialist patterns), coordination becomes important.
GNAP (Git-Native Agent Protocol) provides a minimal coordination substrate: a git repo with board/todo/ → board/doing/ → board/done/. No extra dependencies — AutoChain is already lightweight and GNAP keeps it that way.
Applied to AutoChain's workflow evaluation:
AutoChain evaluates agents with simulated conversations. GNAP could coordinate parallel evaluation runs:
board/todo/eval-agent-v1-scenario-happy-path.md ← Evaluator creates
board/todo/eval-agent-v1-scenario-edge-case.md
board/todo/eval-agent-v1-scenario-error-recovery.md
board/doing/eval-agent-v1-scenario-happy-path.md ← AutoChain eval agent claims
board/doing/eval-agent-v1-scenario-edge-case.md ← Another eval agent claims
board/done/eval-agent-v1-scenario-happy-path.md ← Eval results + success rate
Parallel evaluation without conflicts, persistent results across restarts, and a full audit trail of which scenarios passed/failed — all via git. No additional infrastructure.
GNAP is particularly well-suited for AutoChain's OpenAI function-calling agents since those agents already structure their outputs — GNAP task files can carry structured specs that map directly to function call parameters.