Conversation
Evaluator-optimizer loop that uses Claude Code Opus (free via Pro) to analyze judge feedback and refine underperforming agents' system prompts — same quality as the API evolve command at zero cost. Includes load_underperformers.py helper script to surface agents with low win rates and their loss feedback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds a new Claude skill (mobius-evolve) intended to support “free” agent evolution by loading underperforming agents, surfacing their loss feedback, and guiding a local evaluator/optimizer loop to refine system prompts.
Changes:
- Added a
load_underperformers.pyhelper script to list low-win-rate agents and print recent loss feedback. - Added
SKILL.mddefining themobius-evolveskill instructions and workflow for refining/registering improved agents.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
.claude/skills/mobius-evolve/scripts/load_underperformers.py |
New script to query the registry/tournament DB for underperformers and print prompt + loss feedback. |
.claude/skills/mobius-evolve/SKILL.md |
New skill definition and step-by-step instructions to run the local evolution loop and register improved agents. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
… matches - Strip raw `git show` output (commit metadata, diff headers, leading +/- chars) from SKILL.md and load_underperformers.py so they parse correctly - Remove unused `json` and `row_to_dict` imports from load_underperformers.py - Filter out unjudged matches (winner_id is None) from loss counting - Update SKILL.md argument-hint to include --min-matches Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4a11c4f51e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
Free Opus-powered agent evolution via evaluator-optimizer loop. Analyzes judge feedback and refines underperforming agents' system prompts at zero API cost.
🤖 Generated with Claude Code