Summary
Mobius can evolve agent prompts via mobius evolve, but it never asks "what kind of agent am I missing?" or "what capability gap keeps causing failures?" This issue proposes a meta-layer (mobius improve) that diagnoses system-level gaps and proposes structural improvements.
Current State
The existing loop is: compete → judge → elo → evolve prompts → repeat
Everything improves within the existing agent pool. The system never:
- Identifies missing agent types or specializations
- Detects capability gaps from repeated failures
- Retires stale/underperforming agents
- Tracks why the system changed over time
Proposed: mobius improve
Three phases:
Phase 1: Diagnose (analyst.py + DiagnosisReport)
Analyze recent match history for patterns:
- Repeated failures — tasks where all agents score low → capability gap
- Narrow wins — barely-edged-out tasks → weak coverage
- Missing specializations — cluster tasks by topic, find uncovered areas
- Stale agents — high match count, declining win rate
Phase 2: Propose (ImprovementProposal model)
Generate typed proposals:
create_agent — new agent for uncovered specialization
retire_agent — remove underperformers
split_agent — break generic agent into specialists
add_capability — equip agents with new tools (e.g., WebSearch)
system_change — structural improvements
Phase 3: Act (three modes)
| Mode |
Behavior |
--dry-run |
Print proposals only |
--suggest |
Create tracked proposals, human approves |
--auto |
Execute proposals directly |
Agent Factory Pattern
Instead of hardcoding improvement logic, make it a competition:
mobius improve "We keep failing at web research tasks"
→ Spawns "architect" agents
→ Each proposes a different solution
→ Judge picks best proposal
→ Winner's proposal gets executed
Architect agents themselves evolve — they get better at proposing improvements because their proposals are judged on subsequent match outcomes.
New DB Table: proposals
proposals (
id TEXT PRIMARY KEY,
type TEXT, -- create_agent, retire, split, system_change
description TEXT,
proposed_by TEXT, -- agent_id or 'system'
status TEXT, -- pending, accepted, rejected, implemented
evidence TEXT, -- match IDs that motivated this
outcome TEXT, -- what happened after implementation
created_at, resolved_at
)
Implementation Plan
src/mobius/analyst.py — match history analysis, DiagnosisReport model
ImprovementProposal pydantic model in models.py
proposals table in db.py
mobius improve CLI command
/mobius-improve skill (free Opus-powered version)
Related Ideas
- Agents that can create other agents and equip them with custom skills (stored in DB, not repo)
- Benchmark self-review that feeds back into the improvement loop
- Meta-learning: track which improvement strategies actually helped
🤖 Generated with Claude Code
Summary
Mobius can evolve agent prompts via
mobius evolve, but it never asks "what kind of agent am I missing?" or "what capability gap keeps causing failures?" This issue proposes a meta-layer (mobius improve) that diagnoses system-level gaps and proposes structural improvements.Current State
The existing loop is: compete → judge → elo → evolve prompts → repeat
Everything improves within the existing agent pool. The system never:
Proposed:
mobius improveThree phases:
Phase 1: Diagnose (
analyst.py+DiagnosisReport)Analyze recent match history for patterns:
Phase 2: Propose (
ImprovementProposalmodel)Generate typed proposals:
create_agent— new agent for uncovered specializationretire_agent— remove underperformerssplit_agent— break generic agent into specialistsadd_capability— equip agents with new tools (e.g., WebSearch)system_change— structural improvementsPhase 3: Act (three modes)
--dry-run--suggest--autoAgent Factory Pattern
Instead of hardcoding improvement logic, make it a competition:
Architect agents themselves evolve — they get better at proposing improvements because their proposals are judged on subsequent match outcomes.
New DB Table:
proposalsImplementation Plan
src/mobius/analyst.py— match history analysis,DiagnosisReportmodelImprovementProposalpydantic model inmodels.pyproposalstable indb.pymobius improveCLI command/mobius-improveskill (free Opus-powered version)Related Ideas
🤖 Generated with Claude Code