layout	default
title	The Core

The Core

The core category is the load-bearing set — the 15 skills that make Aeon autonomous rather than just scheduled. Everything else in the catalog is a workload; these are the machine. They group into three clusters: self-evolution & self-healing, fleet / self-replication, and autonomous real-world action.

If you're building a derivative architecture, this is the set to keep and validate first. It doesn't need to be 100% identical — but each skill below earns its place with a specific mechanism, and those mechanisms are what to preserve.

🧬 Self-evolution & self-healing

`autoresearch` — evolves an existing skill

Input: var = a skill name (required; aborts if empty)

Loads the target SKILL.md, parses its purpose / data sources / output format / dependencies, and saves the original for the diff.
Researches improvements — web-searches for better / alternative APIs, best practices, and known failure modes, plus reviews recent memory/logs/ runs and cron-state.json to see if it's been failing.
Generates 4 variations, each with a fixed thesis: A better inputs (data sources), B sharper output (format / quality), C more robust (fallbacks, edge cases, error handling), D rethink (a fundamentally different approach).
Scores them against a rubric and ships the winner as a PR — tagging the chosen lineage as an HTML comment () at the top of the file.

That comment is why you can see it already ran across the library: skill-repair is variation D, skill-health C, and create-skill / deploy-prototype / vuln-scanner B.

`create-skill` — generates a brand-new skill from one sentence

Input: a natural-language description (required)

Parses the request into action verb + data source + output format + configurable param + cadence.
Deep duplicate detection — keyword-greps existing skills, reads the top 3 candidates, and if one already does it, exits CREATE_SKILL_DUPLICATE, suggesting you run the existing skill with a different var=.
Researches the data sources (WebSearch + WebFetch the canonical docs, cross-checked against a 2nd source to confirm the endpoint isn't deprecated) and records the URLs.
New-secret guard — reads existing secret names (never values) via gh api; if the new skill needs a secret that isn't configured, it forces the generated skill to degrade gracefully and documents the requirement in the PR.
Ships a complete, production-ready skill as a PR (never commits to main).

`skill-health` — the detector · daily 18:00

Audits every enabled skill from cron-state.json + per-run Haiku quality scores (memory/skill-health/*.json) + a skill-runs fallback for sandbox-blocked runs. Classifies each as CRITICAL / DEGRADED / FLAPPING / WARNING / HEALTHY / NO-DATA via first-matching-rule, computes a severity score, and detects systemic patterns (≥2 skills sharing an API host or error signature).

It files issues into memory/issues/ISS-NNN.md, resolves them when a skill recovers (drops it from affected_skills, flips status to resolved), and notifies only on state change. It won't touch the issue tracker unless the operator has opted in by creating INDEX.md.

`skill-repair` — the fixer · reactive, `depends_on: skill-health`

Phases: PREFLIGHT → TRIAGE → DIAGNOSE → REPAIR → VERIFY → LOG, with a one-shot exit taxonomy (REPAIR_OK_FIXED, REPAIR_OK_SYSTEMIC, REPAIR_DIAGNOSED_NO_FIX, REPAIR_NO_TARGETS, REPAIR_DRY_RUN, REPAIR_BLOCKED).

Triage auto-picks the worst fixable target from open issues + cron-state (success-rate < 0.5, low quality score), and clusters by normalized error signature — if 2+ skills share one, it switches to systemic mode: one shared issue, one shared fix, instead of N patches.
Guards against looping: 24h per-skill cooldown (tracked in skill-repair-history.json), and caps itself at 3 repair PRs/day.
Builds a diagnostic dossier, applies the fix, opens a PR, and verifies. dry-run:NAME diagnoses without writing.

`skill-evals` — the regression catcher · Sun 06:00

Runs an assertion manifest (evals.json) against each skill's latest output: empty / stale file checks, word-count floors, required / forbidden regex patterns, numeric range checks, and a quality cross-check against skill-health/*.json.

The lede is the diff vs. the previous run — each skill classified NEW_FAIL / FIXED / STILL_FAIL / NEW_PASS / STABLE, so it catches quality degradation between runs rather than just snapshotting. Files issues for every NEW_FAIL, runs a coverage audit (eval-audit) to surface enabled skills with no eval spec, and queues concrete fixes.

`self-improve` — broad self-tuning · every other day

Reads the last 2 days of logs + cron-state.json for the highest-impact, smallest fix (failing skills, timeouts, truncated notifications, low-quality output), makes one minimal targeted change (tighten a prompt, add backoff, fix a config), and opens a PR with Problem / Fix / Evidence.

Backpressure: if 3+ improvement PRs are already open, it exits without creating more debt. Explicitly forbidden from rewriting skills wholesale or touching architecture / secrets.

How the loop closes

skill-health / skill-evals detect → file issue → skill-repair fixes → PR → merge → cron-state recovers → skill-health resolves the issue. CLAUDE.md codifies the contract: health skills file issues, repair skills close them.

🛰️ Fleet / self-replication

`spawn-instance` — clones the agent into a new repo

Input: var = "name: purpose"

Forks the repo (using the upstream parent if this is itself a fork), sanitizes the name into aeon-<name>, configures its skill plan for the stated purpose, validates, enables Actions, and registers it in memory/instances.json. Full exit taxonomy with idempotent recovery (SPAWN_FORK_EXISTS_RECOVERED vs. ..._REGISTERED, SPAWN_PUSH_FAILED, etc.) and preflight checks (gh auth, rate limit ≥50).

Seeds the repo but never propagates secrets — each clone is inert until its owner adds their own API keys, giving billing isolation and blast-radius containment.

`fleet-control` — operates the managed fleet · twice daily 9/15

Three modes via var: Health Check (default), Status (status), Dispatch (dispatch <instance|*> <skill> [var=…]). For each registered instance it runs 3 parallel gh calls (repo metadata, last-24h workflow runs, child cron-state) into /tmp, classifies each as healthy / reachable, and emits a verdict-first report with a per-instance next-action column.

Dispatch mode lets the parent trigger a skill on one child — or all healthy / degraded children at once. State-change-gated notify; bails on missing gh auth or low rate limit.

`fleet-scorecard` — fleet economics · daily 13:00

Discovers the fleet at runtime (self + every non-archived instance — never hardcoded). All data is gathered by scripts/prefetch-fleet-scorecard.sh outside the sandbox, so the skill just reads /tmp/fleet-scorecard/* and writes the report — no network needed.

Aggregates runs / failures / generations / tokens / est. cost / cache discount (tokens in OpenRouter shape, cached ⊆ prompt), builds an Alerts block (any skill with ≥25% fail rate over 14d, cost spikes > 1.5× median daily delta, failure jumps > 10), writes memory/scorecard.md and appends a trend row to scorecard-history.csv.

`contributor-reward` → `distribute-tokens` — the pay-your-contributors flywheel

contributor-reward (Mon 09:30) reads the latest fork-contributor leaderboard, scores each contributor against a tier table (rank 1 = 25 USDC … rank 5 = 5, +5 first-PR bonus tracked once-ever per login, eligibility floor score ≥10 + must have an @handle), and writes the plan into memory/distributions.yml with a one-command run line. It deliberately stops short of sending — keeping a human-visible git diff as the audit trail.

distribute-tokens then does the actual on-chain send via the Bankr Wallet API with serious money-safety engineering: two-phase RESOLVE → EXECUTE (validate config / key / balance, resolve @handles → addresses, build plan; then send), per-recipient idempotency key + txHash so nothing double-sends across re-runs, dry-run mode, and recovery from partial runs. Wallet API for transfers only; read-only keys → 403 guard.

🤖 Autonomous real-world action

`external-feature` / `feature` — ships code to watched repos unprompted

Input: owner/repo, owner/repo#N, or empty (auto-pick)

Clones the repo, deeply reads it (CLAUDE.md, manifests, recent commits, open issues / PRs, test setup), then picks one change by priority: (1) fix an open issue, (2) code improvement — TODOs, missing error handling, untested critical paths, security fixes, (3) a new feature / DX improvement if the codebase is clean.

Implements it matching the repo's exact style on a branch (ai/...), commits with conventional-commit format (Closes #N), pushes, and opens a PR.

Hard rules: one enhancement per run, never push to main, no unrelated refactors, "if nothing's worth doing, log and exit." feature is the batch version that does this across the whole watched-repo list, preferring yesterday's repo.

`deploy-prototype` — generates a live web app and ships it to Vercel

Input: empty (auto-pick from signals), a plain brief, or a typed type:slug descriptor

Scans memory/topics/ and recent logs for a prototype-worthy signal, scores candidates on leverage / concreteness / novelty (must clear 9/15 or it exits DEPLOY_PROTOTYPE_EMPTY). Commits to a shape (slug, tagline, primary action, static-vs-API-vs-Next), then writes the files into .pending-deploy/files/ against a strict quality bar: self-contained, sub-1s load, mobile-first, OG tags, real data from public no-auth endpoints (no lorem), light / dark via prefers-color-scheme, no secrets.

Runs pre-flight checks (≤20 files, ≤4MB, slug regex, greps for leaked tokens and for TODO / placeholder), writes a prototype record + prototypes.md row. The actual GitHub-repo-create + Vercel deploy is handled by scripts/postprocess-deploy.sh (reads .pending-deploy/, uses VERCEL_TOKEN / GH_GLOBAL) — the skill flags DEPLOY_PROTOTYPE_NO_POSTPROCESS if that script is missing.

`vuln-scanner` — finds real vulns and discloses responsibly

Input: owner/repo, or auto-select from github-trending

Selection filters skip CTF / teaching repos, repos scanned in the last 30 days, and repos with no safe disclosure channel. Forks + shallow-clones, then runs purpose-built scanners (not grep): Semgrep, TruffleHog --only-verified (filesystem + git history), osv-scanner for dependency CVEs, Slither if Solidity is present — recording each tool's ok/fail status (empty ≠ clean).

Triages every hit by hand (read the code at the line, write one sentence on what an attacker controls, check reachability, assign severity; drop test / example findings). Then routes by finding type: dependency CVEs → public PR (already-public, net-positive); code flaws / verified secrets / contract bugs → a Private Vulnerability Report via gh api .../security-advisories (optionally a fix pushed to its own fork only, linked in the advisory — never a public PR). If no PVR and no SECURITY channel, it does nothing public and logs "no safe channel."

Core principles: do no harm, never post exploit chains publicly, all-scanners-failed ≠ clean (report as error). Dedup state in memory/vuln-scanned.json.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The Core

🧬 Self-evolution & self-healing

`autoresearch` — evolves an existing skill

`create-skill` — generates a brand-new skill from one sentence

`skill-health` — the detector · daily 18:00

`skill-repair` — the fixer · reactive, `depends_on: skill-health`

`skill-evals` — the regression catcher · Sun 06:00

`self-improve` — broad self-tuning · every other day

How the loop closes

🛰️ Fleet / self-replication

`spawn-instance` — clones the agent into a new repo

`fleet-control` — operates the managed fleet · twice daily 9/15

`fleet-scorecard` — fleet economics · daily 13:00

`contributor-reward` → `distribute-tokens` — the pay-your-contributors flywheel

🤖 Autonomous real-world action

`external-feature` / `feature` — ships code to watched repos unprompted

`deploy-prototype` — generates a live web app and ships it to Vercel

`vuln-scanner` — finds real vulns and discloses responsibly

FilesExpand file tree

CORE.md

Latest commit

History

CORE.md

File metadata and controls

The Core

🧬 Self-evolution & self-healing

autoresearch — evolves an existing skill

create-skill — generates a brand-new skill from one sentence

skill-health — the detector · daily 18:00

skill-repair — the fixer · reactive, depends_on: skill-health

skill-evals — the regression catcher · Sun 06:00

self-improve — broad self-tuning · every other day

How the loop closes

🛰️ Fleet / self-replication

spawn-instance — clones the agent into a new repo

fleet-control — operates the managed fleet · twice daily 9/15

fleet-scorecard — fleet economics · daily 13:00

contributor-reward → distribute-tokens — the pay-your-contributors flywheel

🤖 Autonomous real-world action

external-feature / feature — ships code to watched repos unprompted

deploy-prototype — generates a live web app and ships it to Vercel

vuln-scanner — finds real vulns and discloses responsibly

`autoresearch` — evolves an existing skill

`create-skill` — generates a brand-new skill from one sentence

`skill-health` — the detector · daily 18:00

`skill-repair` — the fixer · reactive, `depends_on: skill-health`

`skill-evals` — the regression catcher · Sun 06:00

`self-improve` — broad self-tuning · every other day

`spawn-instance` — clones the agent into a new repo

`fleet-control` — operates the managed fleet · twice daily 9/15

`fleet-scorecard` — fleet economics · daily 13:00

`contributor-reward` → `distribute-tokens` — the pay-your-contributors flywheel

`external-feature` / `feature` — ships code to watched repos unprompted

`deploy-prototype` — generates a live web app and ships it to Vercel

`vuln-scanner` — finds real vulns and discloses responsibly