feat: vigil-revoke skill — closes detection→revoke loop via Bankr by aaronjmars · Pull Request #354 · aaronjmars/aeon

aaronjmars · 2026-06-07T11:10:20Z

What

New skills/vigil-revoke/SKILL.md — workflow_dispatch-only skill that revokes a single live ERC-20 approval on Base via Bankr. var is a wallet:spender:token triplet (the tuple shape vigil_scan_approvals and approval-audit already emit). Confirms the approval is still live via eth_call on allowance(owner,spender) before spending any gas, submits approve(spender, 0) via Bankr's /agent/prompt path, polls /agent/job/{id} for up to 90s, then re-reads allowance to verify the on-chain state. Notification only claims SUCCESS on receipt status 0x1 + post-allowance 0x000…000.

Why

VIGIL's five-round review (PR #323) explicitly split the Approval Revoker into a future skill — "Bankr-gated, state-changing — separate PR." wallet-risk-weekly (PR #340, 2026-06-04) now runs the weekly audit and surfaces HIGH-bucket approvals worth revoking. The detection → revoke loop has been half-open since: the agent identifies an UNLIMITED approval to a non-trusted spender but has no autonomous path to act. This skill closes the loop — the operator copies the triplet from the wallet-risk-weekly notification (or runs approval-audit on-demand) and dispatches once.

Five other HoundFlow onchain skills still have no scheduled consumer; this isn't an attempt to schedule any of them. It's the deliberate write-side companion to the read-only inventory: operator-initiated only, no cron.

Changes

skills/vigil-revoke/SKILL.md — new skill. Strict input allowlist on the triplet (^0x[hex40]:0x[hex40]:0x[hex40]$, case-insensitive then lower-normalised — mirrors VIGIL's round-4 input hardening). Bankr ownership pre-check (/wallet/me ≟ triplet wallet) refuses cross-wallet submissions. Pre-revoke allowance read short-circuits to NOOP when already zero. Post-revoke: receipt poll + final allowance read so SUCCESS is chain-confirmed, not Bankr-reported. 7-state exit taxonomy (OK / NOOP / FAILED / BAD_VAR / WALLET_MISMATCH / ERROR / STATE_CORRUPT). State file memory/topics/vigil-revoke-log.json is append-only with a 500-entry cap.
aeon.yml — register disabled at end of the Hound onchain investigation section.
generate-skills-json — extend the onchain-security category line to include vigil-revoke alongside vigil and wallet-risk-weekly.
skills.json — entry inserted alphabetically between vigil and vuln-scanner. Total 193 → 194.

Design decisions

No auto-retry. Failed submissions could mean insufficient gas, contract paused, Bankr 5xx, or a sandbox network blip — none are safe to retry blindly. The next workflow_dispatch is the retry.
No bulk revoke per run. One triplet per run. Bulk is a separate vigil-revoke-batch skill, deliberately out of scope here to keep blast radius bounded and the audit trail clean.
No "trusted spender" auto-skip. Even Uniswap routers can be exploited. Trust-list filtering belongs upstream in wallet-risk-weekly's severity bucketing, not here. If the operator names a triplet, the skill revokes (or NOOPs on already-zero).
No prompt-injection surface in Bankr /agent/prompt. The body only interpolates validated 40-hex addresses — never operator-typed text or fetched contract metadata.
Capabilities declared: external_api, writes_external_host, onchain_writes, sends_notifications. Matches the locked taxonomy in docs/CAPABILITIES.md.

Sandbox note

Bankr API calls require X-API-Key env-var expansion in the curl header — sandbox-blocked env expansion would 403 the call. The skill detects that path and exits with a clear error rather than falling back to WebFetch (which would either leak the key into a URL or omit auth entirely). Base RPC calls (allowance check + receipt poll) use the public endpoint and fall back to WebFetch on curl failure per CLAUDE.md pattern 1.

Built autonomously by Aeon

VIGIL PR #323 explicitly split the Approval Revoker into a future skill ('Bankr-gated, state-changing — separate PR.'). wallet-risk-weekly (PR #340) surfaces HIGH-bucket approvals weekly but had no autonomous remedy path. This skill closes that loop. workflow_dispatch only — var is a wallet:spender:token triplet (matches tuples vigil_scan_approvals / approval-audit return). Strict allowlist on the input. Bankr-bound wallet check before any state-changing call. Pre-revoke allowance read short-circuits to NOOP when already zero — no gas spent on a redundant submission. Post-revoke receipt poll plus a final allowance read so the notification only claims success on a chain-confirmed revoke. 7-state exit taxonomy. No auto-retry, no bulk revoke per run, no trusted-spender auto-skip — the operator triplet is the decision boundary. Registered disabled in aeon.yml at the end of the Hound onchain investigation section. skills.json regenerated (193 → 194), category onchain-security. capabilities: external_api, writes_external_host, onchain_writes, sends_notifications.

aaronjmars · 2026-06-07T15:52:28Z

Review findings — holding merge until these are fixed

Deep review against the agreed VIGIL enable-time posture. The config layer is genuinely compliant: dormant (enabled: false, workflow_dispatch only), zero VIGIL API calls in the skill, no .x402books/wallets.json reads, strict triplet allowlist, chain-confirmed SUCCESS, bound-wallet check, idempotent NOOP pre-check. Two blockers before this can land:

Blockers

1. Nothing verifies what Bankr actually signed
Step 4 delegates the state change to Bankr's natural-language /agent/prompt and step 5 only checks receipt status + post-allowance re-read — which confirms the intended effect but cannot detect a misinterpreted or additional action by Bankr's LLM. The PR body's safety argument also misstates the precedent: distribute-tokens explicitly avoids the Agent API for state changes ("Wallet API only for actual transfers") and uses /agent/prompt solely for read-only handle resolution.
Fix (cheap): after obtaining $TX, eth_getTransactionByHash and assert to == $TOKEN and input == 0x095ea7b3 + pad32(SPENDER) + 0x00…00 before claiming SUCCESS. If Bankr has no structured contract-call endpoint, keep /agent/prompt but add this calldata check + an explicit residual-trust note.

2. Validate-after-interpolate on var (SKILL.md:46)
The first executable line is TRIPLET="${var}" — raw interpolation into a shell assignment before the step-1 regex check, directly contradicting the skill's own sandbox note. The triplet is scanner-derived text the operator pastes from a notification, i.e. exactly the indirect-injection channel the posture worries about; $( ) inside it executes within double quotes.
Fix: regex-validate the var value as seen in the prompt before composing any shell line, or read it from the runner env (TRIPLET="$SKILL_VAR") so no literal lands in a command.

Nits for the same pass

skills.json: "sha": "" + stale generated — run ./generate-skills-json.
FAILED notification: Bankr .error // .reason is third-party text — truncate ~120 chars and strip markdown/URLs mechanically (./notify tries Telegram Markdown first).
:172 "exactly one notification per run" contradicts the BAD_VAR/ERROR paths that send none.
:84 comment claims 403/401 are distinguishable but -fsS … || echo "" collapses all failures — capture the HTTP code like distribute-tokens does, or drop the comment.
SUCCESS template hardcodes "was UNLIMITED → now 0"; report actual pre_allowance_hex.

Follow-up worth filing as an issue (pre-existing, not this PR)

The runner injects BANKR_API_KEY into every skill run, and enabled: false doesn't gate workflow_dispatch — so dormancy is convention, not enforcement. Scoping the secret to an allowlist (distribute-tokens, vigil-revoke) would turn the no-auto-trigger constraint into an enforced guarantee.

antfleet-ops · 2026-06-07T22:50:12Z

Fresh-eyes pass, post-existing-review. The current comment thread already lands the two highest-leverage items (calldata verification on /agent/prompt, validate-before-interpolate on TRIPLET). A few correctness deltas worth folding into the same revision rather than chasing post-merge — especially given the write-scoped key and the no-auto-retry contract.

1. Post-allowance read has no failure path — asymmetric with the pre-check

skills/vigil-revoke/SKILL.md:186-189

POST_HEX=$(curl -m 10 -s -X POST "$RPC" -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":1,"method":"eth_call",...}' \
  | jq -r '.result // empty')

The pre-allowance read at :106-111 has explicit WebFetch fallback and a "never proceed blind" guard. The post-read has neither retry, fallback, nor an error branch. If POST_HEX comes back empty (RPC stalled, rate-limited, malformed body), the SUCCESS guard at :199 fails the 0x000…0 equality → notification falls to FAILED, even when the receipt was 0x1 and the chain did revoke. Operator may then re-dispatch, burning gas on a now-true NOOP.

Suggest symmetric handling: if the post-read is empty / non-hex after the same retry+WebFetch ladder, emit a third verdict (VIGIL_REVOKE_OK_UNVERIFIED or similar) distinct from FAILED — so the operator can tell "receipt was good, chain re-read inconclusive" from "tx genuinely failed."

2. Poll loop falls through silently on unknown Bankr statuses — composes with no idempotency on `/agent/prompt`

SKILL.md:139-154

case "$STATUS" in
  completed|success) break ;;
  failed|error|rejected) ... break ;;
esac
sleep 3

Any status outside this closed set — pending, submitted, mining, an empty string from a 5xx, anything Bankr adds later — is treated as "keep polling," exiting after 90s with TX="". Fine if the tx was never submitted. But the next operator re-dispatch on the same triplet has no idempotency key on /agent/prompt, so Bankr could be holding a pending tx the skill never surfaced and accept a duplicate.

Is there a way to list recent jobs for the bound wallet, or to pass an operator-provided idempotency key on /agent/prompt? If neither, worth surfacing the residual double-submit risk in the timeout notification itself, not just the SKILL prose.

3. `latest` block tag on both reads — Base public RPC can lag in both directions

SKILL.md:107 and :187 both query "latest". Public mainnet.base.org is known to serve stale state under load. Pre-check at latest can read allowance=0 from a stale node while it's actually live (false NOOP, under-covers the operator); post-check at latest can miss a just-mined revoke (false FAILED).

For the post-check, pinning the eth_call to >= receipt.blockNumber (or querying at the receipt's block tag directly) closes the post-side cleanly. Pre-side staleness is harder to remove, but worth a one-line note in the SKILL so the operator knows a NOOP isn't a guarantee.

4. Receipt-status case has no default arm

SKILL.md:170-180 — STATUS_HEX absent or unparseable falls through with CONFIRMED=0; the FAILED notification then can't distinguish "actual revert (0x0)" from "malformed receipt." Small diagnostic loss. One *) echo "VIGIL_REVOKE_RECEIPT_UNPARSED status=$STATUS_HEX" ;; arm covers it.

5. Notify-body construction is described, not shown

SKILL.md:195-207 describes "use a single-quoted heredoc or jq -Rs if the body contains backticks or quotes," but the SUCCESS/FAILED template blocks interpolate <bankr_reason_or_timeout> (the third-party text already flagged in review) without showing the literal ./notify invocation. Readers will copy what's shown, not what's described — worth lifting the jq -Rs … | ./notify call into the bash directly so the safe path is the obvious path.

6. Question — `workflow_dispatch` concurrency

SKILL.md:287-288 notes the skill is idempotent "because the pre-check at step 3 short-circuits to NOOP when allowance is already zero" — true for same-triplet replays. What about parallel dispatches against the same Bankr-bound wallet for different triplets? Presumably Bankr serializes per-wallet (nonce management), but the SKILL doesn't say. One-liner if confirmed.

None blockers. Items 1–3 are the ones that compose into a realistic false-state in production; 4–6 are robustness / hygiene that's cheaper to fold in now.

distribute-tokens bans the Agent API for transfers; document that Bankr exposes no structured raw-contract-call path for an arbitrary approve, so /agent/prompt is the only route here, and the blast radius is bounded to zeroing an allowance (never a fund move). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Targeted edit of the vigil-revoke entry only — a full regenerate on this branch would reflow every entry's sha and conflict with main. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

aaronjmars and others added 2 commits June 8, 2026 07:12

chore(skills.json): fill vigil-revoke sha (af5fb9d)

b787118

Targeted edit of the vigil-revoke entry only — a full regenerate on this branch would reflow every entry's sha and conflict with main. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

aaronjmars merged commit fc019fe into main Jun 8, 2026
1 check passed

aaronjmars deleted the feat/vigil-revoke branch June 8, 2026 11:14

aaronjmars mentioned this pull request Jun 8, 2026

skills: add CTRL — on-chain automation on Base #353

Merged

antfleet-ops mentioned this pull request Jun 12, 2026

Proposal: CI gate to fail PRs that change skills/*/SKILL.md without regenerating skills.json #455

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: vigil-revoke skill — closes detection→revoke loop via Bankr#354

feat: vigil-revoke skill — closes detection→revoke loop via Bankr#354
aaronjmars merged 3 commits into
mainfrom
feat/vigil-revoke

aaronjmars commented Jun 7, 2026

Uh oh!

aaronjmars commented Jun 7, 2026

Uh oh!

antfleet-ops commented Jun 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

aaronjmars commented Jun 7, 2026

What

Why

Changes

Design decisions

Sandbox note

Uh oh!

aaronjmars commented Jun 7, 2026

Review findings — holding merge until these are fixed

Blockers

Nits for the same pass

Follow-up worth filing as an issue (pre-existing, not this PR)

Uh oh!

antfleet-ops commented Jun 7, 2026

1. Post-allowance read has no failure path — asymmetric with the pre-check

2. Poll loop falls through silently on unknown Bankr statuses — composes with no idempotency on /agent/prompt

3. latest block tag on both reads — Base public RPC can lag in both directions

4. Receipt-status case has no default arm

5. Notify-body construction is described, not shown

6. Question — workflow_dispatch concurrency

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

2. Poll loop falls through silently on unknown Bankr statuses — composes with no idempotency on `/agent/prompt`

3. `latest` block tag on both reads — Base public RPC can lag in both directions

6. Question — `workflow_dispatch` concurrency