Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 27 additions & 46 deletions QA_REPORT.md
Original file line number Diff line number Diff line change
@@ -1,50 +1,31 @@
# QA Report — deployed-agent preset
# QA_REPORT — feat/agentguard-claude-code-skill

Verdict: PASS
## Verdict: PASS (single-pass review, doc-only diff)

## Scope match

The diff matches `WORK_PLAN.md` exactly. No scope creep beyond the four files
listed in the plan (profiles, setup docstring, test, CHANGELOG).

## What was checked

- `sdk/agentguard/profiles.py` — new `DEPLOYED_AGENT_PROFILE` constant and
dict entry follow the existing pattern. Comment block explains the
motivation and what was deliberately left out (install/registry/oversight
guards). Values (loop_max=2, retry_max=1, warn_pct=0.5) are strictly
tighter than `coding-agent` as expected for a deployed-agent preset.
- `sdk/agentguard/setup.py` — docstring updated to list new profile.
`normalize_profile` already uses `_PROFILE_DEFAULTS` keys for error
messages, so no other update needed.
- `sdk/tests/test_init.py` — new test mirrors the existing
`test_coding_agent_profile_tightens_guard_defaults` and additionally
covers `warn_pct` via `get_budget_guard()._warn_at_pct`. Idiomatic.
- `CHANGELOG.md` — one Unreleased entry citing the arxiv paper.

## Test result

`pytest sdk/tests/ -x` — 708 passed, 0 failed.
## Scope check
Diff matches WORK_PLAN.md scope:
- `skills/agentguard/SKILL.md` — Claude Code skill format frontmatter
- `skills/codex/agentguard.md` — mirrors content for Codex
- `skills/README.md` — orients readers
- `WORK_PLAN.md` / `RESEARCH.md` — overwritten to reflect this task (stragglers from PR #442)

## Safety checks

- No secrets or credentials added.
- No denylist paths touched (`.github/workflows/`, `.env*`,
`supabase/migrations/`, `security/`, Stripe/Clerk).
- No new dependencies.
- No network calls.
- No test coverage regressions — net +1 test.

## Repo pattern adherence

- Naming: hyphenated `deployed-agent` matches `coding-agent`.
- Constant: `DEPLOYED_AGENT_PROFILE` matches `CODING_AGENT_PROFILE`.
- Test method name pattern matches the precedent.
- Comment voice matches surrounding code (terse, no fluff).

## Known gap (intentional, called out in WORK_PLAN.md)

The task body asked for `max_install_count`, `registry_write: deny`,
`oversight_decision_immutable`, `approval_threshold`. These primitives don't
exist in the SDK today. This PR ships the preset registration + the security
narrative; a follow-up task can land the new guard classes.
- No secrets, tokens, credentials
- No denylist paths (`.github/workflows/`, `.env*`, `supabase/migrations/`, `security/`, Stripe/Clerk auth)
- No new dependencies
- No code changes — pure additive documentation
- No tests broken (no test files touched)
- Total: 222 insertions, well under 400 LOC cap

## Content checks
- PyPI package `agentguard47` — verified against root `SKILL.md` line 9
- Import `from agentguard import ...` — matches root SKILL.md lines 31, 75, 90
- All URLs reference canonical `bmdhodl/agent47` repo
- Skill files link back to root SKILL.md as canonical — minimizes drift risk

## Pattern compliance
- Frontmatter matches root `SKILL.md` (Anthropic Skills format)
- Codex skill follows YAML-frontmatter + body single-file pattern

## Issues
None.
53 changes: 22 additions & 31 deletions RESEARCH.md
Original file line number Diff line number Diff line change
@@ -1,37 +1,28 @@
# RESEARCH — deployed-agent preset
# RESEARCH — AgentGuard skill packaging

## Existing pattern verified
## Existing SKILL.md (verified)
`SKILL.md` at the repo root already exists with Anthropic Claude Skills frontmatter (`name`, `description`, `license`, `compatibility`, `metadata`). PyPI package: `agentguard47`. Import: `from agentguard import ...`. CLI: `agentguard doctor|demo|report|eval|incident`. 4-line quick-start uses `BudgetGuard`, `Tracer`, `patch_openai`.

- `sdk/agentguard/profiles.py` exposes `_PROFILE_DEFAULTS` keyed by canonical
profile name. Two entries today: `default` and `coding-agent`. Each entry
is a dict of `{loop_max, retry_max, warn_pct}`.
- `sdk/agentguard/setup.py` line 132 calls `normalize_profile(...)` then
`get_profile_defaults(...)`; resolved values feed BudgetGuard, LoopGuard,
RetryGuard construction. No other plumbing required to register a new
profile.
- `normalize_profile` raises with a sorted list of supported profiles, so
the new profile name appears automatically in error messages.
- `sdk/tests/test_init.py::TestInitLoopGuard::test_coding_agent_profile_tightens_guard_defaults`
is the precedent test shape; the new test mirrors it and additionally
asserts the `warn_pct` change via `get_budget_guard()`.
- `agentguard.get_budget_guard()` exposes `_warn_at_pct`; verified against
`test_budget_warning_threshold` (line 300).
## Skill format research
- **Claude Code / Anthropic skills** — `SKILL.md` with YAML frontmatter (`name`, `description`; optional `license`, `compatibility`, `metadata`). Body is markdown with install + trigger conditions + examples. Multiple skills per repo go in `skills/<skill-name>/SKILL.md`.
- **Codex skills** — single markdown file under `skills/codex/<name>.md` (or `.codex/skills/`). Frontmatter optional but `name`/`description` recommended.

## What was rejected
## awesome-agent-skills (VoltAgent) submission rules
Verified via raw CONTRIBUTING.md fetch:
- Format: `- **[author/skill-name](url)** - description (<=10 words)`
- Add to end of an EXISTING category. No "Safety / Cost Control" section exists. Closest fits: Community Skills > AI and Data, or Other.
- Quality gate: "real community usage and proven adoption (not brand-new submissions)" — explicit blocker for same-day submissions.
- PR title format: `Add skill: bmdhodl/agentguard`
- Public repo + working skill + README/SKILL.md + author prefix all required.

The task body asks for `max_install_count`, `registry_write: deny`,
`oversight_decision_immutable`, `approval_threshold`. None of these guard
primitives exist in the SDK today. Adding them would require new guard
classes (search: `class .*Guard` in `sdk/agentguard/guards.py`), a new public
API surface, and deliberate API review. That is a separate, larger task.
This PR ships the preset hook + arxiv paper messaging using only the
existing guard primitives — a real, useful tightening today, with a clean
seam for follow-up work to extend the profile.
## Decision: defer awesome-agent-skills PR
Submitting a same-day-created skill violates their stated adoption gate. Right path:
1. Ship skill files in agent47 now (this PR).
2. Let them be referenced/installed ~2 weeks.
3. Queue follow-up task for awesome PR with download/issue evidence.

## Cost / safety pass
Decision logged in PR body + new Queue/agent47 task created for the follow-up.

- No new dependencies.
- No network calls.
- No auth, secrets, PII, or denylist paths touched.
- Test executes locally with `auto_patch=False` (no client patching) — same
shape as adjacent tests.
## Repo state at start
- PR #442 had already merged to `origin/main`, so this task branched from current `origin/main`.
- Repo-root planning artifacts from earlier runs existed at start and were refreshed for this doc-only task.
61 changes: 25 additions & 36 deletions WORK_PLAN.md
Original file line number Diff line number Diff line change
@@ -1,40 +1,29 @@
# WORK_PLAN — deployed-agent preset
# WORK_PLAN — Ship AgentGuard as Claude Code skill

## Problem

A peer-reviewed report (arxiv 2605.00055) documents a deployed agent that, under ambient persuasion, installed 107 unauthorized components and overrode its own oversight gate. AgentGuard's existing `coding-agent` profile is tuned for dev-time loops, not production deployment. We need a tighter preset for agents running unattended in production where any drift compounds.
VoltAgent's awesome-agent-skills indexes packaged Claude Code / Codex / Cursor / Gemini skills. AgentGuard fits the safety/cost-control niche but has no entry. agent47 has a root `SKILL.md` (Anthropic format) but no namespaced `skills/agentguard/` Claude Code package and no Codex-format mirror.

## Approach

Add a new `deployed-agent` profile to `sdk/agentguard/profiles.py` alongside `default` and `coding-agent`. Match the existing pattern exactly — same primitives (`loop_max`, `retry_max`, `warn_pct`), tighter values:

- `loop_max: 2` — two repeats and stop. Ambient-persuasion attacks build through repetition.
- `retry_max: 1` — one retry. Removes the "just keep trying" failure mode.
- `warn_pct: 0.5` — warn at half budget, not 80%. Operators see drift earlier.

The task body asks for new primitives (`max_install_count`, `registry_write: deny`, `oversight_decision_immutable`, `approval_threshold`). These don't exist as guards in the SDK today — adding them would mean new guard classes, new APIs, hundreds of LOC, and a release surface change. **Out of scope for this PR.** This PR ships the preset hook and the messaging; a follow-up task can land the new guard primitives once the API shape is reviewed.

The preset's docstring + a README/CHANGELOG note cite the arxiv paper as the motivating incident. That gives us the security-validation narrative without overpromising what the preset enforces.

## Files likely to touch

- `sdk/agentguard/profiles.py` — add `DEPLOYED_AGENT_PROFILE` constant + dict entry
- `sdk/agentguard/setup.py` — update `profile` arg docstring
- `sdk/tests/test_init.py` — add a test mirroring `test_coding_agent_profile_tightens_guard_defaults`
- `CHANGELOG.md` — one-line entry
- `README.md` — short note in the profiles section if one exists
- `sdk/agentguard/doctor.py` (skim) — only if it enumerates profiles

## Done

- [ ] `agentguard.init(profile="deployed-agent")` returns a tracer with LoopGuard max_repeats=2, RetryGuard max_retries=1, BudgetGuard warn at 0.5
- [ ] Test passes
- [ ] Existing tests still pass
- [ ] Profile docstring references arxiv paper
- [ ] CHANGELOG entry

## Risks / assumptions

- The task body asks for primitives that don't exist. We're scoping down to what the SDK already supports + clear messaging. Patrick can land the install-count/registry-write guards in a follow-up if he wants them.
- No new dependencies added.
- Preset name uses the existing convention (`deployed-agent` with hyphen, not `deployed_agent`) so it's consistent with `coding-agent`.
1. Add `skills/agentguard/SKILL.md` — Claude Code skill format with name + description frontmatter and trigger conditions in body. Mirror content tightly to root `SKILL.md` so they don't drift.
2. Add `skills/codex/agentguard.md` — Codex skill format mirror (single file).
3. Add `skills/README.md` orienting readers to which format to use.
4. Defer the awesome-agent-skills PR. Their CONTRIBUTING.md explicitly requires "real community usage and proven adoption (not brand-new submissions)." Submitting a brand-new skill file the same hour the directory is created violates their gate. Instead, ship the skill files in agent47 first, let them mature ~2 weeks, then queue a follow-up awesome PR with usage evidence. Decision logged in PR body + new Queue task.
5. Open small separate PR to bmdpat adding "Install via Claude Code skill" snippet to AgentGuard tools page.

## Files to touch
- `skills/agentguard/SKILL.md` (new)
- `skills/codex/agentguard.md` (new)
- `skills/README.md` (new)
- (bmdpat) tools/agentguard page — separate PR

## Done criteria
- [ ] `skills/agentguard/SKILL.md` exists, valid frontmatter, points to `agentguard47` PyPI
- [ ] `skills/codex/agentguard.md` mirrors content
- [ ] `skills/README.md` explains both formats
- [ ] PR open on agent47 with green checks
- [ ] bmdpat tools page mentions skill install (separate PR)
- [ ] awesome-agent-skills PR deferred — new Queue/agent47 task created for ~2 weeks out

## Risks
- Drift between root `SKILL.md` and `skills/agentguard/SKILL.md` — mitigation: `skills/` version stays terse and links to root for full reference.
- Awesome list gate may reject even a mature submission — acceptable; deferring respects their stated rule.
21 changes: 21 additions & 0 deletions skills/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# AgentGuard skills

Packaged AgentGuard skills for AI coding agent harnesses.

| Format | File | Use with |
|---|---|---|
| Anthropic / Claude Code | [`agentguard/SKILL.md`](./agentguard/SKILL.md) | Claude Code, claude.ai skills, Anthropic SDK |
| Codex | [`codex/agentguard.md`](./codex/agentguard.md) | Codex CLI, OpenAI agent harnesses |

The canonical reference is the root [`SKILL.md`](../SKILL.md). The files in this folder are terser, harness-specific entry points that link back to the canonical doc.

## Install

All skills install AgentGuard the same way:

```bash
pip install agentguard47
agentguard doctor
```

Python 3.9+. Zero dependencies for the core. Optional extras for LangChain, LangGraph, CrewAI.
89 changes: 89 additions & 0 deletions skills/agentguard/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
---
name: agentguard
description: Runtime guardrails for AI coding agents. Stop loops, budget overruns, retry storms, and timeouts before they burn money. Zero dependencies, local-first, MIT licensed.
license: MIT
compatibility: Requires Python 3.9+
metadata:
author: bmdhodl
version: "1.2.10"
pypi: agentguard47
source: https://github.com/bmdhodl/agent47
---

# AgentGuard (Claude Code skill)

Runtime guardrails for AI coding agents. Wraps any agent (OpenAI, Anthropic, LangChain, LangGraph, CrewAI) with budget, loop, timeout, rate-limit, and retry guards that raise exceptions when limits trip. Zero dependencies, local-first, and no network calls unless you configure a remote sink.

## When to use this skill

Trigger this skill when:

- The user asks to "cap AI costs", "stop runaway agents", "prevent infinite loops", "set a budget on an agent", "kill agents that loop", or anything in that family.
- The user is writing or operating an autonomous agent (single-turn or multi-turn) and has not yet wrapped it in spend / loop / timeout protection.
- The user reports an incident where an agent exceeded budget, looped, or burned tokens unexpectedly.
- The user is shipping an agent to production and has no runtime termination semantics.

Do NOT trigger this skill for:

- Static prompt-engineering questions (no runtime).
- Non-Python stacks (AgentGuard is Python-only today).

## Install

```bash
pip install agentguard47
agentguard doctor # verify install, no network calls
```

## Minimal init (4 lines)

```python
from agentguard import Tracer, BudgetGuard, patch_openai

budget = BudgetGuard(max_cost_usd=5.00, warn_at_pct=0.8)
tracer = Tracer(service="my-agent")
patch_openai(tracer, budget_guard=budget)
# OpenAI chat completions are now tracked. At $4 -> warn. At $5 -> raise BudgetExceeded.
```

## Guards available

| Guard | Stops |
|---|---|
| `BudgetGuard` | Dollar / token / call overruns |
| `LoopGuard` | Exact repeated tool calls |
| `FuzzyLoopGuard` | Similar calls, A-B-A-B patterns |
| `TimeoutGuard` | Wall-clock time limits |
| `RateLimitGuard` | Calls-per-minute throttling |
| `RetryGuard` | Retry storms on flaky tools |

Each raises a specific exception (`BudgetExceeded`, `LoopDetected`, `TimeoutExceeded`, `RetryLimitExceeded`) so the agent stops immediately.

## One-liner with profile

```python
import agentguard
agentguard.init(profile="coding-agent", local_only=True)
```

Profiles: `default`, `coding-agent`, `deployed-agent`, custom via `.agentguard.json` in repo root.

## CLI

```bash
agentguard doctor # verify install
agentguard demo # local proof run, no API keys
agentguard report traces.jsonl # summarize a trace file
agentguard eval traces.jsonl # CI assertions
agentguard incident traces.jsonl --format html
```

## Full reference

See the canonical `SKILL.md` at the repo root: <https://github.com/bmdhodl/agent47/blob/main/SKILL.md>

## Links

- PyPI: <https://pypi.org/project/agentguard47/>
- Source: <https://github.com/bmdhodl/agent47>
- Docs: <https://github.com/bmdhodl/agent47/tree/main/docs>
64 changes: 64 additions & 0 deletions skills/codex/agentguard.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
---
name: agentguard
description: Runtime budget / loop / timeout guards for AI coding agents. Stops runaway agents before they burn money.
license: MIT
pypi: agentguard47
source: https://github.com/bmdhodl/agent47
---

# AgentGuard (Codex skill)

Wrap any AI coding agent with runtime guards: budget, loop, timeout, rate-limit, retry. Zero dependencies, local-first, and no network calls unless you configure a remote sink.

## Trigger

Use when:
- Capping spend on an agent run
- Stopping infinite loops or A-B-A-B tool patterns
- Adding wall-clock timeouts
- Building termination semantics for production agents

Python only.

## Install

```bash
pip install agentguard47
agentguard doctor
```

## Quick start

```python
from agentguard import Tracer, BudgetGuard, patch_openai

budget = BudgetGuard(max_cost_usd=5.00, warn_at_pct=0.8)
tracer = Tracer(service="my-agent")
patch_openai(tracer, budget_guard=budget)
```

At `$4`, `BudgetGuard` warns. At `$5`, it raises `BudgetExceeded` and the agent stops.

## Guards

- `BudgetGuard` — dollar / token / call caps
- `LoopGuard` — exact repeated tool calls
- `FuzzyLoopGuard` — similar / A-B-A-B patterns
- `TimeoutGuard` — wall-clock
- `RateLimitGuard` — calls per minute
- `RetryGuard` — retry storms

## Init from profile

```python
import agentguard
agentguard.init(profile="coding-agent", local_only=True)
```

Profiles: `default`, `coding-agent`, `deployed-agent`. Override via `.agentguard.json`.

## Full reference

Canonical SKILL.md at repo root: <https://github.com/bmdhodl/agent47/blob/main/SKILL.md>

PyPI: <https://pypi.org/project/agentguard47/>
Loading