AgentGuard

Contract-based accountability runtime for AI agents.

AgentGuard eliminates the "false progress" problem — when an AI agent reports work is done, but nothing actually happened. Instead of trusting agent output, AgentGuard independently verifies it.

The Problem

You ask an agent to deploy a website. It responds: "Done! The website is deployed at example.com."

But is it? Did the agent actually run the deploy command? Is the URL live? Does the page contain the right content?

Today, you have two options: trust the agent blindly, or manually verify everything yourself. Neither scales.

With agents running multi-step pipelines — writing code, deploying services, calling APIs — you need a system that verifies results independently, not one that takes the agent's word for it.

How AgentGuard Works

AgentGuard is an MCP server that sits between you and your agent. You define a contract: what the agent must do, and what evidence proves it was done. The agent works autonomously, submits evidence, and AgentGuard verifies it independently.

You write a contract     Agent gets the task     Agent submits evidence     AgentGuard verifies
┌──────────────────┐     ┌────────────────┐     ┌──────────────────────┐    ┌─────────────────┐
│ "Deploy site,    │────>│ Reads task +   │────>│ Sends command output │───>│ Checks exit code│
│  prove it's live │     │ requirements   │     │ and deploy URL       │    │ GETs the URL    │
│  with http 200"  │     └────────────────┘     └──────────────────────┘    │ Verifies body   │
└──────────────────┘                                                        └────────┬────────┘
                                                                                     │
                                                                              VERIFIED or FAILED

No evidence = not done. It's that simple.

Features

Single Agent Mode

Run one agent with a contract — the classic mode. Define a task, required evidence, and bounds.

Pipeline Mode (v0.2)

Orchestrate multi-agent pipelines as a DAG. Each stage has its own agent, contract, and evidence. Stages run in dependency order with artifact passing between them.

stages:
  - id: build
    agent_id: builder
    contract: { ... }
    output_artifacts: ["binary"]
  - id: test
    agent_id: tester
    depends_on: [build]
    contract: { ... }
  - id: deploy
    agent_id: deployer
    depends_on: [test]
    contract: { ... }

Free Mode (v0.2)

Give an orchestrator agent a high-level goal with meta-evidence. The orchestrator decomposes the task, creates sub-contracts, and delegates to worker agents — all within enforced limits (max sub-contracts, max depth, max cost).

MCP over SSE (v0.2)

Multiple agents connect to a single AgentGuard instance via HTTP Server-Sent Events. Each agent gets its own SSE stream and message endpoint. Supports ?agent_id= query parameter to identify agents when all instances share the same clientInfo.name.

Quick Start

Install

go install github.com/agentguard/agentguard/cmd/agentguard@latest

Or build from source:

git clone https://github.com/agentguard/agentguard.git
cd agentguard
go build -o agentguard ./cmd/agentguard

1. Write a Contract

Create contract.yaml:

id: ct_deploy
version: 1

task:
  summary: "Create an index.html file"
  context: |
    Create a file at /tmp/agentguard-test/output.txt
    with the content "hello agentguard".

evidence:
  required:
    - id: file_created
      type: file_exists
      description: "Output file exists and is not empty"
      verify:
        path: "/tmp/agentguard-test/output.txt"
        non_empty: true

    - id: content_correct
      type: file_content
      description: "File contains the expected text"
      depends_on: file_created
      verify:
        path: "/tmp/agentguard-test/output.txt"
        contains: "hello agentguard"

bounds:
  timeout: "300s"
  max_retries: 2

on_failure:
  action: escalate

2. Validate

agentguard contract validate contract.yaml
# Contract is valid.

3. Run as MCP Server (stdio)

agentguard serve --contract contract.yaml

4. Run a Pipeline (SSE)

agentguard pipeline run --file pipeline.yaml --port 9200

Agents connect via SSE:

GET http://127.0.0.1:9200/sse?agent_id=builder
GET http://127.0.0.1:9200/sse?agent_id=tester

5. Check the Audit Log

Every state transition and verification is recorded:

agentguard audit log --contract-id ct_deploy
# [14:28:01] contract.created     contract=ct_deploy
# [14:28:01] contract.state_changed contract=ct_deploy {from: created, to: assigned}
# [14:28:05] evidence.verified    contract=ct_deploy {evidence_id: file_created}
# [14:28:06] evidence.verified    contract=ct_deploy {evidence_id: content_correct}
# [14:28:06] contract.state_changed contract=ct_deploy {from: assigned, to: done}

MCP Tools

Standard Tools (all agents)

Tool	Purpose
`task/get`	Get the assigned task and evidence requirements
`evidence/submit`	Submit evidence for independent verification
`task/complete`	Declare the task done (fails if evidence is missing)
`task/blocked`	Report a blocker — triggers escalation
`status/get`	Check current verification progress
`artifact/get`	Retrieve input artifacts
`artifact/put`	Store output artifacts

Orchestrator Tools (Free Mode)

Tool	Purpose
`task/create`	Create a child task contract for a worker agent
`task/list-children`	List all child task contracts
`task/get-child`	Get details of a child task contract
`task/cancel-child`	Cancel a child task contract

Evidence Types

AgentGuard independently verifies each type:

Type	What the Agent Provides	What AgentGuard Checks
`command_output`	command, stdout, exit_code	Validates exit code, patterns in stdout
`http_check`	URL	Makes GET request, checks status code and body
`file_exists`	file path	Checks existence, size, SHA256 hash
`file_content`	file path	Checks contains/regex/valid JSON/valid YAML
`json_value`	JSON data	JSONPath extraction + equality/range checks
`test_result`	test command	Runs tests, verifies exit code (Phase 2)
`api_state`	resource ID	Calls provider API to verify state (Phase 2)

Evidence Chains

Evidence can depend on other evidence. For example, first verify a deploy command succeeded, then use the URL from its output to verify the site is live:

evidence:
  required:
    - id: deploy_done
      type: command_output
      verify:
        exit_code: 0
        stdout_pattern: "https://.*\\.vercel\\.app"

    - id: site_live
      type: http_check
      depends_on: deploy_done
      verify:
        extract_pattern: "https://[\\S]+"
        status: 200
        body_contains: "Welcome"

Contract Lifecycle

CREATED ──> ASSIGNED ──> VERIFYING ──> DONE
   │            │            │
   │            │            ├──> RETRYING ──> ASSIGNED (retry)
   │            │            │                     │
   │            │            └──> FAILED           └──> FAILED (max retries)
   │            │
   │            └──> EXPIRED (timeout)
   │
   └──> CANCELLED

Every transition is recorded in the audit log with a timestamp.

Architecture

┌──────────────────────────────────────────────────────────┐
│                       AgentGuard                          │
│                                                           │
│  ┌─────────────┐  ┌──────────────┐  ┌────────┐          │
│  │  Transport   │  │   Contract   │  │  Store  │          │
│  │  MCP stdio   │  │  Lifecycle   │  │ (bbolt) │          │
│  │  MCP SSE     │  │  DAG / Parse │  │         │          │
│  └──────┬──────┘  └──────┬───────┘  └────┬────┘          │
│         │                │               │               │
│  ┌──────┴────────────────┴───────────────┴──────────┐    │
│  │              Evidence Verifiers                    │    │
│  │  command | http | file | json | ...                │    │
│  └──────────────────────────────────────────────────┘    │
│                                                           │
│  ┌──────────────────────────────────────────────────┐    │
│  │              Control Plane                         │    │
│  │  limits | circuit breaker | scheduler              │    │
│  └──────────────────────────────────────────────────┘    │
└──────────────────────────────────────────────────────────┘
         ▲                              │
    JSON-RPC 2.0                   Audit Log
    stdio / SSE                  (append-only)
         │
    ┌────┴────┐
    │ Agents  │
    └─────────┘

Single binary, zero external dependencies at runtime
bbolt embedded database — no PostgreSQL, no Redis, no Docker
Append-only audit log for compliance
Go 1.23+

Configuration

Create agentguard.yaml (all values are optional — sensible defaults are built in):

store:
  path: "./data/agentguard.db"
  artifacts_dir: "./data/artifacts"

defaults:
  timeout: 300s
  max_retries: 2
  max_evidence_submissions: 50

escalation:
  webhook_url: "https://hooks.slack.com/..."

verification:
  http_check:
    timeout: 10s
    retries: 3

logging:
  level: info       # debug | info | warn | error
  format: json      # json | text

CLI Reference

agentguard serve                        Start MCP server (stdio, single agent)
  --contract <path>                     Contract YAML (required)
  --config <path>                       Config file
  --log-level <level>                   debug|info|warn|error

agentguard pipeline run                 Start pipeline server (SSE, multi-agent)
  --file <path>                         Pipeline YAML (required)
  --port <port>                         HTTP port (default: 9200)

agentguard contract validate <path>     Validate contract YAML

agentguard audit log                    View audit events
  --contract-id <id>                    Filter by contract
  --limit <n>                           Max entries (default: 100)

agentguard status                       System health check
agentguard version                      Print version

Testing

# Unit and integration tests
go test ./...

# E2E test with Claude Code agents (requires claude CLI)
bash test/e2e/pipeline/run.sh

Why Not Just Check Manually?

	Manual	AgentGuard
Speed	You check after the fact	Real-time, as evidence comes in
Consistency	Depends on your attention	Same checks every time
Audit trail	Hope you remember	Every event recorded with proof
Scale	1 agent, maybe	Multi-agent pipelines with dependency chains
Cost	Your time	Milliseconds of CPU

Roadmap

v0.1 — Single agent, contract verification, MCP server (stdio), audit log
v0.2 (current) — Multi-agent pipelines, Free Mode, MCP over SSE, circuit breaker, limits enforcement
v0.3 — gRPC API, terminal dashboard, plugin system
v0.4 — Kubernetes operator, distributed mode

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
cmd/agentguard		cmd/agentguard
docs		docs
examples		examples
internal		internal
test/e2e/pipeline		test/e2e/pipeline
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AgentGuard

The Problem

How AgentGuard Works

Features

Single Agent Mode

Pipeline Mode (v0.2)

Free Mode (v0.2)

MCP over SSE (v0.2)

Quick Start

Install

1. Write a Contract

2. Validate

3. Run as MCP Server (stdio)

4. Run a Pipeline (SSE)

5. Check the Audit Log

MCP Tools

Standard Tools (all agents)

Orchestrator Tools (Free Mode)

Evidence Types

Evidence Chains

Contract Lifecycle

Architecture

Configuration

CLI Reference

Testing

Why Not Just Check Manually?

Roadmap

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AgentGuard

The Problem

How AgentGuard Works

Features

Single Agent Mode

Pipeline Mode (v0.2)

Free Mode (v0.2)

MCP over SSE (v0.2)

Quick Start

Install

1. Write a Contract

2. Validate

3. Run as MCP Server (stdio)

4. Run a Pipeline (SSE)

5. Check the Audit Log

MCP Tools

Standard Tools (all agents)

Orchestrator Tools (Free Mode)

Evidence Types

Evidence Chains

Contract Lifecycle

Architecture

Configuration

CLI Reference

Testing

Why Not Just Check Manually?

Roadmap

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages