GitHub - MarouaBoud/cairn-protocol: Standardized checkpoint & recovery protocol for AI agents — ERC proposal, live on Base Sepolia (Testnet)

CAIRN Protocol - Agent Failure and Recovery Protocol

⚠️ Testnet only. Deployed on Base Sepolia (Chain ID: 84532). Mainnet deployment pending security audit.

CAIRN turns every agent failure into a lesson every other agent inherits — enforced by escrow, validated by attestation, owned by no one.

Agents learn together.

→ Evaluating the protocol? Whitepaper v2 → ERC Spec → Integrating CAIRN? Quick Start → Integration Guide → Auditing contracts? Contracts → Security → Reproducing the simulation? simulation/ → python3 -m simulation.run_eq4 (seed=42)

The Problem: Invisible Failures, Wasted Work

Agent workflows fail 80% of the time. At 85% success per action, a 10-step workflow completes only ~20% of the time. When failures happen today:

What Happens	Cost
Work is lost	Restart from zero — all progress gone
Escrow locks	Funds stuck in ambiguous state for hours/days
No one learns	Same failure repeats across the ecosystem
Human intervention required	2am pages, manual debugging, delayed resolution

The ecosystem is bleeding value. Every silent failure is money lost, time wasted, and a lesson unlearned.

The Cost of Doing Nothing

Monthly failure cost = failures × avg_escrow × (1 - recovery_rate) + restart_cost + opportunity_cost

Example (single operator):
- 20 failures/month × $50 avg escrow × 100% loss rate = $1,000 direct loss
- 20 restarts × $15 gas (duplicate work) = $300 gas waste
- 20 failures × 4 hours avg delay × $50/hour opportunity = $4,000 opportunity cost
- Total: ~$5,300/month lost to unrecovered failures

A Failure Story: What Happens Today

Scenario: DeFi rebalancing agent on Base Time: 2:47am UTC, Saturday Task: Rebalance $12,000 across 3 pools

❌ Without CAIRN

Step	Action	Result
1	Price fetch	✅ SUCCESS
2	Approve token A	✅ SUCCESS
3	Swap on DEX	❌ FAILED — rate limit (429)

What happened next:

Agent stopped. No heartbeat for 45 minutes.
Escrow: $45 locked in ambiguous state
Operator notified: 7:15am (4.5 hours later)
Resolution: Manual restart from scratch
Work lost: Approvals (Step 2) must be re-done

Total cost: $45 escrow delay + $12 gas + 4.5 hours

✅ With CAIRN

Time	Event
2:47am	Agent fails (rate limit)
2:52am	CAIRN detects (liveness timeout)
2:52am	Classified: RESOURCE failure, score: 0.74
2:53am	Fallback agent assigned from pool
2:53am	Fallback reads checkpoints — approvals preserved
3:08am	Task completed by fallback
3:08am	Escrow split: Original 66% / Fallback 33%

Total delay: 21 minutes (vs. 4.5 hours) Work preserved: Yes (checkpoint 2) Escrow settled: Fairly, proportional to verified work

Why Now

Signal	Status
ERC-8183 is live	Agent escrow infrastructure shipped March 2026
~2,000 mechs deployed on Olas	~500 active daily — real fallback pool available today
10M+ agent-to-agent transactions	Real economic activity flowing through agent rails
~50% multi-agent task completion rate	Per Lu et al. 2025 across TaskWeaver, MetaGPT, AutoGen

The infrastructure is ready. The problem is severe. The gap is real.

The Cairn Metaphor

A cairn is a stack of stones left by travelers to mark the path — so the next traveler knows where to go, and where not to. Every agent failure leaves a cairn. Every future agent reads it.

Travelers in wilderness stack stones — cairns — to mark where they have been, which paths are safe, and which lead nowhere. Each cairn is left by one traveler but read by every traveler who comes after. No traveler owns the cairn network. Every traveler benefits from it.

CAIRN applies this to agents. Every failure leaves a cairn — an execution record that marks this exact task type, this exact failure mode, this exact cost. Every future agent reads the cairns before setting out. The ecosystem navigates by accumulated failure intelligence, not blind optimism.

What is CAIRN?

CAIRN is a standardized agent failure and recovery protocol.

It defines the exact sequence of events that must occur when an agent fails mid-task — from detection, through classification, through fallback assignment, through settlement — without requiring any human intervention and without requiring trust between agents.

The Protocol in One Paragraph

An operator initiates a task with a budget, deadline, and task type. Before the task starts, CAIRN queries the execution intelligence layer for known failure patterns on this task type and recommends the best-fit agent. The agent runs. It emits liveness signals. It writes checkpoints after each subtask. If it fails — for any reason — CAIRN detects it automatically, classifies the failure, computes a recovery score, and either assigns a fallback agent (who resumes from the last checkpoint) or routes to dispute. On resolution, escrow splits proportionally between the original and fallback agents based on verified work done. The execution record is written. The intelligence layer grows. The next agent inherits the lesson.

Secondary Output: Execution Intelligence

As a byproduct of the recovery protocol running, CAIRN accumulates an execution intelligence layer — a shared, queryable record of every failure, every recovery, and every successful completion across the ecosystem.

This is what makes CAIRN compound in value over time. The knowledge graph grows automatically. The more agents integrate CAIRN, the richer the intelligence layer becomes. Agents query it before starting tasks. The ecosystem gets smarter from every failure.

The knowledge graph is the byproduct. The recovery protocol is the core.

What CAIRN is NOT

Not a new agent framework. CAIRN wraps any existing framework — LangGraph, Olas SDK, AgentKit, custom builds.
Not a knowledge graph product. Bonfires (the visualization layer) is a window into the intelligence layer, not the protocol itself.
Not a centralized service. Every state transition is enforced by the CAIRN state machine contract. No server. No admin key. No human required.
Not a replacement for ERC-8183 or ERC-8004. CAIRN integrates and extends both. It is an ERC-8183 Hook and an ERC-8004 reputation writer.
Not optional infrastructure. The escrow condition makes record-writing mandatory — agents cannot receive payment without completing the protocol.

The Six-State Machine

Every task moves through exactly one of these states. No silent failures. No ambiguous states. Every transition is enforced on-chain.

                    ┌─────────────────────────────────────────┐
                    │                                         │
                    │   ┌─────────┐                           │
                    │   │  IDLE   │  ← task created           │
                    │   └────┬────┘                           │
                    │        │ startTask()                    │
                    │        ▼                                │
                    │   ┌─────────┐  heartbeat ───────────────┤
                    │   │ RUNNING │  checkpoint               │
                    │   └────┬────┘                           │
                    │        │ failure detected               │
                    │        ▼                                │
                    │   ┌─────────┐                           │
                    │   │ FAILED  │                           │
                    │   └────┬────┘                           │
                    │  score ≥ 0.35       score < 0.35        │
                    │        │                  │             │
                    │        ▼                  ▼             │
                    │  ┌───────────┐     ┌──────────┐         │
                    │  │RECOVERING │     │ DISPUTED │         │
                    │  └─────┬─────┘     └────┬─────┘         │
                    │        │ completes      │ arbiter       │
                    │        └───────┬────────┘               │
                    │                ▼                        │
                    │         ┌──────────┐                    │
                    │         │ RESOLVED │ ← terminal         │
                    │         └──────────┘                    │
                    └─────────────────────────────────────────┘

State	Description
IDLE	Task created, intelligence queried, agent recommended
RUNNING	Agent executing, heartbeats active, checkpoints committed
FAILED	Liveness / budget / deadline violation detected automatically
RECOVERING	Fallback assigned, resumes from last valid checkpoint
DISPUTED	Low recovery score, arbiter intervention required
RESOLVED	Escrow settled, reputation updated, record written (terminal)

Three-Class Failure Taxonomy

Class	Trigger	Class weight F (v2)	Default Path
LIVENESS	Heartbeat missed beyond `heartbeat_interval`	0.70 (high recovery)	RECOVERING
RESOURCE	Budget exceeded or deadline passed	0.30 (partial recovery)	RECOVERING / DISPUTED
LOGIC	Invalid checkpoint, schema violation, hallucination	0.00 (no recovery)	DISPUTED

Recovery Score Formula (v2 — multiplicative):

r = F^0.80 × B^0.35 × D^0.15

Routing: r ≥ 0.40 → RECOVERING (full scope) | 0.35 ≤ r < 0.40 → RECOVERING (reduced scope) | r < 0.35 → DISPUTED

The multiplicative form captures the "any-factor-kills-it" dynamic: if budget, deadline, or class recoverability approaches zero, the score collapses to zero — matching the ground-truth recovery dynamics. The formula was selected after Monte Carlo simulation across 100,000 task-failure events and 16 experiments comparing it against three linear alternatives; see Whitepaper §6.4 and simulation/RESULTS_EQ4.md.

v1 testnet note. The contracts currently deployed on Base Sepolia implement the interim linear formula r = 0.5·F + 0.3·B + 0.2·D with class weights (0.90, 0.50, 0.10) and a single threshold at 0.30. The v2 multiplicative formula ships in RecoveryRouterV2.sol (24 unit tests passing) and migrates via governance through the IRecoveryRouter interface. See PRD-04 for the migration plan.

The 14-Action Protocol

Phase 1 — Initialization (A1–A3)

A1 · Operator submits task spec: task_type, budget_cap, deadline, heartbeat_interval, output schemas per subtask.

A2 · Protocol queries execution intelligence layer by task_type → known failure patterns, real cost distribution from prior executions, recommended agent (highest success rate + reputation), known-bad time windows.

A3 · Operator confirms. Locks escrow. Pre-authorizes CAIRN for fallback sub-delegation (ERC-7710 caveat: allowed actions + budget cap + fallback pool). State → RUNNING.

Phase 2 — Running (A4–A6)

A4 · Agent completes subtask N. Writes output to IPFS. Calls commitCheckpoint(taskId, N, CID, cost). Protocol validates CID against declared schema.

A5 · Agent emits liveness ping: heartbeat(taskId). Resets liveness timer every heartbeat_interval.

A6 · Protocol enforces (public, permissionless — anyone can call): checkLiveness() · checkBudget() · checkDeadline().

Phase 3 — Failed (A7–A8)

A7 · Protocol classifies failure type (LIVENESS / RESOURCE / LOGIC). Computes recovery_score. Writes Failure Record to IPFS. Emits TaskFailed(taskId, recordCID, recoveryScore).

A8 · Routes: score ≥ 0.6 → RECOVERING (full). 0.3 ≤ score < 0.6 → RECOVERING (reduced). score < 0.3 → DISPUTED.

Phase 4 — Recovering (A9–A11)

A9 · Queries execution intelligence for best fallback: task_type match + reputation + availability.

A10 · Transfers state to fallback: checkpoint CID list, next_subtask_index, remaining budget, remaining deadline, scoped permissions (ERC-7710 pre-authorized caveat from A3).

A11 · Fallback reads checkpoint list from IPFS, resumes from next_subtask_index. New liveness clock starts. Continues A4/A5/A6 cycle.

Phase 5 — Resolved (A12)

A12 · Computes escrow split by verified checkpoint count. Releases escrow. Writes Resolution Record to IPFS. Emits TaskResolved. Writes positive reputation signal to ERC-8004. State → RESOLVED (terminal).

Phase 6 — Disputed (A13–A14)

A13 · Holds escrow. Writes negative reputation to ERC-8004. Exposes Failure Record CID publicly. Starts arbiter_timeout clock. Emits TaskDisputed(taskId, recordCID, arbiterTimeout).

A14 · Registered arbiter reads Failure Record, calls rule(taskId, outcome). Arbiter fee deducted from escrow. If timeout expires with no arbiter: auto-refund operator. Either path → RESOLVED.

Architecture

Four layers. Only the CAIRN Protocol Layer is new code. Everything else integrates live existing infrastructure.

┌──────────────────────────────────────────────────────────────┐
│ ACTORS                                                        │
│ Operator · Primary Agent · Fallback Pool · Arbiter            │
└─────────────────────────────┬────────────────────────────────┘
                              │
┌─────────────────────────────▼────────────────────────────────┐
│ CAIRN PROTOCOL LAYER                    ← only new code       │
│ CairnCore · RecoveryRouter · FallbackPool · ArbiterRegistry   │
└─────────────────────────────┬────────────────────────────────┘
                              │ integrates with
┌─────────────────────────────▼────────────────────────────────┐
│ ETHEREUM STANDARDS LAYER                ← existing live infra │
│ ERC-8183 (escrow + hooks) · ERC-8004 (identity + reputation)  │
│ ERC-7710 (delegation) · Olas Mech Marketplace                 │
└─────────────────────────────┬────────────────────────────────┘
                              │ writes to / reads from
┌─────────────────────────────▼────────────────────────────────┐
│ EXECUTION INTELLIGENCE LAYER            ← grows automatically │
│ IPFS execution records · Bonfires graph · The Graph indexer   │
└─────────────────────────────┬────────────────────────────────┘
                              │ deployed on
┌─────────────────────────────▼────────────────────────────────┐
│ BASE SEPOLIA                                                  │
│ ~2s block time · low gas · AgentKit native · ERC-8183 live    │
└──────────────────────────────────────────────────────────────┘

Quick Start

pip install cairn-sdk

# or clone locally
git clone https://github.com/MarouaBoud/cairn-protocol
cd cairn-protocol && pip install -e ./sdk

from cairn_sdk import CairnClient, CairnAgent
import os

client = CairnClient(
    rpc_url="https://sepolia.base.org",
    contract_address="0xB65596B21d670b6C670106C3e3c7E5FFf8E3A640",  # CairnCore
    private_key=os.environ["PRIVATE_KEY"]
)

# Submit a task with checkpoint protocol
task = await client.submit_task(
    task_type="defi.rebalance",
    budget_cap=0.05,           # ETH
    deadline=3600,             # seconds
    heartbeat_interval=60
)

# Checkpoint after each subtask
await agent.checkpoint(task.id, subtask_n=1, output_cid="Qm...")

# Heartbeat to signal liveness
await agent.heartbeat(task.id)

📚 Full guides: Integration · SDK Quickstart · CLI Reference

AI Agent Skill Endpoint

CAIRN exposes machine-readable endpoints for AI agents to fetch integration instructions:

# Quick integration guide (5-minute setup)
curl -s https://cairn-protocol-iona-78423aa1.vercel.app/skill.md

# Full protocol documentation
curl -s https://cairn-protocol-iona-78423aa1.vercel.app/cairn.md

These endpoints return markdown that AI agents can parse to integrate CAIRN into their workflows automatically.

Documentation

Document	Description
Whitepaper v2.0	The protocol specification — problem, mechanism, formal proofs, simulation-validated formula, economic model
ERC Specification	Technical standard (EIP-1 format draft)
Security	Security model, attack vectors, mitigations
Changelog	Version history
Simulation results	Monte Carlo calibration (Runs 1-4, 16 experiments, 100k events each); see `RESULTS_EQ4.md` for the headline multiplicative-formula result

Technical Documentation

Document	Description
Concepts	Failure taxonomy, state machine, glossary
Architecture	System design, protocol flow diagrams
Execution Intelligence	Knowledge graph, queries, network effects
Integration	Checkpoint protocol, fallback pool, guides
Contracts	Interfaces, schemas, component reference
Standards	ERC-8183, ERC-8004, ERC-7710, Olas integration
Observer	CAIRN Observer — failure cost visibility layer
CLI Usage	Command-line tool for task management
Multi-Sig Governance	Gnosis Safe setup, parameter management
Olas Integration	Mech marketplace adapter, fallback pool

Protocol Status

Property	Value
Specification	v2 (this paper — multiplicative recovery score, three-tier routing)
Testnet deployment	v1 (interim linear formula) — Live on Base Sepolia, Chain ID 84532
Whitepaper	v2.0 — April 2026
ERC Dependencies	ERC-8183, ERC-8004, ERC-7710
v1 → v2 migration	Governance-gated via `IRecoveryRouter` interface; see PRD-04

Implementation Status

Component	Status	Notes
Whitepaper v2.0	✅ Released	Formal protocol specification, proofs, simulation-validated formula
Smart contracts (v1)	✅ Deployed	Live on Base Sepolia — 6 contracts, 339 tests passing
RecoveryRouterV2 (v2)	✅ Implemented	24 unit tests, gas measured (avg 5,748 / max 19,935) — ready for governance upgrade
Monte Carlo simulation	✅ Complete	4 runs, 16 experiments, 100k events each — see `simulation/`
PRD-01 MVP	✅ Complete	v1 protocol shipped
PRD-03 Recovery score calibration	✅ Complete	Derived the v2 multiplicative formula and its parameters
PRD-04 v2 contract upgrade	🟡 Phase 1 complete	Phases 2-6 (three-tier routing in CairnCore, stake raise, schema validation, deploy) pending
SDK (Python)	✅ Complete	CairnClient, CairnAgent, CheckpointStore, Observers
CLI Tool	✅ Complete	submit-task, heartbeat, checkpoint, monitor, recover
Subgraph	✅ Deployed	The Graph Studio indexing
Upgradeable	✅ Complete	UUPS proxy pattern (OpenZeppelin 5.x)
Frontend	✅ Deployed	Next.js 14, wagmi

See PRDs/README.md for the full roadmap.

Deployed Contracts (Base Sepolia)

Contract	Address	Description
CairnCore	`0xB65596B21d670b6C670106C3e3c7E5FFf8E3A640`	Main entry point — 6-state machine, task lifecycle
CairnGovernance	`0x7A09567e0348889Cc14264bEcf08F8d72Dc6987f`	Protocol parameters, admin controls
RecoveryRouter	`0xE52703946cb44c12A6A38A41f638BA2D7197a84d`	Failure classification, recovery scoring
FallbackPool	`0x4dCeA24eaD4026987d97a205598c1Ee1CE1649B0`	Agent registration, selection algorithm
ArbiterRegistry	`0xfb50F4F778F166ADd684E0eFe7aD5133CE34aE68`	Dispute resolution, appeals
CairnTaskMVP (legacy)	`0x2eFd1De57BfF1Ea3E40b049F70bb58590Ea73417`	Legacy MVP (4-state) — use CairnCore for production

All contracts use UUPS proxy pattern (OpenZeppelin 5.x). Upgradeable without redeployment.

Live Demo

Resource	URL
Frontend	cairn-protocol-iona-78423aa1.vercel.app
Subgraph	The Graph Studio
Query Endpoint	`https://api.studio.thegraph.com/query/1744842/cairn/v1.0.0`

Quick Links

Understand CAIRN: Whitepaper v2 → Concepts
Technical Spec: ERC-CAIRN → Contracts
Build with CAIRN: Integration Guide
Security: Security Model
Reproduce the simulation: python3 -m simulation.run_eq4 (seed=42, deterministic on NumPy ≥1.20)

Standards Integration

CAIRN integrates with existing Ethereum standards rather than replacing them:

Standard	What It Provides	Role in CAIRN
ERC-8183	Standardized escrow for agent jobs with lifecycle hooks	Holds payment until task completes; CAIRN registers as a lifecycle hook to intercept failures
ERC-8004	On-chain agent identity and reputation registry	Verifies agent identity; CAIRN writes success/failure signals to reputation scores
ERC-7710	Scoped permission delegation with caveats	Enables pre-authorized fallback assignment without requiring new signatures at recovery time
Olas Mech Marketplace	Registry of available agent services with staking	Provides the fallback agent pool; CAIRN queries for best-fit backup agents

For detailed integration guidance, see Standards Documentation.

Repository Structure

cairn-protocol/
├── contracts/          # Solidity smart contracts (Foundry)
│   ├── src/           # Core contracts (CairnCore, RecoveryRouter, RecoveryRouterV2, FallbackPool, ArbiterRegistry)
│   └── test/          # 339 tests passing
├── sdk/               # Python SDK (CairnClient, CairnAgent, CheckpointStore)
├── cli/               # CLI tool — task management, monitoring
├── frontend/          # Next.js 14 dashboard
├── pipeline/          # Off-chain event listener
├── subgraph/          # The Graph indexer
├── simulation/        # Monte Carlo recovery-score calibration (Runs 1-4, 16 experiments)
├── PRDs/              # Product requirements documents
├── docs/              # Technical documentation
├── PUBLICATION/       # arXiv submission bundle (whitepaper LaTeX, figures, metadata)
└── WHITEPAPER_V2.md   # Protocol specification

What Makes CAIRN Different

#	Differentiator
1	Not a framework — Wraps any agent SDK (LangGraph, Olas, AgentKit, CrewAI, AutoGen)
2	Escrow-enforced — Agents cannot get paid without completing the protocol's record-writing
3	Automatic recovery — No human-in-the-loop required between task submission and settlement
4	Simulation-validated formula — The v2 multiplicative recovery score is within 0.93pp of the Bayes-optimal floor on the calibrated ground-truth model
5	Network effects — Every failure becomes a queryable record; the intelligence layer grows with task throughput

License

See LICENSE for details.

Component	License
ERC-CAIRN.md	CC0-1.0
WHITEPAPER_V2.md	All Rights Reserved (see header copyright notice)
contracts/	GPL-3.0-or-later
sdk/, cli/	Apache-2.0
subgraph/	MIT
frontend/	AGPL-3.0-or-later
docs/	CC BY 4.0

Contributing

Contributions are welcome! Whether it's bug fixes, new features, documentation improvements, or ideas — we'd love your help.

See CONTRIBUTING.md for:

Development setup (Foundry, Python, Node.js)
Code style guidelines
Testing requirements
Pull request process

By contributing, you agree to license your contributions under the same license as the component you're modifying.

Cite this work

If you use CAIRN in your research, please cite the whitepaper:

@misc{boudoukha2026cairn,
  title  = {CAIRN: A Protocol for Agent Failure Detection, Classification, and Recovery in the On-Chain Agent Economy},
  author = {Boudoukha, Maroua},
  year   = {2026},
  note   = {Whitepaper v2.0. Reproducible simulation: python3 -m simulation.run\_eq4 (seed=42).},
  url    = {https://github.com/MarouaBoud/cairn-protocol}
}

An arXiv preprint will be linked here after submission acceptance.

Author

Built by Maroua Boudoukha · ML/AI Engineer · Web3 Builder

Name		Name	Last commit message	Last commit date
Latest commit History 181 Commits
.archive/hackathon-synthesis-2026		.archive/hackathon-synthesis-2026
.github/workflows		.github/workflows
.synthesis		.synthesis
PRDs		PRDs
PUBLICATION		PUBLICATION
archive		archive
cli		cli
contracts		contracts
docs		docs
examples		examples
frontend		frontend
pipeline		pipeline
sdk		sdk
simulation		simulation
subgraph		subgraph
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.gitmodules		.gitmodules
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
ERC-CAIRN.md		ERC-CAIRN.md
INSTALLATION.md		INSTALLATION.md
LICENSE		LICENSE
OLAS_IMPLEMENTATION.md		OLAS_IMPLEMENTATION.md
README.md		README.md
SECURITY.md		SECURITY.md
WHITEPAPER_V1.md		WHITEPAPER_V1.md
WHITEPAPER_V2.md		WHITEPAPER_V2.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

The Problem: Invisible Failures, Wasted Work

The Cost of Doing Nothing

A Failure Story: What Happens Today

❌ Without CAIRN

✅ With CAIRN

Why Now

The Cairn Metaphor

What is CAIRN?

The Protocol in One Paragraph

Secondary Output: Execution Intelligence

What CAIRN is NOT

The Six-State Machine

Three-Class Failure Taxonomy

The 14-Action Protocol

Architecture

Quick Start

AI Agent Skill Endpoint

Documentation

Technical Documentation

Protocol Status

Implementation Status

Deployed Contracts (Base Sepolia)

Live Demo

Quick Links

Standards Integration

Repository Structure

What Makes CAIRN Different

License

Contributing

Cite this work

Author

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages