Agentfootprint

agentfootprint mascot composing context flavors (Skills, Steering, Guardrails, RAG, Tool APIs, Memory) into three structured LLM slots (system, messages, tools) — the central abstraction, visualized.

Agentfootprint

We abstract context engineering — and hand back the trace.
Live to develop · offline to monitor · detailed to improve.

1. What we abstract

When you build an Agentic Application, you collect domain-specific data and instructions, then wire them up based on what your system receives.

That data and those instructions wear many names — Skills · Steering · Guardrails · RAG · Tool APIs · Memory — with more on the way. But they all do the same thing: they inject into one of three slots in the LLM call (system, messages, tools).

So we abstracted the injection itself.

The abstraction is three rules:

Three slots are fixed. system, messages, tools — the LLM API surface.
N flavors are open. You declare what you have. Tomorrow's flavor (few-shot, reflection, persona, A2A handoff…) plugs in the same way.
Rules decide where and when. You provide the rules. We collect your data, fire the right one, land it in the right slot at the right iteration.

That's the whole model: Injection = slot × trigger × cache.

Slot — which of the 3 LLM API regions the content lands in (system / messages / tools).
Trigger — when the content fires (see below).
Cache — how stable the content is across iterations. The framework places provider cache markers for you — stable content gets 80–90% cheaper prefixes.

The 4 triggers

Trigger	Flavor	Fires when	Builder example	Default slot
`always`	static	Every iteration	`.steering('You are a triage agent…')`	`system`
`rule`	runtime — predicate	Your rule returns true	`.rag({ when: s => /price\|refund/.test(s.userQuery) })`	`messages`
`on-tool-return`	runtime — lifecycle	After a specific tool returns	`.instruction({ after: 'search_db', text: 'Cite source IDs.' })`	`messages`
`llm-activated`	runtime — agent-driven	LLM calls `read_skill('id')`	`.skill({ id: 'refund-policy', activatedBy: 'read_skill' })`	`messages` (body)

Note

Slot is a default, not a coupling — the same Skill can live in tools (schema only, discovered via read_skill), messages (body injected on activation), or system (baked into the prompt as steering).

3 slots × 4 triggers × N flavors = the entire context-engineering surface.

2. Why we chose this abstraction

The agent space has many credible primary abstractions:

Framework	What it abstracts
LangChain	Pipelines of composable components
LangGraph	State machines of nodes and edges
CrewAI · AutoGen	Crews of role-playing agents
Mastra · Genkit · Pydantic AI	Typed full-stack bundles
DSPy	Compiled prompts
Inngest AgentKit	Durable workflows

We didn't have to choose between them.

agentfootprint is built on footprintjs — the flowchart pattern for backend code. footprintjs gives us every one of those abstractions out of the box:

Capability	What footprintjs hands us
Composition	`Sequence` · `Parallel` · `Conditional` · `Loop`
State machines	The ReAct loop is a flowchart
Multi-agent crews	Compose Agents through control flow — no special class needed
Durable workflows	`pauseHere()` plus JSON-portable `resume()`
Typed observation	60+ events for free, because the framework owns the loop

So we used the budget those abstractions would have cost us to invest deeply in something they all leave to the developer: the injection loop.

Important

We abstract context engineering — and hand back the trace. Live to develop · offline to monitor · detailed to improve.

The reason — agents have a new class of bug

For fifty years, software bugs have been logic errors. A wrong condition, a missed edge case, an off-by-one. You step through the code until you find the bad branch.

LLM-powered apps add a second class of bug: contextual errors. The code is correct. The model is correct. The answer is wrong because the LLM's decision rests on context that was ambiguous, confusing, or misleading at the moment of inference.

Tracking which content the model actually saw, and why, is the entire debugging job. Without it, the failure mode is invisible:

What got injected wrong	What the model did
Wrong instruction landed in the `system` slot	Followed the wrong rule
Predicate fired one iteration too early	Reasoned with stale assumptions
Skill body missing when the LLM called `read_skill`	Invented its own
Cache prefix invalidated mid-iteration	Saw a silently rewritten stale version
Tool returned but the `on-tool-return` injection didn't fire	Couldn't interpret the result

Important

The model doesn't tell you which of these went wrong. It just gives you the wrong answer.

You can't step through that with a debugger. By the time you read the response, the context that produced it is gone unless something recorded it.

That's the gap agentfootprint fills. A framework that owns the control flow can debug logic errors. A framework that owns the injection can debug contextual errors — because every injection is a typed event with a where, when, why, and how-it-cached.

What that buys you

Because we own the injection, every LLM call backtracks to four typed answers:

What was injected
Who triggered it (which rule)
When it fired
How it landed — slot, position, cache

Same trace, three workflows:

Live — debug as you build. See exactly which injection produced which token, which predicate fired this iteration, which prefix actually got cached.
Offline — monitor what shipped. Replay any past run from its trace. Alert on drift. Attribute cost per injection.
Detailed — improve via export. Every successful trajectory is labeled training data for SFT, DPO, or RL — no separate data-collection phase.

And a fourth, novel: the agent can read its own trace. Six months after the agent rejected loan #42, "why did you reject it?" answers from the recorded evidence (creditScore=580, threshold=600), not a rerun. Causal memory turns the trace into the agent's working memory.

3. How do I design my agent or system of agents?

Two scales — same alphabet. Four control flows are the entire vocabulary.

	import { Sequence } from 'agentfootprint'; const flow = Sequence.create() .step('a', stageA) .step('b', stageB) .step('c', stageC) .build();
	import { Parallel } from 'agentfootprint'; const fan = Parallel.create() .branch('web', searchWeb) .branch('docs', searchDocs) .mergeWithFn(synthesizer) .build();
	import { Conditional } from 'agentfootprint'; const router = Conditional.create() .when('billing', s => s.intent === 'billing', billingAgent) .when('tech', s => s.intent === 'tech', techAgent) .otherwise('default', defaultAgent) .build();
	import { Loop } from 'agentfootprint'; const reflexion = Loop.create() .repeat(thinkAgent) .until(s => s.satisfied) .build();

Inside one agent — Dynamic vs Classic ReAct

Classic ReAct vs Dynamic ReAct loop topology — same 5 stages (SystemPrompt, Messages, Tools, CallLLM, Route → ExecuteTools/Finalize), but the loop edge differs: Classic returns to CallLLM only (slots frozen at 12 tools every iteration), Dynamic returns to SystemPrompt (slots recompose, tools shrink from 1 to 5 as skills activate).

Same five stages on both sides. Only one thing differs — where the loop returns. Classic ReAct loops back to CallLLM and slots stay frozen. Dynamic ReAct (agentfootprint) loops back to SystemPrompt, so injections that fired on the previous tool result recompose the next prompt. Per-iteration recomposition is also the structural prerequisite for the cache layer.

Iteration	Classic ReAct	Dynamic ReAct (agentfootprint)
1	12 tools shown	1 tool (`read_skill`)
2	12 tools shown	5 tools (skill activated)
3	12 tools shown	5 tools

📖 Dynamic ReAct guide · Key concepts

Multi-agent — compose with the alphabet

Pick the flows that match your problem. Chain them. That's your Agentic Application.

const research = Loop.create()
  .repeat(Sequence.create().step('plan', plan).step('search', searchAll).build())
  .until(s => s.satisfied).build();

Same .create().method().build() shape as the four rows above — just composed.

Named patterns — also compositions of the same 4

The patterns the field knows reduce to the same alphabet:

Pattern	Composition
Swarm	`Loop( Parallel( Agent×N ) → merge )`
Tree-of-Thoughts	`Loop( Parallel( Agent×N ) → Conditional(score) )`
Reflexion	`Loop( Agent → Conditional(critique) → Agent )`
Debate	`Parallel( Agent_pro, Agent_con ) → Agent_judge`
Router	`Conditional → Agent_A \| Agent_B \| Agent_C`
Hierarchical	`Agent_planner → Sequence( Agent_worker×N ) → synth`

Same trick as Beat 1: instead of N libraries for N patterns, we found the M building blocks all N patterns are made of.

📖 Compare: hand-rolled vs declarative · migration from LangChain / CrewAI / LangGraph

4. How do I see what my agent did?

Because we own the loop (Beat 2), every decision and execution is captured during traversal — not bolted on. The default capture is the causal trace: every stage, read, write, and decision evidence, as a JSON-portable, scrubbable, queryable, exportable artifact. Beyond the default, wire custom recorders for cost, latency, or quality scoring — any observation hook fires on the same stream.

The same trace serves three downstream consumers — no extra instrumentation:

Audit / compliance. Six months later, "why was loan #42 rejected?" answers from the chain (creditScore=580 < 620 ∧ dti=0.6 > 0.43 → riskTier=high → REJECTED). No LLM call. GDPR Art. 22, ECOA, and EU AI Act adverse-action notices write themselves from the captured decision evidence.
Cheap-model triage. A Sonnet trace becomes good input for Haiku to answer follow-ups. ~200 tokens at any model ($0.25/1M) vs ~2,500 tokens at a reasoning model ($15/1M). Memoization for agent thinking — no agent rerun.
Training data — the substrate is already there. Every successful chain is a labeled trajectory. SFT pairs ({prompt, completion}) fall out of the snapshot's history field; the export wrapper is roadmap work tracked in GitHub issues. DPO and process-RL need additional collection layers (preference feedback, per-step reward annotation) that don't ship today.

Two built-in lenses view the same trace:

Lens	View	When to use
Lens	Agent-centric — User/Agent[3 slots]/Tool flowchart with iteration scrubber and round commentary	Live debugging, "what did Neo see at step 5?"
Explainable Trace	Structural — subflow tree, full flowchart, memory inspector, per-stage execution timeline	Architecture review, root-cause analysis

📖 Powered by footprintjs causalChain() — backward thin-slicing on the commit log. Causal memory deep dive · Explainability & compliance

One recording. Two lenses. Three consumers. Zero extra instrumentation.

Quick start — runs offline, no API key

npm install agentfootprint footprintjs

import { Agent, defineTool, mock } from 'agentfootprint';

const weather = defineTool({
  name: 'weather',
  description: 'Get current weather for a city.',
  inputSchema: {
    type: 'object',
    properties: { city: { type: 'string' } },
    required: ['city'],
  },
  execute: async ({ city }: { city: string }) => `${city}: 72°F, sunny`,
});

const agent = Agent.create({
  provider: mock({ reply: 'I checked: it is 72°F and sunny.' }),
  model: 'mock',
})
  .system('You answer weather questions using the weather tool.')
  .tool(weather)
  .build();

const result = await agent.run({ message: 'Weather in Paris?' });
console.log(result);  // → "I checked: it is 72°F and sunny."

Swap mock(...) for anthropic(...) / openai(...) / bedrock(...) / ollama(...) for production. Nothing else changes.

Mocks first, production second

Build the entire app against in-memory mocks with zero API cost, then swap real infrastructure one boundary at a time.

Boundary	Dev	Prod
LLM provider	`mock(...)`	`anthropic()` · `openai()` · `bedrock()` · `ollama()`
Memory store	`InMemoryStore`	`RedisStore` · `AgentCoreStore`
MCP	`mockMcpClient(...)`	`mcpClient({ transport })`
Cache strategy	`NoOpCacheStrategy`	auto-selected per provider

The flowchart, recorders, and tests don't change between dev and prod.

What ships today

Core

2 primitives — LLMCall, Agent (the ReAct loop)
4 control flows — Sequence, Parallel, Conditional, Loop
1 Injection primitive — defineSkill / defineSteering / defineInstruction / defineFact
1 reliability gate — .reliability({ preCheck, postDecide, providers, circuitBreaker, fallback })
1 tool dispatch primitive — ToolProvider (sync OR async) — staticTools · gatedTools · skillScopedTools · custom discoveryProvider over hubs / MCP / per-tenant catalogs

LLM providers (7)

Factory	Use for
`anthropic`	Claude (Sonnet, Opus, Haiku) via `@anthropic-ai/sdk`
`openai`	GPT-4o, GPT-4-turbo via `openai` SDK
`bedrock`	Claude / Titan / Mistral via AWS Bedrock runtime
`ollama`	Local models (OpenAI-compatible endpoint)
`browserAnthropic`	Browser-side Claude calls (no proxy server)
`browserOpenai`	Browser-side OpenAI calls (no proxy server)
`mock`	Deterministic dev/test (zero API cost)

Memory + adapters

Memory factory — 4 types (episodic / semantic / narrative / causal) × 7 strategies (window / budget / summarize / topK / extract / decay / hybrid)
Memory stores — InMemoryStore, RedisStore (peer-dep ioredis), AgentCoreStore (peer-dep AWS SDK)
RAG · MCP adapters — mockMcpClient(...) / mcpClient({ transport })

Operability

Provider-agnostic prompt caching — declarative per-injection, per-iteration marker recomputation
Pause / resume — JSON-serializable checkpoints; resume hours later on a different server
Resilience primitives — withRetry, withFallback, withCircuitBreaker, .outputFallback, agent.resumeOnError
60+ typed observability events — agent · composition · context · stream · tools · skill · memory · cache · cost · permission · eval · embedding · pause · error · fallback · resilience · reliability · risk

Tooling

Lens · Explainable Trace — two visual replays of the causal trace (separate agentfootprint-lens package)
AI-coding-tool support — Claude Code · Cursor · Windsurf · Cline · Kiro · Copilot

📖 Agent API reference · CHANGELOG

Where to next

If you are...	Go here
New to agents	5-minute quick start
Coming from LangChain / CrewAI / LangGraph	Migration guide
Architecting an enterprise rollout	Production guide
Doing due diligence	Architecture overview
Researcher / academic background	Citations & prior art
Curious about design	Inspiration docs

Or jump into the examples gallery — every example is also an end-to-end CI test.

Built on

footprintjs — the flowchart pattern for backend code. agentfootprint's decision-evidence capture, narrative recording, and time-travel checkpointing are footprintjs primitives at the runtime layer.

You don't need to learn footprintjs to use agentfootprint — but if you want to build your own primitives at this depth, start there.

Name		Name	Last commit message	Last commit date
Latest commit History 432 Commits
.claude		.claude
.github		.github
ai-instructions		ai-instructions
docs-site		docs-site
docs		docs
examples		examples
scripts		scripts
src		src
test		test
.eslintrc.js		.eslintrc.js
.gitignore		.gitignore
.prettierrc.js		.prettierrc.js
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MIGRATION_PLAN.md		MIGRATION_PLAN.md
README.md		README.md
package.json		package.json
tsconfig.esm.json		tsconfig.esm.json
tsconfig.json		tsconfig.json
typedoc.json		typedoc.json
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agentfootprint

1. What we abstract

The 4 triggers

2. Why we chose this abstraction

The reason — agents have a new class of bug

What that buys you

3. How do I design my agent or system of agents?

Inside one agent — Dynamic vs Classic ReAct

Multi-agent — compose with the alphabet

Named patterns — also compositions of the same 4

4. How do I see what my agent did?

Quick start — runs offline, no API key

Mocks first, production second

What ships today

Where to next

Built on

License

About

Uh oh!

Releases 77

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Agentfootprint

1. What we abstract

The 4 triggers

2. Why we chose this abstraction

The reason — agents have a new class of bug

What that buys you

3. How do I design my agent or system of agents?

Inside one agent — Dynamic vs Classic ReAct

Multi-agent — compose with the alphabet

Named patterns — also compositions of the same 4

4. How do I see what my agent did?

Quick start — runs offline, no API key

Mocks first, production second

What ships today

Where to next

Built on

License

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 77

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages