LegionForge

A local-first, security-native AI agent framework built on LangGraph.

Security is enforced in the execution path — not layered on afterward.

Important

The source repository is not yet public — this is a pre-release preview. LegionForge v0.7.1-alpha is complete and verified (2247 smoke tests, 41 integration tests), currently in final UAT before the public repository goes live. Installation instructions and the full codebase will be available at release.

Watch / star this repo to be notified the moment it opens. Questions or early access requests: jp@legionforge.org

What Is LegionForge?

LegionForge is an open-source framework for running hardened AI agent systems on your own hardware. It runs local LLMs via Ollama or cloud APIs (OpenAI, Anthropic), with a full security stack baked into every layer of the execution pipeline — not bolted on afterward.

The one-line pitch: The hardened, self-hosted alternative to cloud agent platforms — and a security layer that other agent frameworks can plug into.

Why it exists: In January 2026, OpenClaw hit 60,000 GitHub stars in 72 hours and 300,000+ users in weeks. Kaspersky found 512 vulnerabilities (8 critical). Cisco found active data exfiltration in third-party skills. LegionForge is built in the opposite order: security first, product on top.

Key Numbers

Metric	Value
Smoke tests (no services required)	2247 / 2247 passing (~21s)
Integration tests (PostgreSQL)	41 / 41 passing
Kerberos live-KDC tests	5 / 5 passing
UI tests (Playwright)	40 / 40 passing
Tool accuracy tests	79 / 79 passing
Guardian security checks per tool call	7 (deterministic, 0 LLM calls)
Threat classes covered	11
Prompt injection detection patterns	29 (2-tier: halt + log)
Auth backends	5 (API key, OIDC, GitHub, LDAP, Kerberos)
Channel connectors	4 (Discord, Telegram, Slack, Webhook)

Core Design Principles

Principle	Implementation
Fail-safe tiering	`halt → sandbox/retry → degrade` — never silent failure
Human gates on all mutations	No component autonomously changes security rules, promotes tools, or escalates privileges
Replace AI with determinism	Repeated deterministic tasks crystallize into signed, containerized tools with zero LLM overhead
Validate at trust boundaries	Guardian enforces checks at every inbound/outbound boundary — agents are processing nodes, not trust boundaries
Privilege tied to tasks	Short-lived JWT task tokens scoped to exactly what the current task requires

User Interfaces

Submit tasks from wherever you already are:

Interface	How
Web UI	`http://localhost:8080/ui` — live SSE streaming, tool call blocks, session history
Discord	`!<task>` in any channel the bot can see
Telegram	`/<task>` to your bot
Slack	`!<task>` via Socket Mode (no public URL needed)
Webhook	`POST :8081/inbound` — HMAC-SHA256 verified, works with n8n / Zapier / any HTTP client
REST API	`POST /tasks` + `GET /tasks/{id}/stream` — full SSE streaming

Security Architecture

Guardian Sidecar

A standalone FastAPI process (:9766) running a deterministic-only 7-check pipeline on every tool call. No LLM calls in the hot path. Unpoisonable. Fails safe to halt on any error.

Check 0  Tool revocation      — REVOKED tools blocked before any other check (cache TTL: 10s)
Check 1  Registry + Hash      — tool must be APPROVED; SHA-256 hash must match registration
Check 2  Capability boundary  — negative capability list enforced per agent profile
Check 3  Destructive patterns — regex detection of destructive command patterns in tool args
Check 4  Sequence contract    — agent tool sequences validated; deviations → sandbox retry
Check 5  Ed25519 signature    — crystallized tools must carry a valid signing-keypair signature
Check 6  Adaptive rules       — human-approved threat rules hot-reload every 10s; no restart

Crystallization Pipeline

When agents solve the same deterministic problem repeatedly, LegionForge crystallizes the solution into a signed, containerized tool — zero LLM inference cost for routine work.

Observer ──▶ Crystallizer ──▶ Pre-HITL Analyzer ──▶ Human Gate ──▶ Ed25519-Signed Tool

The Pre-HITL Analyzer runs AST guards (subscript bypass, MRO traversal, globals()/locals() hijack) and behavioral diffs before any human reviews a proposal.

Multi-Provider Authentication

The gateway supports five auth backends — swap without touching agent code:

Backend	Scheme	Use case
`ApiKeyBackend`	Bearer	Default — bcrypt API keys in PostgreSQL
`OIDCBackend`	Bearer	Google, Okta, Auth0, Azure AD, Keycloak, Cognito
`GitHubOAuthBackend`	Bearer	GitHub OAuth app tokens
`LDAPBackend`	Basic	OpenLDAP, Active Directory
`KerberosBackend`	Negotiate	Kerberos/GSSAPI (requires KDC + keytab)

Set gateway.auth_provider in your hardware profile YAML.

Threat Coverage

Threat	Defense
Tool Poisoning	SHA-256 hash validation at registration + Ed25519 cryptographic signing
Rug-Pull	Hash mismatch detection + signed tool versioning
Prompt Injection (direct + indirect)	29-pattern sanitizer (Tier 1 halt / Tier 2 log) + NFKC normalization + zero-width char stripping + RAG provenance scoring
Capability Amplification	Negative capability list enforced by Guardian Check 2
Resource Bomb / Economic DOS	Pre-execution token cost estimator + per-user daily budgets + per-provider rate limits
Credential Theft	macOS Keychain / `~/.pgpass` storage + PII redaction from all outbound API calls
RAG / Memory Poisoning	Document provenance scoring + embedding trust threshold flagging
Multi-Agent Cascade	Orchestrator-only routing + signed inter-agent messages
Supply Chain	AI-BOM + signed tool library + SHA-256 GGUF model integrity verification
TOCTOU	`approved_snapshot` stored pre-execution; post-execution result verified against snapshot
AST Bypass	Subscript, MRO traversal, `globals()`/`locals()` hijack detection in crystallization analyzer

Phase Roadmap

All phases complete.

Phase	What Was Built
0	PostgreSQL + pgvector, async LLM factory (Ollama/OpenAI/Anthropic), health server
1	Researcher agent, tool registry + SHA-256 hash validation, capability boundaries, threat event logging
2	Docker containerization, Guardian security sidecar, immutable audit log (SHA-256 hash chain), RAG provenance
3	JWT task tokens + ACLs, sub-agent orchestrator, sandbox retry tier
4	Threat Analyst agent, adaptive Guardian rules, AI Bill of Materials
5	Crystallization Pipeline — Observer + Crystallizer agents, pre-HITL AST analyzer, Ed25519-signed tools
5.5	DB RBAC (`legionforge_app` restricted role), AST bypass guards, tool revocation, TOCTOU mitigation, Ollama model integrity
6	PentestAgent — air-gapped red-team bot, 8 attack classes × 3 variants, stop-at-proof mode
7	Guardian feedback loop (pentest → threat rules → Guardian), SECURITY.md, v1.0 hardening
8	FastAPI gateway (:8080), PostgreSQL task queue, SSE streaming, web UI, A2A + MCP endpoints, Discord connector
9	langchain 1.x migration, 5-tool library, parallel agent fan-out, Phase 9.5 hardening sprint
10	Multi-user auth — DB-backed stream tokens, per-user daily token budgets, `/usage/me`, user management CLI
11	`SecureToolNode` fix, 38 integration tests, `AuthBackend` protocol, `Dockerfile.gateway`, `docs/SCALING.md`
12	`OIDCBackend`, `GitHubOAuthBackend`, `LDAPBackend`, `KerberosBackend`; multi-scheme `require_user`
13	Real GSSAPI Kerberos backend, Redis-backed stream tokens, multi-instance docker-compose + Nginx
14	Redis global budget counters, Prometheus `/metrics` endpoint, `X-Request-ID` trace middleware
15	Polished web UI — localStorage key + task history, cancel, tool call blocks, live timer, copy, keyboard shortcuts
16	Telegram (polling), Slack (Socket Mode), generic Webhook (HMAC-SHA256 + async callback) connectors
60–381	381-tool operator dashboard — every gateway API endpoint surfaced as a UI function with smoke tests
Security sprint	Extended exfiltration detection, NFKC normalization, DESTRUCTIVE_PATTERN DB logging, PostgreSQL scram-sha-256
Web + Browser tools	`web_fetch_js` Playwright headless browser tool for JS-rendered sites; two-layer SSRF guard; private-IP PII regex fix
Lazy-load Dashboard	296 operator tool cards moved into `<template>` — injected on first click; eliminates startup parse cost
Guardian spinoff G1–G4	`packages/legionforge_guardian` standalone package; `python -m legionforge_guardian` entry point; `legionforge-guardian` v0.2.0 on PyPI; public repo at LegionForge/LegionForge-Guardian with auto-sync Action
Agent Memory — all 5 gaps	Persona bootstrap (Gap 1, DB-backed SOUL.md), user prefs (Gap 5), `memory_write`/`memory_recall` tools (Gap 3), daily episodic summaries (Gap 2), pre-compaction flush (Gap 4) — OpenClaw parity
UI polish	4 color themes (Solarized, Warm, Nord, High-Contrast) + multi-theme cycler + favicon; session continuity sidebar with per-session turn count badge
Phase I — Multi-modal input	Paste or drag images directly into the task input; vision API routing to Ollama vision models or Anthropic Claude (auto-detected by model capability)
HITL approval flow	LangGraph `interrupt_before` operator gate — destructive tasks pause for human approval before execution; `GET /hitl/pending` + `POST /hitl/{id}/approve`
Phase J — WhatsApp	WhatsApp Business Cloud API connector — webhook ingestion, HMAC signature verification, message routing to gateway task queue
Security hardening	8 targeted fixes: timing oracle, SSRF hardening, log injection guard, prompt injection tightening, pgvector isolation, budget atomicity, concurrency fix, admin audit trail

Requirements

Component	Version	Notes
Python	3.11+	via pyenv recommended
PostgreSQL	16 or 17	with pgvector extension
Ollama	latest	for local LLM inference
Docker	24+	for Guardian sidecar + analyzer container
macOS	14+ (Apple Silicon)	primary target; Linux support planned

Recommended hardware: Mac Mini M4 16GB for single-user; Mac Mini M4 Pro 24GB for household (2–4 concurrent users).

Quick Start

# 1. Clone and set up
git clone https://github.com/LegionForge/LegionForge.git
cd LegionForge
python -m venv venv && source venv/bin/activate
pip install -r requirements.txt

# 2. Set hardware profile
export AGENT_HARDWARE_PROFILE=mac_m4_mini_16gb
echo 'export AGENT_HARDWARE_PROFILE=mac_m4_mini_16gb' >> ~/.zshrc

# 3. Store PostgreSQL admin password
echo "localhost:5432:*:$(whoami):yourpassword" >> ~/.pgpass && chmod 0600 ~/.pgpass
# First install with default Homebrew trust auth? Use: export POSTGRES_TRUST_AUTH=true

# 4. Pull Ollama models
ollama pull llama3.1:8b        # ~4.7 GB — primary agent model
ollama pull qwen2.5:3b         # ~2.0 GB — fast router model
ollama pull nomic-embed-text   # ~0.3 GB — RAG embeddings

# 5. Initialize the database and generate secrets
make db-init
make setup-task-token-secret
make setup-signing-key

# 6. Run smoke tests (no services required)
make test-smoke
# Expected: 2247 passed in ~21s

# 7. Start services (three terminals)
make health-server   # Operator API at :8765
make gateway-start   # User API + Web UI at :8080
make guardian-start  # Security sidecar at :9766 (requires Docker)

# 8. Create a user and open the web UI
make create-user USERNAME=myname
open http://localhost:8080/ui

→ Full setup guide: docs/quick-start.md

Documentation

Document	What It Covers
`docs/quick-start.md`	Step-by-step setup, connecting channels, first task, troubleshooting
`docs/architecture.md`	All components, ports, ASCII diagram, connection rationale
`docs/SCALING.md`	Horizontal scaling, Redis, Kerberos KDC, multi-instance Docker
`SECURITY.md`	Threat model, HITL policy, injection detection architecture
`TLDR.md`	Project orientation — what was built and why
`CHANGELOG.md`	Full version history
`docs/VISION.md`	Product vision, architecture rationale, design decisions

Key Files

File	Purpose
`src/base_graph.py`	LangGraph agent template — copy to create new agents
`packages/guardian/src/legionforge_guardian/app.py`	Guardian sidecar — canonical source; deterministic 7-check security pipeline
`src/security/guardian.py`	Backward-compat shim — re-exports everything from `legionforge_guardian.app`
`src/security/core.py`	Keychain loader, PII redaction (8 patterns), injection detection (29 patterns), I/O sanitizer
`src/database.py`	Async PostgreSQL pool, LangGraph checkpointer, pgvector, audit log hash chain
`src/safeguards.py`	Three-layer loop protection (step counter, action history, token budget)
`src/gateway/app.py`	FastAPI gateway (:8080) — task queue, SSE streaming, web UI, A2A, MCP
`src/gateway/backends/`	Auth backend package — ApiKey, OIDC, GitHub, LDAP, Kerberos
`src/connectors/`	Channel connectors — Discord, Telegram, Slack, Webhook
`src/tools/signing.py`	Ed25519 keypair management + tool manifest signing
`src/tools/crystallization_analyzer.py`	Pre-HITL AST + behavioral diff analyzer
`config/settings.py`	Pydantic settings singleton (loaded from hardware YAML profile)
`Makefile`	All development, test, and operational commands

Makefile Reference

# Environment
make check             # Verify environment before starting
make start             # Full startup (Ollama + PostgreSQL + model warmup)
make stop              # Graceful shutdown

# Testing
make test-smoke        # 2247 smoke tests, ~21s, no services required
make test-integration  # 41 integration tests (requires PostgreSQL)
make test-kerberos     # 5 Kerberos live-KDC tests (requires KDC)
make test-ui           # 40 UI tests (Playwright)
make test-fast         # All tests except slow ones

# Code quality
make lint              # Black formatter check
make format            # Auto-format
make security-audit    # Smoke + bandit + URI secret scan

# Services
make health-server     # Operator health API at :8765
make gateway-start     # User-facing gateway at :8080
make guardian-start    # Guardian sidecar at :9766 (requires Docker)
make discord-start     # Discord bot connector
make telegram-start    # Telegram bot connector
make slack-start       # Slack Socket Mode connector
make webhook-start     # Generic webhook connector at :8081

# Users & auth
make create-user USERNAME=<name>
make create-user USERNAME=<name> DAILY_LIMIT=<tokens>

# Security
make pentest           # Air-gapped red-team suite (stop-at-proof)
make pentest-report    # View most recent pentest findings
make audit-log-verify  # Verify SHA-256 hash chain integrity
make revoke-tool TOOL_ID=<id>   # Emergency tool revocation

Known Gaps (Accepted Residual Risk)

Embedding-level anomaly detection — RAG poisoning at the semantic vector level is an open research problem. Provenance scoring and trust flagging exist; embedding-level detection is deferred.
pip-audit / dependency hash pinning — Managed via Dependabot. pip-audit reports no known CVEs; transitive hash pinning is accepted residual risk.
langchain 1.x migration — Planned for a future phase; current stack runs langchain 0.3.x. A LOW-severity SSRF CVE in langchain-core is accepted risk (LegionForge never calls the affected method with image content).

Acknowledgements

LegionForge exists in a space shaped by several projects and thinkers worth calling out directly.

OpenClaw (Peter Steinberger — née Clawd → Clawdbot → Moltbot → OpenClaw) — the primary inspiration for LegionForge. Proved the demand (60,000 GitHub stars in 72 hours), proved the architecture (six-component structure, workspace-as-files memory), and proved what happens when security is an afterthought (512 vulnerabilities, active data exfiltration in third-party skills). LegionForge is building in the opposite order: security first, product on top.

LangGraph — the graph execution engine underneath everything. Checkpoint-based state persistence, loop protection, and graph resumption are LangGraph primitives that LegionForge builds on heavily.

LATM — Learning to Use Tools by Making Them (Cai et al., ICLR 2024) and Voyager (Wang et al., NVIDIA 2023) — the foundational academic work closest to LegionForge-Anneal's tool crystallization pipeline.

Anchor Engine by Robert S. Balch II — deterministic semantic memory using graph traversal (the STAR algorithm). Anchor's insight that agent memory should be explainable and deterministic directly informed LegionForge's temporal decay memory recall. The STAR gravity formula is adapted from Anchor's published whitepaper.

The AI-Human Engineering Stack by Hayen Mill and Henrique Jr. Sanchez (March 2026) — a five-layer framework for AI engineering (Prompt, Context, Intent, Judgment, Coherence). The KV-cache stability insight from this paper directly motivated the context ordering in LegionForge's agent message assembly.

The security-first design here is a direct response to watching these ecosystems grow fast and ship security as an afterthought. That's not a criticism — it's the reality of how open-source moves. This project is an attempt to show what the stack looks like when security is the first constraint, not the last.

For the full canonical record of design influences, academic inspirations, and third-party attributions, see CREDITS.md. For machine-readable citation data, see CITATION.cff.

License

AGPL-3.0 with Section 7(b) attribution requirement.

Status

v0.7.1-alpha — Active Development. This project is currently under active development and is not yet at a stable 1.0 release. The security stack, gateway, and tool library are functionally complete and well-tested, but the project is still evolving toward its v1.0.0 public release.


Version	v0.7.1-alpha
Smoke tests	2247/2247 passing
Integration tests	41/41
Kerberos tests	5/5
UI tests	40/40
Pre-v1.0 security blockers	All resolved
APIs / config formats	May change before v1.0.0

Contributions, issues, and commercial licensing inquiries welcome via GitHub Issues.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
CNAME		CNAME
LICENSE		LICENSE
README.md		README.md
index.md		index.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LegionForge

What Is LegionForge?

Key Numbers

Core Design Principles

User Interfaces

Security Architecture

Guardian Sidecar

Crystallization Pipeline

Multi-Provider Authentication

Threat Coverage

Phase Roadmap

Requirements

Quick Start

Documentation

Key Files

Makefile Reference

Known Gaps (Accepted Residual Risk)

Acknowledgements

License

Status

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

LegionForge

What Is LegionForge?

Key Numbers

Core Design Principles

User Interfaces

Security Architecture

Guardian Sidecar

Crystallization Pipeline

Multi-Provider Authentication

Threat Coverage

Phase Roadmap

Requirements

Quick Start

Documentation

Key Files

Makefile Reference

Known Gaps (Accepted Residual Risk)

Acknowledgements

License

Status

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Packages