harombe

The full-stack agent framework that runs entirely on your hardware.

Manages the complete agent lifecycle — tool execution in containers, network-level security, voice I/O, persistent memory with semantic search, and multi-node cluster routing. No cloud required.

Paper

The security architecture is described in our whitepaper:

The Capability-Container Pattern: Infrastructure-Level Security for Autonomous AI Agents Ricardo Ledan, 2026. DOI: 10.5281/zenodo.18614503

Quick Start

pip install harombe
harombe init          # detects hardware, writes config
ollama pull qwen2.5:7b
harombe chat          # autonomous agent with tools

Or use it as a library:

import asyncio
from harombe.agent.loop import Agent
from harombe.llm.ollama import OllamaClient
from harombe.tools.registry import get_enabled_tools

async def main():
    llm = OllamaClient(model="qwen2.5:7b")
    tools = get_enabled_tools(shell=True, filesystem=True, web_search=True)
    agent = Agent(llm=llm, tools=tools, system_prompt="You are a helpful assistant.")
    response = await agent.run("Find all Python files in src/ and count them.")
    print(response)

asyncio.run(main())

See examples/ for more.

What is harombe?

harombe manages the full stack for AI agents — hardware detection, tool execution (in containers), security (network-level), I/O (voice + CLI + API), memory (SQLite + vector search), and networking (multi-node clusters with service discovery). Think of it as an operating system for AI agents — as a pip install.

Security

harombe's security is infrastructure-level, not bolt-on:

Every tool runs in its own Docker container with resource limits
Per-container network egress filtering (iptables, DNS allowlists)
Audit logging with automatic credential redaction
Human-in-the-loop approval gates with risk classification
Secret management via Vault/SOPS
ZKP-based audit proofs (experimental)

What harombe manages

Responsibility	What it does
Execution	ReAct agent loop, tool calling (shell, filesystem, web search, browser, code exec)
Voice I/O	Whisper STT + Piper TTS, push-to-talk, VAD, WebSocket streaming
Memory	SQLite conversations, ChromaDB vectors, cross-session semantic search
Security	Container isolation, network filtering, audit logging, HITL gates, anomaly detection
Networking	Multi-node clusters, mDNS discovery, complexity-based routing, circuit breakers
Privacy	PII detection, data sanitization, local/hybrid/cloud routing modes
Extensibility	MCP server + client, container-isolated plugins with auto-generated MCP scaffolds

Architecture

┌─────────────────────────────────────┐
│  Layer 6: Clients                   │  Voice, iOS, Web, CLI
├─────────────────────────────────────┤
│  Layer 5: Privacy Router            │  Hybrid local/cloud AI
│  PII detection, context sanitizer   │  local-only / hybrid / cloud
├─────────────────────────────────────┤
│  Layer 4: Agent & Memory            │  ReAct loop, tools, memory
├─────────────────────────────────────┤
│  Layer 3: Security                  │  Defense-in-depth
│  MCP Gateway, container isolation   │  Credential vault, audit log
│  Per-tool egress, secret scanning   │  HITL gates, browser pre-auth
├─────────────────────────────────────┤
│  Layer 2: Orchestration             │  Smart routing, health monitoring
│  Cluster config, mDNS discovery     │  Circuit breakers, metrics
├─────────────────────────────────────┤
│  Layer 1: Runtimes                  │  llama.cpp, Whisper, TTS, embeddings
└─────────────────────────────────────┘

Each layer only talks to its neighbors. Security (Layer 3) wraps every tool invocation — there is no path from the agent to a tool that bypasses the gateway. See ARCHITECTURE.md for full design documentation.

Security Deep Dive

harombe enforces the Capability-Container Pattern: agents never execute tools directly. Every tool call goes through the MCP Gateway, which routes it to an isolated Docker container.

Agent  ──→  MCP Gateway  ──→  [ Container: shell ]
                          ──→  [ Container: browser ]
                          ──→  [ Container: web_search ]

Container Isolation — Each tool runs in its own Docker container with CPU/memory limits, read-only filesystem mounts, and no host network access.

Network Egress — Per-container iptables rules and DNS allowlists. A web_search container can reach DuckDuckGo; a filesystem container can reach nothing.

Audit Logging — Every tool call, approval decision, and security event is logged to SQLite with automatic redaction of API keys, passwords, and tokens.

HITL Gates — Operations are risk-classified (LOW/MEDIUM/HIGH/CRITICAL). High-risk operations require explicit human approval with default-deny on timeout.

Anomaly Detection — Per-agent Isolation Forest models learn baseline behavior and flag deviations. Integrated with SIEM (Splunk, Elasticsearch, Datadog).

See docs/security-quickstart.md for setup instructions.

Features

Agent Core

ReAct agent loop with autonomous planning, tool calling, and multi-step execution. Tools: shell, read/write files, web search, browser automation. Configurable step limits and confirmation gates.

Voice

Whisper STT (tiny to large-v3) + Piper TTS. Push-to-talk CLI (harombe voice), REST + WebSocket API, voice activity detection. Cross-platform audio I/O.

Memory & RAG

SQLite conversation persistence with token-based context windowing. ChromaDB vector store with sentence-transformers embeddings (local, no API calls). Semantic search across sessions. RAG-enabled agents auto-inject relevant context.

Distributed Clusters

Define nodes in YAML, route queries by complexity. mDNS auto-discovery, health monitoring, circuit breakers, load balancing. Works with any hardware mix: Apple Silicon, NVIDIA, AMD, CPU.

MCP Protocol

JSON-RPC 2.0 MCP server and client. Gateway-mediated tool execution with per-tool container isolation. Compatible with the broader MCP ecosystem.

Plugins

Container-isolated plugins with auto-generated MCP scaffolds. ZKP audit proofs, compliance reporting, and container-based extensions ship as built-in plugins. See src/harombe/plugins/ for examples.

Privacy

PII detection and data sanitization. Three routing modes: local-only (nothing leaves your machine), hybrid (sensitive data stays local, general queries can use cloud), and cloud. Configurable per-agent.

Examples

#	Example	Description
01	`simple_agent.py`	Basic single-node agent with all tools
02	`api_usage.py`	Programmatic agent creation and tool usage
03	`data_pipeline.py`	Data processing with autonomous agents
04	`code_review.py`	Automated code review workflows
05	`research_agent.py`	Research automation with web search
06	`memory_conversation.py`	Persistent conversation history
07	`cluster_routing.py`	Task-based routing across nodes
08	`semantic_memory.py`	Semantic search and RAG
09	`voice_assistant.py`	Voice-enabled assistant (STT + TTS)

Configuration

Configuration lives at ~/.harombe/harombe.yaml. Minimal example:

model:
  name: qwen2.5:7b
  temperature: 0.7

tools:
  shell: true
  filesystem: true
  web_search: true
  confirm_dangerous: true

memory:
  enabled: true
  storage_path: ~/.harombe/memory.db

All fields have sensible defaults — you can run with no config at all. See harombe.yaml.example for the full reference.

Status

Production	Experimental	Planned
ReAct agent loop	Hardware security modules	iOS/Web clients
Tool execution (shell, fs, web)	Distributed cryptography	Multi-modal vision
Container isolation + egress
Code execution sandbox
ZKP audit proofs
Audit logging + secret management
HITL approval gates
Conversation memory + RAG
Voice (STT + TTS)
Multi-node clusters
MCP server + client
Privacy router + PII detection
Anomaly detection + SIEM

2400+ tests. Python 3.11-3.13. CI on Ubuntu + macOS. See docs/roadmap.md for full phase history.

Development

git clone https://github.com/smallthinkingmachines/harombe.git
cd harombe && pip install -e ".[dev]"
pytest

See docs/DEVELOPMENT.md for detailed setup and docs/CONTRIBUTING.md for contribution guidelines.

License

Apache 2.0 — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 182 Commits
.github/workflows		.github/workflows
benchmarks		benchmarks
docker		docker
docs		docs
examples		examples
paper		paper
src/harombe		src/harombe
tests		tests
.dockerignore		.dockerignore
.envrc		.envrc
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
ARCHITECTURE.md		ARCHITECTURE.md
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
flake.lock		flake.lock
flake.nix		flake.nix
harombe.yaml.example		harombe.yaml.example
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

harombe

Paper

Quick Start

What is harombe?

Security

What harombe manages

Architecture

Security Deep Dive

Features

Agent Core

Voice

Memory & RAG

Distributed Clusters

MCP Protocol

Plugins

Privacy

Examples

Configuration

Status

Development

License

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

harombe

Paper

Quick Start

What is harombe?

Security

What harombe manages

Architecture

Security Deep Dive

Features

Agent Core

Voice

Memory & RAG

Distributed Clusters

MCP Protocol

Plugins

Privacy

Examples

Configuration

Status

Development

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages