Persistent identity and memory across AI tools - mcp-native, local-first, framework-agnostic, production-ready.
git clone https://github.com/huss-mo/GroundMemory && cd GroundMemory
docker compose up -d
# -> listening on http://127.0.0.1:4242/mcppip install groundmemory && groundmemory-mcp
# -> listening on http://127.0.0.1:4242/mcp{
"mcpServers": {
"GroundMemory": {
"url": "http://127.0.0.1:4242/mcp"
}
}
}You can enable network access and replace 127.0.0.1 with your server's LAN IP - see DOCS.md - Network Access.
You can also use the MCP server over the stdio transport - see DOCS.md - Client Configuration.
Your agent now has structured, searchable memory that persists across every session - long-term facts, a user profile, agent instructions, an entity graph, and daily logs - all managed automatically. No changes to your agent's code required.
This config works with any MCP-compatible client, including AI coding assistants (Cursor, Cline, Windsurf, Claude Code, Codex CLI), AI desktop clients (Claude Desktop, Open WebUI), and agent frameworks and platforms (LangChain, CrewAI, AutoGen, Google ADK, LiteLLM, n8n).
For installation options, embedding providers, multiple workspaces, and the Python API, see DOCS.md.
Without memory, every session in every AI tool starts from zero. With GroundMemory, agents can maintain continuity across time, accumulate knowledge, and behave like they actually know the person they're working with. It makes conversationg stateful, fluid, and natural.
A coding/personal assistant that builds a profile over time. After a few conversations, it knows your schedule, your priorities, how you like to communicate, your tech stack, your preferred patterns, the architectural decisions you've already made. It doesn't need to ask.
A research agent that constructs a knowledge graph. As it reads papers and sources across many sessions, it records entities, relationships, and findings.
A single identity across every AI tool you use. Your workspace is not bound to one assistant. Connect Claude Desktop, Cursor, Cline, and any other MCP-compatible tool to the same GroundMemory server and they all share the same memory - your preferences, your stack, your ongoing work. You stop being a stranger every time you open a different tool. There is something genuinely different about being known rather than just answered - it shifts the relationship from transactional to collaborative, and removes the quiet tax of re-establishing context that most people don't notice until it's gone.
A customer-facing agent with per-user memory. In multi-user setups, each user gets their own workspace - preferences, history, ongoing context - giving every interaction a personalised, stateful feel without any custom infrastructure.
A long-running autonomous agent that survives context limits. When the context window fills, compaction hooks instruct the agent to flush important facts to memory before the window rolls over. The next session picks up exactly where the last one left off.
Most agents are stateless. They ask the same questions again, repeat the same mistakes, and lose track of the user's preferences and ongoing work. This is not a model limitation - it is missing infrastructure.
GroundMemory provides that infrastructure. It gives your agent a structured, searchable memory that persists across sessions, organised into distinct tiers with clear ownership:
| File | Purpose |
|---|---|
MEMORY.md |
Curated long-term facts - preferences, decisions, persistent knowledge. Written by the agent using memory_write(tier="long_term"). Survives forever. |
USER.md |
Stable user profile - name, role, working style. Edited manually or by the agent. Injected at every session start. |
AGENTS.md |
Agent operating instructions - how this agent should behave, what tools to use and when. Seeded with sensible defaults. |
RELATIONS.md |
Entity relationship graph - typed triples (Alice → works_at → Acme Corp). Written by memory_relate, human-readable mirror of the SQLite graph. |
daily/YYYY-MM-DD.md |
Append-only daily logs - task progress, running notes, session context. Written by memory_write(tier="daily"). |
At session start, all of these files are assembled into a compact system prompt block your agent receives as context - called bootstrap injection. At search time, all tiers are queried together or individually.
Additional capabilities:
- Hybrid search - BM25 keyword scoring and vector cosine similarity are combined and re-ranked in a single query, so recall is accurate even when the wording differs from what was stored.
- Zero-setup mode - with
provider: none, GroundMemory runs entirely on SQLite with FTS5. No API key, no GPU, no extra dependencies. - Pluggable embedding providers - swap between a local sentence-transformers model, any OpenAI-compatible endpoint (OpenAI, Ollama, LM Studio, LiteLLM), or BM25-only without touching your agent code.
- Workspace isolation - each project, user, or agent gets its own directory-backed workspace with independent memory, relations, and daily logs.
- Relation graph with semantic deduplication - the graph automatically suppresses near-duplicate triples using configurable cosine similarity thresholding.
- Compaction hooks - when a session approaches the context window limit, GroundMemory emits structured prompts that instruct the agent to flush important facts to storage before the window rolls over.
Comparison reflects publicly documented features as of MAR-2026. Submit a PR if anything is inaccurate.
| Feature | GroundMemory | Mem0 | Letta | memsearch | Zep |
|---|---|---|---|---|---|
| Zero-setup (no API key, no GPU) | ✅ | - | - | - | - |
| Local-first / offline | ✅ | - | - | Partial¹ | - |
| Human-readable Markdown memory | ✅ | - | - | ✅ | - |
| Structured memory tiers | ✅ | ✅² | ✅³ | - | - |
| Hybrid BM25 + vector search | ✅ | - | - | ✅ | ✅ |
| Entity relation graph | ✅ | ✅ | - | - | ✅ |
| MCP-native server | ✅ | Partial⁴ | Partial⁵ | - | - |
| Compaction hooks | ✅ | - | ✅ | - | - |
| Temporal knowledge graph | -⁶ | - | - | - | ✅ |
| Full agent framework | - | - | ✅ | - | - |
| Managed cloud service | - | ✅ | ✅ | - | ✅ |
¹ memsearch supports local ONNX embeddings + Milvus Lite, but requires initial model download
² Mem0 organizes memory into Conversation, Session, User, and Organizational layers
³ Letta uses Core Memory blocks (in-context) + Archival Memory (vector DB) + Conversation Search
⁴ Mem0 offers an MCP integration but the primary interface is the Python/Node SDK
⁵ Letta agents can consume external MCP servers as tools; Letta itself is not an MCP server
⁶ GroundMemory timestamps all relations but does not support date-range queries.
In normal mode, GroundMemory exposes four tools: memory_bootstrap, memory_read, memory_write, and memory_relate. An optional memory_list tool can be enabled via config. In dispatcher mode, all actions are routed through a single memory_tool call - useful for clients that perform better with fewer tools in scope.
When using the MCP server, instruct your agent to call memory_bootstrap at the start of every session before doing anything else, if you find out that it doesn't do that by default. This loads the full memory context (MEMORY.md, USER.md, AGENTS.md, RELATIONS.md, daily logs) into the conversation. Clients that support the MCP Prompts primitive (Cline, Claude Desktop) can instead use the memory_bootstrap_prompt prompt from their Prompts panel.
When using the Python API, call session.bootstrap() and pass the result as your system prompt - no tool call is needed.
For the full tools reference including parameters, tiers, and source filters, see DOCS.md - Tools Reference.
┌─────────────────────────────────────────────────────┐
│ AI Agent / LLM │
│ (OpenAI, Anthropic, or any framework) │
└────────────────────┬────────────────────────────────┘
│ tool calls + bootstrap prompt
▼
┌─────────────────────────────────────────────────────┐
│ MemorySession │
│ workspace · index · provider · config │
└───────┬──────────────┬──────────────────────────────┘
│ │
▼ ▼
┌───────────┐ ┌───────────────────────────────────┐
│ Workspace │ │ MemoryIndex │
│ │ │ SQLite + FTS5 (BM25 keyword) │
│ MEMORY.md │ │ + optional vector store │
│ USER.md │ │ hybrid re-ranking + MMR │
│ AGENTS.md │ └──────────────┬────────────────────┘
│ RELATIONS │ │
│ daily/ │ ▼
└───────────┘ ┌───────────────────────────────────┐
│ EmbeddingProvider │
│ NullProvider (BM25-only) │
│ SentenceTransformer (local) │
│ OpenAICompatible (HTTP API) │
└───────────────────────────────────┘
For a detailed breakdown of each layer, the full data flow, and the tech stack, see DOCS.md - Architecture.
GroundMemory is designed around three values:
- Simplicity over features. Every addition must justify its complexity. A zero-dependency BM25-only mode must always work.
- Offline-first. The default configuration must not require an API key, a network connection, or a GPU.
- Test-driven. New behaviour ships with tests. The full suite must pass before any PR is merged.
git clone https://github.com/huss-mo/GroundMemory.git
cd GroundMemory
# Install with all dev dependencies
pip install -e ".[dev,local]"
# Or with uv
uv sync --extra dev --extra local# Core tests - no extra deps or config required (fast, always passes)
pytest
# Local model tests - sentence-transformers + cross-encoder
# Requires: pip install groundmemory[local]
pytest -m local
# API embedding tests - OpenAI-compatible HTTP endpoint
# Requires: endpoint configured via .env or groundmemory.yaml (see below)
pytest -m api_embeddings
# All marked tests together
pytest -m "local or api_embeddings"
# With coverage
pytest --cov=groundmemory --cov-report=term-missinglocal tests download real sentence-transformers and cross-encoder models on first run.
Model names are read from config (embedding.local_model, search.rerank_model).
They skip automatically when sentence-transformers is not installed.
api_embeddings tests require a configured OpenAI-compatible embedding endpoint.
All settings are read from .env or groundmemory.yaml — whichever is found first
(.env takes priority). The tests skip automatically when embedding.provider is
not openai or the endpoint is unreachable.
Minimal .env for api_embeddings (project root or ~/.groundmemory/):
GROUNDMEMORY_EMBEDDING__PROVIDER=openai
GROUNDMEMORY_EMBEDDING__BASE_URL=http://localhost:11434/v1
GROUNDMEMORY_EMBEDDING__API_KEY=ollama
GROUNDMEMORY_EMBEDDING__MODEL=nomic-embed-textOr equivalently via groundmemory.yaml:
embedding:
provider: openai
base_url: http://localhost:11434/v1
api_key: ollama
model: nomic-embed-textAny OpenAI-compatible endpoint works: OpenAI, Ollama, LM Studio, LiteLLM, etc.
Running pytest with no -m flag runs everything — marked tests skip gracefully
when their requirements are not met.
- Fork the repository and create a branch:
git checkout -b feature/your-feature-name - Make your changes with accompanying tests.
- Run tests - all unit tests must pass.
- Open a pull request with a clear description of what changes and why.
This project is licensed under the MIT License - see the LICENSE file for details.
