MindReader V2 — Technical Reference

Architecture

MindReader is organized as an npm workspaces monorepo:

mindreaderv2/
  packages/
    mindgraph/         # Python core - Graphiti memory engine, CLI, background worker
    mindreader-ui/     # Express server + React UI - visualization & management
      server/
        server.js      # App assembly (~150 lines)
        routes/        # 8 route modules (graph, entity, categories, search, cleanup, audit, tokens, cli)
        lib/           # Shared utilities (daemon, categorizer, LLM client, preprocessor)
      ui/              # React frontend
    openclaw-plugin/   # Optional AI agent integration - auto-recall/capture

How It Works

Conversations ──> Capture ──> Preprocessor ──> Neo4j Knowledge Graph ──> Recall ──> AI Context
                    │              │                     │                            │
                    ▼              ▼                     ▼                            ▼
              Filter messages  Classify facts       Auto-categorize            Semantic search
              Extract facts    Attributes → Neo4j   Auto-tag                   Entity profiles
              Find entities    Relationships → Graphiti  Relationship repair    Structured JSON
                                                         │
                                                         ▼
                                                   Self-Evolution
                                                   (web search LLM discovers
                                                    new entities & relationships)

Capture — Conversations are filtered, then an LLM preprocessor extracts and classifies facts before storage
Preprocess — Each fact is classified as either an attribute (written directly to entity tags/summary in Neo4j) or a relationship (forwarded to Graphiti for graph storage). This prevents junk entities like "Developer" or "15 Years Experience" from polluting the graph.
Organize — LLM auto-categorizes, auto-tags, and maintains the graph continuously
Evolve — Any node can be expanded via web-search-powered research, discovering new entities and connections
Recall — Semantic search retrieves relevant memories with full entity context

Smart Storage — Quality Over Quantity

Traditional knowledge graph systems treat everything as entities. Tell it "Dell is a developer with 15 years experience" and you get three entities: "Dell", "Developer", and "15 Years Experience". The last two are attributes, not independent entities — they pollute the graph.

MindReader's preprocessing pipeline solves this:

Known entity lookup — Before classification, the preprocessor searches Neo4j for entities mentioned in the text
LLM classification — Each fact is classified as an attribute (role, skill, trait, preference → direct Neo4j update) or a relationship (connection between entities → Graphiti)
Direct attribute writes — Tags and summary updates are written directly to existing entity nodes, avoiding unnecessary graph traversal
Graceful degradation — If preprocessing fails, the system falls back to Graphiti with custom extraction instructions that still prevent junk entity creation

Both the manual store path (memory_store tool) and the auto-capture path (end-of-conversation hook) go through this pipeline. The auto-capture path additionally extracts key facts from conversation history before classification, filtering out code, debug output, and tool results.

Preprocessing Modes

Mode	Behavior	LLM Calls
`merged` (default)	One LLM call extracts facts + classifies	1
`two-pass`	Separate extraction and classification steps	1 + N

Set via the PREPROCESS_MODE environment variable.

Features Detail

Smart Search

Search across entity names, summaries, and tags from a single search bar (Ctrl+K).

Tag-aware search — Search "swimmer" finds entities tagged "swimmer" even if the word isn't in the name
Context-aware — On the graph tab, search highlights and zooms to matching nodes
Relevance ranking — Exact name match > prefix match > contains match > tag match

Auto-Maintenance

Feature	How It Works
Auto-Categorization	LLM classifies uncategorized entities using name, summary, and tags
Auto-Tagging	Extracts descriptive tags (roles, skills, locations, relationships) in the same LLM call
Batch Re-categorization	Process uncategorized entities in configurable batches with one click
Duplicate Detection	Scans for entities with similar names and properties
Relationship Repair	Detects reversed, misspelled, and vague relationships (rule-based + LLM)
Orphan Cleanup	Finds and removes disconnected entities

Multiple Views

View	Purpose
List	Browse and search entities with pagination, filter by category
Timeline	See memories organized chronologically (Today, Yesterday, This Week, Earlier)
Graph	Interactive visual exploration of entities and relationships
Categories	Browse and manage entity categories with per-category counts
Activity	Audit log of captured and recalled memories
Tokens	Track LLM API usage and costs
Maintenance	Cleanup tools, relationship repair, batch re-categorization

Graph Visualization

Zoom, pan, click to explore
Nodes sized by connection count — important entities are larger
Color-coded by category (person, project, company, etc.)
Hover for quick preview with category, tags, and summary
Filter by category to focus on what matters
6 layout modes — Force, ForceAtlas2, Radial, Circular, Cluster, Grid

Multi-Round Evolve

Evolve supports up to 3 rounds of progressive discovery:

Round 1: Internal discovery (connections to existing graph entities) + external web research
Round 2: Deeper research, excluding Round 1 discoveries, finding second-degree connections
Round 3: Further depth, building on all prior discoveries

Each round sends previous discoveries as exclusion context to avoid re-discovering the same entities. The UI shows a sequential feed with round dividers. After each round, choose to continue or save.

Memory Decay — Temporal Lifecycle

Memories decay over time unless reinforced by access. This keeps the knowledge graph fresh and prevents stale data from cluttering recall results.

How It Works

Strength decay: Every entity and relationship has a strength field (0.0-1.0). Strength decays exponentially based on time since last access: strength = exp(-lambda * days_since_last_access). Default lambda=0.03 gives a ~23-day half-life.
Access reinforcement: When an entity is accessed (search, recall, detail view), its strength is boosted and last_accessed_at is reset. Frequently-used memories stay strong.
Auto-expiry: When strength drops below the threshold (default 0.1), the item is soft-expired via expired_at timestamp. It's hidden from normal queries but preserved for history.
Cascade expiry: When all relationships of an entity are expired, the entity is auto-expired too.
Contradiction expiry: Handled by Graphiti — when new info contradicts an existing edge, the old edge is invalidated immediately regardless of strength.

Configuration

Variable	Default	Description
`MEMORY_DECAY_ENABLED`	`true`	Enable/disable the decay system
`MEMORY_DECAY_INTERVAL_MS`	`3600000` (1hr)	How often the decay job runs
`MEMORY_DECAY_LAMBDA`	`0.03`	Decay rate (~23-day half-life)
`MEMORY_DECAY_THRESHOLD`	`0.1`	Auto-expire below this strength
`MEMORY_DECAY_REINFORCE_DELTA`	`0.3`	Strength boost on access

API Endpoints

Endpoint	Method	Description
`/api/decay/status`	GET	Decay statistics and config
`/api/decay/run`	POST	Manually trigger a decay cycle
`/api/decay/restore/:name`	POST	Un-expire an entity and its relationships

Time-Travel UI

Enable "Time Travel" in the graph view to reveal a date slider:

Drag the slider to see the graph at any point in time
Nodes created after the selected date disappear
Expired nodes reappear as ghosts
Date gauge ticks show key dates
Auto Play: animates from the oldest entity to now over 30 seconds

All expiry is soft-delete. The underlying data model uses created_at and expired_at timestamps:

-- Show the graph as of March 1
WHERE e.created_at <= datetime('2026-03-01')
  AND (e.expired_at IS NULL OR e.expired_at > datetime('2026-03-01'))

Entity Details — Rich Knowledge Storage

Each entity has two text fields:

Summary (max 200 chars): Brief identifier used in recall context and search previews
Details (max 10KB, markdown): Comprehensive information — every fact, relationship context, and historical change

How Details Are Populated

When new facts are captured about an entity, an LLM synthesizes the existing details with the new information into an updated markdown document. The LLM also generates a concise 200-char summary.

This happens automatically on:

Auto-capture: conversation facts merged into entity details
Manual store: memory_store tool updates trigger synthesis
Evolve: web search discoveries merged into details
Direct API: POST /api/entities can set details explicitly

API

Endpoint	Method	Description
`/api/entity/:name`	GET	Returns entity with `details` field
`/api/entity/:name/details`	PUT	Update details directly (body: `{ details: "markdown..." }`)
`/api/entities`	POST	Create entities with optional `details` field

Search

Details are included in the fulltext search index. Searching for any term in the details will find the entity, even if the summary doesn't mention it.

Direct Entity API

For systems that require precise, deterministic memory management without LLM processing:

POST /api/entities

Create or update entities directly in Neo4j with batch support, optional relationships, and upsert behavior.

See Direct Entity API Reference for full documentation with examples.

CLI Tool

Full command-line interface for power users and automation:

mg search "swimming competitions"     # Semantic search with entity profiles
mg search "Aria" --json               # Machine-readable JSON output
mg tags "Aria Lu"                     # View tags: Aria Lu [person]: swimmer, daughter
mg tags "Aria Lu" --add "competitive" # Add a tag
mg tags --backfill                    # LLM-extract tags for all entities
mg add "Alice is a data scientist"    # Store a new memory
mg entities --limit 20                # List entities
mg maint scan                        # Scan for issues
mg maint fix                         # Auto-fix duplicates and orphans

OpenClaw Integration

MindReader is built as a first-class OpenClaw extension.

Plugin Capabilities

Auto-capture (agent_end hook) — extracts entities, facts, and relationships from conversations
Auto-recall (before_agent_start hook) — retrieves relevant memories and injects into agent context
Tool calls — memory_search, memory_store, memory_entities tools
Web UI — graph explorer served on configurable port (default 18900)
Auto-sync — plugin files automatically sync from monorepo on gateway restart

Plugin Configuration

Configuration in openclaw.json under plugins.entries.mindreader.config:

Key	Description	Default
`neo4jUri`	Neo4j bolt connection URI	`bolt://localhost:7687`
`neo4jUser`	Neo4j username	`neo4j`
`neo4jPassword`	Neo4j password	(empty)
`llmProvider`	LLM provider (openai/dashscope/anthropic)	(from .env)
`llmApiKey`	LLM API key	(from .env)
`llmBaseUrl`	LLM API base URL	(from .env)
`llmModel`	LLM model name	(from .env)
`autoCapture`	Enable auto-capture from conversations	`true`
`autoRecall`	Enable auto-recall before agent responses	`true`
`recallLimit`	Max memories to recall	`5`
`captureMaxChars`	Max chars to capture per conversation	`2000`
`uiPort`	Web UI port	`18900`
`uiEnabled`	Enable web UI	`true`

Values left empty in openclaw.json fall through to .env variables automatically.

Configuration

Configuration is stored in .env at the monorepo root. The setup wizard generates this automatically.

Key Environment Variables

Variable	Description
`LLM_PROVIDER`	`openai`, `dashscope`, `anthropic`, or `ollama`
`LLM_API_KEY`	API key for the LLM provider
`LLM_BASE_URL`	API base URL
`LLM_MODEL`	Primary LLM model
`LLM_SMALL_MODEL`	Smaller/faster model for extraction
`LLM_EVOLVE_MODEL`	Model for evolve feature (ideally with web search)
`EMBEDDER_API_KEY`	Embedder API key (defaults to LLM_API_KEY)
`EMBEDDER_BASE_URL`	Embedder API base URL
`EMBEDDER_MODEL`	Embedding model name
`EMBEDDER_DIM`	Embedding dimensions
`NEO4J_URI`	Neo4j connection URI
`NEO4J_USER`	Neo4j username
`NEO4J_PASSWORD`	Neo4j password
`UI_PORT`	Web UI port (default: 18900)
`SEQ_URL`	Seq structured logging URL (optional)
`SEQ_API_KEY`	Seq API key (optional)

LLM Providers

Provider	Status	Default Model	Web Search	Notes
OpenAI	Supported	`gpt-4o-mini`	No	Most widely available
DashScope (Alibaba)	Supported	`qwen3.5-flash`	Yes (built-in)	Best for evolve feature
Anthropic	Supported (native)	`claude-sonnet-4-6`	No	Uses native Anthropic SDK
Ollama	Supported	`llama3.2`	No	Free, local, no API key. Embeddings via `nomic-embed-text`

Development

# Dev mode (server + UI with hot reload)
npm run dev

# Build the UI for production
npm run build

# Run setup wizard
npm run setup

# CLI tool
mg --help   # or: python3 packages/mindgraph/python/mg_cli.py --help

Cross-Platform Support

MindReader runs on Linux, macOS, and Windows. The setup wizard detects the platform and uses the appropriate script (bash on Unix, PowerShell on Windows). The Python daemon uses platform-specific stdin handling (asyncio pipes on Unix, threaded readline on Windows).

Multi-Tenant Support

MindReader supports multi-tenant isolation via a mandatory tenantId property on all data nodes and edges.

How It Works

Every Entity, Episodic, TokenUsage, AuditLog node and every RELATES_TO/MENTIONS edge has a tenantId property
In open-source (single-user) mode, tenantId is always "master"
In cloud mode, the upstream API proxy sets X-Tenant-Id header per authenticated user
All queries automatically filter by tenantId via AsyncLocalStorage context injection
Category and Migration nodes are system-wide (no tenant filtering)

Configuration

Variable	Default	Description
`INTERNAL_SECRET`	(empty)	Shared secret for X-Internal-Secret header validation. When set, only requests with this secret can set X-Tenant-Id to non-"master" values.

Security

Express server should NOT be publicly accessible in cloud deployments
X-Internal-Secret header validates that tenant ID comes from the trusted proxy
AsyncLocalStorage ensures tenant context flows to all downstream queries automatically
Two-tenant isolation test suite (tests/tenant-isolation.test.js) validates no cross-tenant leakage

Running Isolation Tests

# Start MindReader server first, then:
node tests/tenant-isolation.test.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MindReader V2 — Technical Reference

Architecture

How It Works

Smart Storage — Quality Over Quantity

Preprocessing Modes

Features Detail

Smart Search

Auto-Maintenance

Multiple Views

Graph Visualization

Multi-Round Evolve

Memory Decay — Temporal Lifecycle

How It Works

Configuration

API Endpoints

Time-Travel UI

Entity Details — Rich Knowledge Storage

How Details Are Populated

API

Search

Direct Entity API

CLI Tool

OpenClaw Integration

Plugin Capabilities

Plugin Configuration

Configuration

Key Environment Variables

LLM Providers

Development

Cross-Platform Support

Multi-Tenant Support

How It Works

Configuration

Security

Running Isolation Tests

FilesExpand file tree

technical.md

Latest commit

History

technical.md

File metadata and controls

MindReader V2 — Technical Reference

Architecture

How It Works

Smart Storage — Quality Over Quantity

Preprocessing Modes

Features Detail

Smart Search

Auto-Maintenance

Multiple Views

Graph Visualization

Multi-Round Evolve

Memory Decay — Temporal Lifecycle

How It Works

Configuration

API Endpoints

Time-Travel UI

Entity Details — Rich Knowledge Storage

How Details Are Populated

API

Search

Direct Entity API

CLI Tool

OpenClaw Integration

Plugin Capabilities

Plugin Configuration

Configuration

Key Environment Variables

LLM Providers

Development

Cross-Platform Support

Multi-Tenant Support

How It Works

Configuration

Security

Running Isolation Tests