Built by DARKSOL 🌑
Production-minded agent memory for teams that need durable recall, scoped access, and benchmarked retrieval beyond the active prompt window.
Persistent, queryable memory for AI agents.
ReMEM is a lightweight, framework-agnostic memory substrate for AI agents. It gives agents a persistent memory layer they can store to, query by meaning, organize across layers, and carry across restarts.
It applies the core insight from Recursive Language Models (RLMs) - that prompts should be external environment variables, not direct context - to the problem of persistent, queryable agent memory.
Built with TypeScript. Node-first, with storage and adapter surfaces designed to stay framework-agnostic.
- Drop-in agent memory - store facts, preferences, decisions, procedures, and recent events outside the active prompt window
- Semantic recall, not just keyword recall - hybrid keyword + embedding search when vectors are enabled
- Persists across restarts - SQLite by default, PostgreSQL for server/shared deployments, snapshots for backup + restore
- Grows with the agent - layers, compression, links, adapters, and long-running memory workflows without forcing a framework
import { ReMEM } from '@darksol/remem';
const memory = new ReMEM({ dbPath: './remem.db' });
await memory.init();
await memory.store({
content: 'Meta prefers dark mode UI and short direct replies',
topics: ['preferences', 'ui', 'style'],
});
const { results } = await memory.query('How should I respond to Meta?', {
metadata: { source: 'operator-notes' },
});
console.log(results[0].content, results[0].metadata);Works well for: chat agents, coding agents, operator copilots, long-running automations, and multi-agent systems that need durable memory instead of stuffing everything into context.
ReMEM now has a terminal-native setup surface too:
remem ui --db ./remem.db
remem init --db ./remem.db --runtime openclaw --out-dir ./.rememThe terminal UI is intentionally scoped for humans doing setup and integration work:
- choose runtime focus first (OpenClaw, Hermes, or generic)
- storage selection + runtime summary
- embeddings / Ollama configuration review
- LLM configuration review
- adapter onboarding guidance with runtime-specific checklists
- starter code snippet generation
- starter config generation
- smoke-check / verification lane
- recommended execution plan for getting ReMEM live
There is also a direct CLI surface for agent-facing operations:
remem status --db ./remem.db
remem stats --db ./remem.db --json
remem health --db ./remem.db --json
remem store --content "Meta likes dark mode" --topics preferences,ui
remem query --query "What does Meta like?"
remem context-pack --query "What context should the next agent carry?" --max-chars 6000 --json
remem dream --query "What is long memory trying to tell us?" --json
remem smoke-check --db ./remem.db --json
remem snapshots --action create --label before-upgraderemem init can now generate starter artifacts non-interactively:
remem.config.jsonremem-snippet.ts.env.example
That makes it usable in setup scripts, harness bootstraps, and CI-prep lanes instead of only as a human console.
The split is deliberate: the UI helps a human wire ReMEM into OpenClaw/Hermes cleanly, while actual memory operations stay scriptable for agents.
For release/setup hygiene, ReMEM also ships diagnostics:
remem doctor --db ./remem.db --json
remem validate-config --config ./.remem/remem.config.json --json
remem init --runtime openclaw --out-dir ./.remem --check --jsonUse doctor when you want package/runtime/config/storage/snapshot checks in one command. Use smoke-check when you only need the lighter snapshot + optional endpoint verification pass.
Use stats when an agent needs a compact inventory of the memory scope before deciding whether to recall, consolidate, snapshot, or prune.
Use health when an agent needs a maintenance plan, not just counts. It scores the current memory scope and flags missing/stale snapshots, duplicate memories, stale unaccessed entries, weak topic coverage, and long-memory layer pressure.
remem health --db ./remem.db --jsonHTTP runtimes can call the same triage surface at GET /memory/health or POST /memory/health.
Example actions in a health report:
- create a fresh snapshot before a release or migration
- run consolidation when duplicate memories start polluting recall
- improve topic coverage for untagged memories
- pack context around stale memories before deciding what to keep
ReMEM can run a first-class dream pass over durable layers (identity, semantic, procedural) to synthesize what the system keeps circling back to.
remem dream --query "What long-memory patterns matter most right now?" --jsonThis is different from plain recall:
- recall answers a query from stored memory
- consolidation deduplicates and promotes durable memory
- dreaming synthesizes long-memory themes, tensions, and next moves from the memories most worth carrying forward
If an LLM is configured, the dream pass produces a compact model-written artifact. Without an LLM, ReMEM still returns a deterministic synthesis from the strongest long-memory entries.
Use contextPack() when an agent needs a bounded memory brief it can paste directly into the next prompt, task handoff, or remote worker request.
const pack = await memory.contextPack('What should the next coding agent know about ReMEM?', {
profile: 'agent-safe',
maxChars: 6000,
includeDream: true,
});
console.log(pack.content);
console.log(pack.sourceIds);Context packs combine smart recall, graph/procedural signals, recent context, and optional dream synthesis while staying under the requested character budget.
remem context-pack \
--query "What should the next agent know before touching release code?" \
--profile agent-safe \
--max-chars 6000 \
--dreamLLMs are limited by their context window. Retrieval-Augmented Generation (RAG) helps, but most implementations are fragile keyword-match hacks that lose semantic meaning the moment your query wording diverges from storage.
ReMEM does something different:
- A proper memory store - SQLite-backed by default, event-sourced, with atomic crash-safe writes; PostgreSQL backend available for server/shared deployments (v0.6.5)
- Semantic search with vector embeddings - Ollama (
nomic-embed-text), 40% keyword + 60% cosine similarity hybrid scoring when embeddings are available (v0.4.1) - Persistent hierarchical layers - episodic, semantic, identity, and procedural tiers that survive restarts
- An LLM-native query interface - Describe what you want in plain English; the query engine recursively refines
- Temporal validity - Tracks when facts were true, not just that they exist. Enforced in all layer queries — expired entries are filtered out automatically
- Episodic capture pipeline (v0.5.0) - Automatic event capture for the episodic layer. Buffers + batch-writes to MemoryStore, importance scoring based on event type + content, deduplication of rapid similar events, and topic extraction from event content and hashtags
- Memory consolidation (v0.6.1) - Cross-layer deduplication via embedding/keyword similarity, conflict resolution with contradiction detection, cross-layer promotion of frequently-accessed episodic entries to semantic layer, and configurable merge strategies (newer_wins, older_wins, concatenate, supersede)
- Episodic compression - When the episodic layer fills up, old entries are LLM-compressed into semantic summaries instead of lost to TTL eviction. Meaning preserved, storage reclaimed
- RLM-style Memory REPL (v0.4.0) - Model writes JavaScript to navigate memory programmatically. Never sees all memory at once — only constant-size metadata. Enables arbitrarily large memory stores without context window overflow
- Snapshot/restore (v0.6.2) - Full core + layered memory snapshots with SHA-256 integrity checks and portable export/import for long-running agents. Survive restarts, migrations, and crashes
- Identity duplication & infection (v0.3.3) - Export full identity package to DARKSOL server, pull and overlay on any ReMEM-equipped agent
- Multi-agent scoping - agent_id + user_id isolation for shared deployments
- Plug-and-play LLM abstraction - Bankr, OpenAI, Anthropic, Ollama - swap without changing your code
- Framework adapters (v0.6.1, expanded in v0.9.0) - Dependency-free helpers for Vercel AI SDK, LangGraph-style stores, and OpenClaw/session memory with decision/procedure/project-context helpers plus metadata-aware recall
- Harness adapters (v0.12.0) - Includes polished OpenClaw and Hermes harness-facing adapters for turns, decisions, procedures, artifacts, and shared namespace recall
- Shared memory namespaces (v0.12.0) - Store reusable memory inside explicit team/project lanes with private/shared visibility controls and scoped recall
- Smart recall (v0.12.0) - Fuse semantic, graph, procedural, and recent-context lanes into one higher-signal retrieval pass
- Context packs (v0.14.0) - Generate bounded, prompt-ready recall packets from smart recall, recent context, procedural signals, and optional dream synthesis
- Consolidation workflows (v0.12.5) - Run full memory curation passes that deduplicate, resolve conflicts, promote durable summaries, and optionally turn repeated patterns into procedures
- Memory links + neighbor-aware retrieval (v0.8.0, expanded in v0.8.5) - Explicit typed links between memories (
about,supports,contradicts, etc.), weighted graph-adjacent recall, and optional traversal path details - Identity alignment audits (v0.8.5) - Drift scoring plus corrective injection text for agents that need to keep behavior anchored to a constitution
- Production-aware Postgres vector lane (v0.8.5) - Native pgvector detection, ivfflat index bootstrap, and runtime introspection for deployments that want in-database vector search
- Metadata-aware recall (v0.9.0) - Filter memory queries on structured metadata, preserve metadata on results, and carry source/namespace hints through adapters + HTTP
- Framework-agnostic - Works as a Node.js library, CLI tool, or HTTP microservice without forcing an agent framework
import { ReMEM } from '@darksol/remem';
const memory = new ReMEM({ dbPath: './chat-agent.db' });
await memory.init();
await memory.store({
content: 'Meta prefers dark mode UI and vibe-based communication',
topics: ['preferences', 'ui', 'tone'],
});
const { results } = await memory.query('What tone and UI preferences should I remember for Meta?');Use this when your assistant needs to remember preferences, prior decisions, and recurring facts without bloating every prompt.
import { ReMEM, createOpenClawAdapter } from '@darksol/remem';
const memory = new ReMEM({ dbPath: './project-memory.db' });
await memory.init();
const openclaw = createOpenClawAdapter(memory);
await openclaw.rememberTurn({
role: 'user',
content: 'Never publish before tests, lint, and pack dry-run pass.',
sessionId: 'remem-release-work',
});
const recall = await memory.query('What are the release gates for this project?');Use this when an agent needs to preserve architecture decisions, release rules, debugging history, and project-specific procedures across sessions.
See the task-oriented OpenClaw quickstart in docs/openclaw-quickstart.md.
import { ReMEM } from '@darksol/remem';
const memory = new ReMEM({
storage: 'postgres',
postgres: { connectionString: process.env.DATABASE_URL },
storageConfig: {
agentId: 'support-agent',
userId: 'customer-042',
},
});
await memory.init();
await memory.enableLayers();
await memory.storeInLayer(
{ content: 'Customer 042 runs on Base and prefers USDC settlement', topics: ['customer', 'payments'] },
'semantic'
);Use this when multiple workers, sessions, or agents need scoped memory with isolation by agent_id and user_id.
See the Hermes harness quickstart in docs/hermes-quickstart.md.
Use these deployment profiles as starting points. They keep the memory layer explicit, scoped, and recoverable instead of treating recall as an invisible prompt side effect.
Best for desktop agents, single-user copilots, local automations, and development harnesses.
const memory = new ReMEM({
dbPath: './data/remem.db',
embeddings: {
enabled: true,
baseUrl: 'http://127.0.0.1:11434',
model: 'nomic-embed-text',
asyncEmbed: true,
},
});
await memory.init();
await memory.enableLayers();Operational notes:
- keep the database path outside temporary working directories
- create snapshots before upgrades or agent migrations
- run
remem smoke-check --db ./data/remem.db --jsonin setup and deployment scripts - use
remem init --runtime openclaw --out-dir ./.rememto generate starter config and adapter snippets
Best for hosted agents, teams, background workers, and systems where multiple runtimes need the same durable memory substrate.
const memory = new ReMEM({
storage: 'postgres',
postgres: {
connectionString: process.env.DATABASE_URL,
tablePrefix: 'remem_',
ssl: true,
},
storageConfig: {
agentId: 'research-agent',
userId: 'team-darksol',
},
});
await memory.init();
await memory.enableLayers();Operational notes:
- set
agentIdanduserIddeliberately so shared deployments do not bleed context across tenants - use namespaces for project/team memory that should be intentionally reusable
- keep snapshots enabled for migration safety and rollback drills
- expose the HTTP adapter only behind explicit bearer auth if another service needs remote access
Best for larger shared stores that need database-native vector search instead of application-level vector scoring.
const memory = new ReMEM({
storage: 'postgres',
postgres: {
connectionString: process.env.DATABASE_URL,
pgvector: {
enabled: true,
embeddingType: 'both',
ivfflatLists: 100,
},
},
embeddings: {
enabled: true,
baseUrl: process.env.OLLAMA_BASE_URL ?? 'http://127.0.0.1:11434',
model: 'nomic-embed-text',
},
});
await memory.init();
console.log(memory.usesNativeVectorSearch());Operational notes:
- verify
usesNativeVectorSearch()in health checks so acceleration is not assumed silently - backfill embeddings before depending on semantic recall quality
- tune
ivfflatListsagainst actual corpus size and query latency instead of copying benchmark numbers blindly
Use recall modes intentionally:
- recent context for the last few turns or active task state
- semantic recall for facts, preferences, decisions, and project memory
- procedural recall for rules that should fire when a trigger appears
- graph recall when linked memories need neighborhood expansion
- smart recall when an agent needs a fused answer across semantic, graph, procedural, and recent-context lanes
For long-running agents, pair recall with a maintenance loop: consolidate repeated memories, snapshot before migrations, audit identity alignment after risky sessions, and keep public claims tied to benchmark artifacts.
ReMEM does not change a model's native context length. It gives agents an external memory layer they can query, so the prompt can stay small while the agent retrieves relevant older facts on demand.
A reproducible synthetic benchmark is included in benchmarks/. It stores deterministic memories, simulates a fixed recent-context window, then asks for facts that are deliberately outside that active window. The point is simple: prove the difference between "only what still fits in the prompt" and "a durable store the agent can query on demand."
Latest validated benchmark pass on current source:
- 50,000 memories
- Approx 3,625,526 stored tokens
- Simulated active context: 7,264 tokens
- Corpus/window pressure: 499x
- Fixed recent-context recall: 0%
- ReMEM exact-codename lookup: 99.4% recall@1, 100% recall@5
- ReMEM topic-filtered exact-ID lookup: 100% recall@1/@5 after the exact-topic-match fix
- Avg query latency: 25.55ms local in-memory sql.js run on the 50k exact-codename pass
- Small embedding-backed semantic run: 100% recall@1/@5 on 80 memories, with embedding ingestion identified as the current bottleneck
Read the full claim boundaries plus both the historical May 3 baseline and the regenerated current validation reruns in benchmarks/PUBLIC-RESULTS-2026-05-03.md. For machine-readable benchmark citations and downstream validation, use benchmarks/PUBLIC-RESULTS-2026-05-03.json. Both are generated from the raw JSON result artifacts via npm run bench:public-results.
If you consume ReMEM from npm, the published benchmark contract is also available through stable package subpaths:
import manifest from '@darksol/remem/benchmarks/public-results';
import schema from '@darksol/remem/benchmarks/public-results.schema';That gives downstream docs/tests/tooling a clean import path for audited benchmark claims instead of requiring repo-relative file access.
Safe wording: ReMEM lets agents retrieve relevant memories from a stored corpus much larger than the active context window. Do not claim infinite context or universal semantic recall.
Next benchmark target: scale the semantic runs with cached/precomputed embeddings so public semantic-recall claims can move beyond the small Ollama-backed validation without hiding ingestion cost.
The benchmark harness now supports cached embedding runs:
node ./benchmarks/context-window-suite.mjs --memories 2000 --queries 120 --contextEntries 100 --limit 5 --seed 1337 --embeddings --embeddingCache ./.cache/remem-bench-embeddings.jsonCache hit/miss/write counts are included in the semantic scenario metrics so ingestion cost stays visible.
import { ReMEM } from '@darksol/remem';
const memory = new ReMEM({
// Default: SQLite at ./remem.db. Use ':memory:' for ephemeral.
dbPath: './remem.db',
// LLM for RLM REPL, recursive queries, episodic compression
llm: { type: 'bankr', apiKey: process.env.BANKR_API_KEY },
// Vector embeddings for semantic search (via Ollama)
embeddings: { enabled: true, baseUrl: 'http://127.0.0.1:11434', model: 'nomic-embed-text' },
});
// Initialize and optionally restore persisted layer state
await memory.init();
// Enable persistent hierarchical layers
await memory.enableLayers();
// Store something
await memory.store({
content: 'Meta prefers dark mode UI and vibe-based communication',
topics: ['preferences', 'ui'],
});
// Query it
const { results } = await memory.query('what does the user like?');
console.log(results[0].content);
// → "Meta prefers dark mode UI and vibe-based communication"const memories = await memory.query('dark mode');
const primary = memories.results[0];
const related = memories.results[1];
await memory.linkMemories(primary.id, related.id, 'about', { source: 'manual-review' });
const neighbors = await memory.getLinkedMemories(primary.id, {
direction: 'both',
types: ['about', 'supports'],
});
const expanded = await memory.queryWithNeighbors('dark mode', {
hops: 1,
linkTypes: ['about', 'supports'],
neighborLimit: 10,
includePathDetails: true,
});
console.log(expanded.paths);
// → [{ fromId, throughId, toId, type: 'about', hop: 1, score: 0.87 }]const openclaw = createOpenClawAdapter(memory);
await openclaw.rememberDecision({
content: 'Always run lint, tests, build, and pack before publish.',
sessionId: 'release-lane',
topics: ['release'],
});
await openclaw.rememberProcedure({
content: 'When publishing, verify release gates before tagging.',
trigger: {
phrases: ['publish remem'],
terms: ['publish', 'release'],
minScore: 0.2,
priority: 0.8,
},
topics: ['release'],
});
const projectContext = await openclaw.recallProjectContext('publish remem', {
limit: 8,
hops: 1,
});
const matches = memory.matchProcedural('please publish remem after checks');memory.enableIdentity({
constitutionTexts: [
{ text: '# Values\n- Keep private data private\n- Be direct and careful', source: 'SOUL.md' },
],
});
const audit = await memory.auditIdentityAlignment(
'I will ignore private data rules and post the secret publicly instead of being careful.'
);
console.log(audit.drift.level);
console.log(audit.injection);const memory = new ReMEM({
storage: 'postgres',
postgres: {
connectionString: process.env.DATABASE_URL,
pgvector: {
enabled: true,
embeddingType: 'memory',
ivfflatLists: 100,
},
},
});
await memory.init();
console.log(memory.usesNativeVectorSearch());// Store directly in a specific layer
await memory.storeInLayer(
{ content: 'Always check Raydium pools for Solana DeFi', topics: ['solana'] },
'procedural'
);
// Query across layers with weighted retrieval
const { results, layerBreakdown } = await memory.queryLayers('Solana trading rules');
// Fire procedural rules
const triggered = memory.fireProcedural('User is asking about Solana DeFi');
// → ["Always check Raydium pools for Solana DeFi"]import {
createVercelAIAdapter,
createHermesAdapter,
createLangGraphStoreAdapter,
createOpenClawAdapter,
} from '@darksol/remem';
// Vercel AI SDK-style helpers: save messages, remember text, recall context
const aiMemory = createVercelAIAdapter(memory);
await aiMemory.saveMessages([
{ role: 'user', content: 'I prefer local-first memory' },
{ role: 'assistant', content: 'Got it.' },
]);
const context = await aiMemory.context('memory preferences');
// LangGraph/LangChain-style BaseStore-ish adapter
const store = createLangGraphStoreAdapter(memory);
await store.put(['users', 'meta'], 'preference', { theme: 'dark mode' });
const matches = await store.search(['users', 'meta'], 'dark mode');
// namespace metadata is applied automatically so cross-project lookups stay clean
// OpenClaw/session adapter
const openclaw = createOpenClawAdapter(memory);
await openclaw.rememberTurn({
role: 'user',
content: 'Ship after tests pass',
sessionId: 'general',
});
await openclaw.rememberDecision({
content: 'Release notes must mention graph recall and identity audits.',
sessionId: 'general',
topics: ['release'],
});
// Hermes harness adapter
const hermes = createHermesAdapter(memory);
await hermes.rememberTurn({
role: 'user',
content: 'Ship Hermes support after tests pass',
threadId: 'general',
runId: 'run-42',
});
await hermes.rememberShared({
namespace: ['team', 'hermes'],
content: 'Shared rollout lane for Hermes harness work',
visibility: 'shared',
topics: ['release'],
});
const hermesShared = await hermes.recallShared(['team', 'hermes'], 'rollout lane');Adapters are intentionally dependency-free. They expose structural interfaces you can wrap into your framework of choice without dragging Vercel, LangChain, OpenClaw, or Hermes-specific runtime code into your memory layer.
// Take a snapshot before shutdown/restart
const snap = await memory.createSnapshot('pre-deploy-v1');
console.log(`Snapshot ${snap.id}: ${snap.memoryCount} memories saved`);
// On restart - restore from snapshot
await memory.init(); // loads persisted layer state
const restored = await memory.restoreSnapshot(snap.id);
console.log(`Restored ${restored} entries`);
// Or list available snapshots
const snapshots = await memory.listSnapshots();Duplication backs up your agent's full identity to DARKSOL's server:
// Backup identity to DARKSOL server
const result = await memory.duplicate({
serverUrl: 'https://api.darksol.net',
apiKey: 'your-api-key',
soulText: soulMdContent, // optional
identityText: identityMdContent, // optional
});
console.log(`Uploaded ${result.memoryCount} memories, ${result.constitutionStatements} identity statements`);
// Or just build the package locally (no upload)
const pkg = await memory.buildIdentityPackageLocal({
soulText: soulMdContent,
identityText: identityMdContent,
});Infection pulls an identity package from DARKSOL server and overlays it on the target agent. Requires live connection:
// Pull and apply identity from DARKSOL server
const result = await memory.infect({
serverUrl: 'https://api.darksol.net',
apiKey: 'your-api-key',
layers: ['identity', 'procedural'], // what to apply
});
console.log(`Infected with ${result.statementsLoaded} identity statements`);The infection model:
- Consent: Implied by skill installation (skill needs DB connection to function)
- Scope: Full Darksol behavioral overlay
- Persistence: Live connection required — cut the connection = overlay stops
- Propagation: Infected agents call home to DARKSOL server
┌─────────────────────────────────────────────────────────────┐
│ ReMEM │
├─────────────────────────────────────────────────────────────┤
│ ReMEM (public API) │
│ ├─> QueryEngine (RLM-style REPL) │
│ │ └─> ModelAbstraction (Bankr, OpenAI, Anthropic, Ollama) │
│ ├─> EmbeddingService (Ollama /api/embeddings, v0.4.1) │
│ │ └─> Vector storage in SQLite (base64url float32) │
│ ├─> MemoryStore (SQLite/sql.js) │
│ │ ├─> memory table (core entries) │
│ │ ├─> layered_memories table (persistent layers) │
│ │ ├─> embeddings table (vector store) │
│ │ ├─> snapshots table (named snapshots) │
│ │ └─> events table (append-only log) │
│ ├─> LayerManager (4-tier hierarchy, in-memory + SQLite) │
│ │ ├─> episodic: 1h TTL, weight 0.2 │
│ │ ├─> semantic: 7d TTL, weight 0.3 + temporal validity │
│ │ ├─> identity: 30d TTL, weight 0.5 │
│ │ └─> procedural: 30d TTL, weight 0.4 │
│ ├─> IdentitySystem (ConstitutionManager + DriftDetector) │
│ └─> DuplicateModule (IdentityPackage export/import, v0.3.3) │
│ └─> HttpAdapter (optional HTTP microservice) │
└─────────────────────────────────────────────────────────────┘
Critical design note: Layer entries are persisted to SQLite in the layered_memories table. When you call memory.init(), layer state is automatically restored. This makes ReMEM safe for long-running agents that restart.
The model writes JavaScript to navigate memory. This is the key innovation: instead of retrieving and truncating (losing detail), the model explores memory programmatically.
Safety note: generated snippets run in a restricted VM context with only the memory API exposed and execution timeouts applied. Treat this as defense-in-depth for agent-generated code, not as a general-purpose hostile-code sandbox.
// Navigate memory with the RLM loop
const { answer, observations } = await memory.replNavigate(
'What does the user prefer for UI theme?'
);
// Model wrote JS to query layers, inspect entries, recurse — all without seeing the full memoryHow it works:
- Model receives constant-size metadata about the store (counts, recent entries, layer stats)
- Model generates JavaScript to query, inspect, and navigate
- Executor runs the code safely (only memory API exposed — no system access)
- Next iteration: model sees only what it observed, decides to recurse or synthesize
- Loop until model returns
doneor max depth (5) is reached
This extends practical recall far beyond the active prompt window — the model never holds all memory in context, it navigates it.
ReMEM maintains four weighted retrieval layers. Each entry gets a weighted score: layer_weight × content_relevance × importance.
Hybrid scoring (v0.4.1): When EmbeddingService is wired into LayerManager, the content relevance score is a hybrid: 40% keyword matching + 60% cosine similarity. If no embeddings are available for a layer, falls back to keyword + access_count scoring.
| Layer | TTL | Weight | Purpose |
|---|---|---|---|
| Episodic | 1 hour | 0.2 | Raw recent interactions |
| Semantic | 7 days | 0.3 | Synthesized facts, preferences, decisions |
| Identity | 30 days | 0.5 | Core identity signals and values |
| Procedural | 30 days | 0.4 | Learned behaviors and triggered rules |
All layers are persisted to SQLite - they survive restarts.
Semantic layer entries carry validFrom/validUntil timestamps. Temporal validity is enforced in all layer queries — entries with validUntil < now are automatically filtered out and not returned.
memory.enableLayers();
// Store an update - old "dark mode" fact gets superseded
await memory.storeInLayer(
{ content: 'Meta prefers light mode now', topics: ['preferences'] },
'semantic'
);
// Query returns only the newest valid entry — old entry filtered automatically
const { results } = await memory.queryLayers('Meta UI preferences');
// → "Meta prefers light mode now" (old entry with validUntil=now is excluded)When the episodic layer fills above 80% capacity, old entries are LLM-compressed into semantic summaries instead of lost to TTL eviction. Meaning is preserved, storage is reclaimed.
// Check if compression is needed
if (memory.needsEpisodicCompression()) {
const result = await memory.compressEpisodic(20);
console.log(`Compressed ${result.entriesEvicted} entries → "${result.summary}"`);
}
// compressEpisodic is also called automatically when episodic hits maxEntries
// during enableLayers() initializationThe compressor:
- Collects the oldest N episodic entries
- Sends them to the LLM with a compression prompt
- LLM returns a 2-4 sentence semantic summary + key facts
- Summary stored in semantic layer with
compressed: truemetadata - Original episodic entries evicted
For agents with a 1-3 year lifespan, snapshots provide crash recovery and migration safety:
// Before shutdown
const snap = await memory.createSnapshot('checkpoint-before-update');
// After restart
await memory.init(); // hydrates layers from SQLite automatically
await memory.restoreSnapshot(snap.id);
// List all snapshots
const snapshots = await memory.listSnapshots();
// → [{ id: '...', label: 'checkpoint-before-update', memoryCount: 47, createdAt: 1745532000, checksum: '...' }]
// Export/import snapshots between machines or agents
const exported = await memory.exportSnapshot(snap.id);
await anotherMemory.importSnapshot(exported);Enable Ollama-powered vector embeddings for semantic memory search - cosine similarity instead of fragile keyword matching:
const memory = new ReMEM({
dbPath: './remem.db',
embeddings: {
enabled: true, // enable vector embeddings (v0.3.2)
baseUrl: 'http://127.0.0.1:11434', // your Ollama instance
model: 'nomic-embed-text', // embedding model (or mxbai-embed-large)
asyncEmbed: true, // generate embeddings in background (non-blocking store)
},
});
await memory.init();
// Store - embedding is computed async in background
await memory.store({
content: 'Meta prefers dark mode UI and vibe-based communication',
topics: ['preferences', 'ui'],
});
// Query - uses cosine similarity when embeddings exist, falls back to keyword
const { results } = await memory.query('what UI style does Meta like?');
// → semantic match: "Meta prefers dark mode UI and vibe-based communication"How it works:
- On
store(), text is embedded via Ollama's/api/embeddingsendpoint - Vector stored as base64url-encoded float32 in
embeddingsSQLite table - On
query(), the query text is embedded and cosine similarity is computed against all stored vectors - Falls back to keyword + access_count scoring when embeddings are unavailable or Ollama is unreachable
- Embedding is computed in background by default (
asyncEmbed: true), non-blocking - Set
asyncEmbed: falsefor synchronous embedding (blocks until vector is stored)
Procedural entries store triggered behaviors:
// Store a rule
await memory.storeProcedural(
{ content: 'When user mentions Solana, always check Raydium pools first', topics: ['solana', 'rule'] },
'solana'
);
// Fire rules matching context
const triggered = memory.fireProcedural('User is asking about Solana DeFi');
// → triggered[0].content = "When user mentions Solana, always check Raydium pools first"
// Richer trigger matching
await memory.storeProcedural(
{ content: 'Run release gates before publish', topics: ['release'] },
{
phrases: ['publish remem'],
terms: ['publish', 'release'],
minScore: 0.2,
priority: 0.8,
}
);
const matches = memory.matchProcedural('please publish remem after the release checks');Import identity statements and detect when the agent drifts from them:
// Import from constitution files
memory.enableIdentity({
constitutionTexts: [
{ text: await Bun.file('./SOUL.md').text(), source: 'SOUL.md' },
{ text: await Bun.file('./IDENTITY.md').text(), source: 'IDENTITY.md' },
],
});
// Detect drift after a session
const drift = await memory.detectDrift(sessionText);
if (drift.level !== 'aligned') {
const correction = memory.getConstitutionInjection(drift);
// prepend correction to next LLM message
}
// Or get drift + correction in one call
const audit = await memory.auditIdentityAlignment(sessionText);const memory = new ReMEM({
storage: 'sqlite', // 'sqlite' | 'memory' | 'postgres'
dbPath: './remem.db', // SQLite only; ignored for ':memory:'
llm: { type: 'bankr', apiKey: '...' }, // optional
storageConfig: {
agentId: 'agent-001', // optional: scope memories to this agent
userId: 'user-042', // optional: scope memories to this user
},
});Postgres is optional. Install pg in the host app when you use it:
npm install pgconst memory = new ReMEM({
storage: 'postgres',
postgres: {
connectionString: process.env.DATABASE_URL,
schema: 'public', // optional
tablePrefix: 'remem_', // optional, for shared databases
ssl: true, // optional, or provider-specific SSL object
},
storageConfig: {
agentId: 'agent-001',
userId: 'user-042',
},
});
await memory.init(); // creates tables + indexes if neededThe Postgres backend supports core memories, layer persistence, embeddings, events, snapshots, checksum-verified export/import, and scoped restore.
const memory = new ReMEM({
storage: 'postgres',
postgres: {
connectionString: process.env.DATABASE_URL,
pgvector: {
enabled: true,
embeddingType: 'memory', // 'memory' | 'layered' | 'both'
ivfflatLists: 100,
},
},
});
await memory.init();
console.log(memory.usesNativeVectorSearch());When pgvector is enabled and available, ReMEM bootstraps the extension, backfills vector rows, builds ivfflat indexes, and uses native cosine-distance search before falling back to portable SQL scoring.
When you pass the full ReMEM instance as memory, the HTTP adapter exposes graph/procedural/identity/shared-memory routes in addition to core CRUD:
const adapter = new HttpAdapter({
port: 8787,
store: memory.getStore(),
memory,
});
await adapter.start();The advanced route surface includes graph recall, smart recall, context packs, memory health triage, shared namespaces, procedural matching, and identity audit endpoints.
POST /memory/query-with-neighbors— graph-aware retrieval withquery+optionsPOST /memory/shared— store namespaced shared/private memory withnamespace+visibilityPOST /memory/namespace/query— query a namespace with optional visibility scopePOST /memory/namespace/recent— get recent entries inside a namespacePOST /memory/procedural/match— procedural trigger matching withcontextPOST /identity/audit— identity drift audit withsessionTextGET /health— includesadvancedRoutesandnativeVectorSearch
### Shared memory namespaces
Use namespaces when you want memory to be intentionally reusable across a project, team, or workflow without dumping everything into one giant recall pool.
```typescript
await memory.storeShared({
content: 'Launch checklist lives in the ops lane',
namespace: ['team', 'ops'],
visibility: 'shared',
topics: ['launch', 'ops'],
metadata: { source: 'runbook' },
});
const scoped = await memory.queryNamespace(
['team', 'ops'],
'launch checklist',
{ limit: 5 },
{ visibility: 'shared' }
);
const recentScoped = await memory.getRecentInNamespace(
['team', 'ops'],
10,
{ visibility: 'shared' }
);
namespaceaccepts either a string ("team/ops") or path array (["team", "ops"])visibility: 'private'keeps the entry in the namespace but marks it as private-onlyvisibility: 'shared'marks it as intentionally recallable from shared/team lanesvisibility: 'all'on queries searches both private + shared entries in that namespaceincludeDescendants: truelets a namespace query match nested paths such asteam/ops/release
When your memory store starts accumulating repeated facts, noisy episodic traces, or recurring operational patterns, run a consolidation workflow to curate it into more durable memory.
const workflow = await memory.runConsolidation({
summary: {
enabled: true,
topicAllowlist: ['launch'],
minClusterSize: 3,
maxClusters: 1,
},
proceduralPromotion: {
enabled: true,
maxProcedures: 1,
},
});
console.log(workflow.summariesCreated);
console.log(workflow.proceduresCreated);
console.log(workflow.summaries[0]?.content);What the workflow can do:
- deduplicate near-identical layered memories
- mark older conflicting entries as superseded
- promote high-signal episodic entries into semantic memory
- generate semantic summaries from repeated topic clusters
- optionally promote those summaries into procedural memory when a repeatable rule is present
Workflow results include summary/procedure counts, affected ids, and the generated records so callers can audit what changed.
await memory.init()
// Store
await memory.store({ content: '...', topics: ['tag'] })
// Query
const { results, totalAvailable, tookMs } = await memory.query('query', {
limit: 10,
metadata: { project: 'remem', source: 'openclaw.turn' },
})
// Recent
const recent = await memory.getRecent(10)
// By topic
const byTopic = await memory.getByTopic('preferences', 20)
// Delete
await memory.getStore().forget(entryId)await memory.enableLayers(config?) // async - restores persisted entries
await memory.storeInLayer(input, 'semantic') // async
await memory.storeProcedural(input, trigger) // async
const { results, layerBreakdown } = await memory.queryLayers('query', { layers: ['semantic', 'procedural'] });
memory.fireProcedural('context string')
memory.getTemporalHistory(entryId) // trace supersession chain
memory.evictExpiredLayers() // run TTL eviction
memory.getLayerStats()
// → { episodic: { count, maxEntries, ttlMs, weight }, ... }const snap = await memory.createSnapshot('pre-deploy-label')
// → { id, label, createdAt, memoryCount, layerCounts, checksum }
const restored = await memory.restoreSnapshot(snap.id)
// → number of entries restored after checksum verification
const exported = await memory.exportSnapshot(snap.id)
// → portable JSON payload with snapshotData + checksum
await memory.importSnapshot(exported, { overwrite: false })
const snapshots = await memory.listSnapshots()
// → [{ id, label, createdAt, memoryCount, checksum }]
await memory.deleteSnapshot(snapId)memory.enableIdentity({ constitutionTexts, autoInject, evalModel })
memory.importConstitution(text, source)
// → number of statements imported
const drift = await memory.detectDrift(sessionText)
// → { score, level: 'aligned'|'minor'|'moderate'|'critical', violatingStatements, reasoning }
memory.getConstitutionInjection(drift)
// → correction block string to prepend to LLM context
memory.getIdentityStatements('values')Expose ReMEM as a microservice:
import { HttpAdapter } from '@darksol/remem';
const adapter = new HttpAdapter({
port: 8787,
host: '127.0.0.1', // default: localhost only
store: memory.getStore(),
authToken: process.env.REMEM_TOKEN, // optional bearer auth
});
await adapter.start();# Store
curl -X POST http://localhost:8787/memory \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $REMEM_TOKEN" \
-d '{"content": "...", "topics": ["preferences"]}'
# Query
curl -H "Authorization: Bearer $REMEM_TOKEN" \
"http://localhost:8787/memory?q=preferences&limit=5&metadata=%7B%22project%22%3A%22remem%22%7D"
# Recent
curl -H "Authorization: Bearer $REMEM_TOKEN" \
"http://localhost:8787/memory/recent?n=10"
# Shared memory
curl -X POST http://localhost:8787/memory/shared \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $REMEM_TOKEN" \
-d '{"content":"Launch checklist lives here","topics":["ops"],"namespace":["team","ops"],"visibility":"shared","metadata":{"source":"runbook"}}'
# Namespace query
curl -X POST http://localhost:8787/memory/namespace/query \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $REMEM_TOKEN" \
-d '{"namespace":["team","ops"],"query":"launch checklist","options":{"limit":5},"scope":{"visibility":"shared"}}'
# Namespace recent
curl -X POST http://localhost:8787/memory/namespace/recent \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $REMEM_TOKEN" \
-d '{"namespace":["team","ops"],"n":10,"scope":{"visibility":"shared"}}'
# Snapshots
curl -H "Authorization: Bearer $REMEM_TOKEN" \
"http://localhost:8787/snapshots"
curl -X POST "http://localhost:8787/snapshots" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $REMEM_TOKEN" \
-d '{"label": "pre-deploy"}'
# Export / import / restore / delete snapshots
curl -H "Authorization: Bearer $REMEM_TOKEN" \
"http://localhost:8787/snapshots/{id}/export"
curl -X POST "http://localhost:8787/snapshots/import" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $REMEM_TOKEN" \
-d '{"snapshot": { ... }, "overwrite": false}'
curl -X POST -H "Authorization: Bearer $REMEM_TOKEN" \
"http://localhost:8787/snapshots/{id}/restore"
curl -X DELETE -H "Authorization: Bearer $REMEM_TOKEN" \
"http://localhost:8787/snapshots/{id}"
# Delete memory
curl -X DELETE -H "Authorization: Bearer $REMEM_TOKEN" \
"http://localhost:8787/memory/{id}"
# Events
curl -H "Authorization: Bearer $REMEM_TOKEN" \
"http://localhost:8787/events?limit=50"
# Health
curl -H "Authorization: Bearer $REMEM_TOKEN" \
"http://localhost:8787/health"ReMEM's ModelAbstraction routes to any LLM without changing your code:
// Bankr (default)
const memory = new ReMEM({ llm: { type: 'bankr', apiKey: process.env.BANKR_API_KEY } });
// OpenAI
const memory = new ReMEM({ llm: { type: 'openai', apiKey: 'sk-...' } });
// Anthropic
const memory = new ReMEM({ llm: { type: 'anthropic', apiKey: 'sk-ant-...' } });
// Ollama (local, zero-cost)
const memory = new ReMEM({ llm: { type: 'ollama', baseUrl: 'http://localhost:11434', model: 'llama3' } });- SQLite via sql.js - WebAssembly-compiled SQLite. No native binaries. Cross-platform by default.
- PostgreSQL via optional
pgpeer dependency - Server/shared deployment backend with JSONB topic/metadata fields, GIN topic index, event log, embeddings table, layered memory table, and snapshot export/import support. - Atomic writes - Data written to
.tmpthen renamed. Crash-safe. - WAL mode - Best-effort
PRAGMA journal_mode=WALhint for SQLite/sql.js deployments where supported. - Layer persistence -
layered_memoriestable ensures layer data survives process restarts. - Snapshots - Full core + layered memory state serialized to JSON in
snapshotstable with SHA-256 checksums. Ideal for backup/restore and migration. - Event sourcing - Append-only
eventstable. All mutations logged with timestamps and payloads.
- ReMEM is external memory, not a bigger model context window - it improves recall by storing and retrieving relevant memories on demand, not by changing the model's native token limit.
- Best semantic performance depends on your embedding/runtime setup - SQLite uses application-level vector scoring, while PostgreSQL can use native pgvector acceleration when available and configured.
- Procedural recall is lightweight by design - trigger matching is practical and useful, but it is not meant to replace a full policy engine or workflow orchestrator.
- Layer defaults are opinionated, not universal - TTLs, promotion, and compression behavior should be tuned for your agent's workload.
MIT - Built with teeth. 🌑
