🦀 Kōhaku

🧠 Episodic memory engine for LLMs — persistent, associative, and beyond context windows.

🍂 Meaning

Kōhaku (琥珀) — amber, preserved in time.

Like insects trapped in amber, memories are captured, compressed, and preserved — not lost to context limits.

🚀 What it is

Kōhaku is a neural episodic memory system:

Stores experiences as HDC hypervectors
Retrieves via associative similarity
Works as a drop-in memory layer for any LLM

Not:

❌ RAG
❌ vector database

But:

✅ learned memory with recall

❗ The problem

LLMs forget.

Context windows are finite
RAG loses nuance
Summaries lose detail

There is no true memory system.

✨ Why kohaku is different — memory that reasons

Every other LLM-memory tool (Mem0, Zep/Graphiti, Letta) is the same recipe: dense embeddings + vector search + a graph. They all do one thing — retrieve by cosine top-k. Dense embeddings have no invertible structure, so that's the ceiling.

kohaku's hyperdimensional substrate can do algebra over memory — something that's impossible to copy without changing substrates:

from kohaku import AnalogicalMemory

mem = AnalogicalMemory()
mem.add_record("USA",    {"currency": "dollar", "capital": "washington"})
mem.add_record("Mexico", {"currency": "peso",   "capital": "mexico_city"})

mem.get("USA", "currency").value          # -> "dollar"   (attribute recall)
mem.analogy("USA", "Mexico", "dollar").value   # -> "peso"  (the dollar of Mexico)

No model call, no extra storage — just binding + bundling + cleanup. That last line ("What is the dollar of Mexico?", Kanerva 2010) is relational transfer over an agent's own memory: learn a preference in one domain, infer the analog in another. Try it: PYTHONPATH=python python3 examples/analogy_demo.py.

Operating envelope (honest — see benchmarks/bench_analogy.py): attribute recall stays exact past 40 attributes/record; analogical transfer is ≥95% accurate up to ~16 bound pairs/record at 10k-D, then degrades gracefully (every answer carries a confidence + margin so you can threshold).

🧠 What you learn

Hyperdimensional computing (HDC)
Associative memory / Hopfield networks
Memory-augmented architectures
Episodic vs semantic memory

⚙️ Architecture

🐍 Python — the full engine and API; the pure-Python path is the correctness baseline and works with zero native dependencies.
🦀 Rust accelerator (optional) — bit-packed XOR + popcount cosine top-k behind a PyO3 extension. pip install . (from the repo root, via maturin) builds kohaku._kohaku_rs (kohaku._BACKEND == "rust-accel"), parity-tested against NumPy in CI. Retrieval crosses the FFI boundary zero-copy (borrowed int8 arrays, no list marshaling). The big win is kohaku.RetrievalIndex, a resident packed index that packs the keys once and is ~160–230× faster than NumPy on repeated probes (benchmarks/bench_backends.py); query and query_with_decay use a per-memory cached index automatically. One-shot batches stay on NumPy (re-packing every call is ~parity with BLAS).

pip install ./python    # pure-Python baseline
pip install .           # + Rust accelerator (needs a Rust toolchain + maturin)

🚀 Quick Start

pip install kohaku

from kohaku import Memory

mem = Memory()
mem.store("User prefers Italian wine")
mem.store("User is allergic to shellfish", importance=0.9, tags=["health"])

hits = mem.query("What does the user like to drink?")
for h in hits:
    print(h.text, round(h.similarity, 3))
# → User prefers Italian wine 0.63

mem.save("user.json")          # labels + metadata; HVs re-derived on load
mem2 = Memory.load("user.json")

Memory is the one-line front door: store strings, get ranked MemoryHit results back (.text, .similarity, .salience, .source, .tags). It wraps the full EnrichedMemoryStore — temporal validity, salience, source-trust, tags — behind a string-in/string-out API. Reach for EnrichedMemoryStore, MemorySystem, and friends directly when you need provenance graphs, version history, or consolidation daemons.

🧬 Semantic recall (opt-in)

The default encoder bundles per-token hypervectors, so similarity is token overlap — "the customer enjoys merlot" won't match "User prefers Italian wine". For meaning-based recall, plug in an EmbeddingEncoder that projects a dense embedding into HDC space (SimHash — sign of a fixed random projection, which approximately preserves cosine):

pip install "kohaku[semantic]"     # pulls sentence-transformers

from kohaku import Memory, EmbeddingEncoder

enc = EmbeddingEncoder(model_name="all-MiniLM-L6-v2")   # or embed_fn=<your callable>
mem = Memory(encoder=enc)
mem.store("User prefers Italian wine")
mem.query("the customer enjoys a glass of merlot")[0].text
# → 'User prefers Italian wine'   (zero shared tokens, still matches)

EmbeddingEncoder takes any embed_fn (str -> float array) — sentence- transformers, OpenAI embeddings, your own — so there's no hard dependency. A store saved with a custom encoder must be reloaded with the same one (Memory.load(path, encoder=enc)).

⚡ Scaling past 10⁴ memories

Exact cosine retrieval is O(N·D) per query. Flip on the bipolar-LSH index to narrow each similarity query to a small candidate set before exact ranking:

mem = Memory(ann=True)            # maintains a kohaku.ann.LSHIndex
# ... store thousands of memories ...
mem.query("...")                  # sub-linear: LSH candidates, exact re-rank

Results are unchanged except for the rare LSH miss — candidates are always scored with exact cosine, and salience/recency sorts or empty candidate sets fall back to a full scan. LSHIndex is pure NumPy (no FAISS/hnswlib) and can be used standalone.

📦 Whole-system snapshots

save_system / load_system persist an entire enriched setup — episodic hypervectors, per-memory metadata, and the provenance / version / relationship side stores — into one directory with a manifest:

from kohaku import save_system, load_system

save_system(store, "snapshot/", provenance=pg, versions=vs, relationships=rel)
bundle = load_system("snapshot/")
bundle.store, bundle.provenance, bundle.versions, bundle.relationships

SQLite side stores are copied via the backup API (so :memory: stores persist too), and recall is exact after the round-trip.

💾 Persistence (v0.4.0)

from kohaku import EpisodicMemory, save, load

mem = EpisodicMemory(capacity=1000)
# ... store entries ...
save(mem, "memories.hkb")        # packed binary, ~10x smaller than JSON
save(mem, "memories.json")       # human-readable

mem2 = load("memories.hkb")      # round-trip preserves IDs, timestamps, recall

🌱 Consolidation

from kohaku import consolidate_to_memory

semantic = consolidate_to_memory(mem, similarity_threshold=0.3)
# Greedy bundle-of-bundles clustering: N noisy episodic traces → K semantic centroids.

🧠 Online learning + Hopfield + episodic↔semantic (v0.5.0)

from kohaku import ItemMemory, HopfieldAssociator, MemorySystem, encode_text

# Online HDC learning — prototypes update with every example
im = ItemMemory()
for example in cat_examples:
    im.add("cat", encode_text(example))
im.train_from_feedback("cat", encode_text("a dog barked"), correct=False)
top = im.predict(encode_text("a kitten napping"), top_k=3)

# Modern Hopfield — clean noisy queries by softmax-weighted retrieval
hop = HopfieldAssociator(beta=0.05)
for proto in canonical_prototypes:
    hop.store(proto)
cleaned = hop.complete(noisy_query)

# Combined episodic + semantic store with sleep-style consolidation
ms = MemorySystem(episodic_capacity=1000)
ms.store_episode(key, value, label="meeting on monday")
ms.consolidate_to_semantic(similarity_threshold=0.3)  # promote clusters → prototypes
results = ms.recall(query, top_k=3, use_decay=True)   # tagged by source

🕰️ Temporal decay

from kohaku import DecayConfig, query_with_decay

cfg = DecayConfig(half_life=100.0, floor=0.05)
results = query_with_decay(mem, query_key, top_k=5, config=cfg)
# Older memories decay exponentially: weight = max(0.5 ** (age / half_life), floor)

🎬 Live demo

python demo/server.py        # starts a localhost server with REAL kohaku
open http://127.0.0.1:8000

The page detects the API and switches from offline simulation to live mode — every similarity number, decay weight, and .hkb file size you see is computed by the live library. Add a phrase, click any node, drag the days slider, hit save — it's all real.

PYTHONPATH=python python3 demo/demo.py    # rich-terminal walkthrough

🎯 Vision

Give models memory — not just context.

Name		Name	Last commit message	Last commit date
Latest commit History 124 Commits
.claude		.claude
.github		.github
.konjo		.konjo
api		api
benchmarks		benchmarks
dashboard		dashboard
demo		demo
examples		examples
python		python
scripts		scripts
src		src
tests		tests
.DS_Store		.DS_Store
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
KONJO_PROMPT.md		KONJO_PROMPT.md
KONJO_QUALITY_FRAMEWORK.md		KONJO_QUALITY_FRAMEWORK.md
PLAN.md		PLAN.md
README.md		README.md
ROADMAP.md		ROADMAP.md
STARTUP.md		STARTUP.md
pyproject.toml		pyproject.toml
render.yaml		render.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🦀 Kōhaku

🍂 Meaning

🚀 What it is

❗ The problem

✨ Why kohaku is different — memory that reasons

🧠 What you learn

⚙️ Architecture

🚀 Quick Start

🧬 Semantic recall (opt-in)

⚡ Scaling past 10⁴ memories

📦 Whole-system snapshots

💾 Persistence (v0.4.0)

🌱 Consolidation

🧠 Online learning + Hopfield + episodic↔semantic (v0.5.0)

🕰️ Temporal decay

🎬 Live demo

🎯 Vision

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🦀 Kōhaku

🍂 Meaning

🚀 What it is

❗ The problem

✨ Why kohaku is different — memory that reasons

🧠 What you learn

⚙️ Architecture

🚀 Quick Start

🧬 Semantic recall (opt-in)

⚡ Scaling past 10⁴ memories

📦 Whole-system snapshots

💾 Persistence (v0.4.0)

🌱 Consolidation

🧠 Online learning + Hopfield + episodic↔semantic (v0.5.0)

🕰️ Temporal decay

🎬 Live demo

🎯 Vision

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages