Skip to content

Latest commit

 

History

History
72 lines (46 loc) · 3.83 KB

File metadata and controls

72 lines (46 loc) · 3.83 KB

Project Memory — DynamicGraphMemory (DGM)

Durable project state, architecture-as-built, and open design questions. Human-facing and committed to the repo. Actionable work lives in GitHub Issues; this file holds the context behind it.

Last updated: 2026-06-17


What DGM actually is

A conversational long-term memory pipeline, evaluated on the LoCoMo benchmark (long-term conversational QA — 5 question categories, scored by a Claude-Haiku LLM-as-judge).

Note: the README.md still describes an earlier design (a nanoGPT KV-cache replacement with src/graph/, src/memory/, src/model/). That structure no longer exists. The real code lives in src/library/. README needs updating — see issues.

Pipeline

turn
  → ingest_entities   (Phase 1: spaCy NER → L0 GROUND nodes + speaker)
  → ingest_concepts   (Phase 2: Haiku LLM → L1 CONCEPT nodes, INSTANTIATES edges)
  → sleep_consolidation (global re-merge pass — "hippocampal replay")
  → assemble_graph / assemble_graph_only  (context within a token budget)
  → LLM answer → LoCoMo LLM-as-judge score

Graph model (src/library/graph.py)

Three node levels in one unified graph:

Level Source Behaviour
L0 GROUND spaCy NER + speaker Concrete entities; stable immediately; never merge
L1 CONCEPT Haiku extraction Abstract concepts; accrue maturity; merge-eligible
L2 META Declared, currently unused

Plus: Hebbian edge reinforcement, maturity-gated merge events ("aha moments"), a temporal/event/place sub-graph, a JSONL ingest log, and a holographic "bioelectric field" layer (field.py, field_reconstruction.py).

Key files

  • src/library/graph.py — relational concept graph (write policy, maturity, merge)
  • src/library/context.pyContextAssembler; orchestrates ingest + context assembly
  • src/library/llm_extractor.py — triple extractor (currently shadowed; see gaps)
  • src/library/field.py / field_reconstruction.py — holographic field layer
  • src/locomo.py — LoCoMo parsing, scoring, sleep consolidation, convergence
  • src/{gpt2,llama,qwen}_evaluator.py, evaluate.py, experiment.py — eval runners
  • paper/DESIGN.md — original design rationale (partly aspirational; see gaps)

Design-vs-reality gaps

  1. Relational-signature identity is only half-built. DESIGN.md's central claim is that concept identity is its relational signature (language-agnostic, structural). In the code, node matching at write time is pure surface form — character-bigram Jaccard on labels (_label_similarity). Only merging uses the relational signature (_signature_similarity). The language-agnostic promise is therefore not realized at write/retrieval time.

  2. Thresholds are heuristic, not calibrated. DESIGN.md's "open questions" (maturity threshold, merge threshold, provisional lifetime) were answered with dynamic, density-scaled formulas rather than empirically tuned values. No results/ directory exists yet, so they are uncalibrated.

  3. Holographic field is oversold by the README. The field layer's own evaluation concludes it is a compression tool, not a recovery mechanism — graph topology does the recovery work.

  4. Redundancy / dead code. llm_extractor.py (triple extraction) is shadowed by the direct _extract_concepts_llm path in context.py, which wins. The L2 META level is declared but never used.


Highest-value threads

  • Signature identity at write time (gap 1) — the most interesting design thread.
  • Real LoCoMo numbers to calibrate the dynamic thresholds (gap 2) — unblocks evidence-based tuning.

Conventions

  • Task tracking: GitHub Issues (adopted 2026-06-17).
  • This file: durable project context and decisions, not a task list.