epmem: opt-in kernel-mediated copy of long-present constant WMEs into smem, with episode eviction (experimental)#578
Conversation
Periodically scan episodic memory for stable WME structures and write
them to semantic memory as new LTIs. This implements the compose+test
framework (Casteigts et al., 2019) for automatic episodic-to-semantic
knowledge transfer — the operation Soar's long-term declarative stores
have been missing.
Algorithm:
compose — union of constant WMEs currently active in epmem
test — continuous presence >= consolidate-threshold episodes
write — create smem LTI with qualifying augmentations via CLI_add
New parameters (all under epmem):
consolidate on/off (default off)
consolidate-interval integer (default 100) — episodes between runs
consolidate-threshold integer (default 10) — min episode persistence
Deduplication via epmem_consolidated tracking table prevents repeated
writes across consolidation runs. Table is dropped on reinit alongside
other epmem graph tables.
Off by default — zero behavior change until explicitly enabled.
Limitations (deferred to follow-up):
- Only consolidates constant-valued WMEs, not identifier edges
- No back-invalidation across the WM/smem tier boundary
- last-consolidation stat does not persist across agent reinit
Motivation: Derbinsky & Laird (2013) proved forgetting is essential to
Soar's scaling but only built it for working and procedural memory.
Episodic and semantic memory have no eviction and no capacity bound.
This patch addresses the first half: automatic semantic learning from
episodic experience. With semantic entries derived from episodes,
episodic eviction becomes safe (merged episodes leave no reconstruction
debt), and R4's forgettable WME scope expands automatically.
Reference:
Casteigts et al. (2019), "Computing Parameters of Sequence-Based
Dynamic Graphs," Theory of Computing Systems.
Derbinsky & Laird (2013), "Effective and efficient forgetting of
learned knowledge in Soar's working and procedural memories,"
Cognitive Systems Research.
https://june.kim/prescription-soar — full prescription
After consolidation writes stable WMEs to smem, old episodes become redundant. Delete point entries and episode rows older than consolidate-evict-age episodes. This is safe: the consolidated knowledge is in smem, so there is no reconstruction debt. New parameter: consolidate-evict-age integer (default 0 = off) — min age before an episode is eligible for eviction Range and _now interval entries are preserved (they span multiple episodes). Only point entries and episode rows are removed. Reference: Derbinsky & Laird (2013), §5 — "forgotten working-memory knowledge may be recovered via deliberate reconstruction from semantic memory." Consolidation creates the semantic entries; eviction removes the source episodes that are no longer needed for reconstruction.
- Delete _range entries whose intervals end before the eviction cutoff (previously only _point entries were evicted, leaving dead weight) - Wrap all eviction DELETEs in BEGIN/COMMIT when lazy_commit is off for atomicity (when lazy_commit is on, already inside a transaction) Retrieval of evicted episodes is already safe: epmem_install_memory checks valid_episode and returns ^retrieved no-memory.
|
Wow, this and the other pull request. I need to review these, but thanks! These look exciting. |
|
Retiring this PR. The assumptions it was built on were wrong:
See the re-intake at june.kim for the corrected SOAP analysis. |
What this PR does
Adds an opt-in mechanism that periodically scans episodic memory for constant WMEs continuously present for ≥ N episodes and inserts them into semantic memory via the kernel's
CLI_addpath, outside the agent's production system. Optionally evicts old episode rows after each run.All four parameters are off by default. The scan (including eviction) only runs when
consolidate=on; settingconsolidate-evict-agealone has no effect.This is kernel-mediated semantic memory insertion
The central architectural choice in this PR is that a kernel-side scan, not the agent's own productions, decides what gets added to semantic memory. The scan is driven by an epmem persistence heuristic (continuous presence for N episodes). This is not a claim about how Soar should derive semantic memory, or about which regularities "deserve" promotion. It is a mechanism that exposes one heuristic and lets users decide when and whether to invoke it.
Any agent that queries smem after a run will observe entries that were not there before. Any agent that queries episodes older than the eviction age will get different retrieval results. These are deliberate semantic changes, not transparent optimizations.
Open for discussion on whether such a mechanism belongs in Soar and under what conditions. Not a merge proposal.
Parameters
consolidateconsolidate-intervalconsolidate-thresholdconsolidate-evict-ageAlgorithm
epmem_wmes_constant_nowwithstart_episode_id ≤ current - thresholdwc_ids already recorded inepmem_consolidatedtracking table(<c_n> ^attr value ...)string per parent, pass toCLI_addwc_ids intoepmem_consolidatedto prevent reprocessingconsolidate-evict-age > 0, delete point and range rows and episode rows older thancurrent - evict-age;_nowintervals are untouchedconsolidate-intervalepisodesThe scan uses a single SQL statement joining
epmem_wmes_constant,epmem_wmes_constant_now, and the dedup table. For WMEs currently open in_now, "continuous presence for N episodes" is detectable by comparingstart_episode_idagainstcurrent - threshold, avoiding aggregation over episode history. This relies on current epmem interval semantics and only considers WMEs in the_nowset at scan time. It does not attempt to detect WMEs that were interrupted and re-added during the window, or to deduplicate WMEs that are semantically or structurally equivalent across differentwc_ids.Known limitations
wc_iddedup only. Theepmem_consolidatedtable prevents reprocessing of the samewc_idbut does not deduplicate semantically or structurally equivalent WMEs across differentwc_ids. Repeated runs can produce multiple smem LTIs covering overlapping content._nowintervals preserved during eviction. Active WME intervals are never split; only completed ranges, points, and episode rows older than the eviction age are removed.Changes
episodic_memory.hepisodic_memory.cppepmem_consolidate()with eviction (~200 lines), hook inepmem_go()EpMemFunctionalTests.hppEpMemFunctionalTests.cpp.soartest agentsTests
46/46 epmem tests pass (43 existing + 3 new), zero regressions:
testConsolidation— stable WMEs written to smem after consolidation firestestConsolidationOff— smem stays empty when feature is disabledtestConsolidationEviction— old episodes evicted, recent episodes preserved, smem entries intactBackground references