smem: add --redundancy-check for rooted structural containment (read-only) by kimjune01 · Pull Request #579 · SoarGroup/Soar

kimjune01 · 2026-03-27T15:31:57Z

What this PR does

Adds smem --redundancy-check: a read-only CLI command that scans semantic memory for LTIs structurally contained in other LTIs under a rooted inclusion predicate. Reports pairs where one LTI is contained in another, and groups of LTIs that are mutually contained under the same predicate. Makes no changes to the store.

Under the predicate, LTI A contains LTI B if B's augmentation graph embeds into A's: every slot in B has a matching slot in A with a superset of values, child LTIs matched recursively, subject to a global injective node mapping rooted at B and A. The predicate ignores activation state, base-level activation history, support and status metadata, aliases, ordering, and multiplicity. It is a structural test only.

The implementation is inspired by tree inclusion (Kilpeläinen & Mannila, 1995) but adapted for smem's rooted graphs, which may contain cycles and shared substructure. The adaptation has not been proved correct; see limitations below.

Read-only command. Intended as a diagnostic tool for inspecting smem contents.

Algorithm

Coinductive cycle handling. When recursion revisits an active pair, the pair is assumed included for the purpose of the local check; the global result is only accepted if every non-cyclic obligation also succeeds. This is an implementation choice, not a proved soundness argument for cyclic inputs.
Global injective mapping over LTI nodes. A b_to_a map threaded through all recursion prevents two distinct B-side LTI nodes from mapping to the same A-side LTI node. Injectivity is enforced at the LTI-node level; constants and slot-value occurrences are compared but not themselves tracked in the map.
Backtracking child matching. Full backtracking over candidate assignments per attribute, with transactional b_to_a snapshot and restore on failed branches. Root nodes are pinned before recursion begins.

Limitations

Cyclic and shared-substructure cases are not covered by tests in this PR. The cycle handling described above is designed to be sound on the non-cyclic portion of any proof, but correctness on inputs with cycles or shared substructure has not been established by regression tests.
Predicate ignores non-structural state. Activation, BLA history, status/support metadata, aliases, multiplicity, and ordering are not considered. Two LTIs that differ only in these dimensions can be reported as one containing the other.
"Structural containment" is not retrieval equivalence. Even when A contains B structurally, B may be preferred over A under activation or spreading rules. This command does not reason about retrieval behavior at all.
No regression run against the existing smem test suite is included here; only the targeted smoke test is reported.

Implementation

New file: smem_inclusion.cpp (~400 lines)
- load_lti_augs() — loads augmentations via web_expand into lightweight structs keyed by hash IDs (no Symbol allocation)
- smem_lti_includes_impl() — recursive inclusion with coinductive cycles, global injectivity, backtracking
- CLI_redundancy_check() — iterates all LTI pairs, reports pairs under the predicate
CLI: smem --redundancy-check (option 'R', no arguments)
No schema changes. Reuses existing web_expand and lti_all infrastructure.

Implementation

New file: smem_inclusion.cpp (~400 lines)
- load_lti_augs() — loads augmentations via web_expand into lightweight structs keyed by hash IDs (no Symbol allocation)
- smem_lti_includes_impl() — recursive inclusion with coinductive cycles, global injectivity, backtracking
- CLI_redundancy_check() — iterates all LTI pairs, reports dominated entries and equivalence classes
CLI: smem --redundancy-check (option 'R', no arguments)
No schema changes. Reuses existing web_expand and lti_all infrastructure

References

Kilpeläinen, P. & Mannila, H. (1995). "Ordered and Unordered Tree Inclusion." SIAM J. Computing 24(2):340–356. DOI
Bille, P. & Gørtz, I.L. (2011). "The Tree Inclusion Problem: In Linear Space and Faster." ACM TALG 7(3). arXiv
Pinter, R.Y. et al. (2007). "Approximate Labelled Subtree Homeomorphism." J. Discrete Algorithms 5(4). DOI

Test plan

Build succeeds (CMake + make, macOS)
Smoke test: @1 (^name alice ^age 30) contained in @2 (^name alice ^age 30 ^city boston)
Test with cyclic structures (@1 ^next @1 vs @2 ^next @2)
Test with shared substructure across attributes
Verify no regression on existing smem unit tests

Periodically scan episodic memory for stable WME structures and write them to semantic memory as new LTIs. This implements the compose+test framework (Casteigts et al., 2019) for automatic episodic-to-semantic knowledge transfer — the operation Soar's long-term declarative stores have been missing. Algorithm: compose — union of constant WMEs currently active in epmem test — continuous presence >= consolidate-threshold episodes write — create smem LTI with qualifying augmentations via CLI_add New parameters (all under epmem): consolidate on/off (default off) consolidate-interval integer (default 100) — episodes between runs consolidate-threshold integer (default 10) — min episode persistence Deduplication via epmem_consolidated tracking table prevents repeated writes across consolidation runs. Table is dropped on reinit alongside other epmem graph tables. Off by default — zero behavior change until explicitly enabled. Limitations (deferred to follow-up): - Only consolidates constant-valued WMEs, not identifier edges - No back-invalidation across the WM/smem tier boundary - last-consolidation stat does not persist across agent reinit Motivation: Derbinsky & Laird (2013) proved forgetting is essential to Soar's scaling but only built it for working and procedural memory. Episodic and semantic memory have no eviction and no capacity bound. This patch addresses the first half: automatic semantic learning from episodic experience. With semantic entries derived from episodes, episodic eviction becomes safe (merged episodes leave no reconstruction debt), and R4's forgettable WME scope expands automatically. Reference: Casteigts et al. (2019), "Computing Parameters of Sequence-Based Dynamic Graphs," Theory of Computing Systems. Derbinsky & Laird (2013), "Effective and efficient forgetting of learned knowledge in Soar's working and procedural memories," Cognitive Systems Research. https://june.kim/prescription-soar — full prescription

After consolidation writes stable WMEs to smem, old episodes become redundant. Delete point entries and episode rows older than consolidate-evict-age episodes. This is safe: the consolidated knowledge is in smem, so there is no reconstruction debt. New parameter: consolidate-evict-age integer (default 0 = off) — min age before an episode is eligible for eviction Range and _now interval entries are preserved (they span multiple episodes). Only point entries and episode rows are removed. Reference: Derbinsky & Laird (2013), §5 — "forgotten working-memory knowledge may be recovered via deliberate reconstruction from semantic memory." Consolidation creates the semantic entries; eviction removes the source episodes that are no longer needed for reconstruction.

- Delete _range entries whose intervals end before the eviction cutoff (previously only _point entries were evicted, leaving dead weight) - Wrap all eviction DELETEs in BEGIN/COMMIT when lazy_commit is off for atomicity (when lazy_commit is on, already inside a transaction) Retrieval of evicted episodes is already safe: epmem_install_memory checks valid_episode and returns ^retrieved no-memory.

Implements Kilpeläinen-Mannila tree inclusion to detect which LTI entries are structurally dominated by others. Detection only — no eviction. Works at the raw hash level via web_expand to avoid Symbol allocation overhead.

…rror routing Codex review found four issues: 1. Greedy child matching could produce false negatives when first-fit blocks later required matches. Replaced with backtracking injective matcher. 2. Cycle detection conflated "currently exploring" with "proven to include". Split into separate active_pairs (recursion stack) and memo (proven results). Cycles now conservatively return false. 3. CLI_redundancy_check returned void, DoSMem always returned true. Changed to bool with SetError routing on failure. 4. help smem did not list --redundancy-check. Added.

@1

Round 2 codex review found two remaining issues: 1. Per-attribute a_used vectors allowed two distinct B nodes to map to the same A node via different attributes. Fixed by threading a global b_to_a map (B node → A node assignment) through all recursion. Backtracking undoes assignments on failure. 2. Active-pair cycle check returned false (pessimistic), which missed valid cyclic equivalences like @1 ^next @1 vs @2 ^next @2. Changed to coinductive (optimistic): revisiting an active pair returns true. If the assumption is wrong, non-cyclic proof obligations will fail.

Round 3 codex review found two issues: 1. Failed speculative branches leaked descendant b_to_a bindings. Fixed by snapshotting b_to_a before each branch and restoring on failure. Also pre-bind b_child before recursion so descendants see the intended assignment. 2. Memoization keyed by (lti_a, lti_b) was unsound because results depend on the current b_to_a context. Dropped memo entirely — smem entries are shallow (depth 1-2) so re-evaluation is cheap.

@b

Round 4 codex review found two issues: 1. Root of B was not pinned to root of A in the global assignment. Counterexample: @b ^next @b vs @A ^next @A1, @A1 ^next @A1 — B's root could map to A1 instead of A. Fixed by seeding b_to_a[lti_b] = lti_a before recursion. 2. smem --redundancy-check was missing from the runtime help screen in smem_settings.cpp. Added.

@1

…ated LTIs Mark phase reuses tree inclusion detection from SoarGroup#579. Sweep phase: 1. R4 safety check: skip LTIs currently referenced in working memory 2. Dependency-safe deletion: disconnect_ltm, then delete from all smem tables (augmentations, activation history, aliases, lti) 3. Per-invocation budget via optional numeric argument Includes functional test proving full eviction: add 3 LTIs where @1 is structurally dominated by @2, sweep, verify @1 is gone and @2/@3 survive. Post-sweep redundancy check confirms no remaining dominated entries.

kimjune01 · 2026-04-11T06:02:31Z

Retiring this PR. The assumptions it was built on were wrong:

D&L 2013 extrapolation. The prior diagnosis extrapolated from D&L 2013's WM/procedural forgetting to claim smem eviction was needed. D&L 2013 treats smem as the stable backup enabling WM forgetting (R4), not as a target for eviction.
Scaling motivation retired. Laird rates real-time capability "yes, yes" with "millions of items" (§10 item 3). The scaling-wall framing does not hold.
Wrong level of analysis. Smem redundancy checking and structural compaction assumed smem accumulates unboundedly in a way that degrades performance. The evidence does not support this at current hardware scales (Laird §10 items 3, 12).

See the re-intake at june.kim for the corrected SOAP analysis.

kimjune01 added 8 commits March 25, 2026 04:52

Add smem --redundancy-check for tree inclusion detection

e32d4d6

Implements Kilpeläinen-Mannila tree inclusion to detect which LTI entries are structurally dominated by others. Detection only — no eviction. Works at the raw hash level via web_expand to avoid Symbol allocation overhead.

kimjune01 marked this pull request as draft March 27, 2026 16:12

kimjune01 mentioned this pull request Mar 27, 2026

smem: add --sweep-dominated for destructive structural compaction (experimental, changes retrieval behavior) #580

Closed

7 tasks

kimjune01 changed the title ~~Add smem --redundancy-check (tree inclusion)~~ smem: add --redundancy-check for structural dominance (read-only) Apr 9, 2026

kimjune01 changed the title ~~smem: add --redundancy-check for structural dominance (read-only)~~ smem: add --redundancy-check for rooted structural containment (read-only) Apr 9, 2026

kimjune01 closed this Apr 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

smem: add --redundancy-check for rooted structural containment (read-only)#579

smem: add --redundancy-check for rooted structural containment (read-only)#579
kimjune01 wants to merge 8 commits intoSoarGroup:developmentfrom
kimjune01:smem-redundancy-check

kimjune01 commented Mar 27, 2026 •

edited

Loading

Uh oh!

kimjune01 commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kimjune01 commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does

Algorithm

Limitations

Implementation

Implementation

References

Test plan

Uh oh!

kimjune01 commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

kimjune01 commented Mar 27, 2026 •

edited

Loading