Skip to content

Lower memory thresholds, add dedup, store from free-mode#14

Merged
AaronGoldsmith merged 2 commits intomainfrom
fix/memory-thresholds-dedup
Mar 21, 2026
Merged

Lower memory thresholds, add dedup, store from free-mode#14
AaronGoldsmith merged 2 commits intomainfrom
fix/memory-thresholds-dedup

Conversation

@AaronGoldsmith
Copy link
Copy Markdown
Owner

Summary

  • Lower similarity thresholds from 0.9/0.7 to 0.5/0.3 (real similarities are 0.3-0.5)
  • Add dedup check in memory.store() to prevent duplicate task+winner entries
  • Store memory entries from free-mode matches via record_verdict.py

Why

Memory was never influencing agent selection because thresholds were too high for real embedding distances. Dedup prevents polluting the vector index.

🤖 Generated with Claude Code

- Lower similarity thresholds from 0.9/0.7 to 0.5/0.3 so memory
  actually influences agent selection (real similarities are 0.3-0.5)
- Add dedup check in memory.store() to prevent duplicate task+winner entries
- Store memory entries from free-mode matches via record_verdict.py
  so /mobius-run competitions feed back into the selection system

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 21, 2026 16:27
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adjusts Mobius’s vector-memory behavior so past task outcomes can influence agent selection in practice, while reducing duplicate memory entries and capturing outcomes recorded via the judge script.

Changes:

  • Lowered similarity thresholds used to choose specialist/ensemble selection strategies.
  • Added a duplicate check in Memory.store() to skip repeated (task_text, winning_agent_id) entries.
  • Extended record_verdict.py to store a memory entry for the winning agent after recording a verdict.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
src/mobius/memory.py Adds dedup logic before inserting memory entries.
src/mobius/config.py Lowers similarity thresholds to make memory influence selection more often.
.claude/skills/mobius-judge/scripts/record_verdict.py Stores memory entries when verdicts are recorded via the script.
Comments suppressed due to low confidence (1)

src/mobius/memory.py:56

  • The SELECT-then-INSERT dedup check is not atomic. If two processes/threads call store() concurrently, both can pass the SELECT and insert duplicates. Consider enforcing a DB-level uniqueness constraint (e.g., UNIQUE index on (task_text, winning_agent_id)) and switching the insert to INSERT ... ON CONFLICT DO NOTHING / INSERT OR IGNORE to make dedup race-free and avoid the extra round-trip.
        existing = self.conn.execute(
            "SELECT id FROM memory WHERE task_text = ? AND winning_agent_id = ?",
            (entry.task_text, entry.winning_agent_id),
        ).fetchone()
        if existing:
            logger.debug(
                "Duplicate memory entry for agent %s on task, skipping",
                entry.winning_agent_id,
            )
            return

        row = dict_to_row(entry.model_dump(exclude={"task_embedding"}))
        cols = ", ".join(row.keys())
        placeholders = ", ".join(["?"] * len(row))
        self.conn.execute(
            f"INSERT INTO memory ({cols}) VALUES ({placeholders})",
            list(row.values()),
        )

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .claude/skills/mobius-judge/scripts/record_verdict.py Outdated
Comment thread .claude/skills/mobius-judge/scripts/record_verdict.py
Comment thread .claude/skills/mobius-judge/scripts/record_verdict.py Outdated
- Guard memory-write path with `if vec_available:` so Memory.store()
  is only called when vector search is operational
- Reuse existing task_embedding blob from the match row instead of
  re-embedding task_text via embed()
- Add two tests: vec_unavailable skips memory, existing embedding
  skips embed() call

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@AaronGoldsmith AaronGoldsmith marked this pull request as ready for review March 21, 2026 17:08
@AaronGoldsmith AaronGoldsmith merged commit 550bb5c into main Mar 21, 2026
2 checks passed
@AaronGoldsmith AaronGoldsmith deleted the fix/memory-thresholds-dedup branch March 21, 2026 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants