Skip to content

feat: add FTS5 full-text search#22

Merged
valiantone merged 1 commit into
mainfrom
feat/1a-fts5
Jun 1, 2026
Merged

feat: add FTS5 full-text search#22
valiantone merged 1 commit into
mainfrom
feat/1a-fts5

Conversation

@valiantone

@valiantone valiantone commented May 31, 2026

Copy link
Copy Markdown
Contributor

What

Add SQLite FTS5-backed keyword retrieval and use BM25 as the lexical component in hybrid ranking.

Changes:

  • Creates a memories_fts FTS5 virtual table backed by memories.fact_text.
  • Adds INSERT/UPDATE/DELETE triggers to keep the FTS index synchronized.
  • Rebuilds the FTS index on DB init so existing databases get populated.
  • Adds MemoryDB.fts_search(query) with raw BM25 scores.
  • Replaces naive whitespace keyword overlap in search_memories() with normalized BM25 scores.
  • Uses the porter unicode61 tokenizer and prefix query terms to improve stemming/prefix behavior.
  • Applies TTL filtering to FTS results after feat: add per-memory TTL expiry #21 so expired memories do not contribute lexical scores.
  • Adds coverage for FTS search, replace sync, hybrid ranking behavior, and the combined TTL/FTS path.

Closes #2

Dependency

Resolved: this branch has been rebased onto main after #21 merged, so FTS now includes the TTL schema and filters.

Why

The old keyword score only split on whitespace and counted overlap. FTS5 gives HotMem a credible local lexical retrieval layer while keeping the existing API response shape and SQLite-only architecture.

Verification

  • Tests pass: PYTHONPATH=src /Users/zubinj/forge/.knwb/bin/pytest -> 42 passed in 2.01s on Python 3.13.5
  • Lint passes: /Users/zubinj/forge/.knwb/bin/ruff check src/ tests/ -> All checks passed
  • Format check passes: /Users/zubinj/forge/.knwb/bin/ruff format --check src/ tests/ -> 18 files already formatted
  • No new external dependencies added to core

@valiantone valiantone added enhancement New feature or request phase:1-search-quality Phase 1: Credible Search Quality labels May 31, 2026
@valiantone valiantone self-assigned this May 31, 2026
@valiantone valiantone added enhancement New feature or request phase:1-search-quality Phase 1: Credible Search Quality labels May 31, 2026
@valiantone valiantone requested a review from alphakenz May 31, 2026 09:57
Closes #2

Co-authored-by: Codex <codex@openai.com>
@valiantone valiantone merged commit 73a8906 into main Jun 1, 2026
4 checks passed
@valiantone valiantone deleted the feat/1a-fts5 branch June 1, 2026 09:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request phase:1-search-quality Phase 1: Credible Search Quality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1a: FTS5 full-text search

2 participants