feat(smart-search): boost title/narrative matches on 'who/what is X' queries by efenex · Pull Request #571 · rohitg00/agentmemory

efenex · 2026-05-20T15:26:00Z

Summary

For named-concept queries ("who is the careful generator?", "what is a circuit breaker", "what does eventual consistency mean?"), the BM25 hybrid ranker scores busier observations above records that name the concept directly — question scaffolding tokens ("who", "is", "the") add noise that dilutes the true match signal. The record that defines the concept ranks below records that mention it incidentally.

What it does

Detect the query as a named-concept pattern via 5 regexes (`/who is/`, `/what is/`, `/what's/`, `/what does X mean/`, `/who's/`). Skip if no match.
Extract the concept phrase (e.g. "careful generator"). Reject degenerate phrases — single tokens shorter than 3 chars (`it`, `x`) and phrases longer than 6 tokens.
Deepen the BM25 sweep to `limit*3` so the boost has candidates to re-rank (boost on a top-10 set has limited room to move records around).
Re-rank with multiplicative boosts:
- Title contains the phrase → 2.0×
- Narrative contains the phrase → 1.3×
Same treatment for lessons whose content contains the phrase (2.0×).
Re-sort by combined score, trim to original `limit`.

Non-named-concept queries are untouched.

Why this lives in smart-search and not lineage

`mem::lineage` is chronologically-ordered and multi-channel; this is a ranking concern that affects the primary recall path (smart-search), which is what `memory_recall` / `memory_smart_search` MCP tools land on. Lineage benefits from upstream improvements in BM25 score, so this lift propagates.

Test plan

`npm test` passes
New unit tests for `extractNamedConcept` (7 cases) — pattern matching, degenerate-phrase rejection
New integration test that proves the boost re-ranks: an observation whose title contains "careful generator" but has lower BM25 score than a busier unrelated observation gets moved to rank fix: system audit -- 10 bugs fixed across hooks, triggers, and core #1
Non-named-concept query preserves original ordering (regression test)

Discovered while working on the "careful generator" test case in feat(lineage): mem::lineage primitive — chronological concept retrieval across all channels #570 (`mem::lineage`) — documented as Gap 4 there.
Independent of feat: time-range filtering for memory_recall, memory_smart_search, memory_sessions (#392) #414 (time-range filtering for smart-search) — different concern, different lines.

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Concept-aware search: queries like "who is/what is/what does…" are detected and re-rank results so matching observations and lessons surface higher for concept-focused queries.
- Lessons receive additional boosting when their content matches the detected concept, improving relevance for conceptual queries.
Tests
- Added tests covering concept extraction and verifying that conceptual queries trigger expected re-ranking while non-concept queries keep original order.

vercel · 2026-05-20T15:26:04Z

@efenex is attempting to deploy a commit to the rohitg00's projects Team on Vercel.

A member of the Team first needs to authorize it.

coderabbitai · 2026-05-20T15:26:12Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a835dd0f-832c-4bab-983f-d009646ac476

📥 Commits

Reviewing files that changed from the base of the PR and between 997d25d and 3a1f8e7.

📒 Files selected for processing (3)

src/functions/smart-search.ts
src/types.ts
test/smart-search.test.ts

💤 Files with no reviewable changes (3)

src/types.ts
test/smart-search.test.ts
src/functions/smart-search.ts

📝 Walkthrough

Walkthrough

Adds extraction of named concepts from "who is / what is / what does X mean" queries and uses the concept to expand observation fetches and apply multiplicative boosts when the concept appears in observation titles/narratives or lesson content, then re-sorts and trims results.

Changes

Named-Concept Query Detection and Ranking Boost

Layer / File(s)	Summary
Named-concept extraction and boost constants `src/functions/smart-search.ts`, `test/smart-search.test.ts`	`extractNamedConcept()` parses "who is/what is/what does ... mean / what's ..." queries with regex, trims punctuation, filters degenerate token-length matches, and defines title/body boost multipliers. Unit tests verify extraction success and null cases.
Smart-search pipeline boost and re-ranking `src/functions/smart-search.ts`, `src/types.ts`, `test/smart-search.test.ts`	`mem::smart-search` derives `namedConcept`, increases observation fetch size when present, runs hybrid observation search and `recallLessons()` (passing `boostPhrase`) in parallel, sets `CompactLessonResult.boostMatched`, applies multiplicative boosts to observation `combinedScore` (title/narrative) and lesson `score` (boostMatched or content match fallback), re-sorts and truncates results back to `limit`. Integration tests assert boosted re-ranking and stable ordering for non-matching queries.

Sequence Diagram

sequenceDiagram
  participant Query
  participant extractNamedConcept
  participant hybridSearch
  participant lessonRecall
  participant boostProcessor
  participant returnSorted

  Query->>extractNamedConcept: parse query -> concept|null
  extractNamedConcept-->>Query: concept|null
  Query->>hybridSearch: run observation search (expanded limit if concept)
  Query->>lessonRecall: run lesson recall (pass boostPhrase)
  hybridSearch-->>boostProcessor: observations with combinedScore
  lessonRecall-->>boostProcessor: lessons with boostMatched flag
  boostProcessor->>boostProcessor: multiply observation combinedScore for title/narrative matches
  boostProcessor->>boostProcessor: multiply lesson score when boostMatched or content includes concept
  boostProcessor->>returnSorted: re-sort and truncate to limit
  returnSorted-->>Query: final observations and lessons

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

rohitg00/agentmemory#473: Adds compact lesson inclusion and recallLessons/CompactLessonResult plumbing that this PR extends with boostMatched and named-concept ranking.

Poem

🐰 I sniff a phrase beneath the moonlit log,

"Who is" I twitch — a hopping catalog.
Titles sparkle where the concept lands,
Lessons hum like clapping hands.
Hooray — a carrot-coded search that stands!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: adding a named-concept detection and boosting mechanism for 'who/what is X' style queries in smart-search.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

test/smart-search.test.ts (1)
331-335: ⚡ Quick win

Tighten this assertion so dual-match regressions actually fail.

obsNamed already contains "careful generator" in both title and narrative, but the test only asserts score > 1.0. That still passes with a single applied boost, so it won't catch the bug in the new re-ranker. Either remove the narrative match from the fixture for a pure title-only case, or assert the full expected multiplier for a dual-match case.

Also applies to: 387-389
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/smart-search.test.ts` around lines 331 - 335, The test fixture obsNamed
created via makeObs currently contains "careful generator" in both title and
narrative, which makes the weak assertion (score > 1.0) insufficient; either
remove the phrase from the narrative so the fixture is a title-only match and
keep the simple assertion, or tighten the assertion to check the full expected
boosted score for a dual-match (compute and assert the exact expected
multiplier/threshold instead of >1.0). Update the corresponding duplicate
assertions mentioned (around the second occurrence at lines 387-389) to use the
same fix and reference obsNamed/makeObs when locating the fixture and
assertions.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/functions/smart-search.ts`:
- Around line 151-156: The current boost logic uses the truncated preview in
rawLessons, so named-concept matching misses occurrences beyond the 240-char
cutoff; update the scoring to operate on the full lesson text before any preview
truncation by either (A) running this phrase includes check against the
untruncated field returned by recallLessons (e.g., use the original full content
property such as fullContent or contentFull instead of the previewed content) or
(B) change recallLessons to preserve a fullContent field on each lesson and use
that field in the map that adjusts score (referencing rawLessons, lessons,
phrase, and NAMED_CONCEPT_TITLE_BOOST). Ensure the boost is applied using the
full text and only truncate for presentation after ranking is complete.
- Around line 143-145: The current logic in smart-search that sets mult using an
if/else if (checking title.includes(phrase) then else if
narrative.includes(phrase)) prevents applying both NAMED_CONCEPT_TITLE_BOOST and
NAMED_CONCEPT_BODY_BOOST when both title and narrative match; change it to
compute the multiplier by starting mult = 1 and multiplying by
NAMED_CONCEPT_TITLE_BOOST if title.includes(phrase) and by
NAMED_CONCEPT_BODY_BOOST if narrative.includes(phrase), then return r unchanged
when mult === 1 else return { ...r, combinedScore: r.combinedScore * mult } so
dual matches get the product of both boosts (use the existing symbols title,
narrative, phrase, mult, NAMED_CONCEPT_TITLE_BOOST, NAMED_CONCEPT_BODY_BOOST, r,
combinedScore).

---

Nitpick comments:
In `@test/smart-search.test.ts`:
- Around line 331-335: The test fixture obsNamed created via makeObs currently
contains "careful generator" in both title and narrative, which makes the weak
assertion (score > 1.0) insufficient; either remove the phrase from the
narrative so the fixture is a title-only match and keep the simple assertion, or
tighten the assertion to check the full expected boosted score for a dual-match
(compute and assert the exact expected multiplier/threshold instead of >1.0).
Update the corresponding duplicate assertions mentioned (around the second
occurrence at lines 387-389) to use the same fix and reference obsNamed/makeObs
when locating the fixture and assertions.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c24364d3-8993-4417-a12c-9c0c02cd7c30

📥 Commits

Reviewing files that changed from the base of the PR and between 93d1bdd and d1fcb71.

📒 Files selected for processing (2)

src/functions/smart-search.ts
test/smart-search.test.ts

…l content CodeRabbit caught two issues on rohitg00#571: 1. The boost branch used `if (title) ... else if (narrative) ...`, capping observations that contain the concept in BOTH fields at the title-only 2.0× multiplier. The feature is specified as multiplicative — title-and-narrative matches now compound to 2.0 × 1.3 = 2.6×. Single-field matches behave as before. 2. The lesson boost path was scanning the 240-char preview emitted by recallLessons, not the lesson's full pre-truncation content. Any concept that appeared past the preview boundary silently missed the boost. Fix: thread the concept phrase into recallLessons via a new `boostPhrase` parameter. The function now decides match against `content + context` BEFORE truncation, stamps each result with `boostMatched: boolean`, and the smart-search caller uses that flag instead of re-scanning the preview. `boostMatched` added as an optional field on CompactLessonResult. Callers that don't pass `boostPhrase` get `boostMatched: false` — the smart-search caller falls back to scanning the (truncated) content for the phrase if `boostMatched` is absent, preserving the pre-fix behavior for any non-smart-search caller of recallLessons. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

rohitg00

Audited locally: build clean, 5 new tests pass (1086 total). Self-contained regex + boost multipliers, well-documented, fallback path leaves untouched queries unchanged. Multiplicative (title 2.0 × body 1.3 = 2.6×) when both match — CodeRabbit had already caught the prior else-if cap. Ready to merge.

…queries For identity/definition queries ("who is the careful generator?", "what is mem::lineage?"), extract the named concept and boost hybrid + lesson hits whose title/narrative names it directly, so the defining record outranks incidental mentions. Over-fetches (3x, capped) before boosting so the rerank has headroom; composes with the upstream agentId filter (filter first, then boost, then trim). Rebased on current main. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

efenex · 2026-06-05T01:10:00Z

Rebased on current main. Composes with the upstream agentId filter — filters by agent first, then applies the named-concept boost, then trims to limit. Tests green; the only red check is the Vercel fork-preview.

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

Comment thread src/functions/smart-search.ts Outdated

Comment thread src/functions/smart-search.ts

rohitg00 approved these changes May 27, 2026

View reviewed changes

efenex force-pushed the feat/v4-b-smart-search-named-concept-boost branch from 997d25d to 3a1f8e7 Compare June 5, 2026 00:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(smart-search): boost title/narrative matches on 'who/what is X' queries#571

feat(smart-search): boost title/narrative matches on 'who/what is X' queries#571
efenex wants to merge 1 commit into
rohitg00:mainfrom
efenex:feat/v4-b-smart-search-named-concept-boost

efenex commented May 20, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

vercel Bot commented May 20, 2026

Uh oh!

coderabbitai Bot commented May 20, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

rohitg00 left a comment

Uh oh!

efenex commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

efenex commented May 20, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What it does

Why this lives in smart-search and not lineage

Test plan

Related

Summary by CodeRabbit

Uh oh!

vercel Bot commented May 20, 2026

Uh oh!

coderabbitai Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

rohitg00 left a comment

Choose a reason for hiding this comment

Uh oh!

efenex commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

efenex commented May 20, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 20, 2026 •

edited

Loading