Skip to content

Releases: rmk40/opencode-session-recall

v0.12.1

Choose a tag to compare

@github-actions github-actions released this 21 Jun 00:06

A bug-fix release. Search worked, but retrieving and browsing the results was
broken
— so after recall found a hit, reading or paginating it could come
back empty. Searching for things you can't then retrieve is no use; this fixes
the retrieval half of the flow.

Fixed

  • Tool-argument defaults were not applied, breaking the browse/retrieve
    tools.
    opencode passes the model's raw argument object to a plugin tool's
    execute; it validates against the Zod schema but does not feed back the
    parsed value, so schema .default()s are never materialized. When the model
    omitted role, recall_messages saw role: undefined, its role filter
    matched nothing, and a fully-loaded session reported total: 0 with no
    messages (verified live: 590 messages fetched, 0 after the filter). The same
    missing-default behavior left recall_context with NaN slice bounds and
    could make recall_get throw on an undefined messageID. recall (search)
    already coerced its own args defensively; the four browse/retrieve tools now
    do too, via shared coerceEnum / coerceBool / coerceInt helpers plus
    required-argument guards. Impact: retrieving or browsing a session —
    including cross-project results that recall surfaces — now works regardless
    of which optional args the caller sends.
  • recall could surface its own prior output under renamed tool calls. When
    tool names are namespaced upstream (e.g. mcp__…__recall), they slipped past
    the self-exclusion guard, so recall could match earlier recall results; and
    inline expand could include a nearby recall call's output. Both are now
    excluded from search and redacted from expansion, matching by suffix so
    namespaced variants are caught. Explicit recall_get / recall_context
    remain full-fidelity.

Verified end-to-end by driving a fresh opencode session against real history.

Latest Snapshot

Latest Snapshot Pre-release
Pre-release

Choose a tag to compare

@github-actions github-actions released this 21 Jun 01:20

Rolling pre-release tracking the latest main commit.

Commit: 06b3bbf
Date: 2026-06-21T01:20:33Z

This is not a stable release. Use a versioned tag (e.g., v0.1.0) for production.

v0.12.0

Choose a tag to compare

@github-actions github-actions released this 18 Jun 21:08

This release rebuilds how recall ranks results and adds the ability for the
agent to reach for its history on its own. No existing tool parameter changed
its meaning, so upgrades are drop-in.

Highlights

  • Relevance ranking is now BM25 instead of fuzzy string matching. smart
    and fuzzy search are powered by an in-memory MiniSearch
    BM25 index built per query. BM25 weights rare, discriminative terms over
    common boilerplate and normalizes for document length, so a short message that
    is actually about your query beats a long log that merely mentions the words.
  • Proactive recall. Three opt-level features help the agent search history
    when it should, instead of waiting to be told: a default-on system-prompt
    nudge, and two opt-in hooks (autoRecall, compactionRecall).
  • regex match mode for exact shapes — error codes, stack traces, file
    paths, IDs, URLs.
  • Result diversity so one noisy session can't flood a result list.
  • Query-shape routing that suggests regex when a query looks like a
    pattern, without ever overriding the caller.

Expected effectiveness

The ranker change is measured, not asserted. A new relevance eval harness
(test/eval/) scores a labeled corpus of eight retrieval cases that exercise
the situations recall is for: rare-term recall, prior-decision recall, vague
"same as before" recall, typo tolerance, exact-phrase preference, cross-project
recall, and old-but-strong vs. recent-but-weak ranking.

Ranker MRR recall@5
Previous (Fuse.js) 0.50 0.50
BM25 (this release) 1.00 1.00

The previous ranker returned nothing on four of the eight cases (exact
phrase, cross-project error, old-strong-vs-recent-weak, and long-document
competition). BM25 returns the correct session at rank 1 for all eight. The
eval is wired into npm run check as a regression gate, so future ranking
changes must meet or beat these numbers.

Practical effect: queries that name a specific symbol, error string, file, or
decision now rank the right hit at or near the top far more reliably, and broad
queries no longer get drowned out by long, boilerplate-heavy tool output.

Added

  • BM25 ranking (smart, fuzzy) via MiniSearch, replacing Fuse.js.
    Structural boosts (exact phrase, full token coverage, reasoning traces, error
    output, user messages, recency) and penalties (weak single-token fuzzy, poor
    coverage) are layered on the BM25 base score as multipliers. Scores are
    reported 0–1.
  • match: "regex" — bounded regular-expression scan over message and tool
    content. Invalid patterns return a clear error instead of silently matching
    nothing.
  • Result diversity — in part-grouped results, a single session's share of
    the initial result list is capped so it can't crowd out other sessions;
    held-back hits backfill if room remains.
  • Query routing — when a literal query looks like a regular expression, the
    response includes a non-overriding suggestion to use match: "regex".
  • Proactive recall options:
    • nudge (default on): adds a short system-prompt reminder to search
      history when you reference prior work. Text only — a few tokens per request,
      no latency, no I/O.
    • autoRecall (default off): when a message clearly references earlier
      work ("last time", "what did we decide", "same as before", "previously"),
      runs a bounded recall and injects the top one to three cited hits into the
      agent's context before it answers. Hard-bounded to 1.5s and a capped session
      scan so it can never stall a turn; stays quiet when it finds nothing.
    • compactionRecall (default off): before a session is compacted, pulls
      the strongest durable findings from that session and appends them to the
      compaction prompt so the summary preserves them.
  • Relevance eval harness (test/eval/) with a labeled corpus, MRR and
    recall@5 metrics, and a locked baseline that gates npm run check.

Changed

  • smart/fuzzy no longer have a "degraded mode" that silently switched
    ranking algorithms under load. A time budget still applies, but it only flags
    elevated latency (degradeKind: "time") — the ranking itself is unchanged.
  • Tokenization is split: a duplicate-preserving tokenizer feeds the BM25 index
    (so term frequency is meaningful), while a deduplicated tokenizer backs
    set-membership checks.
  • README reorganized so the value proposition and install come first and the
    agent-facing reference is grouped at the end. CONTRIBUTING's architecture
    section rewritten for the BM25 pipeline, the three execution paths, and the
    invocation hooks.

Removed

  • Fuse.js dependency and the legacy fuse / prefilter / rank modules.
    MiniSearch's built-in fuzzy matching covers typo tolerance.

Compatibility

  • All existing recall parameters keep their meaning; match gains a new
    "regex" value. The score field on results is now BM25-derived (still
    0–1). nudge is on by default; autoRecall and compactionRecall are
    opt-in. No configuration changes are required to upgrade.

v0.11.0

Choose a tag to compare

@github-actions github-actions released this 25 Apr 21:28

Forgiving search UX with coverage, suggestions, and live-MCP defenses. All additions are non-breaking; existing call sites continue to work.

Behavior

  • Unified title + message/tool/reasoning content search in recall. Results carry source, why (matchedFields/matchedTerms/recency/confidence), titleMatch, and directoryRelevance.
  • Forgiving time API: new last, from, to fields, plus ISO-string forms of before/after. Conflicting bounds resolve to the most restrictive valid window (newest lower / oldest upper) and warn instead of erroring. Degenerate durations and relative durations on absolute-only fields are normalized with warnings.
  • Default scan covers all eligible sessions. The previous implicit 1000-session cap is gone; sessions is now an optional cap, bounded by maxSessions, provider limits, and time/abort budgets.
  • Directory fallback (fallback: true) buckets exact → project → global with bucket counts and limitedBy reasons. Never broadens beyond scope: "project".
  • Partial expansion: expand: "context" now returns base hits plus as much expansion as fits, with warnings, instead of hard-failing on budget caps. New inputs: window: "auto", expandBudgetMessages, expandBudgetChars.
  • Tool inputs (command, cwd) and tool outputs are first-class searchable fields; why.matchedFields reports only fields that actually matched.

New output

  • warnings (max 5), suggestions (max 3), nearMisses (max 3).
  • coverage with totalSessionsKnown, sessionsDiscovered, sessionsEligible, sessionsSearched, messagesSearched, partsSearched, sessionsSkipped, skippedByReason, directoryBucketsSearched, directoryBucketCounts, and a limitedBy array (scope, time, directory, sessionsLimit, maxSessions, loadError, rankingBudget, timeBudget, abortSignal, etc.).
  • SearchResult adds source, why, directoryRelevance, titleMatch.

Live-MCP hardening

The plugin now defends at the execute boundary against MCP hosts that forward raw caller args without applying Zod schema defaults:

  • pickEnum / pickNumber / clampNumber coerce, clamp, and whitelist scope, match, group, type, role, expand, explain, fallback, expandResults, window, width, results, sessions, expandBudgetMessages, expandBudgetChars.
  • Unknown enum values fall back with Ignored {label}:"value"; using {label}:"fallback".
  • Out-of-range numerics clamp with warnings; non-numeric inputs render literally (NaN, Infinity) rather than as null.

Fixes

  • No more literal type:undefined text in suggestions when type arrives unset.
  • Grammar: Only N session(s) was/were searched. agrees with count.

Tests

  • 60 → 70 Vitest tests. New runToolRaw helper bypasses Zod parsing to exercise the defensive boundary directly.
  • New coverage: suggestion gating + grammar, defensive defaults, enum fallback, numeric clamping, non-numeric inputs, time-bound conflicts and impossible windows, malformed dates, expansion clamp warnings, directory bucket counts.

Docs

  • README.md updated for the new contract (unified search, time API, fallback, window:"auto", output additions).
  • New docs/recall-search-ux-improvement-plan.md (P0/P1 marked implemented).
  • docs/recall-tool-surface-plan.md reconciled with implemented behavior.

Full changelog: v0.10.0...v0.11.0

v0.10.0

Choose a tag to compare

@github-actions github-actions released this 25 Apr 05:20

Bounded search expansion plus automated quality gates.

Features

  • recall adds bounded multi-message context expansion with expand: "context" and expandResults (default 1, max 3). Returns surrounding conversation around top hits, capped by MAX_EXPANDED_CONTEXT_MESSAGES and MAX_EXPANDED_TOTAL_TEXT_CHARS to prevent context blowups.
  • Expansion budgets enforced at response time: oversized expansions error with normalized totals so callers can retry with smaller expandResults.
  • Compressed recall tool-instruction payload to keep the schema description under the ~2k-2.2k token target.

Quality gates

  • New automated CI gates: format:check, lint, test:typecheck, test, typecheck, compile all run on every push.
  • Husky pre-commit (scripts/precommit.sh) auto-fixes formatting/lint and re-runs full check; pre-push (scripts/prepush.sh) requires a clean tree and a clean check.
  • 60 behavior-focused Vitest tests covering recall ranking, fallback, expansion, and helpers.

Commits

  • a18c805 feat(recall): add bounded search expansion
  • 546c4c0 docs(tools): compress recall instructions
  • 84a1019 ci: add automated quality gates
  • 5254c5a test(recall): add behavior coverage

Full changelog: v0.9.2...v0.10.0

v0.9.2

Choose a tag to compare

@github-actions github-actions released this 24 Apr 23:29

Bug fixes around session loading and timestamp filtering.

Fixes

  • Surface session load failures (29b15e1): when a session fails to load during scan, the failure is no longer silently swallowed. The session is counted in sessionsSkipped with loadError as the reason and the response surfaces the error so callers can distinguish empty results from broken indexing.
  • Ignore zero-timestamp filters (#2, 44d9413): before: 0 (and equivalently after: 0) used to filter out every positive timestamp because 0 was treated as a meaningful epoch bound. Zero values are now ignored as no-ops, restoring expected behavior when callers default-initialize numeric filters.

Docs

  • Clarified load-error metadata in the recall response shape.

Contributors

Commits

  • 29b15e1 fix(recall): surface session load failures
  • 44d9413 fix(recall): ignore zero timestamp filters
  • b21a82e docs(recall): clarify load error metadata
  • 0f4b4b3 Merge pull request #2 from MatthewK30/fix/ignore-zero-timestamp-filters

Full changelog: v0.9.1...v0.9.2

v0.9.1

Choose a tag to compare

@github-actions github-actions released this 15 Apr 23:40

Removes the hard ceiling on session scanning.

Behavior

  • Remove session scan ceiling (5e08f42): the previous fixed cap on the number of sessions scanned per recall call is removed. The default request now scans up to 1000 sessions, and the configured maxSessions plugin option becomes the upper bound rather than an internal hard limit. This makes recall usable on archives with thousands of sessions without losing coverage to an arbitrary internal cap.

Callers can still pass an explicit sessions argument to keep individual queries cheap; the change only affects the implicit ceiling.

Full changelog: v0.9.0...v0.9.1

v0.9.0

Choose a tag to compare

@github-actions github-actions released this 15 Apr 23:26

Smart/fuzzy search across all scopes plus session grouping.

Features

  • Smart/fuzzy ranking everywhere (1316840): smart and fuzzy match modes (introduced for project scope in v0.8.0) now apply to global and session scopes as well. The same Fuse.js-based ranking that handled cross-project search now powers single-session and global queries, so match quality is consistent regardless of scope.
  • Session grouping (1316840): added group: "session" to roll up multiple per-message hits into one result per session, ranked by best match within the session. The previous group: "part" (one result per matching message/part) remains the default. Useful when callers want a session-level overview rather than every individual hit.

Docs

  • Rewrote the README to document smart/fuzzy ranking, scope behavior, and grouping.
  • Added CONTRIBUTING.md with the architecture/module map and contributor guidance.
  • Removed the now-implemented SMART_RECALL_PLAN.md planning doc.

Commits

  • 1316840 feat(recall): add smart/fuzzy for all scopes and session grouping
  • 789190b docs: rewrite README with smart/fuzzy docs, add CONTRIBUTING with architecture guide
  • c843db4 chore: remove SMART_RECALL_PLAN.md
  • 82fba93 chore: remove scope and grouping plan

Full changelog: v0.8.0...v0.9.0

v0.8.0

Choose a tag to compare

@github-actions github-actions released this 15 Apr 22:12

Smart and fuzzy ranked search via Fuse.js.

Features

  • Smart/fuzzy match modes (5cabf23): recall adds match: "smart" | "fuzzy" (alongside the existing literal substring match) backed by Fuse.js. Smart mode tokenizes the query, ranks by match quality with a small typo budget, and returns weighted scores; fuzzy mode is more permissive for spelling variations and partial recall. Literal stays the default and remains an exact-substring filter for predictable results.
  • Each ranked result returns ordering driven by Fuse score plus recency, so cross-session searches surface the strongest matches first instead of relying on insertion order.

Scope

  • This release introduced smart/fuzzy ranking for project scope. Global and single-session scopes inherit the same ranking in v0.9.0.

Docs

  • README emphasizes that recall searches across sessions and across projects (cross-project / cross-session is a first-class use case, not a sidecar).

Commits

  • 5cabf23 feat(recall): add smart/fuzzy search with Fuse.js ranking
  • c3feb1a docs: emphasize cross-session, cross-project scope in README

Full changelog: v0.7.1...v0.8.0

v0.7.1

Choose a tag to compare

@github-actions github-actions released this 15 Apr 02:18

Full Changelog: v0.7.0...v0.7.1