Skip to content

feat(salience): Phase 1 — first-class standing tier (default-off, soak-gated)#154

Merged
cipher813 merged 1 commit into
mainfrom
feat/salience-tier-phase-1
May 22, 2026
Merged

feat(salience): Phase 1 — first-class standing tier (default-off, soak-gated)#154
cipher813 merged 1 commit into
mainfrom
feat/salience-tier-phase-1

Conversation

@cipher813
Copy link
Copy Markdown
Owner

Summary

Ships the salience-tier substrate as a schema-backed, MCP-tooled, CLI-driven first-class feature. Memories explicitly promoted via memory_promote are injected into every <mnemon-context> envelope on every prompt, regardless of query similarity. Cap=15 (hard ceiling 20). Default-off behind STANDING_TIER_ENABLED.

Closes the salience-tier Phase 1 work per the 2026-05-22 reframing: Phase 1 IS the validation gate. Operator promotes ~5 career-context memories via the new memory_promote MCP tool, flips the flag, observes ≥1 week soak for runway-style under-weighting recurrence vs absence. The earlier synthetic-A/B-against-Phase-0 plan was reframed because the injection mechanism is identical between Phase 0's env-var-flagged form and Phase 1's schema-backed form — an A/B carried no marginal information.

Driver

The original Claude Desktop failure mode: load-bearing facts (Brian's runway, career posture) were retrieved into context but under-weighted against a dominant generic prior. RAG worked. Salience didn't. This PR closes the recall-side weighting gap. The upstream curation gap (capture fragmenting load-bearing facts into pieces) was closed by capture attention Phase A (#153, merged earlier today).

What ships

Store API

  • promote_to_standing(id) — raises StandingTierCapReached / StandingTierProvenanceRejected / StandingTierError with user-actionable messages. Idempotent re-promote returns True.
  • demote_to_situational(id) — round-trip. Idempotent on already-situational (returns False).
  • list_standing() — live tier members, content-included, ordered DESC by created_at.
  • standing_tier_status(){count, cap, hard_ceiling}.
  • search_bm25 + search_vector — gain include_standing: bool = False kw param. Tier 1 docs excluded from ranked retrieval by default (no double-counting; they're already injected unconditionally).

MCP tools (both stdio + Streamable HTTP)

  • memory_promote(id) — surfaces cap / provenance / missing rejections as readable messages.
  • memory_demote(id) — idempotent.
  • memory_list_standing() — JSON array consumed by build_context.
  • Tool count: 14 → 17.

CLI

  • mnemon standing list / promote <id> / demote <id>
  • mnemon status gains a Standing tier: N/CAP line.

build_context wiring

  • When STANDING_TIER_ENABLED (config constant OR MNEMON_STANDING_TIER_ENABLED env override accepting 1/true/yes/on): single memory_list_standing round-trip → renders as ## Standing context sub-section inside the existing Layer 1 envelope, ahead of ## Situational recall.
  • Phase 0 env-var path PRESERVED as fallback (MNEMON_STANDING_TIER_FILEstanding.jsonstanding-rendered.md cache). Operators retain a per-session override mechanism.

Schema

  • documents.tier TEXT NOT NULL DEFAULT 'situational' — additive migration in _migrate_tier(). Index idx_documents_tier scoped to live rows. Pre-existing memories default to 'situational'; harmless if flag stays off.

Composability (all preserved)

  • Layer 0 (is_well_shaped) — capture rejection runs upstream of any tier consideration
  • Layer 1 envelope — standing block sits inside the same <mnemon-context> data-marking + nonce
  • Layer 4 (HOOK_SOURCE_CONFIDENCE_CEILING + provenance) — hook-sourced memories cannot be promoted; explicit StandingTierProvenanceRejected rejection
  • rc16 source_key upsert — unchanged; tier orthogonal
  • Capture attention Phase Arecurrence_count signals candidates that operator-review can promote (bridges to salience-tier Phase 2 promotion signals work)

Soak gates (before flipping default-on)

  1. ≥1 week with STANDING_TIER_ENABLED=true (set via env or config)
  2. Observed reduction in runway-style under-weighting recurrence on real career-strategy conversations
  3. Zero spurious-injection complaints from operator review of every promoted memory

Test plan

  • All 22 new tests pass (pytest tests/test_standing_tier.py)
  • Full suite 814 → 836 passing, no regression beyond the tool-count assertions in test_server_remote.py (bumped 14 → 17)
  • CLI smoke: mnemon status shows tier line; mnemon standing list/promote/demote round-trip works on a temp vault
  • Post-merge operator workflow: promote 5 career-context memories (ids 2543, 2084, 131, 2402, 2401 are the candidates surfaced 2026-05-22) via memory_promote MCP tool; set MNEMON_STANDING_TIER_ENABLED=true in the Claude Code launching shell; verify the <mnemon-context> block contains ## Standing context sub-section; observe ≥1 week soak.

🤖 Generated with Claude Code

…k-gated)

Adds the standing-context recall tier as a schema-backed, MCP-tooled,
CLI-driven first-class feature. Memories explicitly promoted via
memory_promote are injected into every <mnemon-context> envelope on
every prompt, regardless of query similarity. Cap=15 (hard ceiling 20).
Default-off behind STANDING_TIER_ENABLED; soak-gated.

Closes the salience-tier Phase 1 work from the 2026-05-22 reframing —
ships the substrate gated, operator promotes ~5 career-context
memories via memory_promote MCP tool, flips the flag, observes ≥1
week soak for runway-style under-weighting recurrence vs absence.
Phase 1 IS the validation gate (vs the original synthetic A/B against
the Phase 0 env-var form, which carried no marginal information once
Phase 1 ships gated — same injection mechanism).

Schema additive: documents.tier TEXT NOT NULL DEFAULT 'situational'.
Idx idx_documents_tier on live rows for cap-count + search-exclusion
queries. Pre-existing memories default to 'situational'; harmless if
the flag stays off.

Store API:
  promote_to_standing(id) — raises StandingTierCapReached at cap,
    StandingTierProvenanceRejected on hook-sourced (Layer 4 compose),
    StandingTierError on missing/invalidated; idempotent re-promote.
  demote_to_situational(id) — round-trip; no-op on already-situational.
  list_standing() — live tier members, content-included, ordered DESC.
  standing_tier_status() — count/cap/hard_ceiling stats.
  search_bm25 + search_vector — gain include_standing kw (default False)
    so the unconditionally-injected tier isn't double-counted in
    ranked retrieval.

MCP tools (server.py, both stdio + Streamable HTTP):
  memory_promote(id) — surface cap / provenance / missing rejections
    as user-actionable messages.
  memory_demote(id) — idempotent.
  memory_list_standing() — JSON array consumed by build_context.
  Tool count: 14 → 17.

build_context wiring (context_surfacing.py): when STANDING_TIER_ENABLED
(config constant OR MNEMON_STANDING_TIER_ENABLED env override accepting
1/true/yes/on), call memory_list_standing in one round-trip; render
as "Standing context" sub-section inside the existing Layer 1
envelope. Phase 0 env-var path (MNEMON_STANDING_TIER_FILE → standing.json
→ standing-rendered.md cache) PRESERVED as fallback so operators
retain per-session override mechanism.

CLI: `mnemon standing list / promote <id> / demote <id>`. `mnemon
status` gains a Standing tier: N/CAP line.

Composes with: Layer 0 is_well_shaped (capture rejection upstream),
Layer 1 envelope (standing block inside same envelope + nonce as
situational), Layer 4 HOOK_SOURCE_CONFIDENCE_CEILING (provenance
demotion enforced via StandingTierProvenanceRejected), rc16 source_key
(orthogonal), capture-attention Phase A (recurrence_count signals
candidates that operator-review can promote — bridges to salience-tier
Phase 2 promotion signals).

22 new tests in tests/test_standing_tier.py covering: promote
success / cap-rejection (cap=2 fixture, 3rd raises) / hook-sourced
rejection / invalidated rejection / missing rejection / cap respects
invalidated / demote round-trip / demote idempotent / demote frees
cap slot / list_standing ordering + content / search excludes by
default / search includes when requested / build_context flag-off /
flag-on memory_list_standing call / env-var truthy parsing.

Suite 814 → 836 passing. test_server_remote.py tool-count assertions
bumped 14 → 17.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cipher813 cipher813 merged commit dab4859 into main May 22, 2026
9 checks passed
@cipher813 cipher813 deleted the feat/salience-tier-phase-1 branch May 22, 2026 14:06
cipher813 added a commit that referenced this pull request May 22, 2026
Seals the 2026-05-22 substrate arc:
- #153 capture-attention Phase A (recurrence-weighted preserve+relate+boost)
- #154 salience-tier Phase 1 (first-class standing tier, +3 MCP tools)
- #155 build_standing_set.py exemplar bias fix

Both new feature paths gated default-off (CAPTURE_ATTENTION_ENABLED,
STANDING_TIER_ENABLED) — operator flips per the soak workflow.

Post-merge ritual:
- tag v0.7.0rc1 + GitHub Release
- twine upload (dist/mnemon-memory-0.7.0rc1.{tar.gz,whl})
- mnemon upgrade web --app-name mnemon-memory --mnemon-version 0.7.0rc1
- mnemon doctor 7/7 against live remote
- operator promotes 5 career memories + flips MNEMON_STANDING_TIER_ENABLED
  for ≥1 week soak

Suite 836 passing. mnemon --version returns 0.7.0rc1.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cipher813 added a commit that referenced this pull request May 22, 2026
Closes the unit-test-coverage gap that let memory_check_contradictions
ship to Fly with a hidden bug. Pre-existing test_server.py mocks every
external dep (LLM, NLI, embedder, vecstore), which is right for
isolated-contract testing but leaves a gap: a tool's real call path
can raise an uncaught exception that no mocked test exercises.

tests/test_tools_integration.py iterates the entire registered MCP
tool manager and invokes each tool against a real seeded vault with
minimal-valid inputs. Three test classes:

  1. test_every_tool_invokes_cleanly — no unhandled exception, return
     shape matches MCP contract (str/dict/list). Coverage-check
     assertions force every newly-registered tool to have a fixture
     input AND every removed tool to have its fixture cleaned up.

  2. test_no_tool_returns_opaque_error_string — outputs must not
     contain "Error occurred during tool execution" or "Internal
     server error" verbatim. Opaque envelopes come from the MCP
     transport wrapping an escaped Python exception — never
     acceptable as a tool's own output.

  3. test_destructive_tools_respect_dry_run — locks the dry_run
     contract on memory_check_contradictions + memory_sweep. Mutating
     inputs are stubbed to "would-decay" labels via mocked NLI; the
     test asserts pre-state == post-state.

Heavy paths (NLI classify, FastEmbed re-embed) are stubbed so the
suite stays under ~17s; real-model paths are validated by
scripts/calibrate_capture_threshold.py and the operator Layer-3
web test ritual (extended in follow-up PR #158).

Catches the failure class that bit memory_check_contradictions on
2026-05-22: the LLM-required path would have raised under [server]
extras even with the broad try/except, and this canary's assertion
"no unhandled exception" would have fired on PR #154 when the
salience-tier tools were added — forcing the fix before merge.

Note: this PR doesn't yet exercise the [server]-only install matrix
(that's PR #158). It DOES catch any tool that raises through its
wrapper regardless of extras — the universal subset of the gap.

Composes with feedback_no_silent_fails: every tool must catch
failures at its boundary and return a clean error string, never
let an exception escape.

Suite 847 → 850 passing.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cipher813 added a commit that referenced this pull request May 22, 2026
…#159)

Follow-up to PR #158 — closes the [server]-extras gap that the local
integration canary can't see + extends the operator Layer-3 web test
ritual to probe every MCP tool against the live Fly app.

Two artifacts:

1. .github/workflows/ci-server-extras.yml — installs mnemon-memory[server]
   ONLY (the Fly Docker install) + pytest as a separate test runner.
   Runs the full suite under that minimal install. Includes a guard
   that asserts llama-cpp-python is NOT installed under [server] — so
   future PRs can't accidentally drag the LLM dep into the production
   path. This is the workflow that would have caught
   memory_check_contradictions's LLM hard-dependency on PR #154 when
   the salience-tier tools were first added; ci.yml passed because
   [dev] installs everything.

2. scripts/promote_stable.sh layer3 --exercise-all-tools — opt-in flag
   that, after the test Fly app is up but before downgrade, iterates
   every registered MCP tool against the remote and asserts each
   returns cleanly. Catches Fly-specific breakage (missing baked
   models, Anthropic MCP proxy timeouts, transport regressions) that
   the local Python-level canary in tests/test_tools_integration.py
   can't see.

   Tool list resolved dynamically from mcp._tool_manager._tools, so
   tools added in future PRs are exercised automatically — no per-
   release maintenance burden. Per-tool inputs mirror the integration-
   test fixture; destructive tools (memory_forget, memory_rebuild)
   skipped; mutating tools constrained to dry_run / round-trip.

scripts/_layer3_remote_helper.py gains an exercise-all-tools
subcommand wired through the FastMCP tool manager. Two regression-
lock tests added to tests/test_promote_stable.sh harness (13 → 15
passing) covering helper dispatch + flag plumbing through the bash
dispatcher (cmd_layer3 "$@" forwarding + EXERCISE_ALL_TOOLS=1 set).

Full Python suite still 850 passing.

Driver: Brian's 2026-05-22 ask after the memory_check_contradictions
incident — "given the difficulty of checking each individual mnemon
tool available, are we properly using unit tests to confirm that
everything works as expected?" PR #158 addressed the Python-level
canary; this PR addresses the deployment-environment + Fly-level
canary. Together they form the test trio for catching the
2026-05-22 failure class on the next PR rather than on the next
operator MCP call.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant