Fix metadata wipe on content updates + CLI metadata-search parity (v0.11.1)#88
Merged
Merged
Conversation
…earch parity
Two fixes surfaced by a real metadata-loss incident (a content update via the
CLI silently cleared a document's tags, and metadata is unversioned so the
loss was unrecoverable):
1. metadata contract: NULL = "not provided" → KEEP existing (create uses {}),
enforced once in the cerefox_ingest_document RPC
(metadata = COALESCE(p_metadata, metadata)). Every transport used to
default an absent metadata argument to {} and the RPC applied it verbatim —
wiping tags on any content update that didn't re-pass them. Callers fixed
in lockstep: MCP handler, cerefox-ingest EF, CLI ingest/ingest-dir, frozen
Python client. Pass {} explicitly to deliberately clear. Schema 0.5.0 →
0.6.0 (RPC-only; `cerefox server deploy` required — v0.11.1 clients send
NULL, which a 0.5.0 server would reject on update via the NOT NULL column).
2. CLI parity: `cerefox metadata search` no longer hard-requires
--metadata-filter — same contract as the MCP tool / EF (relaxed in v0.10.x;
the CLI was missed): at least one of filter / --project-name /
--updated-since / --created-since; --project-name alone lists a project.
Tests: RPC-level preserve/clear contract (synthetic embedding, no OpenAI);
CLI update-flow asserts tags survive four metadata-less updates; metadata
search parity smokes; live-suite schema gates bumped to 0.6.0. Backlog:
docs/research/metadata-versioning.md (recovery for the unversioned-metadata
gap) + TODO entry.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two fixes surfaced by a real incident (a content update silently cleared a document's tags — and metadata is unversioned, so the loss was unrecoverable).
1. Content updates no longer wipe metadata. Every transport defaulted an absent
metadataargument to{}and the ingest RPC applied it verbatim — so any content update that didn't re-pass the tags cleared them. New contract, enforced once incerefox_ingest_document: NULL = "not provided" → keep existing (metadata = COALESCE(p_metadata, metadata); create uses{}). Pass{}explicitly to deliberately clear. Callers fixed in lockstep: MCP handler,cerefox-ingestEF, CLIingest/ingest-dir, frozen Python client. (The pipeline already had keep-on-absent semantics — the CLI's?? {}default was defeating them.)2. CLI parity:
cerefox metadata searchno longer hard-requires--metadata-filter— same contract as the MCP tool / EF (relaxed in v0.10.x; the CLI was missed): at least one of filter /--project-name/--updated-since/--created-since;--project-namealone lists that project's documents.Versioning: schema 0.5.0 → 0.6.0 (RPC-only, no migration).⚠️ Run
cerefox server deploywith the release — v0.11.1 clients sendNULLfor absent metadata, which a 0.5.0 server would reject on update (NOT NULL column). OpenAPI → 2.1.0 (metadatadescription only — re-paste optional).Backlog: the incident exposed that metadata has no recovery layer (version snapshots are content-only). Proposal with options:
docs/research/metadata-versioning.md+ TODO entry + plan.md note.Test plan
_shared: 211 pass; typecheck clean; help bundle in syncpackages/memory: 141 ran, 0 fail (live suites gate on deployed schema ≥ 0.6.0 with a deploy hint)p_metadataomitted → tags preserved; explicit{}→ cleared (synthetic embedding, no OpenAI dependency)metadata search --project-name <p>(no filter) → lists docs; no criteria → exit 1 with guidancecerefox server deploy→ live suites go fully green🤖 Generated with Claude Code