feat(store): temporal + weighted edge metadata on relations#606
Conversation
Add migration 15: nullable weight (DOUBLE), valid_at / invalid_at (TIMESTAMPTZ, bi-temporal validity window), and metadata (JSON) columns on entry_relations. Fully additive — existing edges and the metadata / wikilink backfill paths keep working; NULL means "unspecified". - add_relation accepts optional weight / valid_at / invalid_at / metadata; re-asserting an existing (from,to,type) triple upserts supplied attrs while staying idempotent on the row. - get_related / list_relations return the new fields; timestamps render as true UTC (AT TIME ZONE 'UTC'), fixing a latent created_at local-time/'Z' mislabel. - distillery_relations MCP tool action="add" accepts and echoes the new fields, with validation for weight (number) and ISO 8601 timestamps. Closes the "relations carry no weight/timestamp column" gap surfaced by the graph post-recommendation spikes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Warning Review limit reached
More reviews will be available in 3 minutes and 14 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more credits in the billing tab to continue. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis PR extends relation edges throughout the distillery system to support optional attributes: ChangesEdge Attributes for Relations
Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly Related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/distillery/mcp/tools/relations.py (1)
157-161:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDo not echo store exception text to MCP clients.
This branch now relays the diagnostic payload from
_sync_add_relationverbatim, so a missing endpoint leaks internal visibility details in a client-facingNOT_FOUNDresponse. Return a fixed safe message here and keep the full exception only in logs.Based on learnings: do not include raw exception details via f-strings or string interpolation in client-facing error responses in
src/distillery/mcp/tools/*.py.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/distillery/mcp/tools/relations.py` around lines 157 - 161, The except ValueError handler in the create-relation flow currently returns error_response("NOT_FOUND", f"Cannot create relation: {exc}") and thus leaks internal exception text to clients; change this to return a fixed, non-sensitive message (e.g. "Cannot create relation: endpoint not found" or similar) and move the actual exception details into a server-side log call (use the same logger used elsewhere in this module) so the full exc is recorded but not echoed to the MCP client; update the handler around _sync_add_relation/except ValueError and keep the error_response call only containing the safe static message.Source: Learnings
src/distillery/store/duckdb.py (1)
3290-3346:⚠️ Potential issue | 🟠 Major | ⚡ Quick winCheckpoint relation writes before returning success.
Both the re-assert
UPDATEpath and the newINSERTpath mutateentry_relations, but unlike the other write methods in this class they never call_checkpoint_after_write(). A hard restart can therefore lose the relation or its new edge attributes after the caller already received arelation_id.Suggested fix
if existing is not None: relation_id = str(existing[0]) # Re-asserting an edge with fresh attributes upserts the supplied # (non-None) fields onto the existing row; the (from,to,type) triple # stays idempotent. set_parts: list[str] = [] set_params: list[Any] = [] @@ if set_parts: self._conn.execute( f"UPDATE entry_relations SET {', '.join(set_parts)} WHERE id = ?", [*set_params, relation_id], ) + self._checkpoint_after_write(self._conn) logger.debug( "Relation already exists id=%s from=%s to=%s type=%s (attrs upserted=%d)", relation_id, @@ self._conn.execute( "INSERT INTO entry_relations " "(id, from_id, to_id, relation_type, weight, valid_at, invalid_at, metadata) " "VALUES (?, ?, ?, ?, ?, ?, ?, ?)", [ @@ metadata_json, ], ) + self._checkpoint_after_write(self._conn) logger.debug( "Added relation id=%s from=%s to=%s type=%s", relation_id,🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/distillery/store/duckdb.py` around lines 3290 - 3346, The UPDATE and INSERT branches that mutate the entry_relations table (the block that executes self._conn.execute(...) and the subsequent INSERT block creating relation_id) do not call _checkpoint_after_write(), so the new/updated relation may not be flushed to durable storage; after performing the UPDATE (when set_parts is non-empty) and after the INSERT (before each return that returns relation_id) call self._checkpoint_after_write() to checkpoint the write, ensuring durability of the change to entry_relations.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/distillery/store/duckdb.py`:
- Around line 3287-3289: After coercing timestamps in _coerce_relation_timestamp
(i.e., after computing valid_at_dt and invalid_at_dt) validate that
invalid_at_dt is not earlier than valid_at_dt; if invalid_at_dt < valid_at_dt
raise a ValueError (or a DomainError) with a clear message referencing valid_at
and invalid_at so the caller knows why the relation is rejected. Place this
check immediately after the valid_at_dt/invalid_at_dt assignments in the method
that persists relations (the block that sets valid_at_dt, invalid_at_dt, and
metadata_json) to prevent storing inverted validity windows.
In `@tests/test_relations.py`:
- Around line 389-413: Add an assertion to verify the persisted timestamp
normalization: after calling store.get_related() in
test_relations_tool_add_with_edge_attributes, assert that the returned
row["valid_at"] equals the expected UTC-normalized string (e.g.,
"2026-03-01T00:00:00Z") to match the normalization behavior tested in
test_add_relation_edge_attributes_round_trip; locate the timestamp coming from
_handle_relations/store.get_related and compare it to the normalized form
instead of only checking weight and metadata.
---
Outside diff comments:
In `@src/distillery/mcp/tools/relations.py`:
- Around line 157-161: The except ValueError handler in the create-relation flow
currently returns error_response("NOT_FOUND", f"Cannot create relation: {exc}")
and thus leaks internal exception text to clients; change this to return a
fixed, non-sensitive message (e.g. "Cannot create relation: endpoint not found"
or similar) and move the actual exception details into a server-side log call
(use the same logger used elsewhere in this module) so the full exc is recorded
but not echoed to the MCP client; update the handler around
_sync_add_relation/except ValueError and keep the error_response call only
containing the safe static message.
In `@src/distillery/store/duckdb.py`:
- Around line 3290-3346: The UPDATE and INSERT branches that mutate the
entry_relations table (the block that executes self._conn.execute(...) and the
subsequent INSERT block creating relation_id) do not call
_checkpoint_after_write(), so the new/updated relation may not be flushed to
durable storage; after performing the UPDATE (when set_parts is non-empty) and
after the INSERT (before each return that returns relation_id) call
self._checkpoint_after_write() to checkpoint the write, ensuring durability of
the change to entry_relations.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 8892706f-9c37-45a2-ad60-e64ecb2cb251
📒 Files selected for processing (6)
src/distillery/mcp/server.pysrc/distillery/mcp/tools/relations.pysrc/distillery/store/duckdb.pysrc/distillery/store/migrations.pysrc/distillery/store/protocol.pytests/test_relations.py
Reject invalid_at earlier than valid_at in add_relation; normalize naive datetime inputs to UTC in _coerce_relation_timestamp so the comparison is always between aware datetimes (matches the documented contract). Assert valid_at persistence through the MCP layer in tests. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
@coderabbitai review |
✅ Action performedReview finished.
|
Adds edge attributes to
entry_relationsso relationships carry strength and a bi-temporal validity window, not just a type. Closes the "relations carry no weight/timestamp column" gap surfaced by the graph post-recommendation spikes (#605).Migration 15
Four nullable columns on
entry_relations:weightDOUBLE — edge strength (interest/engagement magnitude)valid_atTIMESTAMPTZ — when the relationship became trueinvalid_atTIMESTAMPTZ — when it stopped being true (NULL = still valid)metadataJSON — arbitrary per-edge attributesFully additive: existing edges and the metadata/wikilink backfill paths keep working; NULL means "unspecified".
API
add_relation(from, to, type, weight=, valid_at=, invalid_at=, metadata=)— all optional. Re-asserting an existing(from, to, type)triple upserts the supplied non-None attrs while staying idempotent on the row.get_related/list_relationsreturn the new fields. Timestamps now render as true UTC viaAT TIME ZONE 'UTC', fixing a latentcreated_atlocal-time-labelled-Zmislabel.distillery_relationsMCPaction="add"accepts and echoes the new fields, validatingweight(number) and ISO 8601 timestamps.Tests
get_related+list_relationstest_get_related_row_structurefor the additive key setFull suite green (2930 passed),
mypy --strictclean on all ofsrc, ruff clean.Follow-ups (not in this PR)
get_related(as_of=...)filtering on the validity window)🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Tests