Skip to content

feat(store): temporal + weighted edge metadata on relations#606

Merged
norrietaylor merged 2 commits into
mainfrom
feat/relations-edge-metadata
Jun 11, 2026
Merged

feat(store): temporal + weighted edge metadata on relations#606
norrietaylor merged 2 commits into
mainfrom
feat/relations-edge-metadata

Conversation

@norrietaylor

@norrietaylor norrietaylor commented Jun 9, 2026

Copy link
Copy Markdown
Owner

Adds edge attributes to entry_relations so relationships carry strength and a bi-temporal validity window, not just a type. Closes the "relations carry no weight/timestamp column" gap surfaced by the graph post-recommendation spikes (#605).

Migration 15

Four nullable columns on entry_relations:

  • weight DOUBLE — edge strength (interest/engagement magnitude)
  • valid_at TIMESTAMPTZ — when the relationship became true
  • invalid_at TIMESTAMPTZ — when it stopped being true (NULL = still valid)
  • metadata JSON — arbitrary per-edge attributes

Fully additive: existing edges and the metadata/wikilink backfill paths keep working; NULL means "unspecified".

API

  • add_relation(from, to, type, weight=, valid_at=, invalid_at=, metadata=) — all optional. Re-asserting an existing (from, to, type) triple upserts the supplied non-None attrs while staying idempotent on the row.
  • get_related / list_relations return the new fields. Timestamps now render as true UTC via AT TIME ZONE 'UTC', fixing a latent created_at local-time-labelled-Z mislabel.
  • distillery_relations MCP action="add" accepts and echoes the new fields, validating weight (number) and ISO 8601 timestamps.

Tests

  • Edge-attribute round-trip through get_related + list_relations
  • Re-assert upserts attrs, no duplicate row
  • MCP add accepts/echoes attrs; rejects non-numeric weight and unparseable timestamps
  • Updated test_get_related_row_structure for the additive key set

Full suite green (2930 passed), mypy --strict clean on all of src, ruff clean.

Follow-ups (not in this PR)

  • Temporal-aware reads (e.g. get_related(as_of=...) filtering on the validity window)
  • Weight-ordered traversal / weighted graph metrics

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Extended relation definitions to support optional weight, temporal validity timestamps (valid_at/invalid_at), and custom metadata attributes on edges.
    • Re-asserting an existing relation now upserts the provided attributes without creating duplicate entries.
  • Tests

    • Added comprehensive coverage for edge attribute persistence, validation, timestamp parsing, metadata round-tripping, and MCP tool integration.

Add migration 15: nullable weight (DOUBLE), valid_at / invalid_at
(TIMESTAMPTZ, bi-temporal validity window), and metadata (JSON) columns
on entry_relations. Fully additive — existing edges and the metadata /
wikilink backfill paths keep working; NULL means "unspecified".

- add_relation accepts optional weight / valid_at / invalid_at / metadata;
  re-asserting an existing (from,to,type) triple upserts supplied attrs
  while staying idempotent on the row.
- get_related / list_relations return the new fields; timestamps render as
  true UTC (AT TIME ZONE 'UTC'), fixing a latent created_at local-time/'Z'
  mislabel.
- distillery_relations MCP tool action="add" accepts and echoes the new
  fields, with validation for weight (number) and ISO 8601 timestamps.

Closes the "relations carry no weight/timestamp column" gap surfaced by
the graph post-recommendation spikes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Warning

Review limit reached

@norrietaylor, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 3 minutes and 14 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more credits in the billing tab to continue.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: e208034a-54ee-4266-ab31-6b7ebf718072

📥 Commits

Reviewing files that changed from the base of the PR and between e33fa31 and d12f7db.

📒 Files selected for processing (2)
  • src/distillery/store/duckdb.py
  • tests/test_relations.py
📝 Walkthrough

Walkthrough

This PR extends relation edges throughout the distillery system to support optional attributes: weight (edge strength), valid_at/invalid_at (bi-temporal validity windows), and metadata (arbitrary JSON). Changes span the protocol contract, DuckDB backend upsert/read logic, MCP tool validation, and comprehensive tests.

Changes

Edge Attributes for Relations

Layer / File(s) Summary
Protocol contract and schema migration
src/distillery/store/protocol.py, src/distillery/store/migrations.py
DistilleryStore.add_relation signature now accepts optional weight, valid_at, invalid_at, and metadata parameters. Return schemas for get_related and list_relations document the new fields. Migration 15 extends entry_relations with four new nullable columns: weight (DOUBLE), valid_at (TIMESTAMPTZ), invalid_at (TIMESTAMPTZ), and metadata (JSON).
DuckDB store backend: attribute support
src/distillery/store/duckdb.py
New _coerce_relation_timestamp normalizes temporal inputs. _sync_add_relation now upserts: updates only provided non-None attributes on existing relations, or inserts with attributes for new ones. Async add_relation forwards parameters to sync. New _RELATION_SELECT_COLUMNS and _relation_row_to_dict standardize relation projection and parsing (ISO 8601 formatting, JSON metadata handling). Both get_related and list_relations now return dicts including edge attributes.
MCP tool validation and response
src/distillery/mcp/tools/relations.py
add action parses optional weight (float validation), valid_at/invalid_at (ISO-8601 via new _validate_optional_timestamp helper), and metadata. Validated values are forwarded to store.add_relation and echoed in the MCP success response.
MCP server tool signature and forwarding
src/distillery/mcp/server.py
Top-level distillery_relations tool signature exposes the four new optional parameters. Tool docstring documents them. Documented action="add" response includes the new fields. Arguments are forwarded to _handle_relations.
Comprehensive test coverage
tests/test_relations.py
Tests validate row structure (new edge attributes present with None defaults), round-trip persistence (attributes stored and read back correctly with timestamp normalization), upsert behavior (re-asserting a relation updates supplied attributes without duplicates), MCP tool acceptance and response echoing, and validation rejection (bad weight and unparseable timestamps return INVALID_PARAMS).

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly Related PRs

  • norrietaylor/distillery#189: Original PR that introduced typed entry_relations and distillery_relations tool; this PR extends the same relation schema and APIs to add edge attribute support.
  • norrietaylor/distillery#495: Adds action="reconcile" to distillery_relations tool and a corresponding store.reconcile_relations() API, overlapping on the same MCP/store relation workflow and tooling.

Poem

🐰 A rabbit hops through edges fine,
With weight and time to mark each line,
Metadata blooms where relations grow,
Our schema shifts to steal the show!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(store): temporal + weighted edge metadata on relations' directly and accurately summarizes the main changeset: adding temporal validity windows (valid_at, invalid_at) and weighted edges (weight) plus arbitrary metadata to relation edges across the store layer.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/distillery/mcp/tools/relations.py (1)

157-161: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not echo store exception text to MCP clients.

This branch now relays the diagnostic payload from _sync_add_relation verbatim, so a missing endpoint leaks internal visibility details in a client-facing NOT_FOUND response. Return a fixed safe message here and keep the full exception only in logs.

Based on learnings: do not include raw exception details via f-strings or string interpolation in client-facing error responses in src/distillery/mcp/tools/*.py.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/distillery/mcp/tools/relations.py` around lines 157 - 161, The except
ValueError handler in the create-relation flow currently returns
error_response("NOT_FOUND", f"Cannot create relation: {exc}") and thus leaks
internal exception text to clients; change this to return a fixed, non-sensitive
message (e.g. "Cannot create relation: endpoint not found" or similar) and move
the actual exception details into a server-side log call (use the same logger
used elsewhere in this module) so the full exc is recorded but not echoed to the
MCP client; update the handler around _sync_add_relation/except ValueError and
keep the error_response call only containing the safe static message.

Source: Learnings

src/distillery/store/duckdb.py (1)

3290-3346: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Checkpoint relation writes before returning success.

Both the re-assert UPDATE path and the new INSERT path mutate entry_relations, but unlike the other write methods in this class they never call _checkpoint_after_write(). A hard restart can therefore lose the relation or its new edge attributes after the caller already received a relation_id.

Suggested fix
         if existing is not None:
             relation_id = str(existing[0])
             # Re-asserting an edge with fresh attributes upserts the supplied
             # (non-None) fields onto the existing row; the (from,to,type) triple
             # stays idempotent.
             set_parts: list[str] = []
             set_params: list[Any] = []
@@
             if set_parts:
                 self._conn.execute(
                     f"UPDATE entry_relations SET {', '.join(set_parts)} WHERE id = ?",
                     [*set_params, relation_id],
                 )
+                self._checkpoint_after_write(self._conn)
             logger.debug(
                 "Relation already exists id=%s from=%s to=%s type=%s (attrs upserted=%d)",
                 relation_id,
@@
         self._conn.execute(
             "INSERT INTO entry_relations "
             "(id, from_id, to_id, relation_type, weight, valid_at, invalid_at, metadata) "
             "VALUES (?, ?, ?, ?, ?, ?, ?, ?)",
             [
@@
                 metadata_json,
             ],
         )
+        self._checkpoint_after_write(self._conn)
         logger.debug(
             "Added relation id=%s from=%s to=%s type=%s",
             relation_id,
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/distillery/store/duckdb.py` around lines 3290 - 3346, The UPDATE and
INSERT branches that mutate the entry_relations table (the block that executes
self._conn.execute(...) and the subsequent INSERT block creating relation_id) do
not call _checkpoint_after_write(), so the new/updated relation may not be
flushed to durable storage; after performing the UPDATE (when set_parts is
non-empty) and after the INSERT (before each return that returns relation_id)
call self._checkpoint_after_write() to checkpoint the write, ensuring durability
of the change to entry_relations.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/distillery/store/duckdb.py`:
- Around line 3287-3289: After coercing timestamps in _coerce_relation_timestamp
(i.e., after computing valid_at_dt and invalid_at_dt) validate that
invalid_at_dt is not earlier than valid_at_dt; if invalid_at_dt < valid_at_dt
raise a ValueError (or a DomainError) with a clear message referencing valid_at
and invalid_at so the caller knows why the relation is rejected. Place this
check immediately after the valid_at_dt/invalid_at_dt assignments in the method
that persists relations (the block that sets valid_at_dt, invalid_at_dt, and
metadata_json) to prevent storing inverted validity windows.

In `@tests/test_relations.py`:
- Around line 389-413: Add an assertion to verify the persisted timestamp
normalization: after calling store.get_related() in
test_relations_tool_add_with_edge_attributes, assert that the returned
row["valid_at"] equals the expected UTC-normalized string (e.g.,
"2026-03-01T00:00:00Z") to match the normalization behavior tested in
test_add_relation_edge_attributes_round_trip; locate the timestamp coming from
_handle_relations/store.get_related and compare it to the normalized form
instead of only checking weight and metadata.

---

Outside diff comments:
In `@src/distillery/mcp/tools/relations.py`:
- Around line 157-161: The except ValueError handler in the create-relation flow
currently returns error_response("NOT_FOUND", f"Cannot create relation: {exc}")
and thus leaks internal exception text to clients; change this to return a
fixed, non-sensitive message (e.g. "Cannot create relation: endpoint not found"
or similar) and move the actual exception details into a server-side log call
(use the same logger used elsewhere in this module) so the full exc is recorded
but not echoed to the MCP client; update the handler around
_sync_add_relation/except ValueError and keep the error_response call only
containing the safe static message.

In `@src/distillery/store/duckdb.py`:
- Around line 3290-3346: The UPDATE and INSERT branches that mutate the
entry_relations table (the block that executes self._conn.execute(...) and the
subsequent INSERT block creating relation_id) do not call
_checkpoint_after_write(), so the new/updated relation may not be flushed to
durable storage; after performing the UPDATE (when set_parts is non-empty) and
after the INSERT (before each return that returns relation_id) call
self._checkpoint_after_write() to checkpoint the write, ensuring durability of
the change to entry_relations.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 8892706f-9c37-45a2-ad60-e64ecb2cb251

📥 Commits

Reviewing files that changed from the base of the PR and between a0947e8 and e33fa31.

📒 Files selected for processing (6)
  • src/distillery/mcp/server.py
  • src/distillery/mcp/tools/relations.py
  • src/distillery/store/duckdb.py
  • src/distillery/store/migrations.py
  • src/distillery/store/protocol.py
  • tests/test_relations.py

Comment thread src/distillery/store/duckdb.py
Comment thread tests/test_relations.py
Reject invalid_at earlier than valid_at in add_relation; normalize naive
datetime inputs to UTC in _coerce_relation_timestamp so the comparison is
always between aware datetimes (matches the documented contract). Assert
valid_at persistence through the MCP layer in tests.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@norrietaylor

Copy link
Copy Markdown
Owner Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@norrietaylor norrietaylor merged commit 9cfd5cd into main Jun 11, 2026
19 of 20 checks passed
@norrietaylor norrietaylor deleted the feat/relations-edge-metadata branch June 11, 2026 22:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant