Skip to content

feat(mcp): constraint + link_prediction graph metrics#607

Open
norrietaylor wants to merge 1 commit into
mainfrom
feat/graph-constraint-link-prediction
Open

feat(mcp): constraint + link_prediction graph metrics#607
norrietaylor wants to merge 1 commit into
mainfrom
feat/graph-constraint-link-prediction

Conversation

@norrietaylor

@norrietaylor norrietaylor commented Jun 9, 2026

Copy link
Copy Markdown
Owner

Extends distillery_relations action="metrics" with the two algorithms the post-recommendation spikes (#605) ran in-process, so they're now first-class Distillery metrics next to bridges/communities.

New metrics

  • constraint — Burt's structural-hole constraint on the undirected projection, sorted ascending. Lowest-constraint entries are the strongest brokers: topics/people bridging otherwise-disconnected clusters. Nodes with no neighbours (NaN) are excluded. Returns [{id, score}].
  • link_prediction — Adamic-Adar index over non-existent edges. Pass entry_id to score emerging adjacencies for one entry (candidate edges to its non-neighbours); omit it for the top predicted edges graph-wide. Returns [{source, target, score}]. Quadratic in node count when unsourced — bound with scope="ego".

Both live in distillery.graph.metrics, compute on the undirected projection, and stay gated behind the [graph] extra.

Example

// strongest brokers in the whole relations graph
{"action":"metrics","metric":"constraint","scope":"global","limit":10}
// emerging adjacencies for one entry
{"action":"metrics","metric":"link_prediction","scope":"ego","entry_id":"<uuid>"}

Tests

  • Unit (tests/graph/test_metrics.py): bowtie → broker has lowest constraint; Adamic-Adar from a source predicts the shared-neighbour node; global surfaces the shared pair; unknown source / empty graph → [].
  • MCP (tests/test_mcp_tools/test_relations_metrics.py): constraint ranks the broker first; link_prediction with entry_id returns {source, target, score} with the right top pair; unknown metric rejected with the expanded valid-set message.

mypy --strict clean, ruff clean, all graph/relations/metrics suites green.

Companion to #606 (temporal/weighted edge metadata); together they close the graph-algorithm and edge-attribute gaps the spikes surfaced.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added two new graph metrics: constraint (identifies structural-hole brokers) and link_prediction (predicts non-existent edges using Adamic–Adar index).
    • Both metrics now available via the relations MCP tool with configurable top-k results.
  • Documentation

    • Updated tool documentation to list new supported metric options and their requirements.
  • Tests

    • Added comprehensive test coverage for new metrics and tool integrations.

Extend distillery_relations action="metrics" with two algorithms the
post-recommendation spikes (#605) ran in-process:

- constraint: Burt's structural-hole constraint, ascending — lowest-constraint
  entries are the strongest brokers (topics/people bridging disconnected
  clusters). Returns [{id, score}].
- link_prediction: Adamic-Adar index over non-existent edges. Pass entry_id to
  score emerging adjacencies for one entry; omit for top predicted edges
  graph-wide (bound via scope="ego"). Returns [{source, target, score}].

Both compute on the undirected projection alongside the existing
bridges/communities, gated behind the [graph] extra.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

This PR adds two graph-analysis metrics (constraint and link_prediction) to the distillery toolkit. The constraint metric identifies Burt's structural-hole brokers; link prediction identifies top non-existent edges using Adamic-Adar scoring. Both metrics are implemented in the graph module, integrated into the MCP relations tool, documented, and validated through comprehensive unit and integration tests.

Changes

Graph Metrics: Constraint and Link Prediction

Layer / File(s) Summary
Core metric implementations
src/distillery/graph/metrics.py
constraint() and link_prediction() functions added with math import; constraint filters None/NaN, sorts ascending, and returns top-k brokers; link_prediction handles empty graphs and optional source restriction, ranks by Adamic-Adar score, returns top-k candidates.
MCP tool integration and dispatch
src/distillery/mcp/tools/relations.py
_VALID_METRICS expanded to accept constraint and link_prediction; metrics imports updated; new dispatch branches added for each metric with appropriate result shape ({id, score} for constraint, {source, target, score} for link_prediction).
Tool documentation updates
src/distillery/mcp/server.py
distillery_relations docstring updated to list constraint and link_prediction as supported metrics, notes optional [graph] extra, revises limit parameter description for top-k semantics.
Unit tests for graph metrics
tests/graph/test_metrics.py
Bowtie and shared-neighbors fixture helpers added; tests validate constraint identifies broker with lowest score, returns empty for empty graphs; link_prediction tests validate source-based prediction, global pair surfacing, and unknown-source handling.
MCP tool integration tests
tests/test_mcp_tools/test_relations_metrics.py
Bowtie-graph seeding helper and three new async tests: constraint metric ordering, link_prediction output structure and scoring, and error message validation for invalid metrics.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • norrietaylor/distillery#426: Introduced the initial distillery_relations metrics framework that this PR extends with two new metric implementations and their full integration pipeline.

Suggested labels

graph

Poem

🐰 Two metrics bloom on graph's stage,
Brokers and predictions engage—
Constraint finds holes, bonds draw near,
Adamic-Adar crystal clear!
NetworkX magic, tests all cheer!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(mcp): constraint + link_prediction graph metrics' accurately describes the main change—adding two new graph metrics (constraint and link_prediction) to the MCP relations tool.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
tests/test_mcp_tools/test_relations_metrics.py (2)

135-141: 💤 Low value

Consider clarifying the docstring.

The docstring says "M brokers two otherwise-disconnected pairs {A,B} and {X,Y}", but the topology actually creates two triangles {M,A,B} and {M,X,Y} since edges A-B and X-Y are included. A more accurate description would be "M brokers two otherwise-disconnected triangles" to reflect that A-B and X-Y are directly connected within their components.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_mcp_tools/test_relations_metrics.py` around lines 135 - 141,
Update the docstring for _seed_bowtie to accurately describe the topology
created: instead of saying "M brokers two otherwise-disconnected pairs {A,B} and
{X,Y}", state that it creates two triangles {M,A,B} and {M,X,Y} (or "M brokers
two otherwise-disconnected triangles") since the pairs list in _seed_bowtie
(pairs = [("M","A"),("M","B"),("A","B"),("M","X"),("M","Y"),("X","Y")]) includes
the A-B and X-Y edges; keep the rest of the function (ids creation via
_store_entry and return ids) unchanged.

185-195: 💤 Low value

Consider consolidating with the existing invalid-metric test.

This test duplicates test_metrics_invalid_metric_returns_invalid_params (lines 198–207), which already validates that unknown metrics are rejected with INVALID_PARAMS. The only addition here is asserting that the error message enumerates the new metrics. You could update the existing test to include that assertion rather than maintaining two separate tests for the same behavior.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_mcp_tools/test_relations_metrics.py` around lines 185 - 195, Two
tests duplicate coverage for rejecting unknown metrics: keep one and add the
additional assertion about enumerated metrics to it. Remove the redundant test
function test_metrics_invalid_metric_rejected and update the existing
test_metrics_invalid_metric_returns_invalid_params to also assert that the error
message contains the expanded metric names (e.g., check "constraint" and
"link_prediction" in data["message"]) after calling _handle_relations(store,
{"action":"metrics","metric":"pagerank","scope":"global"}) and parsing the
result with _parse.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/test_mcp_tools/test_relations_metrics.py`:
- Around line 135-141: Update the docstring for _seed_bowtie to accurately
describe the topology created: instead of saying "M brokers two
otherwise-disconnected pairs {A,B} and {X,Y}", state that it creates two
triangles {M,A,B} and {M,X,Y} (or "M brokers two otherwise-disconnected
triangles") since the pairs list in _seed_bowtie (pairs =
[("M","A"),("M","B"),("A","B"),("M","X"),("M","Y"),("X","Y")]) includes the A-B
and X-Y edges; keep the rest of the function (ids creation via _store_entry and
return ids) unchanged.
- Around line 185-195: Two tests duplicate coverage for rejecting unknown
metrics: keep one and add the additional assertion about enumerated metrics to
it. Remove the redundant test function test_metrics_invalid_metric_rejected and
update the existing test_metrics_invalid_metric_returns_invalid_params to also
assert that the error message contains the expanded metric names (e.g., check
"constraint" and "link_prediction" in data["message"]) after calling
_handle_relations(store,
{"action":"metrics","metric":"pagerank","scope":"global"}) and parsing the
result with _parse.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: fd516e07-deb1-4463-93ee-e19800b0d616

📥 Commits

Reviewing files that changed from the base of the PR and between a0947e8 and 8272a7a.

📒 Files selected for processing (5)
  • src/distillery/graph/metrics.py
  • src/distillery/mcp/server.py
  • src/distillery/mcp/tools/relations.py
  • tests/graph/test_metrics.py
  • tests/test_mcp_tools/test_relations_metrics.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant