Fill tools/graph-equiv with corpus and wire CI equivalence gate#463
Merged
Conversation
… [ORB-00320] Planned-By: codex
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Task
ORB-00320 — Fill tools/graph-equiv with corpus and wire CI equivalence gate
Description
Problem
GRAPH_SPEC.md §16 Step 2 prescribes an equivalence harness that runs both orbit-knowledge (v1) and orbit-graph (v2) backends against a frozen corpus of roughly 30 representative selectors covering rust, typescript, python, and go. It fails CI on any diff outside the documented per-query tolerances from the §16 table. The harness skeleton (ORB-00297 / P0.4) landed the backend trait and v1 implementation; the v2 backend was stubbed with unimplemented!(). This task fills in the corpus, wires the v2 backend to orbit-graph through the CLI (ORB-00318 / P5.1), implements the per-query diff logic, and adds the CI gate.
Why It Matters
The equivalence harness is the technical gate for Step 3 (default flip from v1 to v2). Without it, 'v2 matches v1' is a hand-wave; with it, every PR runs the comparison and any regression blocks merge. The harness is also the place that surfaces honest disagreements early — selectors where v2's confidence ladder catches things v1 silently fudged, or where v2 misses something v1 had heuristically right. Those disagreements become triage tickets, not silent shipping risks.
Constraints / Notes
Plan ID: P6.1. Depends on ORB-00297 (P0.4 — harness skeleton) and ORB-00318 (P5.1 — CLI to invoke v2). Runs in parallel with P6.2.
Acceptance Criteria
Execution Summary
Click to expand
Outcome: success
Changes:
tools/graph-equiv/corpus/with 30 frozen query lines split acrossrust,typescript,python, andgo, backed by small language fixtures undertools/graph-equiv/fixtures/.orbit-graph-clias a subprocess and normalizing its JSON output for comparison.bench/equiv-waivers.mdas the reviewed-waiver stub and expandedtools/graph-equiv/README.mdwith corpus, tolerance, run, and waiver documentation.make ci-equivand a GitHub ActionsGraph Equivalencejob that gates PRs through the harness.L-0054documenting why graph-equiv keeps v1 checks fixture-scoped for CI speed.Strategic decisions:
Validation:
cargo build -p graph-equivcargo test -p graph-equivcargo clippy -p graph-equiv -- -D warningsmake ci-equiv(30/30 corpus queries passed)make ci-fastAssessment: The equivalence gate is wired and passing locally; the main residual risk is that the initial corpus is intentionally fixture-sized, so future tasks should expand it with reviewed real-world selectors as v1/v2 parity hardens.
Validation
Branch Freshness
origin/agent-mainorbit/ORB-00320-6a13e23f